Step 1: Hello World
Installing Deep Lake and accessing your first Deep Lake Dataset.
How to Install Deep Lake and Get Started
Installing Deep Lake
Deep Lake can be installed through pip. By default, Deep Lake does not install dependencies for audio, video, google-cloud, and other features. Details on all installation options are available here.
pip install deeplake
Fetching Your First Deep Lake Dataset
Let's load MNIST, the hello world dataset of machine learning.
First, instantiate a Dataset
by pointing to its storage location. Datasets hosted on Activeloop Platform are typically identified by the namespace of the organization followed by the dataset name: activeloop/mnist-train
.
import deeplake
dataset_path = 'hub://activeloop/mnist-train'
ds = deeplake.load(dataset_path) # Returns a Deep Lake Dataset but does not download data locally
Reading Samples From a Deep Lake Dataset
Data is not immediately read into memory because Deep Lake operates lazily. You can fetch data by calling the .numpy()
method, which reads data into a NumPy array.
# Indexing
img = ds.images[0].numpy() # Fetch the 1st image and return a NumPy array
label = ds.labels[0].numpy(aslist=True) # Fetch the 1st label and store it as a
# as a list
text_labels = ds.labels[0].data()['text'] # Fetch the first labels and return them as text
# Slicing
imgs = ds.images[0:100].numpy() # Fetch 100 images and return a NumPy array
# The method above produces an exception if
# the images are not all the same size
labels = ds.labels[0:100].numpy(aslist=True) # Fetch 100 labels and store
# them as a list of NumPy arrays
Congratulations, you've got Deep Lake working on your local machine🤓
Last updated
Was this helpful?