LogoLogo
API ReferenceGitHubSlackService StatusLogin
v3.1.5
v3.1.5
  • Deep Lake Docs
  • List of ML Datasets
  • Quickstart
  • Dataset Visualization
  • Storage & Credentials
    • Storage Options
    • Managed Credentials
      • Enabling CORS
      • Provisioning Role-Based Access
  • API Reference
  • Enterprise Features
    • Querying Datasets
      • Sampling Datasets
    • Performant Dataloader
  • EXAMPLE CODE
  • Getting Started
    • Step 1: Hello World
    • Step 2: Creating Deep Lake Datasets
    • Step 3: Understanding Compression
    • Step 4: Accessing Data
    • Step 5: Visualizing Datasets
    • Step 6: Using Activeloop Storage
    • Step 7: Connecting Deep Lake Datasets to ML Frameworks
    • Step 8: Parallel Computing
    • Step 9: Dataset Version Control
    • Step 10: Dataset Filtering
  • Tutorials (w Colab)
    • Creating Datasets
      • Creating Complex Datasets
      • Creating Object Detection Datasets
      • Creating Time-Series Datasets
      • Creating Datasets with Sequences
      • Creating Video Datasets
    • Training Models
      • Training an Image Classification Model in PyTorch
      • Training Models Using MMDetection
      • Training Models Using PyTorch Lightning
      • Training on AWS SageMaker
      • Training an Object Detection and Segmentation Model in PyTorch
    • Data Processing Using Parallel Computing
  • Playbooks
    • Querying, Training and Editing Datasets with Data Lineage
    • Evaluating Model Performance
    • Training Reproducibility Using Deep Lake and Weights & Biases
    • Working with Videos
  • API Summary
  • How Deep Lake Works
    • Data Layout
    • Version Control and Querying
    • Tensor Relationships
    • Visualizer Integration
    • Shuffling in ds.pytorch()
    • Storage Synchronization
    • How to Contribute
Powered by GitBook
On this page
  • How to Install Deep Lake and Get Started
  • Installing Deep Lake
  • Fetching Your First Deep Lake Dataset
  • Reading Samples From a Deep Lake Dataset

Was this helpful?

  1. Getting Started

Step 1: Hello World

Installing Deep Lake and accessing your first Deep Lake Dataset.

PreviousGetting StartedNextStep 2: Creating Deep Lake Datasets

Last updated 2 years ago

Was this helpful?

How to Install Deep Lake and Get Started

Installing Deep Lake

Deep Lake can be installed through pip. By default, Deep Lake does not install dependencies for audio, video, google-cloud, and other features. .

pip install deeplake

Fetching Your First Deep Lake Dataset

Let's load MNIST, the hello world dataset of machine learning.

First, instantiate a Dataset by pointing to its storage location. Datasets hosted on Activeloop Platform are typically identified by the namespace of the organization followed by the dataset name: activeloop/mnist-train.

import deeplake

dataset_path = 'hub://activeloop/mnist-train'
ds = deeplake.load(dataset_path) # Returns a Deep Lake Dataset but does not download data locally

Reading Samples From a Deep Lake Dataset

Data is not immediately read into memory because Deep Lake operates . You can fetch data by calling the .numpy() method, which reads data into a NumPy array.

# Indexing
img = ds.images[0].numpy()              # Fetch the 1st image and return a NumPy array
label = ds.labels[0].numpy(aslist=True) # Fetch the 1st label and store it as a 
                                        # as a list
                              
text_labels = ds.labels[0].data()['text'] # Fetch the first labels and return them as text

# Slicing
imgs = ds.images[0:100].numpy() # Fetch 100 images and return a NumPy array
                                # The method above produces an exception if 
                                # the images are not all the same size

labels = ds.labels[0:100].numpy(aslist=True) # Fetch 100 labels and store 
                                             # them as a list of NumPy arrays

Congratulations, you've got Deep Lake working on your local machine

🤓
Details on all installation options are available here
lazily