LogoLogo
API ReferenceGitHubSlackService StatusLogin
v3.8.16
v3.8.16
  • Deep Lake Docs
  • Vector Store Quickstart
  • Deep Learning Quickstart
  • Storage & Credentials
    • Storage Options
    • User Authentication
    • Storing Deep Lake Data in Your Own Cloud
      • Microsoft Azure
        • Provisioning Federated Credentials
        • Enabling CORS
      • Amazon Web Services
        • Provisioning Role-Based Access
        • Enabling CORS
  • List of ML Datasets
  • 🏢High-Performance Features
    • Introduction
    • Performant Dataloader
    • Tensor Query Language (TQL)
      • TQL Syntax
      • Sampling Datasets
    • Deep Memory
      • How it Works
    • Index for ANN Search
      • Caching and Optimization
    • Managed Tensor Database
      • REST API
      • Migrating Datasets to the Tensor Database
  • 📚EXAMPLE CODE
    • Getting Started
      • Vector Store
        • Step 1: Hello World
        • Step 2: Creating Deep Lake Vector Stores
        • Step 3: Performing Search in Vector Stores
        • Step 4: Customizing Vector Stores
      • Deep Learning
        • Step 1: Hello World
        • Step 2: Creating Deep Lake Datasets
        • Step 3: Understanding Compression
        • Step 4: Accessing and Updating Data
        • Step 5: Visualizing Datasets
        • Step 6: Using Activeloop Storage
        • Step 7: Connecting Deep Lake Datasets to ML Frameworks
        • Step 8: Parallel Computing
        • Step 9: Dataset Version Control
        • Step 10: Dataset Filtering
    • Tutorials (w Colab)
      • Vector Store Tutorials
        • Vector Search Options
          • Deep Lake Vector Store API
          • REST API
          • LangChain API
        • Image Similarity Search
        • Deep Lake Vector Store in LangChain
        • Deep Lake Vector Store in LlamaIndex
        • Improving Search Accuracy using Deep Memory
      • Deep Learning Tutorials
        • Creating Datasets
          • Creating Complex Datasets
          • Creating Object Detection Datasets
          • Creating Time-Series Datasets
          • Creating Datasets with Sequences
          • Creating Video Datasets
        • Training Models
          • Splitting Datasets for Training
          • Training an Image Classification Model in PyTorch
          • Training Models Using MMDetection
          • Training Models Using PyTorch Lightning
          • Training on AWS SageMaker
          • Training an Object Detection and Segmentation Model in PyTorch
        • Updating Datasets
        • Data Processing Using Parallel Computing
      • Concurrent Writes
        • Concurrency Using Zookeeper Locks
    • Playbooks
      • Querying, Training and Editing Datasets with Data Lineage
      • Evaluating Model Performance
      • Training Reproducibility Using Deep Lake and Weights & Biases
      • Working with Videos
    • Low-Level API Summary
  • 🔬Technical Details
    • Best Practices
      • Creating Datasets at Scale
      • Training Models at Scale
      • Storage Synchronization and "with" Context
      • Restoring Corrupted Datasets
      • Concurrent Writes
    • Data Layout
    • Version Control and Querying
    • Dataset Visualization
    • Tensor Relationships
    • Visualizer Integration
    • Shuffling in dataloaders
    • How to Contribute
Powered by GitBook
On this page
  • How to Install Deep Lake and Get Started
  • Installing Deep Lake
  • Fetching Your First Deep Lake Dataset
  • Reading Samples From a Deep Lake Dataset

Was this helpful?

  1. EXAMPLE CODE
  2. Getting Started
  3. Deep Learning

Step 1: Hello World

Installing Deep Lake and accessing your first Deep Lake Dataset.

PreviousDeep LearningNextStep 2: Creating Deep Lake Datasets

Was this helpful?

How to Install Deep Lake and Get Started

Installing Deep Lake

Deep Lake can be installed through pip. By default, Deep Lake does not install dependencies for audio, video, google-cloud, and other features. .

! pip install deeplake

Fetching Your First Deep Lake Dataset

Let's load MNIST, the hello world dataset of machine learning.

First, instantiate a Dataset by pointing to its storage location. Datasets hosted on Activeloop Platform are typically identified by the namespace of the organization followed by the dataset name: activeloop/mnist-train.

import deeplake

dataset_path = 'hub://activeloop/mnist-train'
ds = deeplake.load(dataset_path) # Returns a Deep Lake Dataset but does not download data locally

Reading Samples From a Deep Lake Dataset

Data is not immediately read into memory because Deep Lake operates . You can fetch data by calling the .numpy() method, which reads data into a NumPy array.

# Indexing
img = ds.images[0].numpy()              # Fetch the 1st image and return a NumPy array
label = ds.labels[0].numpy(aslist=True) # Fetch the 1st label and store it as a 
                                        # as a list
                              
text_labels = ds.labels[0].data()['text'] # Fetch the first labels and return them as text

# Slicing
imgs = ds.images[0:100].numpy() # Fetch 100 images and return a NumPy array
                                # The method above produces an exception if 
                                # the images are not all the same size

labels = ds.labels[0:100].numpy(aslist=True) # Fetch 100 labels and store 
                                             # them as a list of NumPy arrays

Congratulations, you've got Deep Lake working on your local machine

📚
🤓
Details on all installation options are available here
lazily