LogoLogo
API ReferenceGitHubSlackService StatusLogin
v3.9.16
v3.9.16
  • 🏠Deep Lake Docs
  • List of ML Datasets
  • 🏗️SETUP
    • Installation
    • User Authentication
      • Workload Identities (Azure Only)
    • Storage and Credentials
      • Storage Options
      • Setting up Deep Lake in Your Cloud
        • Microsoft Azure
          • Configure Azure SSO on Activeloop
          • Provisioning Federated Credentials
          • Enabling CORS
        • Google Cloud
          • Provisioning Federated Credentials
          • Enabling CORS
        • Amazon Web Services
          • Provisioning Role-Based Access
          • Enabling CORS
  • 📚Examples
    • Deep Learning
      • Deep Learning Quickstart
      • Deep Learning Guide
        • Step 1: Hello World
        • Step 2: Creating Deep Lake Datasets
        • Step 3: Understanding Compression
        • Step 4: Accessing and Updating Data
        • Step 5: Visualizing Datasets
        • Step 6: Using Activeloop Storage
        • Step 7: Connecting Deep Lake Datasets to ML Frameworks
        • Step 8: Parallel Computing
        • Step 9: Dataset Version Control
        • Step 10: Dataset Filtering
      • Deep Learning Tutorials
        • Creating Datasets
          • Creating Complex Datasets
          • Creating Object Detection Datasets
          • Creating Time-Series Datasets
          • Creating Datasets with Sequences
          • Creating Video Datasets
        • Training Models
          • Splitting Datasets for Training
          • Training an Image Classification Model in PyTorch
          • Training Models Using MMDetection
          • Training Models Using PyTorch Lightning
          • Training on AWS SageMaker
          • Training an Object Detection and Segmentation Model in PyTorch
        • Updating Datasets
        • Data Processing Using Parallel Computing
      • Deep Learning Playbooks
        • Querying, Training and Editing Datasets with Data Lineage
        • Evaluating Model Performance
        • Training Reproducibility Using Deep Lake and Weights & Biases
        • Working with Videos
      • Deep Lake Dataloaders
      • API Summary
    • RAG
      • RAG Quickstart
      • RAG Tutorials
        • Vector Store Basics
        • Vector Search Options
          • LangChain API
          • Deep Lake Vector Store API
          • Managed Database REST API
        • Customizing Your Vector Store
        • Image Similarity Search
        • Improving Search Accuracy using Deep Memory
      • LangChain Integration
      • LlamaIndex Integration
      • Managed Tensor Database
        • REST API
        • Migrating Datasets to the Tensor Database
      • Deep Memory
        • How it Works
    • Tensor Query Language (TQL)
      • TQL Syntax
      • Index for ANN Search
        • Caching and Optimization
      • Sampling Datasets
  • 🔬Technical Details
    • Best Practices
      • Creating Datasets at Scale
      • Training Models at Scale
      • Storage Synchronization and "with" Context
      • Restoring Corrupted Datasets
      • Concurrent Writes
        • Concurrency Using Zookeeper Locks
    • Deep Lake Data Format
      • Tensor Relationships
      • Version Control and Querying
    • Dataset Visualization
      • Visualizer Integration
    • Shuffling in Dataloaders
    • How to Contribute
Powered by GitBook
On this page
  • How to Get Started with Deep Learning in Deep Lake in Under 5 Minutes
  • Installing Deep Lake
  • Fetching Your First Deep Lake Dataset
  • Reading Samples From a Deep Lake Dataset
  • Visualizing a Deep Lake Dataset
  • Creating Your Own Deep Lake Datasets
  • Authentication
  • Next Steps

Was this helpful?

Edit on GitHub
  1. Examples
  2. Deep Learning

Deep Learning Quickstart

A jump-start guide to using Deep Lake for Deep Learning.

PreviousDeep LearningNextDeep Learning Guide

Was this helpful?

How to Get Started with Deep Learning in Deep Lake in Under 5 Minutes

Installing Deep Lake

Deep Lake can be installed using pip. By default, Deep Lake does not install dependencies for video, google-cloud, compute engine, and other features. .

!pip install deeplake

Fetching Your First Deep Lake Dataset

Let's load the , a rich dataset with many object detections per image. hosted by Activeloop are identified by the host organization id followed by the dataset name: activeloop/visdrone-det-train.

import deeplake

dataset_path = 'hub://activeloop/visdrone-det-train'
ds = deeplake.load(dataset_path) # Returns a Deep Lake Dataset but does not download data locally

Reading Samples From a Deep Lake Dataset

Data is not immediately read into memory because Deep Lake operates . You can fetch data by calling the .numpy() or .data() methods:

# Indexing
image = ds.images[0].numpy() # Fetch the first image and return a numpy array
labels = ds.labels[0].data() # Fetch the labels in the first image

# Slicing
img_list = ds.labels[0:100].numpy(aslist=True) # Fetch 100 labels and store 
                                               # them as a list of numpy arrays

Other metadata such as the mapping between numerical labels and their text counterparts can be accessed using:

labels_list = ds.labels.info['class_names']

Visualizing a Deep Lake Dataset

Deep Lake enables users to visualize and interpret large datasets. The tensor layout for a dataset can be inspected using:

ds.summary()
ds.visualize()

Creating Your Own Deep Lake Datasets

You can access all of the features above and more with your own datasets! If your source data conforms to one of the formats below, you can ingest them directly with 1 line of code. The ingestion functions support source data from the cloud, as well as creation of Deep Lake datasets in the cloud.

For example, a COCO format dataset can be ingested using:

dataset_path = 's3://bucket_name_deeplake/dataset_name' # Destination for the Deep Lake dataset

images_folder = 's3://bucket_name_source/images_folder'
annotations_files = ['s3://bucket_name_source/annotations.json'] # Can be a list of COCO jsons.

ds = deeplake.ingest_coco(images_folder, annotations_files, dataset_path, src_creds = {...}, dest_creds = {...})

Authentication

Environmental Variable

Set the environmental variable ACTIVELOOP_TOKEN to your API token. In Python, this can be done using:

os.environ['ACTIVELOOP_TOKEN'] = <your_token>

Pass the Token to Individual Methods

You can pass your API token to individual methods that require authentication such as:

ds = deeplake.load('hub://org_name/dataset_name', token = <your_token>)

Next Steps

The dataset can be , or using an iframe in a Jupyter notebook:

Visualizing datasets in will unlock more features and faster performance compared to visualization in Jupyter notebooks.

For creating datasets that do not conform to one of the formats above,

To use Deep Lake features that require authentication (Activeloop storage, Tensor Database storage, connecting your cloud dataset to the Deep Lake UI, etc.) you should and authenticate on the client using the methods in the link below:

Check out our for a comprehensive walk-through of Deep Lake. Also check out tutorials on , , and , as well as about powerful use-cases that are enabled by Deep Lake.

Congratulations, you've got Deep Lake working on your local machine

📚
🤓
Details on all installation options are available here
Visdrone dataset
Datasets
lazily
visualized in the Deep Lake UI
the Deep Lake UI
YOLO
COCO
Classifications
you can use our methods for manually creating datasets, tensors, and populating them with data.
register in the Deep Lake App
Getting Started Guide
Running Queries
Training Models
Creating Datasets
Playbooks