LogoLogo
API ReferenceGitHubSlackService StatusLogin
v3.0.10
v3.0.10
  • Deep Lake Docs
  • List of ML Datasets
  • Quickstart
  • Dataset Visualization
  • Storage & Credentials
    • Storage Options
    • Managed Credentials
      • Enabling CORS
      • Provisioning Role-Based Access
  • API Reference
  • EXAMPLE CODE
  • Getting Started
    • Step 1: Hello World
    • Step 2: Creating Deep Lake Datasets
    • Step 3: Understanding Compression
    • Step 4: Accessing Data
    • Step 5: Visualizing Datasets
    • Step 6: Using Activeloop Storage
    • Step 7: Connecting Deep Lake Datasets to ML Frameworks
    • Step 8: Parallel Computing
    • Step 9: Dataset Version Control
    • Step 10: Dataset Filtering
  • Tutorials (w Colab)
    • Creating Datasets
      • Creating Complex Datasets
      • Creating Object Detection Datasets
      • Creating Time-Series Datasets
      • Creating Datasets with Sequences
      • Creating Video Datasets
    • Training Models
      • Training an Image Classification Model in PyTorch
      • Training an Object Detection and Segmentation Model in PyTorch
    • Querying Datasets
    • Data Processing Using Parallel Computing
  • Playbooks
    • Querying, Training and Editing Datasets with Data Lineage
    • Evaluating Model Performance
    • Training Reproducibility Using Deep Lake and Weights & Biases
    • Working with Videos
  • Performant Dataloader (Beta)
  • API Summary
  • How Deep Lake Works
    • Data Layout
    • Tensor Relationships
    • Visualizer Integration
    • Shuffling in ds.pytorch()
    • Storage Synchronization
    • How to Contribute
Powered by GitBook
On this page
  • How to use Deep Lake's performant Dataloader built and optimized in C++
  • Pure-Python Dataloader
  • C++ Dataloader

Was this helpful?

Performant Dataloader (Beta)

How to use Deep Lake's new dataloader built and optimized in C++

PreviousWorking with VideosNextAPI Summary

Last updated 2 years ago

Was this helpful?

How to use Deep Lake's performant Dataloader built and optimized in C++

Deep Lake offers an Alpha version of its dataloader that was build and optimized in C++. The new dataloader is 2-3X faster in many applications, but since it is an experimental state, is not as reliable as the pure-python dataloader described .

Both dataloaders can be used interchangeably, and their syntax varies as shown below

Pure-Python Dataloader

train_loader = ds_train.pytorch(num_workers = 8,
                                transform = transform, 
                                batch_size = 32,
                                tensors=['images', 'labels'],
                                shuffle = True)

C++ Dataloader

The C++ dataloader is currently available only on Linux machines. It also returns image tensors as PIL images, not numpy arrays (like the python dataloader).

from deeplake.experimental import dataloader

train_loader = dataloader(ds)\
                .transform(transform)\
                .batch(32)\
                .shuffle()\
                .pytorch(tensors=['images', 'labels'], num_workers = 8)
here