Step 4: Accessing Data

Learn how Deep Lake Datasets can be accessed or loaded from a variety of storage locations.

How to Access and Load Datasets with Deep Lake

Loading Datasets

Deep Lake Datasets can be loaded from a variety of storage locations using:

import deeplake

# Local Filepath
ds = deeplake.load('./my_dataset_path') # Similar functionality to deeplake.dataset(path)

# S3
ds = deeplake.load('s3://my_dataset_bucket', creds={...})

# Public Dataset hosted by Activeloop
## Activeloop Storage - See Step 6
ds = deeplake.load('hub://activeloop/public_dataset_name')

# Dataset in another organization on Activeloop Platform
ds = deeplake.load('hub://org_name/dataset_name')

Referencing Tensors

Deep Lake allows you to reference specific tensors using keys or via the "." notation outlined below.

Note: data is still not loaded by these commands.

Accessing Data

Data within the tensors is loaded and accessed using the .numpy() , .data() , and .tobytes() commands. When the underlying data can be converted to a numpy array, .data() and .numpy() return equivalent objects.

The .numpy()method will produce an exception if all samples in the requested tensor do not have a uniform shape. If that's the case, running .numpy(aslist=True)solves the problem by returning a list of NumPy arrays, where the indices of the list correspond to different samples.

Last updated

Was this helpful?