Data Processing Using Parallel Computing
Deeplake offers built-in methods for parallelizing dataset computations in order to achieve faster data processing.
How to use deeplake.compute for parallelizing workflows
deeplake.compute for parallelizing workflowsThis tutorial is also available as a Colab Notebook
Transformations on New Datasets
import deeplake
from PIL import Image
import numpy as np
@deeplake.compute
def flip_vertical(sample_in, sample_out):
## First two arguments are always default arguments containing:
# 1st argument is an element of the input iterable (list, dataset, array,...)
# 2nd argument is a dataset sample
# Append the label and image to the output sample
sample_out.append({'labels': sample_in.labels.numpy(),
'images': np.flip(sample_in.images.numpy(), axis = 0)})
return sample_outTransformations on Existing Datasets
Dataset Processing Pipelines
Recovering From Errors
Last updated
Was this helpful?