TQL Syntax
How to properly format TQL queries
Query syntax for the Tensor Query Language (TQL)
CONTAINS and ==
-- Exact match, which generally requires that the sample
-- has 1 value, i.e. no lists or multi-dimensional arrays
select * where tensor_name == 'text_value' # If value is text
select * where tensor_name == numeric_value # If values is numeric
select * where contains(tensor_name, 'text_value')Tensor or group names with special characters should be wrapped with double-quotes:
select * where contains("tensor-name", 'text_value')
select * where "tensor_name/group_name" == numeric_valueMake sure to wrap double-quotes with escape characters in Python:
select * where contains(\"tensor-name\", 'text_value')SHAPE
select * where shape(tensor_name)[dimension_index] > numeric_value
select * where shape(tensor_name)[1] > numeric_value # Second array dimension > valueLIMIT
AND, OR, NOT
UNION and INTERSECT
ORDER BY
ANY, ALL, and ALL_STRICT
all adheres to NumPy and list logic where all(empty_sample) returns True
all_strict is more intuitive for queries so all_strict(empty_sample) returns False
IN and BETWEEN
Only works for scalar numeric values and text references to class_names
LOGICAL_AND and LOGICAL_OR
REFERENCING SAMPLES IN EXISTING TENORS
SAMPLE BY
weight_choiceresolves the weight that is used when multiple expressions evaluate toTruefor a given sample. Options aremax_weight, sum_weight. For example, ifweight_choiceismax_weight, then the maximum weight will be chosen for that sample.replacedetermines whether samples should be drawn with replacement. It defaults toTrue.limitspecifies the number of samples that should be returned. If unspecified, the sampler will return the number of samples corresponding to the length of the dataset
EMBEDDING SEARCH
Deep Lake supports several vector operations for embedding search. Typically, vector operations are called by returning data ordered by the score based on the vector search method.
VIRTUAL TENSORS
Virtual tensors are the result of a computation and are not tensors in the Deep Lake dataset. However, they can be treated as tensors in the API.
When combining embedding search with filtering (where conditions), the filter condition is evaluated prior to the embedding search.
GROUP BY AND UNGROUP BY
Group by creates a sequence of data based on the common properties that are being grouped (i.e. frames into videos). Ungroup by splits sequences into their individual elements (i.e. videos into images).
EXPAND BY
Expand by includes samples before and after a query condition is satisfied.
Last updated
Was this helpful?