Vector Functions
This page covers vector operations and distance functions available in Rockset. Rockset internally handles and manipulates vectors as homogeneous arrays.
Note: Arrays input to vector functions must have the same lengths and the elements must be of the types int
or float
.
Otherwise the functions will throw an error. Various error messages are outlined in the following sections.
Operations between an array of ints
and an array of floats
will return an array of floats
. Learn more about type functions.
Vector Operations
VECTOR_ENFORCE
VECTOR_ENFORCE(array, length, type)
Returns the input vector if it fulfills the length and type requirements. Otherwise returns null
. Input the length argument as an int
and the type argument as the string
'int' or 'float'.
Use VECTOR_ENFORCE
within your ingest transformations to ensure
compatibility between vectors during query execution. VECTOR_ENFORCE
also
signals to Rockset that the ingested array should be treated as a vector for
storage and access optimizations.
SELECT
VECTOR_ENFORCE([1, 2, 3], 3, 'int')
[1, 2, 3]
SELECT
VECTOR_ENFORCE([1, 2, 3], 3, 'float')
null
SELECT
VECTOR_ENFORCE([1.5, 2.6, 3.7, 4.8], 4, 'float')
[1.5, 2.6, 3.7, 4.8]
SELECT
VECTOR_ENFORCE([1.5, 2.6, 3.7, 4.8], 1, 'float')
null
SELECT
VECTOR_ENFORCE([1.5, 2.6, 3.7, 4.8], 4, 1)
Error: Passed in type must be a name of type string not of type int.
VECTOR_ADD
VECTOR_ADD(array, addend)
Adds a scalar of type int
or float
to a vector. Alternatively, adds two vectors.
SELECT
VECTOR_ADD([1, 2, 3, 4], 4)
[5, 6, 7, 8]
SELECT
VECTOR_ADD([1, 2, 3, 4], [5, 6, 7, 8])
[6, 8, 10, 12]
SELECT
VECTOR_ADD([1.5, 2.6, 3.7, 4.8], 4.4)
[5.9, 7.0, 8.1, 9.2]
SELECT
VECTOR_ADD([1.1, 2.2, 3.3, 4.3], [1.5, 2.6, 3.7, 4.8])
[2.6, 4.8, 7.0, 9.1]
SELECT
VECTOR_ADD(
[1.1, 2.2, 3.3, 4.3],
['0':1.5, '1':2.6, '2':3.7, '3':4.8]
)
Error: Cannot perform vector operations on datatype object.
VECTOR_SUBTRACT
VECTOR_SUBTRACT(array, subtrahend)
Subtracts a scalar of type int
or float
from a vector. Alternatively, subtracts a vector from another vector.
SELECT
VECTOR_SUBTRACT([5, 6, 7, 8], 4)
[1, 2, 3, 4]
SELECT
VECTOR_SUBTRACT([5, 6, 7, 8], [1, 2, 3, 4])
[4, 4, 4, 4]
SELECT
VECTOR_SUBTRACT([5.1, 6.2, 7.3, 8.4], 4.4)
[0.7, 1.8, 2.9, 4.0]
SELECT
VECTOR_SUBTRACT([5.5, 6.6, 7.7, 8.8], [1.1, 2.2, 3.3, 4.4])
[4.4, 4.4, 4.4, 4.4]
SELECT
VECTOR_SUBTRACT([1.1], [5.5, 6.6, 7.7, 8.8])
Error: Cannot apply operation VECTOR_SUBTRACT on vectors of different sizes 1 and 4.
VECTOR_MULTIPLY
VECTOR_MULTIPLY(array, multiplier)
Multiplies a vector by a scalar of type int
or float
or computes the element-wise multiplication of two vectors as the Hadamard product.
SELECT
VECTOR_MULTIPLY([1, 2, 3, 4], 2)
[2, 4, 6, 8]
SELECT
VECTOR_MULTIPLY([1, 2, 3, 4], [1, 2, 3 , 4])
[1, 4, 9, 16]
SELECT
VECTOR_MULTIPLY([5, 6, 7, 8], 1.5)
[7.5, 9.0, 10.5, 12.0]
SELECT
VECTOR_MULTIPLY([5.5, 6.6, 7.7, 8.8], [5.5, 6.6, 7.7, 8.8])
[30.25, 43.56, 59.29, 77.44]
SELECT
VECTOR_MULTIPLY([5.5, 6.6, 7.7, 8.8], [5.5, 6.6, 7.7, 'foo'])
Error: Cannot perform vector operations on datatype `string`.
VECTOR_DIVIDE
VECTOR_DIVIDE(array, divisor)
Divides a vector by a scalar of type int
or float
or computes the element-wise divison of two vectors. Throws an error when division by zero occurs.
SELECT
VECTOR_DIVIDE([2, 4, 6, 8], 2)
[1, 2, 3, 4]
SELECT
VECTOR_DIVIDE([2, 4, 6, 8], [2, 4, 6, 8])
[1, 1, 1, 1]
SELECT
VECTOR_DIVIDE([2.2, 4.4, 6.6, 8.8], 2.2)
[1.0, 2.0, 3.0, 4.0]
SELECT
VECTOR_DIVIDE([2.2, 4.4, 6.6, 8.8], [2.2, 4.4, 6.6, 8.8])
[1.0, 1.0, 1.0, 1.0]
SELECT
VECTOR_DIVIDE([2, 4, 6, 8], 0)
Error: The divisor in a VECTOR_DIVIDE operation was zero.
SELECT
VECTOR_DIVIDE([2.2, 4.4, 6.6, 8.8], [2.2, 4.4, 0.0, 0.0])
Error: The divisor in a VECTOR_DIVIDE operation was zero.
Vector Distance Functions
DOT_PRODUCT
DOT_PRODUCT(array, array)
Computes the dot product of two vectors.
SELECT
DOT_PRODUCT([1, 2, 3, 4], [5, 6, 7, 8])
70
(= 1 * 5 + 2 * 6 + 3 * 7 + 4 * 8)
SELECT
DOT_PRODUCT([1.1, 2.2, 3.3, 4.4], [5.5, 6.6, 7.7, 8.8])
84.7
(= 1.1 * 5.5 + 2.2 * 6.6 + 3.3 * 7.7 + 4.4 * 8.8)
SELECT
DOT_PRODUCT([5, 6], [1, 2, 3, 4])
Error: Cannot apply operation DOT_PRODUCT on vectors of different sizes 2 and 4.
EUCLIDEAN_DIST
EUCLIDEAN_DIST(array, array)
Computes the Euclidean distance of two vectors (which is also referred to as the L2 norm). Euclidean distance represents the square root of the sum of squared differences between corresponding elements of two vectors.
SELECT
EUCLIDEAN_DIST([1, 2, 3, 4], [5, 6, 7, 8])
8
SELECT
EUCLIDEAN_DIST([1.1, 2.2, 3.3, 4.4], [5.5, 6.6, 7.7, 8.8])
8.8
SELECT
EUCLIDEAN_DIST([1, 2, 3, 4], [5.5, 6.6, 7.7, 8.8])
9.302688
COSINE_SIM
COSINE_SIM(array, array)
Computes the cosine similarity of two vectors. Cosine similarity represents the dot product of two vectors divided by their magnitude
SELECT
COSINE_SIM([1, 2, 3, 4], [5, 6, 7, 8])
0.968864
SELECT
COSINE_SIM([1.1, 2.2, 3.3, 4.4], [5.1, 6.2, 7.3, 8.4])
0.971264
SELECT
COSINE_SIM([1.1, 2.2, 3.3, 4.4], [1, 1, 1, 1])
0.912871