Rockset

    Vector Functions

    This page covers vector operations and distance functions available in Rockset. Rockset internally handles and manipulates vectors as homogeneous arrays.

    Note: Arrays input to vector functions must have the same lengths and the elements must be of the types int or float. Otherwise the functions will throw an error. Various error messages are outlined in the following sections. Operations between an array of ints and an array of floats will return an array of floats. Learn more about type functions.

    Vector Operations

    VECTOR_ENFORCE

    VECTOR_ENFORCE(array, length, type)

    Returns the input vector if it fulfills the length and type requirements. Otherwise returns null. Input the length argument as an int and the type argument as the string 'int' or 'float'.

    We recommend using VECTOR_ENFORCE within your ingest transformations to ensure compatibility between vectors during query execution.

    SELECT
        VECTOR_ENFORCE([1, 2, 3], 3, 'int')
    [1, 2, 3]
    SELECT
        VECTOR_ENFORCE([1, 2, 3], 3, 'float')
    null
    SELECT
        VECTOR_ENFORCE([1.5, 2.6, 3.7, 4.8], 4, 'float')
    [1.5, 2.6, 3.7, 4.8]
    SELECT
        VECTOR_ENFORCE([1.5, 2.6, 3.7, 4.8], 1, 'float')
    null
    SELECT
        VECTOR_ENFORCE([1.5, 2.6, 3.7, 4.8], 4, 1)
    Error: Passed in type must be a name of type string not of type int.

    VECTOR_ADD

    VECTOR_ADD(array, addend)

    Adds a scalar of type int or float to a vector. Alternatively, adds two vectors.

    SELECT
        VECTOR_ADD([1, 2, 3, 4], 4)
    [5, 6, 7, 8]
    SELECT
        VECTOR_ADD([1, 2, 3, 4], [5, 6, 7, 8])
    [6, 8, 10, 12]
    SELECT
        VECTOR_ADD([1.5, 2.6, 3.7, 4.8], 4.4)
    [5.9, 7.0, 8.1, 9.2]
    SELECT
        VECTOR_ADD([1.1, 2.2, 3.3, 4.3], [1.5, 2.6, 3.7, 4.8])
    [2.6, 4.8, 7.0, 9.1]
    SELECT
        VECTOR_ADD(
            [1.1, 2.2, 3.3, 4.3],
            ['0':1.5, '1':2.6, '2':3.7, '3':4.8]
        )
    Error: Cannot perform vector operations on datatype object.

    VECTOR_SUBTRACT

    VECTOR_SUBTRACT(array, subtrahend)

    Subtracts a scalar of type int or float from a vector. Alternatively, subtracts a vector from another vector.

    SELECT
        VECTOR_SUBTRACT([5, 6, 7, 8], 4)
    [1, 2, 3, 4]
    SELECT
        VECTOR_SUBTRACT([5, 6, 7, 8], [1, 2, 3, 4])
    [4, 4, 4, 4]
    SELECT
        VECTOR_SUBTRACT([5.1, 6.2, 7.3, 8.4], 4.4)
    [0.7, 1.8, 2.9, 4.0]
    SELECT
        VECTOR_SUBTRACT([5.5, 6.6, 7.7, 8.8], [1.1, 2.2, 3.3, 4.4])
    [4.4, 4.4, 4.4, 4.4]
    SELECT
        VECTOR_SUBTRACT([1.1], [5.5, 6.6, 7.7, 8.8])
    Error: Cannot apply operation VECTOR_SUBTRACT on vectors of different sizes 1 and 4.

    VECTOR_MULTIPLY

    VECTOR_MULTIPLY(array, multiplier)

    Multiplies a vector by a scalar of type int or float or computes the element-wise multiplication of two vectors as the Hadamard product.

    SELECT
        VECTOR_MULTIPLY([1, 2, 3, 4], 2)
    [2, 4, 6, 8]
    SELECT
        VECTOR_MULTIPLY([1, 2, 3, 4], [1, 2, 3 , 4])
    [1, 4, 9, 16]
    SELECT
        VECTOR_MULTIPLY([5, 6, 7, 8], 1.5)
    [7.5, 9.0, 10.5, 12.0]
    SELECT
        VECTOR_MULTIPLY([5.5, 6.6, 7.7, 8.8], [5.5, 6.6, 7.7, 8.8])
    [30.25, 43.56, 59.29, 77.44]
    SELECT
        VECTOR_MULTIPLY([5.5, 6.6, 7.7, 8.8], [5.5, 6.6, 7.7, 'foo'])
    Error: Cannot perform vector operations on datatype `string`.

    VECTOR_DIVIDE

    VECTOR_DIVIDE(array, divisor)

    Divides a vector by a scalar of type int or float or computes the element-wise divison of two vectors. Throws an error when division by zero occurs.

    SELECT
        VECTOR_DIVIDE([2, 4, 6, 8], 2)
    [1, 2, 3, 4]
    SELECT
        VECTOR_DIVIDE([2, 4, 6, 8], [2, 4, 6, 8])
    [1, 1, 1, 1]
    SELECT
        VECTOR_DIVIDE([2.2, 4.4, 6.6, 8.8], 2.2)
    [1.0, 2.0, 3.0, 4.0]
    SELECT
        VECTOR_DIVIDE([2.2, 4.4, 6.6, 8.8], [2.2, 4.4, 6.6, 8.8])
    [1.0, 1.0, 1.0, 1.0]
    SELECT
        VECTOR_DIVIDE([2, 4, 6, 8], 0)
    Error: The divisor in a VECTOR_DIVIDE operation was zero.
    SELECT
        VECTOR_DIVIDE([2.2, 4.4, 6.6, 8.8], [2.2, 4.4, 0.0, 0.0])
    Error: The divisor in a VECTOR_DIVIDE operation was zero.

    Vector Distance Functions

    DOT_PRODUCT

    DOT_PRODUCT(array, array)

    Computes the dot product of two vectors.

    SELECT
        DOT_PRODUCT([1, 2, 3, 4], [5, 6, 7, 8])
    70

    (= 1 * 5 + 2 * 6 + 3 * 7 + 4 * 8)

    SELECT
        DOT_PRODUCT([1.1, 2.2, 3.3, 4.4], [5.5, 6.6, 7.7, 8.8])
    84.7

    (= 1.1 * 5.5 + 2.2 * 6.6 + 3.3 * 7.7 + 4.4 * 8.8)

    SELECT
        DOT_PRODUCT([5, 6], [1, 2, 3, 4])
    Error: Cannot apply operation DOT_PRODUCT on vectors of different sizes 2 and 4.

    EUCLIDEAN_DIST

    EUCLIDEAN_DIST(array, array)

    Computes the Euclidean distance of two vectors (which is also referred to as the L2 norm). Euclidean distance represents the square root of the sum of squared differences between corresponding elements of two vectors.

    SELECT
        EUCLIDEAN_DIST([1, 2, 3, 4], [5, 6, 7, 8])
    8
    SELECT
        EUCLIDEAN_DIST([1.1, 2.2, 3.3, 4.4], [5.5, 6.6, 7.7, 8.8])
    8.8
    SELECT
        EUCLIDEAN_DIST([1, 2, 3, 4], [5.5, 6.6, 7.7, 8.8])
    9.302688

    COSINE_SIM

    COSINE_SIM(array, array)

    Computes the cosine similarity of two vectors. Cosine similarity represents the dot product of two vectors divided by their magnitude

    SELECT
        COSINE_SIM([1, 2, 3, 4], [5, 6, 7, 8])
    0.968864
    SELECT
        COSINE_SIM([1.1, 2.2, 3.3, 4.4], [5.1, 6.2, 7.3, 8.4])
    0.971264
    SELECT
        COSINE_SIM([1.1, 2.2, 3.3, 4.4], [1, 1, 1, 1])
    0.912871