Commands
This page documents all available SQL commands in Rockset.
In the command templates shown here, all uppercase words are keywords and lowercase words are
user-provided values. In practice, all keywords and function names are case insensitive; for
example, describe "_events"
is equivalent to DESCRIBE "_events"
.
DESCRIBE
DESCRIBE collection_name
The output of DESCRIBE has the following fields:
- field: Every distinct field name in the collection
- type: The data type of the field
- occurrences: The number of documents that have this field in the given type
- total: Total number of documents in the collection for top level fields, and total number of documents that have the parent field for nested fields
Note: We use the delimiter
*
(glob) in the field name to indicate descending into an array. It can be replaced by an integer index to construct an instance of the field path. If multiple data types appear for a certain field path,DESCRIBE
will return one entry for each type.
Suppose a collection names
has the following documents:
{
"name": "John",
"age": 31,
"city": "New York"
},
{
"name": "Michael",
"age": "notfound",
"experiences": [
{
"title": "XYZ",
"years": 4
}
]
}
DESCRIBE names
+--------------------------------------+---------------+---------+-----------+
| field | occurrences | total | type |
|--------------------------------------+---------------+---------+-----------|
| ['name'] | 2 | 2 | string |
| ['age'] | 1 | 2 | string |
| ['age'] | 1 | 2 | int |
| ['city'] | 1 | 2 | string |
| ['experiences'] | 4 | 4 | string |
| ['experiences', '*'] | 1 | 1 | object |
| ['experiences', '*', 'title'] | 1 | 1 | string |
| ['experiences', '*', 'years'] | 1 | 1 | int |
+--------------------------------------+---------------+---------+-----------+
Let's look at how DESCRIBE works with collections that have documents with each of the following properties:
- Mixed Types
- Nested Objects
- Sparse Fields and Null Values
- Nested Arrays
Mixed Types
Consider the collection zipcodes
with the following documents:
{"zipcode": 94542},
{"zipcode": "91126"},
{"zipcode": 94110.0},
{"zipcode": "94020"}
Every document in this collection has the same field called zipcode, but its type is different across the collection. This is captured below.
DESCRIBE zipcodes
+-------------+-------------+-------+--------+
| field | occurrences | total | type |
|-------------+-------------+-------+--------|
| ['zipcode'] | 1 | 4 | int |
| ['zipcode'] | 2 | 4 | string |
| ['zipcode'] | 1 | 4 | float |
+-------------+-------------+-------+--------+
We can use the typeof
SQL function to get all zipcodes of type string
:
SELECT
zipcode
FROM
zipcodes
WHERE
TYPEOF(zipcode) = 'string'
+-----------+
| zipcode |
|-----------|
| 94020 |
| 91126 |
+-----------+
Then to cast field zipcode
as string as we retrieve it, we can do the following:
SELECT
CAST(zipcode AS string) zipcode
FROM
zipcodes
+-----------+
| zipcode |
|-----------|
| 94020 |
| 94110 |
| 94542 |
| 91126 |
+-----------+
Nested Objects
Documents can have objects with scalars, nested arrays, or nested objects. Consider a simple
collection of documents with names of Turing Award winners, which we will call
turing_award_winners
.
{"name": {"first": "John", "last": "HopCroft"}},
{"name": {"first": "Robert", "last": "Tarjan"}},
{"name": {"first": "Alan", "last": "Kay"}},
{"name": {"first":"Edsger", "last": "Dijkstra"}}
Each document has an object with a field called name
that has nested fields first
and last
. To
access nested fields inside objects, we concatenate the field names with a . (dot) as separator. For
example, use name.first
and name.last
to access the first and last names, respectively, in each
of these documents. The fields name.first
and name.last
are present in all documents as scalars
of type 'string'. Hence, the occurrences and total for each field in this document will be the same
as the total number of documents.
DESCRIBE turing_award_winners
+--------------------------------------+---------------+---------+-----------+
| field | occurrences | total | type |
|--------------------------------------+---------------+---------+-----------|
| ['name'] | 4 | 4 | object |
| ['name', 'first'] | 4 | 4 | string |
| ['name', 'last'] | 4 | 4 | string |
+--------------------------------------+---------------+---------+-----------+
Say we want to list all the first and last names from this collection.
SELECT
turing_award_winners.name.first,
turing_award_winners.name.last
FROM
turing_award_winners
+---------+----------+
| first | last |
|---------+----------|
| Alan | Kay |
| John | HopCroft |
| Robert | Tarjan |
| Edsger | Dijkstra |
+---------+----------+
Sparse Fields and Null Values
Suppose collection turing_award_winners
now has the following documents.
{"name": {"first": "John", "last": "HopCroft"}},
{"name": {"first": "Robert", "last": "Tarjan"}},
{"name": {"first": "Alan", "middle": "Curtis", "last": "Kay"}},
{"name": {"first": "Edsger", "last": "Dijkstra"}}
Notice that this is similar to the previous collection with an additional nested field middle added
to the third document. The field name.middle
is a sparse field which is a string in 1 document. It
is treated as undefined
in the other 3 documents that it is absent in. Note that this is different
from null
. This is clarified in further detail
in the data-types page. DESCRIBE
helps capture such
sparse fields as well.
DESCRIBE turing_award_winners
+--------------------------------------+---------------+---------+-----------+
| field | occurrences | total | type |
|--------------------------------------+---------------+---------+-----------|
| ['name'] | 4 | 4 | object |
| ['name', 'first'] | 4 | 4 | string |
| ['name', 'last'] | 4 | 4 | string |
| ['name', 'middle'] | 1 | 4 | string |
+--------------------------------------+---------------+---------+-----------+
While fields with type undefined
are not captured in the smart schema, those with type null
are.
Say the last document in the collection was
{"name": {"first": "Edsger", "middle": null, "last": "Dijkstra"}}
instead.
DESCRIBE turing_award_winners
+-------------------+----------------+---------+-----------+
| field | occurrences | total | type |
|------------------------------------+---------------------+
| ['name'] | 4 | 4 | object |
| ['name', 'first'] | 4 | 4 | string |
| ['name', 'last'] | 4 | 4 | string |
| ['name', 'middle'] | 1 | 4 | null |
| ['name', 'middle'] | 1 | 4 | string |
+----------------------------------------------+-----------+
Both null
and undefined
types behave the same way in almost all situations. The predicate
IS NULL
will return true
for both null
and undefined
. You can filter the fields that are
undefined
with a special predicate IS UNDEFINED
.
SELECT
COUNT(*)
FROM
turing_award_winners
WHERE
turing_award_winners.name.middle IS NOT NULL
+----------+
| ?count |
|----------|
| 1 |
+----------+
SELECT
turing_award_winners.name.middle
FROM
turing_award_winners
WHERE
turing_award_winners.name.middle IS NOT NULL
+----------------------+
| middle |
|----------------------|
| Curtis |
+----------------------+
If you want to also include the fields with a null
value, you can use the IS NOT UNDEFINED
predicate:
SELECT
turing_award_winners.name.middle
FROM
turing_award_winners
WHERE
turing_award_winners.name.middle IS NOT UNDEFINED
+----------------------+
| middle |
|----------------------|
| Curtis |
| null |
+----------------------+
Nested Arrays
Just like nested objects, documents can also have arrays comprising of nested fields that are
scalars, objects, or arrays. Consider the following collection of documents with Turing Award
winners by year. Let's call this collection turing_award_winners_by_year
.
{
"year" : "1972",
"winners" : [
{
"first" : "Edsger",
"last" : "Dijkstra",
"subjects" : [ "Program Verification", "Programming" ]
}
]
},
{
"year" : "1986",
"winners" : [
{
"first" : "John",
"last" : "HopCroft",
"subjects" : [ "Analysis of Algorithms", "Data Structures" ]
},
{
"first" : "Robert",
"last" : "Tarjan",
"subjects" : [ "Analysis of Algorithms", "Data Structures" ]
}
]
},
{
"year" : "2003",
"winners" : [
{
"first" : "Alan",
"last" : "Kay",
"subjects" : [ "Personal Computing", "Programming Languages" ]
}
]
}
This shape of this collection is succinctly captured in the smart schema below.
DESCRIBE turing_award_winners_by_year
+--------------------------------------------+---------------+---------+-----------+
| field | occurrences | total | type |
|--------------------------------------------+---------------+---------+-----------|
| ['winners'] | 3 | 3 | array |
| ['winners', '*'] | 4 | 4 | object |
| ['winners', '*', 'first'] | 4 | 4 | string |
| ['winners', '*', 'last'] | 4 | 4 | string |
| ['winners', '*', 'subjects'] | 4 | 4 | array |
| ['winners', '*', 'subjects', '*'] | 8 | 8 | string |
| ['year'] | 3 | 3 | string |
+--------------------------------------------+---------------+---------+-----------+
The occurrence count for each of the nested elements is equal to the number of array elements of the given type, summed across all documents in the collection.
The total count for each such entry is equal to the total number of array elements at that level
of nesting, summed across all documents in the collection. Note that we use the delimiter '*'
(glob) in the field name to indicate descending into an array. Further, Rockset treats arrays as
virtual collections and allows you to DESCRIBE
arrays as well. When using a nested array as a
target collection in queries we use the delimiter : (colon) as a separator between the root
collection and the nested fields.
DESCRIBE turing_award_winners_by_year :winners [*]
+------------------------+---------------+---------+--------+
| field | occurrences | total | type |
|------------------------+---------------+---------+--------|
| ['*'] | 4 | 4 | object |
| ['*', 'first'] | 4 | 4 | string |
| ['*', 'last'] | 4 | 4 | string |
| ['*', 'subjects'] | 4 | 4 | array |
| ['*', 'subjects', '*'] | 8 | 8 | string |
+------------------------+---------------+---------+--------+
DESCRIBE turing_award_winners_by_year :winners [*].subjects
+---------+---------------+---------+--------+
| field | occurrences | total | type |
|---------+---------------+---------+--------|
| ['*'] | 8 | 8 | string |
+---------+---------------+---------+--------+
SELECT
[ WITH with_query [, ...] ]
SELECT [ ALL | DISTINCT ] select_expr [, ...]
[ FROM from_item [, ...] ]
[ WHERE condition ]
[ GROUP BY [ ALL | DISTINCT ] grouping_element [, ...] ]
[ HAVING condition]
[ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select_query ]
[ ORDER BY expression [ ASC | DESC ] [, ...] ]
[ LIMIT [ count | ALL ] ]
[ OFFSET count ]
Returns result set from one or more collections. May include various other clauses:
with_query
is of the formalias_name AS select_query
.select_expr
is a valid SQL expression.from_item
is one of:collection_name [ [ AS ] alias [ ( column_alias [, ...] ) ] ] [ TABLESAMPLE BERNOULLI (percentage) ]
percentage
is the percentage of input to be sampled (ranges from 0 to 100)
from_item join_type from_item [ ON join_condition | USING ( join_column [, ...] ) ]
wherejoin_type
is one of:[ INNER ] JOIN
LEFT [ OUTER ] JOIN
RIGHT [ OUTER ] JOIN
CROSS JOIN
(which should not be followed byON
/USING
, all other should)
condition
is a SQL expression that evaluates to a boolean. See the section on conditional operators.grouping_element
is one of:()
expression
GROUPING SETS ( ( column [, ...] ) [, ...] )
CUBE ( column [, ...] )
ROLLUP ( column [, ...] )
select_query
is itself aSELECT
command with the same recursive format.expression
is a valid SQL expression.count
is the number of results to be returned.
Below we cover common uses some clauses in more detail.
To return all available fields, instead of listing them out as select expressions, specify a single
select expression *
.
SELECT
*
FROM
_events
+--------------------------------------+---------------+--------+---------------+------+
| _id | eventTime | kind | label | type |
+--------------------------------------+---------------+--------+---------------+------+
| cf0e193d-9ee0-4c6a-9f2b-cab9893e97e3 | 1536164080248 | QUERY | QUERY_SUCCESS | INFO |
| 4a5e941e-7683-4db2-b8e0-ee0e4807d6a7 | 1536164079995 | QUERY | QUERY_SUCCESS | INFO |
| 6a9c5d49-ee3b-4b78-b653-006c540f70c3 | 1536162113586 | QUERY | QUERY_ERROR | INFO |
+--------------------------------------+---------------+--------+---------------+------+
To return all available fields except some use the EXCEPT
clause in conjunction with *
.
SELECT
*
EXCEPT(label, type)
FROM
_events
+--------------------------------------+---------------+--------+
| _id | eventTime | kind |
+--------------------------------------+---------------+--------+
| cf0e193d-9ee0-4c6a-9f2b-cab9893e97e3 | 1536164080248 | QUERY |
| 4a5e941e-7683-4db2-b8e0-ee0e4807d6a7 | 1536164079995 | QUERY |
| 6a9c5d49-ee3b-4b78-b653-006c540f70c3 | 1536162113586 | QUERY |
+--------------------------------------+---------------+--------+
To return only unique results, include DISTINCT
after SELECT
.
SELECT
DISTINCT label
FROM
_events
+---------------+
| label |
+---------------+
| QUERY_SUCCESS |
| QUERY_ERROR |
+---------------+
To return all available subfields of an object, use the .*
notation. As with SELECT *
you may use EXCEPT
to ignore some subfields of the object.
WITH foo as (
SELECT
{ 'a': 1,
'b': 2 } x
)
SELECT
foo.x.*
FROM
foo
+---+---+
| a | b |
+---+---|
| 1 | 2 |
+---+---+
WITH Clause
The WITH
clause can be used to chain queries by defining a named subquery that can be then
used within the main query.
Using WITH
clauses can improve the readability and maintainability of complex queries.
WITH x AS (
SELECT
_id,
label
FROM
_events
)
SELECT
_id,
label
FROM
x
+--------------------------------------+---------------+
| _id | label |
|--------------------------------------+---------------|
| 23658547-1ae5-4be8-97e5-6e043f682c31 | QUERY_SUCCESS |
| 85e16122-d749-444d-ab63-bb5a0544368b | QUERY_SUCCESS |
| 972485f2-e3b0-43c4-bb7e-c70b7df3e86c | QUERY_SUCCESS |
+--------------------------------------+---------------+
There can be multiple subqueries in a WITH
clause, and one can use any of the previous ones.
WITH x AS (
SELECT
_id,
label,
kind
FROM
_events
),
y AS (
SELECT
_id,
kind AS c
FROM
x
)
SELECT
_id,
c
FROM
y
+--------------------------------------+-------+
| _id | c |
|--------------------------------------+-------|
| 589eaee9-0c62-46f9-8f7a-0a5a3fcf2e2c | QUERY |
| 3e2e99b9-896f-4852-9669-ab5d1a1a6ebf | QUERY |
| a2b5d818-cad6-49b8-8f9d-bcc77146e060 | QUERY |
+--------------------------------------+-------+
FROM Clause
When querying a single collection, the FROM
clause consists simply of the collection name.
However, the FROM
clause can also be used to reshape the data that is being queried.
UNNEST
UNNEST
is a function that can be used to expand arrays into several rows.
UNNEST
is a row generating function which takes as input an array and outputs one row per element
of the input array. Unlike other functions, it outputs multiple rows (a relation) rather than one
value (a scalar). That means it appears in queries in the same places a collection appears: a FROM
or JOIN
clause.
This query takes an array of 3 numbers, and produces 3 rows. An UNNEST
can have two aliases - an
outer alias which names the relation, and an inner alias which names the field within the relation.
SELECT
numbers.aNumber
FROM
UNNEST(array [1, 2, 3] AS aNumber) AS numbers
+---------+
| aNumber |
|---------|
| 1 |
| 2 |
| 3 |
+---------+
We can also UNNEST
arrays of objects. We can access fields of each object with the usual dot
notation.
SELECT
myFruits.someFruit.name,
myFruits.someFruit.color
FROM
UNNEST(
JSON_PARSE(
'[{"name": "strawberry", "color": "red"}, {"name": "lime", "color": "green"}]'
) AS someFruit
) AS myFruits
+------------+---------+
| name | color |
|------------+---------|
| strawberry | red |
| lime | green |
+------------+---------+
UNNEST
is particularly useful when we have an array field within a collection. Suppose collection
companies
contains the following documents:
{
"make": "Ford",
"models": ["Focus", "Mustang", "F-150"],
"offices": [
{"city": "Dearborn", "state": "MI"},
{"city": "Denver", "state": "CO"}
]
},
{
"make": "Toyota",
"models": ["Prius", "Highlander", "Camry"],
"offices": [
{"city": "Jacksonville", "state": "FL"},
{"city": "Ann Arbor", "state": "MI"}
]
}
For each company, we want to generate one row for each model of car. This will let us manipulate the
models more easily, for instance by aggregating or filtering them. We do this by joining the
companies
collection with the output of UNNEST
. For each company, UNNEST
generates one row per
model. We then perform a correlated cross join. It is correlated because the input to the right side
of the join depends on the current row being processed on the left side. It is a cross join because
we take every output from the right side and join it with the current row from the left. There is no
join condition.
SELECT
companies.make,
models.*
FROM
companies
CROSS JOIN UNNEST(companies.models AS model) AS models
+---------+------------+
| make | model |
|---------+------------|
| Ford | Focus |
| Ford | Mustang |
| Ford | F-150 |
| Toyota | Prius |
| Toyota | Highlander |
| Toyota | Camry |
+---------+------------+
In SQL, we can cross join two relations implicitly by just listing them with a comma in between them, so the following query is equivalent to the above:
SELECT
companies.make,
models.*
FROM
companies,
UNNEST(companies.models AS model) AS models
Note: There is an important caveat with this shorthand though. Mixing implicit joins (i.e.
SELECT * FROM t1, t2 WHERE t1.x = t2.x
) with explicit joins (i.e.SELECT * FROM t1 JOIN t2 WHERE t1.x = t2.x
) can cause confusing results. If you want to unnest a field from a collection and join with another collection, perform all joins explicitly:SELECT
companies.*,models.model,makeInfo.*FROM
companiesCROSS JOIN UNNEST(companies.models AS model) AS modelsJOIN makeInfo ON companies.make = makeInfo.make
The index within the original array can also be extracted by appending
WITH ORDINALITY AS alias_name
inside the UNNEST
function.
SELECT
models.*
FROM
companies,
UNNEST(
companies.models AS name WITH ORDINALITY AS index
) AS models
+---------+------------+
| index | name |
|---------+------------|
| 1 | Prius |
| 2 | Highlander |
| 3 | Camry |
| 1 | Focus |
| 2 | Mustang |
| 3 | F-150 |
+---------+------------+
When the unnested array contains objects, the values returned by the aliased field are objects, and so they can be dereferenced using the usual dot-notation syntax.
SELECT
offices.value.state,
ARRAY_AGG(offices.value.city) AS cities
FROM
companies,
UNNEST(companies.offices AS value) AS offices
GROUP BY
offices.value.state
+---------------------------+---------+
| cities | state |
|---------------------------+---------|
| ['Denver'] | CO |
| ['Jacksonville'] | FL |
| ['Dearborn', 'Ann Arbor'] | MI |
+---------------------------+---------+
If all elements of an array are objects, you can also omit the UNNEST alias, which allows you to query object's fields directly. Note that this doesn't work for non-object array elements; those will produce empty rows in the UNNESTed table.
SELECT
offices.state,
ARRAY_AGG(offices.city) AS cities
FROM
companies,
UNNEST(companies.offices) AS offices
GROUP BY
offices.state
+---------------------------+---------+
| cities | state |
|---------------------------+---------|
| ['Denver'] | CO |
| ['Jacksonville'] | FL |
| ['Dearborn', 'Ann Arbor'] | MI |
+---------------------------+---------+
JOIN
Rockset supports standard SQL joins.
- An
INNER JOIN
returns only results where there is a match on both sides. - A
LEFT
(orRIGHT
)OUTER JOIN
returns all results from the left (or right) side, populating the other side's columns when possible and withnull
otherwise. - A
CROSS JOIN
returns the Cartesian product of the two sides' results (i.e all combinations of documents in two collections).
SELECT
collection1.name,
collection2.name
FROM
collection1
JOIN collection2 ON collection1._id = collection2._id
Cross joins can either be specified using the explicit CROSS JOIN
syntax or by specifying multiple
collections in the FROM
clause. The queries below are equivalent.
SELECT
collection1.id,
collection2.id
FROM
collection1,
collection2
SELECT
collection1.id,
collection2.id
FROM
collection1
CROSS JOIN collection2
Full outer joins are not directly supported but they can be imitated by unioning a LEFT and RIGHT JOIN as done in the example below.
SELECT
collection1.name,
collection2.name
FROM
collection1
LEFT JOIN collection2 ON collection1._id = collection2._id
UNION
SELECT
collection1.name,
collection2.name
FROM
collection1
RIGHT JOIN collection2 ON collection1._id = collection2._id
TABLESAMPLE
Rockset only supports BERNOULLI
sampling.
SELECT
id,
name
FROM
collection1 TABLESAMPLE BERNOULLI (50)
This returns approximately 50% of the results that would have been otherwise obtained.
SELECT
avg(height)
FROM
people TABLESAMPLE BERNOULLI (25)
This returns the average of the height field based on approximately 25% of rows in the table.
GROUP BY Clause
Often, interesting results are obtained by grouping records together on some similar attributes.
GROUP BY
is the mechanism by which the result of a SELECT
is partitioned based on a particular field.
Non-aggregated fields can only be included in aggregate functions in select
expressions with the presense of a GROUP BY
clause.
A simple GROUP BY
clause may contain any expression composed of input columns.
SELECT
count(*) AS c,
label
FROM
_events
GROUP BY
label
+--------+------------------------------+
| c | label |
|--------+------------------------------|
| 41 | API_KEY_DELETED |
| 35 | API_KEY_CREATED |
| 88256 | QUERY_SUCCESS |
| 133 | QUERY_INVALID |
| 24 | API_KEY_ERROR |
| 72 | INGEST_WARNING |
| 16 | COLLECTION_DROPPED |
| 121988 | INGEST_INFO |
+--------+------------------------------+
HAVING
is used to filter results, based on the output of aggregate functions.
SELECT
count(*) AS c,
label
FROM
_events
GROUP BY
label
HAVING
count(*) < 50
+--------+------------------------------+
| c | label |
|--------+------------------------------|
| 41 | API_KEY_DELETED |
| 35 | API_KEY_CREATED |
| 24 | API_KEY_ERROR |
| 16 | COLLECTION_DROPPED |
+--------+------------------------------+
For more complex aggregations, Rockset supports GROUPING SETS
, CUBE
and ROLLUP
. These can be
used to perform aggregations on multiple fields in a single query.
Functions are not supported in
GROUP BY
clause. Consider using a subquery instead.
GROUPING SETS
Grouping sets allow users to specify multiple lists of fields to group on.
The fields that are not part of a given sublist of grouping fields are set to null
.
It can logically be thought of as the UNION
of a single SELECT
query with a simple
GROUP BY
for each GROUPING SET
.
SELECT
label,
kind
FROM
_events
GROUP BY
GROUPING SETS ((label), (label, kind))
+--------------+------------------------------+
| kind | label |
|--------------+------------------------------|
| null | API_KEY_DELETED |
| API_KEY | API_KEY_DELETED |
| COLLECTION | COLLECTION_CREATED |
| null | COLLECTION_CREATED |
| QUERY | QUERY_SUCCESS |
| QUERY | QUERY_COLLECTION_NOT_READY |
| INGEST | INGEST_INFO |
| INGEST | INGEST_INITIALIZED |
+--------------+------------------------------+
CUBE
CUBE
is a higher level operator that can be used to generate GROUPING SETS
.
It generates all possible GROUPING SETS
given a set of fields.
SELECT
label,
kind
FROM
_events
GROUP BY
CUBE (label, kind)
This is equivalent to:
SELECT
label,
kind
FROM
_events
GROUP BY
GROUPING SETS (
(label, kind),
(label),
(kind),
()
)
ROLLUP
The ROLLUP
operator is similar to CUBE
in that it's a higher order abstraction over
GROUPING SETS
. It generates all possible starting subsequences for a given set of fields.
SELECT
label,
kind
FROM
_events
GROUP BY
ROLLUP (label, kind)
This is equivalent to:
SELECT
label,
kind
FROM
_events
GROUP BY
GROUPING SETS ((label, kind), (label), ())
ORDER BY Clause
The ORDER BY
clause is used to sort a result set by one or more output expressions.
By default ordering for an expression is ascending. Use DESC
to get result set in descending order.
SELECT
*
FROM
users
ORDER BY
users.username DESC
Use multiple expressions, each with a different ordering, by separating with commas.
SELECT
*
FROM
users
ORDER BY
users.first_name ASC,
users.last_name DESC
LIMIT Clause
The LIMIT
clause restricts the number of rows in the result.
The following query limits the result set to contain a maximum of 10 rows. The result set represents the top 10 customers based on customer rating.
SELECT
*
FROM
customers
ORDER BY
customers.customer_rating DESC
LIMIT
10
OFFSET Clause
The OFFSET
clause skips the first n
rows of results.
The following query selects the 6th through the 15th rows.
SELECT
*
FROM
customers
ORDER BY
customers.customer_rating DESC
LIMIT
10 OFFSET 5
CLUSTER BY Clause
The CLUSTER BY
clause is used in an ingest transformation to configure data clustering on a collection. It is not allowed during normal query execution.
See the ingest transformation docs for more.
Combining Result Sets
UNION
returns the combination of results of two SELECT
queries.
SELECT
id,
name
FROM
collection1
UNION
SELECT
userId,
name
FROM
collection2
INTERSECT
returns only distinct common results of two SELECT
queries.
SELECT
id,
name
FROM
collection1
INTERSECT
SELECT
userId,
name
FROM
collection2
EXCEPT
returns only distinct results of the first query that are not in the second query.
SELECT
id,
name
FROM
collection1
EXCEPT
SELECT
userId,
name
FROM
collection2
INSERT INTO
INSERT INTO collection_name
[ (column_alias [, ...]) ]
select_query
Inserts result of a query into a collection.
collection_name
is the collection into which query results are ingested.column_alias
renames fields in the result rows. If specified, this list must contain as many items as the number of fields selected by theselect_query
select_query
is a validSELECT
command.
The ingestion of query results into the specified collection begins after the SELECT
command
finishes successfully. The console will display num_docs_inserted
( the API equivalent is
result_set_document_count
) which is the result count of the SELECT
command. If there is an ingest
transformation on the collection, it will then be applied to the result docs before they are finally
inserted into the collection.
Below we cover common uses of INSERT INTO
.
To update some documents in a collection. Note that following queries select _id
column, so a
document with specified id gets updated.
INSERT INTO
collection1
SELECT
_id,
price + 10 AS price
FROM
collection1
where
price < 5
INSERT INTO
collection1
SELECT
_id,
NULL AS age
FROM
collection1
To replicate a collection:
INSERT INTO
collection2
SELECT
*
FROM
collection1