Window Functions
This page covers functions that can be used in window queries.
Window functions are similar to aggregate functions, except that they return one row per input row,
instead of aggregating all of the rows into a single result. Window functions operate over
partititons in the input data, specified by an OVER()
clause. Within a partition, window frames
can be defined, and the window function can be applied in a given order to these window frames.
Window function calls are permitted only in the SELECT
list and the ORDER BY
clause of the
query.
WITH numbers AS (
SELECT
1 AS number
UNION ALL
SELECT
2
UNION ALL
SELECT
3
UNION ALL
SELECT
4
)
SELECT
ROW_NUMBER() OVER()
FROM
numbers
+--------------+
| row_number |
|--------------+
| 1 |
| 2 |
| 3 |
| 4 |
+--------------+
Over
To execute a window function, an OVER()
clause is required. An OVER()
clause can contain
information about the partitions a window function should be applied on, the order in which a window
function will operate over a given partition, as well as the window frames over which aggregation or
certain window functions should be computed on.
PARTITION BY
The PARTITION BY
option groups the rows of the query into partitions, which the window function
processes separately. Without PARTITION BY
, the query is treated as having only one partititon.
/*
In this example, we end up with three partitions.
One partition includes all rows with 1 as the number, one
includes only the row with 3 as the number value, and one
with only 4 as the number value.
*/
WITH numbers AS (
SELECT
1 AS number
UNION ALL
SELECT
1
UNION ALL
SELECT
3
UNION ALL
SELECT
4
)
SELECT
number,
ROW_NUMBER() OVER(PARTITION BY number)
FROM
numbers
+-----------+------------+
| number | row_number |
|-----------+------------|
| 1 | 1 |
| 1 | 2 |
| 3 | 1 |
| 4 | 1 |
+-----------+------------+
ORDER BY
The ORDER BY
option determines the order in which the rows of a partition are processed. It works
similarly to a query-level ORDER BY
clause. Without ORDER BY
, rows are processed in an
unspecified order. Note that ORDER BY
inside of a window function may not affect the row ordering
in the output columns. The only thing ORDER BY
guarantees is the order in which the window
function is applied across a partition.
/*
In this example, we are applying the sum function
in descending order relative to the number column.
Note that both rows with number value 1 are treated
as peers, and are therefore aggregated together.
*/
WITH numbers AS (
SELECT
1 AS number
UNION ALL
SELECT
1
UNION ALL
SELECT
3
UNION ALL
SELECT
4
)
SELECT
number,
SUM(number) OVER(
ORDER BY
number DESC
)
FROM
numbers
+-----------+-----+
| number | sum |
|-----------+-----|
| 4 | 4 |
| 3 | 7 |
| 1 | 9 |
| 1 | 9 |
+-----------+-----+
RANGE/ROWS BETWEEN
The window frame specification for the current partition. This is either RANGE
or ROWS
. This
defaults to RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
. With ORDER BY
, this sets the
frame to be all rows from the partition start up to the current row's last peer. Without ORDER BY
,
all rows of the partition are included in the window frame. Restrictions are that frame_start cannot
be UNBOUNDED FOLLOWING
, frame_end cannot be UNBOUNDED PRECEDING
, and the frame_end choice cannot
appear earlier in the above list than the frame_start choice — for example
RANGE BETWEEN CURRENT ROW AND 1 PRECEDING
is not allowed.
/*
Here, our window frame is rows between 1 preceding and 1 following.
The COUNT(*) function operates over an independent window frame for
each row it processes.
*/
WITH numbers AS (
SELECT
1 AS number
UNION ALL
SELECT
1
UNION ALL
SELECT
3
UNION ALL
SELECT
4
)
SELECT
number,
COUNT(*) OVER(
ORDER BY
number ROWS BETWEEN 1 PRECEDING
AND 1 FOLLOWING
)
FROM
numbers
+-----------+-------+
| number | count |
|-----------+-------|
| 1 | 2 |
| 1 | 3 |
| 3 | 3 |
| 4 | 2 |
+-----------+-------+
/*
Here, our window frame is everything in the range
between the current row and all rows after. Note
that both rows with number value 1 are treated as
peer rows, and recieve the same value for count.
*/
WITH numbers AS (
SELECT
1 AS number
UNION ALL
SELECT
1
UNION ALL
SELECT
3
UNION ALL
SELECT
4
)
SELECT
number,
COUNT(*) OVER(
ORDER BY
number RANGE BETWEEN CURRENT ROW
AND UNBOUNDED FOLLOWING
)
FROM
numbers
+-----------+-------+
| number | count |
|-----------+-------|
| 1 | 4 |
| 1 | 4 |
| 3 | 2 |
| 4 | 1 |
+-----------+-------+
/*
Here, our window frame is between 1 preceding
and implicitly the current row. Note that the
rows with 1 as the number value are not treated
as peers.
*/
WITH numbers AS (
SELECT
1 AS number
UNION ALL
SELECT
1
UNION ALL
SELECT
3
UNION ALL
SELECT
4
)
SELECT
number,
COUNT(*) OVER(
ORDER BY
number ROWS 1 PRECEDING
)
FROM
numbers
+-----------+-------+
| number | count |
|-----------+-------|
| 1 | 1 |
| 1 | 2 |
| 3 | 2 |
| 4 | 2 |
+-----------+-------+
/*
Here, our window frame is only peer rows to the current row.
*/
WITH numbers AS (
SELECT
1 AS number
UNION ALL
SELECT
1
UNION ALL
SELECT
3
UNION ALL
SELECT
4
)
SELECT
number,
COUNT(*) OVER(
ORDER BY
number RANGE CURRENT ROW
)
FROM
numbers
+-----------+-------+
| number | count |
|-----------+-------|
| 1 | 2 |
| 1 | 2 |
| 3 | 1 |
| 4 | 1 |
+-----------+-------+
Aggregate Functions
All aggregate functions can also be used as window
functions. When an aggregate function is used as a window function, it aggregates over the rows
within the current row's window frame. An aggregate used with ORDER BY
and the default window
frame definition produces a "running sum" behavior. To obtain aggregation over a row's entire
partition, omit the ORDER BY
clause entirely or use
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
to capture the entire partition.
Window Functions
ROW_NUMBER
ROW_NUMBER()
Number of the current row within its partition, counting from 1.
WITH examples AS (
SELECT
'John' AS name
UNION ALL
SELECT
'Jane'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'James'
)
SELECT
ROW_NUMBER() OVER()
FROM
examples
+--------------+
| row_number |
|--------------+
| 1 |
| 2 |
| 3 |
| 4 |
+--------------+
RANK
RANK()
Rank of the current row with gaps; same as row_number of its first peer.
WITH examples AS (
SELECT
'John' AS name
UNION ALL
SELECT
'Jane'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'James'
)
SELECT
name,
RANK() OVER(
ORDER BY
name
)
FROM
examples
+-----------+------+
| name | rank |
|-----------+------|
| Angela | 1 |
| James | 2 |
| Jane | 3 |
| John | 4 |
+-----------+------+
DENSE_RANK
DENSE_RANK()
Rank of the current row without gaps; this function counts peer groups.
WITH examples AS (
SELECT
'John' AS name
UNION ALL
SELECT
'Jane'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'James'
)
SELECT
name,
DENSE_RANK() OVER(
ORDER BY
name
)
FROM
examples
+-----------+------------+
| name | dense_rank |
|-----------+------------|
| Angela | 1 |
| Angela | 1 |
| James | 2 |
| Jane | 3 |
| John | 4 |
+-----------+------------+
PERCENT_RANK
PERCENT_RANK()
Relative rank of the current row: (rank - 1) / (total rows - 1).
WITH examples AS (
SELECT
'John' AS name
UNION ALL
SELECT
'Jane'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'James'
)
SELECT
name,
PERCENT_RANK() OVER(
ORDER BY
name
)
FROM
examples
+-----------+--------------+
| name | percent_rank |
|-----------+--------------|
| Angela | 0 |
| Angela | 0 |
| James | 0.5 |
| Jane | 0.75 |
| John | 1 |
+-----------+--------------+
CUME_DIST
CUME_DIST()
Relative rank of the current row: (number of rows preceding or peer with current row) / (total rows).
WITH examples AS (
SELECT
'John' AS name
UNION ALL
SELECT
'Jane'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'James'
)
SELECT
name,
CUME_DIST() OVER(
ORDER BY
name
)
FROM
examples
+-----------+------------+
| name | cume_dist |
|-----------+------------|
| Angela | 0.4 |
| Angela | 0.4 |
| James | 0.6 |
| Jane | 0.8 |
| John | 1 |
+-----------+------------+
NTILE
NTILE(num_buckets)
Integer ranging from 1 to the argument value, dividing the partition as equally as possible.
WITH examples AS (
SELECT
'John' AS name
UNION ALL
SELECT
'Jane'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'James'
)
SELECT
name,
NTILE(2) OVER(
ORDER BY
name
)
FROM
examples
+-----------+-------+
| name | ntile |
|-----------+-------|
| Angela | 1 |
| Angela | 1 |
| James | 1 |
| Jane | 2 |
| John | 2 |
+-----------+-------+
LAG
LAG(value [, offset [, default ]])
Returns value evaluated at the row that is offset rows before the current row within the partition; if there is no such row, instead return default. Both offset and default are evaluated with respect to the current row. If omitted, offset defaults to 1 and default to null.
WITH examples AS (
SELECT
'John' AS name
UNION ALL
SELECT
'Jane'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'James'
)
SELECT
name,
LAG(name, 2) OVER(
ORDER BY
name
)
FROM
examples
+-----------+------------+
| name | lag |
|-----------+------------|
| Angela | null |
| Angela | null |
| James | Angela |
| Jane | Angela |
| John | James |
+-----------+------------+
LEAD
LEAD(value [, offset [, default ]])
Returns value evaluated at the row that is offset rows after the current row within the partition; if there is no such row, instead return default. Both offset and default are evaluated with respect to the current row. If omitted, offset defaults to 1 and default to null.
WITH examples AS (
SELECT
'John' AS name
UNION ALL
SELECT
'Jane'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'James'
)
SELECT
name,
LEAD(name, 2) OVER(
ORDER BY
name
)
FROM
examples
+-----------+------------+
| name | lead |
|-----------+------------|
| Angela | James |
| Angela | Jane |
| James | John |
| Jane | null |
| John | null |
+-----------+------------+
FIRST_VALUE
FIRST_VALUE(value)
Returns value evaluated at the row that is the first row of the window frame.
WITH examples AS (
SELECT
'John' AS name
UNION ALL
SELECT
'Jane'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'James'
)
SELECT
name,
FIRST_VALUE(name) OVER(
ORDER BY
name ROWS BETWEEN 1 PRECEDING
AND 1 FOLLOWING
)
FROM
examples
+-----------+-------------+
| name | first_value |
|-----------+-------------|
| Angela | Angela |
| Angela | Angela |
| James | Angela |
| Jane | James |
| John | Jane |
+-----------+-------------+
LAST_VALUE
LAST_VALUE(value)
Returns value evaluated at the row that is the last row of the window frame.
WITH examples AS (
SELECT
'John' AS name
UNION ALL
SELECT
'Jane'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'James'
)
SELECT
name,
LAST_VALUE(name) OVER(
ORDER BY
name ROWS BETWEEN 1 PRECEDING
AND 1 FOLLOWING
)
FROM
examples
+-----------+-------------+
| name | first_value |
|-----------+-------------|
| Angela | Angela |
| Angela | James |
| James | Jane |
| Jane | John |
| John | John |
+-----------+-------------+
NTH_VALUE
NTH_VALUE(value, nth)
Returns value evaluated at the row that is the nth row of the window frame (counting from 1); null if no such row.
WITH examples AS (
SELECT
'John' AS name
UNION ALL
SELECT
'Jane'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'Angela'
UNION ALL
SELECT
'James'
)
SELECT
name,
NTH_VALUE(name, 2) OVER(
ORDER BY
name ROWS BETWEEN 1 PRECEDING
AND 1 FOLLOWING
)
FROM
examples
+-----------+-------------+
| name | first_value |
|-----------+-------------|
| Angela | Angela |
| Angela | Angela |
| James | James |
| Jane | Jane |
| John | John |
+-----------+-------------+
All of the functions listed above depend on the sort ordering specified by the ORDER BY
clause of
the associated window definition. Rows that are not distinct in the ORDER BY ordering are peers; the
four ranking functions are defined so that they give the same answer for any two peers.
Note that FIRST_VALUE
, LAST_VALUE
, and NTH_VALUE
consider only the rows within the window
frame.