Window Functions
This page unpacks QuestDB window functions and provides references.
Window functions exist within many SQL dialects. QuestDB is consistent with expected function.
What is a Window Function?
A window function performs a calculation across a set of rows that are related
to the current row. This set of related rows is called a "window", defined by an
OVER
clause that follows the window function.
In practical terms, window functions are used when you need to perform a
calculation that depends on a group of rows, but you want to retain the
individual rows in the result set. This is different from aggregate functions
like a cumulative sum
or avg
, which perform calculations on a group of rows
and return a single result.
The underlying mechanism of a window function involves three components:
-
Partitioning: The
PARTITION BY
clause divides the result set into partitions (groups of rows) upon which the window function is applied. If no partition is defined, the function treats all rows of the query result set as a single partition. -
Ordering: The
ORDER BY
clause within theOVER
clause determines the order of the rows in each partition. -
Frame Specification: This defines the set of rows included in the window, relative to the current row. For example,
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
includes all rows from the start of the partition to the current row.
Use cases for window functions are vast.
They are often used in analytics for tasks such as:
- Calculating running totals or averages
- Finding the maximum or minimum value in a sequence or partition
- Ranking items within a specific category or partition
- Calculating moving averages or cumulative sums
Window functions are tough to grok.
An analogy before we get to building:
Imagine a group of cars in a race. Each car has a number, a name, and a finish
time. If you wanted to know the average finish time, you could use an aggregate
function like avg
to calculate it. But this would only give you a single
result: the average time. You wouldn't know anything about individual cars'
times.
Now, let's say you want to know how each car's time compares to the average. Enter window functions. A window function allows you to calculate the average finish time (the window), but for each car (row) individually.
For example, you could use a window function to calculate the average finish
time for all cars, but then apply this average to each car to see if they were
faster or slower than the average. The OVER
clause in a window function is
like saying, "for each car, compare their time to the average time of all cars."
So, in essence, window functions allow you to perform calculations that consider more than just the individual row or the entire table, but a 'window' of related rows. This 'window' could be all rows with the same value in a certain column, like all cars of the same engine size, or it could be a range of rows based on some order, like the three cars who finished before and after a certain car.
This makes window functions incredibly powerful for complex calculations and analyses.