Windows

DUQL Window Function

The window function in DUQL is a powerful tool for performing calculations across a set of rows that are related to the current row. It allows for complex analytics and comparisons within specified ranges or groups of data.

Window applies a step pipeline to segments of rows, producing one output value for every input value.

Syntax

window:
  <window_type>: <value>
  <steps>

Parameters

For each row, the segment over which the pipeline is applied is determined by one of:

ParameterTypeDescription

rows

string or integer

Specifies a range of rows relative to the current row position.

range

string

Specifies a range of values relative to the current row value.

expanding

boolean

When true, creates a cumulative window (alias for rows: ..0).

rolling

integer

Specifies a rolling window of n rows, including the current row.

0 references the current row.

For ease of use, there are two flags that override rows or range:

  • expanding:true is an alias for rows:..0. A sum using this window is also known as “cumulative sum”.

  • rolling:n is an alias for rows:(-n+1)..0, where n is an integer. This will include n last values, including current row. An average using this window is also knows as a Simple Moving Average.

The bounds of the range are inclusive. If a bound is omitted, the segment will extend until the edge of the table or group.

Window Types

Row Range

window:
  rows: <start>..<end>
  • Specifies a range of rows relative to the current row position.

  • Examples:

    • "0..2": Current row and next two rows

    • "-2..0": Previous two rows and current row

    • "..0": All preceding rows and current row

    • "..": All rows in the partition

Gotcha: Large row ranges can impact performance on large datasets.

Value Range

window:
  range: <start>..<end>
  • Specifies a range of values relative to the current row value.

  • Example: "-1000..1000": All rows within 1000 units of the current row's value

Gotcha: Range is based on the values in the sorted column, not row numbers.

Expanding Window

window:
  expanding: true
  • Creates a cumulative window (alias for rows: ..0).

  • A sum using this window is also known as “cumulative sum”.

  • Useful for running totals or cumulative aggregations.

Gotcha: Can lead to performance issues on very large datasets.

Rolling Window

window:
  rolling: <n>
  • Specifies a rolling window of n rows, including the current row.

  • rolling:n is an alias for rows:(-n+1)..0, where n is an integer. This will include n last values, including current row.

  • Example: 7 for a 7-day rolling window.

  • An average using this window is also knows as a Simple Moving Average.

Gotcha: Ensure the window size is appropriate for your data and analysis needs.

Best Practices

  1. 🎯 Choose the appropriate window type for your analysis needs.

  2. 🔢 Be mindful of window sizes, especially on large datasets.

  3. 📊 Use sorting within window functions to ensure consistent results.

  4. 🚀 Consider performance implications, especially with expanding windows or large ranges.

  5. 🧪 Test window functions with sample data to verify results.

  6. 📈 Combine window functions with other DUQL operations for complex analyses.

  • sort: Often used within window functions to order data.

  • generate: Used to create new columns based on window calculations.

  • select: Can be used to choose specific window function results.

Limitations and Considerations

  • Window functions can be computationally expensive, especially on large datasets.

  • Not all database systems support all types of window functions. Check your target database's capabilities.

  • Complex window functions may require careful optimization for performance.


💡 Tip: Window functions are powerful tools for time-series analysis, rankings, and running calculations. They allow you to perform complex analytics without the need for self-joins or subqueries. Experiment with different window types and combinations to unlock deeper insights from your data!

Last updated