Groups
The group
function in DUQL is a powerful tool for aggregating and analyzing data based on specified columns. It supports a wide range of operations, allowing for complex data transformations and analyses within grouped data.
Syntax
Parameters
by
string or array
Yes
Columns to group by. Can be a single column, an array of columns, or 'table.*' for all columns of a table.
additional_steps
object
Yes (at least one)
One or more additional transformation steps to apply to the grouped data.
Additional Steps
The group
function can include any of the following steps after grouping:
summarize
: Perform aggregations on the grouped data.filter
: Apply conditions to filter the grouped results.generate
: Create new columns based on grouped data.join
: Join the grouped data with another dataset.select
: Choose specific columns from the grouped results.sort
: Order the grouped results.take
: Limit the number of grouped results returned.window
: Perform window functions on the grouped data.loop
: Perform iterative operations on the grouped data.
Examples
Basic Grouping with Aggregation and Filtering
This example groups employees by department, calculates average salary and employee count, filters for departments with high average salaries, sorts by average salary descending, and takes the top 5 results.
Complex Business Logic with Case Statements
This query categorizes products based on their inventory status and filters for those needing attention.
Best Practices
🎯 Choose grouping columns carefully to ensure meaningful aggregations.
📊 Use
summarize
orgenerate
to calculate aggregate metrics.🔍 Leverage
filter
after aggregation to focus on important results.📈 Utilize
window
functions for advanced analytics within groups.🔢 Use
sort
andtake
to prioritize and limit results for better performance.🧮 Combine multiple steps for complex analyses in a single, readable query.
Related Functions
summarize
: Often used withingroup
to perform aggregations.filter
: Can be used before or after grouping to refine the dataset.generate
: Useful for creating new columns based on grouped data.window
: Enables advanced analytical functions within grouped data.
Limitations and Considerations
Grouping operations can be computationally expensive on large datasets. Use indexing strategies on grouping columns when possible.
Be mindful of the order of operations within the
group
function, as it can affect the final results.Some combinations of steps might not be logically valid or may produce unexpected results. Always test your queries thoroughly.
💡 Tip: The
group
function is the cornerstone of data analysis in DUQL. By combining it with various additional steps, you can perform complex aggregations, transformations, and analyses all within a single, readable query structure. Experiment with different combinations to unlock deeper insights from your data!
Last updated
Was this helpful?