Take
The take
function in DUQL is used to limit the number of rows returned or to select specific ranges of rows. It's useful for pagination, sampling, or selecting top/bottom N rows.
Syntax
take: <number_or_range>
Parameters
take
integer or string
Yes
Number of rows to take or a range specification
Behavior
When given an integer, it returns that many rows from the beginning of the dataset.
When given a range, it returns the specified range of rows.
Can be used after sorting to get top/bottom N rows.
Examples
Take First N Rows
take: 10
Take No Rows (Useful for Schema Inspection)
take: 0
Take a Range of Rows
take: '5..10'
Take All Rows from a Specific Point
take: '100..'
Take First N Rows
take: '..50'
Best Practices
๐ฏ Use
take
in combination withsort
to get meaningful subsets of data.๐ข Consider using
take: 0
to inspect the schema of your query result without processing all rows.๐ Use range syntax for more flexible row selection.
๐ Place
take
at the end of your query pipeline for best performance.๐งช Be cautious with small
take
values when working with grouped or aggregated data.
Real-World Use Case
Here's an example of a DUQL query that uses take
to analyze the top-selling products:
dataset: sales
steps:
- join:
dataset: products
where: sales.product_id == products.id
- generate:
revenue: price * quantity
- group:
by: [product_id, product_name, category]
steps:
- summarize:
total_revenue: sum revenue
units_sold: sum quantity
- sort: -total_revenue
- take: 20 # Top 20 products by revenue
- generate:
average_price: total_revenue / units_sold
rank:
sql'ROW_NUMBER() OVER (ORDER BY total_revenue DESC)'
into: top_selling_products
This query demonstrates:
Joining sales data with product information
Calculating revenue
Grouping and summarizing by product
Sorting by total revenue
Taking the top 20 products
Generating additional metrics and ranking
The take: 20
step ensures that we only get the top 20 selling products, making the analysis more focused and manageable.
๐ก Tip: Use the
take
function judiciously to control the size of your query results. It's particularly useful in combination with sorting to get top N or bottom N results quickly!
Last updated
Was this helpful?