Take
The take function in DUQL is used to limit the number of rows returned or to select specific ranges of rows. It's useful for pagination, sampling, or selecting top/bottom N rows.
Syntax
take: <number_or_range>Parameters
take
integer or string
Yes
Number of rows to take or a range specification
Behavior
When given an integer, it returns that many rows from the beginning of the dataset.
When given a range, it returns the specified range of rows.
Can be used after sorting to get top/bottom N rows.
Examples
Take First N Rows
take: 10Take No Rows (Useful for Schema Inspection)
take: 0Take a Range of Rows
take: '5..10'Take All Rows from a Specific Point
take: '100..'Take First N Rows
take: '..50'Best Practices
🎯 Use
takein combination withsortto get meaningful subsets of data.🔢 Consider using
take: 0to inspect the schema of your query result without processing all rows.📊 Use range syntax for more flexible row selection.
🚀 Place
takeat the end of your query pipeline for best performance.🧪 Be cautious with small
takevalues when working with grouped or aggregated data.
Real-World Use Case
Here's an example of a DUQL query that uses take to analyze the top-selling products:
dataset: sales
steps:
- join:
dataset: products
where: sales.product_id == products.id
- generate:
revenue: price * quantity
- group:
by: [product_id, product_name, category]
steps:
- summarize:
total_revenue: sum revenue
units_sold: sum quantity
- sort: -total_revenue
- take: 20 # Top 20 products by revenue
- generate:
average_price: total_revenue / units_sold
rank:
sql'ROW_NUMBER() OVER (ORDER BY total_revenue DESC)'
into: top_selling_productsThis query demonstrates:
Joining sales data with product information
Calculating revenue
Grouping and summarizing by product
Sorting by total revenue
Taking the top 20 products
Generating additional metrics and ranking
The take: 20 step ensures that we only get the top 20 selling products, making the analysis more focused and manageable.
💡 Tip: Use the
takefunction judiciously to control the size of your query results. It's particularly useful in combination with sorting to get top N or bottom N results quickly!
Last updated
Was this helpful?