Select
The select
function in DUQL is used to specify which columns to include in the output, rename columns, compute new columns, or select all columns from a specific table. There's also a select!
variant used to exclude specified columns.
Syntax
select:
<new_column_name>: <column_or_expression>
or
select: [<column1>, <column2>, ...]
or
select!: [<column1_to_exclude>, <column2_to_exclude>, ...]
Parameters
select
object or array
Yes (if not using select!
)
Columns to include or rename
select!
array
Yes (if not using select
)
Columns to exclude
Behavior
When
select
is an object, it allows renaming and computing new columns.When
select
is an array, it simply selects the specified columns.select!
excludes the specified columns and keeps all others.Using
table.*
selects all columns from a specific table.
Examples
Basic Column Selection
select: [id, name, email]
Renaming and Computing Columns
select:
Customer ID: customer_id
Full Name: f"{first_name} {last_name}"
Total Spent: sum(order_total)
Selecting All Columns from a Table
select: employees.*
Excluding Columns
select!: [password, credit_card_number]
Mixed Selection with Computation
select:
- order_id
- customer_name
- order_date
Total: price * quantity
Complex Expressions
select:
BMI: weight / (height / 100) ^ 2
Weight Status:
case:
- BMI < 18.5: "Underweight"
- BMI < 25: "Normal"
- BMI < 30: "Overweight"
- true: "Obese"
Selecting from Multiple Tables
select:
- products.*
- inventory.quantity_on_hand
Using SQL Functions
select:
User ID: id
Username: username
Last Login:
sql'DATE_FORMAT(last_login, ''%Y-%m-%d %H:%i:%s'')'
Window Functions
select:
- category
- product_name
Sales: sum quantity * price
"% of Total Sales":
sql'SUM(quantity * price) / SUM(SUM(quantity * price)) OVER () * 100'
Best Practices
🎯 Be explicit about which columns you need to improve query performance and readability.
🏷️ Use meaningful names when renaming columns or creating computed columns.
🧮 Leverage the power of expressions and case statements for complex transformations.
🔍 Use
select!
when it's easier to specify columns to exclude rather than include.📊 Be cautious when using
table.*
as it may include unnecessary columns.🚀 Place computationally intensive selections later in your query pipeline for better performance.
Real-World Use Case
Here's an example of a DUQL query that uses select
to prepare a comprehensive customer report:
dataset: customers
steps:
- join:
dataset: orders
where: customers.id == orders.customer_id
- join:
dataset: products
where: orders.product_id == products.id
- group:
by: customers.id
steps:
- summarize:
total_spent: sum orders.amount
order_count: count orders.id
last_order_date: max orders.order_date
- select:
Customer ID: customers.id
Full Name: f"{customers.first_name} {customers.last_name}"
Email: customers.email
Total Spent: total_spent
Average Order Value: total_spent / order_count
Order Count: order_count
Last Order Date: last_order_date
Customer Since:
sql'DATE_FORMAT(customers.created_at, ''%Y-%m-%d'')'
Loyalty Status:
case:
- order_count > 10: "VIP"
- order_count > 5: "Regular"
- true: "New"
Days Since Last Order:
sql'DATEDIFF(CURRENT_DATE, last_order_date)'
into: customer_summary_report
This query demonstrates:
Joining customer data with orders and products
Aggregating order information
Selecting and renaming relevant columns
Computing new columns using various expressions and SQL functions
Creating a categorical column based on order count
The resulting customer_summary_report
provides a comprehensive view of customer activity and value, showcasing the versatility of the select
function in DUQL.
💡 Tip: The
select
function is your tool for shaping the final output of your DUQL query. Use it to focus on the most relevant data, create meaningful derivations, and present your results in the most useful format for your analysis or reporting needs!
Last updated
Was this helpful?