Loops
The loop
function in DUQL is used for iterative processing. It applies a sequence of steps repeatedly to an initial dataset until a termination condition is met, typically when the step function returns an empty table.
Syntax
Parameters
The loop
function doesn't have parameters itself, but it contains a list of steps that will be executed iteratively until the input (the results from the previous step) is empty.
steps
array
Yes
A list of transformation steps to be applied iteratively
Each step in the loop can be any valid DUQL transformation, such as filter
, generate
, group
, etc.
Examples
Basic Iterative Calculation
This loop continues to process payments until the remaining balance is zero or negative.
Hierarchical Data Processing
This loop traverses an employee hierarchy, joining the manager table with itself until it reaches employees with no manager (top level).
Use Cases
Recursive Calculations: Perform calculations that depend on previous results, such as compound interest or depreciation.
Hierarchical Data Processing: Traverse tree-like structures, such as organizational hierarchies or bill of materials.
Iterative Data Cleaning: Apply data cleaning steps repeatedly until certain quality criteria are met.
Convergence Algorithms: Implement algorithms that iterate until a convergence condition is satisfied.
Best Practices
⚠️ Always include a termination condition to prevent infinite loops. This is typically done using a
filter
step that will eventually return an empty result.🔢 Consider adding a maximum iteration count as a safeguard against unexpected infinite loops.
📊 Use
generate
steps within the loop to create or update variables that track the iteration progress or accumulate results.🧮 When possible, try to express your logic without loops for better performance. Only use loops when iterative processing is truly necessary.
📝 Document the purpose and expected behavior of your loop clearly, especially for complex iterative processes.
Related Functions
filter
: Often used as a termination condition in loopsgenerate
: Used to update variables within the loopappend
: Useful for accumulating results across iterations
Limitations and Considerations
Loops can be computationally expensive, especially on large datasets. Use them judiciously.
Not all database systems support iterative processing natively.
Complex loops can be difficult to optimize. Consider alternative non-looping approaches if performance becomes an issue.
💡 Tip: While loops are powerful for certain types of problems, they should be used sparingly in data processing pipelines. Often, set-based operations are more efficient and easier to optimize. Always consider if there's a non-iterative way to express your logic before resorting to a loop!
Last updated
Was this helpful?