Pipeline Concepts
This page explains what you just experienced in the 2-Minute Win.
What You Just Did
You connected nodes together on a canvas.
β Thatβs called a PIPELINE.
A pipeline is just data flowing from one step to another.
Why it matters:
β You can transform any data without writing code β Each step does one thing well β You can see exactly what changed at each step
A pipeline consists of:
- Input nodes β your source JSON data
- Step nodes β utilities that transform the data
- Connections β wires that define how data flows from one step to the next
Data flows left-to-right through the pipeline. Each step receives JSON from its connected input, applies a transformation, and passes the result to the next step.
You dragged a βFilter Fieldsβ utility onto the canvas.
β Thatβs called a STEP NODE (or UTILITY NODE).
Step nodes transform your data in specific ways.
Why it matters:
β You donβt need to remember syntax β Each utility has a clear purpose β You can chain them together to do complex things
The system has 28+ built-in utilities organized into categories:
- Convert β CSV β JSON, format conversions
- Merge β Combine multiple data sources
- Structure β Pick/remove fields, flatten/nest, restructure
- Cleanup β Clean JSON, normalize, truncate
- Analysis β Aggregate, summarize, diff
- Generation β Mock data generation, compute
- Localization β Add localization features
- Parsing β Parse console logs, NDS logs
- Transform β Map values, format values
- Redaction β Remove sensitive data
Each utility is a building block. You chain them together to create complex transformations.
You pressed Run and saw the pipeline execute.
β Thatβs called PIPELINE EXECUTION.
Execution is data flowing through your pipeline from start to finish.
Why it matters:
β You can inspect outputs step by step β Errors are caught immediately β You can debug by looking at each step
When you click Run, the pipeline engine:
- Builds a dependency graph from the connections between nodes
- Sorts topologically to determine the correct execution order
- Executes each step in order, passing outputs as inputs to downstream steps
- Stores the result of the final step as the pipeline output
Steps that donβt depend on each other could theoretically run in parallel, but the current engine executes them sequentially in topological order.
You inspected step and output results.
β Thatβs result inspection.
The system lets you review the output produced by a step or output node.
Why it matters:
β You can confirm what a utility produced β You can trust your pipeline step by step β You can learn by exploring outputs
When you view results:
- Successful steps can show output in tree or text form
- Failed steps show errors instead of output
- Output nodes show the final pipeline result
Node Types
Input Nodes
Every pipeline starts with an Input node. This is where you provide your source JSON data.
You can add multiple inputs for utilities that require two inputs (e.g., Deep Merge, Diff).
Step Nodes
Step nodes contain utilities that transform your data. Each utility has different configuration options:
- Text fields β for search terms, field names, expressions
- Dropdowns β for selecting modes, strategies, or formats
- Toggles β for enabling/disabling options
- Forge JSON Editors β for complex configuration like field mappings
Output Nodes
Output nodes display the final result of your pipeline. The last stepβs output becomes the pipeline result.
Group Nodes
Group nodes let you visually organize related steps. They donβt transform data β theyβre just for keeping your canvas tidy.
Connections
Connections (wires) define how data flows from one step to another.
How to connect nodes:
- Drag from the output handle (right side) of one node
- Drop it on the input handle (left side) of another node
Auto-connect: When you drag a utility onto the canvas near existing nodes, it automatically connects to the nearest compatible node. This is why your first pipeline was so easy!
Connection types:
- Primary input (solid blue β) β main data flow
- Secondary input (solid orange β) β for utilities that merge two data sources
- Output (hollow green β) β result output
Pipeline Execution Models
Demo vs. Saved Pipelines
Demo pipeline (/demo/pipeline)
- Works without sign-in
- Saved to localStorage
- Great for testing and learning
Saved pipelines (/dashboard/pipeline/[id])
- Requires sign-in
- Saved to the cloud with full version history
- Supports sharing and collaboration
Execution in the Background
Pipeline execution happens in a Web Worker β a separate thread from the main browser interface. This means:
- The UI stays responsive even during complex transformations
- Large datasets donβt freeze your browser
- Multiple pipelines can run simultaneously
Step outputs are stored using a 3-tier storage system:
- OPFS (Origin Private File System) β fastest, on desktop Chrome/Edge
- IndexedDB β fallback for browsers without OPFS
- Memory β final fallback
Next Steps
Now that you understand the concepts:
- Build your own pipeline β go beyond the tutorial
- Explore all utilities β reference for all 28+ utilities
- Find solutions to your problems β βI want to clean messy API dataβ
- Running Pipelines β execute and view results