Pipeline Concepts

This page explains what you just experienced in the 2-Minute Win.

What You Just Did

You connected nodes together on a canvas.

↑ That’s called a PIPELINE.

A pipeline is just data flowing from one step to another.

Why it matters:

→ You can transform any data without writing code → Each step does one thing well → You can see exactly what changed at each step

A pipeline consists of:

Input nodes — your source JSON data
Step nodes — utilities that transform the data
Connections — wires that define how data flows from one step to the next

Data flows left-to-right through the pipeline. Each step receives JSON from its connected input, applies a transformation, and passes the result to the next step.

You dragged a “Filter Fields” utility onto the canvas.

↑ That’s called a STEP NODE (or UTILITY NODE).

Step nodes transform your data in specific ways.

Why it matters:

→ You don’t need to remember syntax → Each utility has a clear purpose → You can chain them together to do complex things

The system has 28+ built-in utilities organized into categories:

Convert — CSV ↔ JSON, format conversions
Merge — Combine multiple data sources
Structure — Pick/remove fields, flatten/nest, restructure
Cleanup — Clean JSON, normalize, truncate
Analysis — Aggregate, summarize, diff
Generation — Mock data generation, compute
Localization — Add localization features
Parsing — Parse console logs, NDS logs
Transform — Map values, format values
Redaction — Remove sensitive data

Each utility is a building block. You chain them together to create complex transformations.

You pressed Run and saw the pipeline execute.

↑ That’s called PIPELINE EXECUTION.

Execution is data flowing through your pipeline from start to finish.

Why it matters:

→ You can inspect outputs step by step → Errors are caught immediately → You can debug by looking at each step

When you click Run, the pipeline engine:

Builds a dependency graph from the connections between nodes
Sorts topologically to determine the correct execution order
Executes each step in order, passing outputs as inputs to downstream steps
Stores the result of the final step as the pipeline output

Steps that don’t depend on each other could theoretically run in parallel, but the current engine executes them sequentially in topological order.

You inspected step and output results.

↑ That’s result inspection.

The system lets you review the output produced by a step or output node.

Why it matters:

→ You can confirm what a utility produced → You can trust your pipeline step by step → You can learn by exploring outputs

When you view results:

Successful steps can show output in tree or text form
Failed steps show errors instead of output
Output nodes show the final pipeline result

Node Types

Input Nodes

Every pipeline starts with an Input node. This is where you provide your source JSON data.

You can add multiple inputs for utilities that require two inputs (e.g., Deep Merge, Diff).

Step Nodes

Step nodes contain utilities that transform your data. Each utility has different configuration options:

Text fields — for search terms, field names, expressions
Dropdowns — for selecting modes, strategies, or formats
Toggles — for enabling/disabling options
Forge JSON Editors — for complex configuration like field mappings

Output Nodes

Output nodes display the final result of your pipeline. The last step’s output becomes the pipeline result.

Group Nodes

Group nodes let you visually organize related steps. They don’t transform data — they’re just for keeping your canvas tidy.

Connections

Connections (wires) define how data flows from one step to another.

How to connect nodes:

Drag from the output handle (right side) of one node
Drop it on the input handle (left side) of another node

Auto-connect: When you drag a utility onto the canvas near existing nodes, it automatically connects to the nearest compatible node. This is why your first pipeline was so easy!

Connection types:

Primary input (solid blue ●) — main data flow
Secondary input (solid orange ●) — for utilities that merge two data sources
Output (hollow green ○) — result output

Pipeline Execution Models

Demo vs. Saved Pipelines

Demo pipeline (/demo/pipeline)

Works without sign-in
Saved to localStorage
Great for testing and learning

Saved pipelines (/dashboard/pipeline/[id])

Requires sign-in
Saved to the cloud with full version history
Supports sharing and collaboration

Execution in the Background

Pipeline execution happens in a Web Worker — a separate thread from the main browser interface. This means:

The UI stays responsive even during complex transformations
Large datasets don’t freeze your browser
Multiple pipelines can run simultaneously

Step outputs are stored using a 3-tier storage system:

OPFS (Origin Private File System) — fastest, on desktop Chrome/Edge
IndexedDB — fallback for browsers without OPFS
Memory — final fallback

Next Steps

Now that you understand the concepts:

Build your own pipeline — go beyond the tutorial
Explore all utilities — reference for all 28+ utilities
Find solutions to your problems — “I want to clean messy API data”
Running Pipelines — execute and view results

Want to Try the 2-Minute Win Again?

🚀 Build Your First Pipeline in 2 Minutes