Skip to main content

Why Constellation

The Problem

Pain Point

Backend services that aggregate data from multiple sources accumulate a specific class of bugs over time: runtime type errors in pipeline composition. Field name typos, type mismatches, and null values hide in code that compiles fine but fails at production runtime.

Consider a typical Scala service that fetches user data from three APIs, merges the results, and returns a subset of fields:

for {
profile <- profileApi.get(userId)
activity <- activityApi.get(userId)
prefs <- prefsApi.get(userId)
} yield {
val merged = profile ++ activity ++ prefs
// Typo: "usrName" instead of "userName" — compiles, fails at runtime
Json.obj("name" -> merged("usrName"), "score" -> merged("activityScore"))
}

These bugs share common traits:

  • They compile successfully — the types are all Map[String, Any] or Json
  • They surface late — in staging, in production, or during edge-case inputs
  • They're hard to trace — the error message says "key not found" with no indication of which pipeline step produced the wrong shape

How Constellation Solves It

Constellation validates field accesses and type operations at compile time, before any code runs:

in profile: { userName: String, email: String }
in activity: { activityScore: Int, lastLogin: String }
in prefs: { theme: String }

merged = profile + activity + prefs

# Compile error: field 'usrName' not found. Did you mean 'userName'?
result = merged[usrName, activityScore]

The compiler knows the exact shape of merged{ userName: String, email: String, activityScore: Int, lastLogin: String, theme: String } — and rejects any access to a field that doesn't exist.

Separation of Concerns

This works because Constellation separates pipeline definition (a declarative .cst script) from module implementation (Scala functions with typed inputs/outputs). The compiler can reason about the entire pipeline graph before execution.

What you get

CapabilityHow it works
Compile-time field validationRecord types track every field name and type through merges, projections, and module calls
Automatic parallelizationThe DAG compiler identifies independent branches and runs them concurrently — no manual parMapN
Declarative resilienceretry, timeout, fallback, cache are per-call options, not wrapper code
Hot reloadChange a .cst file, hit the API — no recompile, no restart
Built-in observabilityExecution traces, DAG visualization, and metrics are available out of the box

When to Use Constellation

API composition (Backend-for-Frontend) You aggregate data from 3+ microservices, merge fields, and return a shaped response. Constellation's type algebra (+ for merge, [] for projection) makes this declarative and type-safe.

Data enrichment pipelines You take a batch of records, enrich each with data from external sources, and filter/score the results. Candidates<T> batch types and per-call resilience options handle this pattern directly.

Type-safety-first teams You already use Scala 3, Cats Effect, and value compile-time guarantees. Constellation extends those guarantees to your pipeline composition layer.

When Not to Use Constellation

Know the Limits

Constellation is designed for pipeline orchestration within a single JVM. It is not a replacement for distributed compute frameworks or stream processing systems.

Use caseBetter toolWhy
CRUD applicationsAn ORM (Slick, Doobie)Constellation orchestrates pipelines, not database operations
Stream processingKafka Streams, Flink, fs2Constellation handles request/response pipelines, not unbounded streams
ETL at scaleSpark, dbtConstellation runs in a single JVM; it's not a distributed compute framework
Simple request handlersDirect Scala codeIf your endpoint calls one service and returns the result, Constellation adds overhead without benefit

Key Differentiators

FeatureConstellationManual ScalaWorkflow engines (Temporal, Airflow)
Compile-time type checkingYes — field-levelScala compiler onlyNo — YAML/JSON configs
Declarative DSL.cst filesScala codeYAML/DAG definitions
Auto-parallelizationFrom DAG structureManual parMapNTask-level scheduling
Hot reloadYes — no restartRequires recompileVaries
Built-in resiliencePer-call optionsManual retry/circuit-breaker wrappersFramework-specific
LatencySub-100ms compile + executeNativeSeconds to minutes (scheduler overhead)

Next Steps

  • Introduction — overview of the framework and its components
  • Tutorial — build your first pipeline step by step
  • Cookbook — 25 example pipelines for common patterns