Execution Modes
This guide explains the different ways to execute Constellation pipelines, comparing hot vs cold execution patterns and HTTP vs embedded API deployment models. Understanding these modes helps you make informed architectural decisions based on your performance, scalability, and deployment requirements.
Quick Reference
| Mode | What Gets Reused | Typical Latency | Use Case |
|---|---|---|---|
| Cold Execution | Nothing | Parse + Compile + Execute | One-off scripts, CI/CD pipelines |
| Hot Execution (Compile-Once) | Compiled pipeline | Execute only (~1-10ms overhead) | Production workloads, low-latency APIs |
| Hot Execution (Cached) | Compiled pipeline + cache hits | <5ms for unchanged source | Interactive tools, LSP, high-throughput services |
| Deployment | Architecture | Overhead | Use Case |
|---|---|---|---|
| Embedded API | In-process library calls | ~0.15ms/node | Batch jobs, embedded ML pipelines, tight control loops |
| HTTP API | Network + JSON serialization | +2-10ms per request | Multi-language clients, microservices, dashboard UI |
Part 1: Hot vs Cold Execution
Cold Execution
Definition: Every execution starts from source text. The pipeline is compiled from scratch each time.
Flow:
Source Text → Parse → TypeCheck → IR Generation → Optimize → DAG Compile → Execute → Results
(5-50ms) (5-50ms) (5-20ms) (5-20ms) (10-50ms) (1-10000ms)
Example:
import cats.effect.IO
import io.constellation.impl.ConstellationImpl
import io.constellation.stdlib.StdLib
import io.constellation.lang.LangCompiler
import io.constellation.TypeSystem.CValue._
val source = """
in text: String
result = Uppercase(text)
out result
"""
// Full cold execution on every call
def coldExecute(input: String): IO[String] =
for {
constellation <- ConstellationImpl.init
_ <- StdLib.allModules.values.toList.traverse(constellation.setModule)
compiler = StdLib.compiler
// 1. Compile from source (cold)
compiled <- IO.fromEither(
compiler.compile(source, "my-pipeline")
.leftMap(errs => new RuntimeException(errs.map(_.message).mkString("\n")))
)
// 2. Execute
sig <- constellation.run(compiled.pipeline, Map("text" -> VString(input)))
// 3. Extract result
result <- IO.fromOption(sig.outputs.get("result").collect { case VString(s) => s })(
new NoSuchElementException("Missing output")
)
} yield result
// Every call pays full compilation cost
coldExecute("hello").unsafeRunSync() // ~100ms first time
coldExecute("world").unsafeRunSync() // ~100ms again (no reuse)
When to Use Cold Execution:
- One-off scripts: Shell scripts, CI/CD pipelines, cron jobs
- Dynamic code generation: When pipeline source changes every execution
- Prototyping: Fast iteration during development (though caching is faster)
- Low execution frequency: When compilation cost is amortized over long intervals
Performance Characteristics:
| Program Size | Parse | TypeCheck | Compile | Total Overhead |
|---|---|---|---|---|
| Small (10 nodes) | <5ms | <5ms | <10ms | ~20-30ms |
| Medium (50 nodes) | <50ms | <50ms | <50ms | ~100-150ms |
| Large (100 nodes) | <200ms | <200ms | <150ms | ~300-500ms |
For reference, these are target values from dev/benchmarks/performance-benchmarks.md.
Limitations:
- High latency unsuitable for request/response APIs
- Redundant work if source text doesn't change
- Cannot leverage compilation caching
- Module registry must be rebuilt each time
Hot Execution (Compile-Once)
Definition: Compile once, execute many times. The compiled LoadedPipeline is stored and reused.
Flow (First Execution):
Source Text → Parse → TypeCheck → IR Gen → Optimize → DAG Compile → Store in PipelineStore
(5-50ms) (5-50ms) (5-20ms) (5-20ms) (10-50ms)
↓
(returns hash: abc123...)
Flow (Subsequent Executions):
Pipeline Reference (hash or alias) → PipelineStore Lookup → Execute → Results
(<1ms) (1-10000ms)
Example:
import cats.effect.IO
import io.constellation.impl.ConstellationImpl
import io.constellation.stdlib.StdLib
import io.constellation.lang.LangCompiler
import io.constellation.TypeSystem.CValue._
import io.constellation.ExecutionOptions
val source = """
in text: String
result = Uppercase(text)
out result
"""
// One-time setup (application startup)
val setup: IO[(io.constellation.Constellation, String)] =
for {
constellation <- ConstellationImpl.init
_ <- StdLib.allModules.values.toList.traverse(constellation.setModule)
compiler = StdLib.compiler
// Compile once
compiled <- IO.fromEither(
compiler.compile(source, "uppercase-pipeline")
.leftMap(errs => new RuntimeException(errs.map(_.message).mkString("\n")))
)
// Store the pipeline image (content-addressed storage)
hash <- constellation.PipelineStore.store(compiled.pipeline.image)
// Create human-readable alias
_ <- constellation.PipelineStore.alias("uppercase", hash)
} yield (constellation, hash)
// Production usage (executes many times)
def hotExecute(
constellation: io.constellation.Constellation,
pipelineRef: String,
input: String
): IO[String] =
for {
// Execute by reference (hash or alias) - no compilation
sig <- constellation.run(
pipelineRef, // "uppercase" or hash
Map("text" -> VString(input)),
ExecutionOptions()
)
result <- IO.fromOption(sig.outputs.get("result").collect { case VString(s) => s })(
new NoSuchElementException("Missing output")
)
} yield result
// Usage pattern
for {
(constellation, hash) <- setup // ~100ms (once at startup)
// All executions reuse compiled pipeline
r1 <- hotExecute(constellation, "uppercase", "hello") // ~1-10ms overhead
r2 <- hotExecute(constellation, "uppercase", "world") // ~1-10ms overhead
r3 <- hotExecute(constellation, hash, "test") // ~1-10ms overhead (by hash)
} yield ()
PipelineStore Operations:
// Store pipeline by structural hash (content-addressed)
val hash: IO[String] = constellation.PipelineStore.store(pipelineImage)
// Returns: "a3f7c2e8b..." (SHA-256 of DAG structure)
// Create human-readable alias
constellation.PipelineStore.alias("my-pipeline", hash)
// Resolve alias to hash
val resolvedHash: IO[Option[String]] = constellation.PipelineStore.resolve("my-pipeline")
// Retrieve by hash
val byHash: IO[Option[PipelineImage]] = constellation.PipelineStore.get(hash)
// Retrieve by alias
val byName: IO[Option[PipelineImage]] = constellation.PipelineStore.getByName("my-pipeline")
// Execute by reference (alias or hash)
constellation.run("my-pipeline", inputs, ExecutionOptions())
constellation.run(s"sha256:$hash", inputs, ExecutionOptions())
When to Use Hot Execution:
- Production APIs: Request/response services where latency matters
- High-throughput batch processing: Millions of executions with same pipeline
- Long-running services: Web servers, microservices, daemon processes
- Stable pipeline definitions: When source text rarely changes
Performance Characteristics:
| Operation | Latency | Notes |
|---|---|---|
| PipelineStore lookup (in-memory) | <1ms | Hash-based retrieval |
| Pipeline rehydration | <1ms | Reconstruct LoadedPipeline from image |
| DAG execution overhead | ~0.15ms/node | Pure orchestration cost (no module time) |
For a 10-node pipeline where each module takes 5ms, total execution time:
- Cold: ~150ms (compile) + 50ms (execute) = 200ms
- Hot: 50ms (execute only) = 50ms — 4x faster
Key Benefits:
- Eliminates redundant parsing, type-checking, and compilation
- Predictable latency (no compilation jitter)
- Enables content-addressed caching (structural hash deduplication)
- PipelineStore can be backed by persistent storage (survives restarts)
Hot Execution with Caching
Definition: Combines compile-once with automatic cache-hit detection. If source text hasn't changed, the compiler returns a cached LoadedPipeline immediately.
Flow (Cache Hit):
Source Text → Syntactic Hash → PipelineStore Syntactic Index Lookup → Return Cached Pipeline
(instant) (<1ms) (<1ms) (total: <5ms)
Flow (Cache Miss):
Source Text → Parse → TypeCheck → IR Gen → Optimize → DAG Compile → Store & Index → Return Pipeline
(5-50ms) (5-50ms) (5-20ms) (5-20ms) (10-50ms) (<1ms)
How It Works:
The PipelineStore maintains a syntactic index that maps (syntacticHash, registryHash) → structuralHash:
- Syntactic Hash: SHA-256 of the source text (unchanged source = same hash)
- Registry Hash: Hash of the function registry (detects when available modules change)
- Structural Hash: SHA-256 of the compiled DAG structure
When you compile source:
- Compiler computes syntactic hash immediately (no parsing required)
- Checks syntactic index for
(syntacticHash, registryHash)pair - If found: returns cached
LoadedPipeline(cache hit) - If not found: compiles, stores result, indexes
(syntactic, registry) → structural
Example:
import io.constellation.lang.{CachingLangCompiler, LangCompiler}
import io.constellation.impl.ConstellationImpl
import io.constellation.stdlib.StdLib
// Setup with caching compiler
val setup: IO[(io.constellation.Constellation, CachingLangCompiler)] =
for {
constellation <- ConstellationImpl.init
_ <- StdLib.allModules.values.toList.traverse(constellation.setModule)
baseCompiler = StdLib.compiler
// Wrap with caching layer
cachingCompiler = CachingLangCompiler.withDefaults(baseCompiler)
} yield (constellation, cachingCompiler)
val source = """
in text: String
result = Uppercase(text)
out result
"""
// Usage pattern demonstrating cache hits
for {
(constellation, compiler) <- setup
// First compilation (cold - cache miss)
compiled1 <- compiler.compileIO(source, "uppercase")
// Took ~100ms (full compilation pipeline)
// Store the compiled pipeline
hash <- constellation.PipelineStore.store(compiled1.pipeline.image)
// Second compilation of identical source (cache hit)
compiled2 <- compiler.compileIO(source, "uppercase")
// Took <5ms (cache hit - no parsing/typechecking/compilation)
// Third compilation with modified source (cache miss)
modifiedSource = """
in text: String
result = Lowercase(text) # Changed: Uppercase → Lowercase
out result
"""
compiled3 <- compiler.compileIO(modifiedSource, "lowercase")
// Took ~100ms (cache miss - different syntactic hash)
} yield ()
Cache Performance:
| Scenario | Latency | Speedup |
|---|---|---|
| Cold compile (medium program) | ~100ms | Baseline |
| Warm cache hit | <5ms | 20x faster |
| Cache hit after server restart (persistent store) | <5ms | 20x faster |
| Source modified (cache miss) | ~100ms | No speedup (expected) |
When to Use Cached Execution:
- LSP (Language Server Protocol): Every keystroke triggers recompilation
- Interactive dashboards: Users edit and re-run pipelines frequently
- Development environments: Rapid edit-test cycles
- Multi-tenant platforms: Many users running identical pipelines
Example: LSP Autocomplete Performance
// User types: "result = Upper|" (cursor at |)
// LSP needs to compile to provide autocomplete suggestions
// Without caching:
// Every keystroke = full recompilation = 100ms per keystroke = unusable
// With caching:
// Syntactic hash changes only when source changes
// Cache hit = <5ms = 50ms autocomplete response = smooth UX
Cache Invalidation:
The cache automatically invalidates when:
- Source text changes: Different syntactic hash triggers cache miss
- Function registry changes: Adding/removing modules changes registry hash
- Module signatures change: Version bump in module metadata changes registry hash
No manual invalidation is required — cache correctness is guaranteed by the hash-based indexing.
Monitoring Cache Effectiveness:
# Check cache hit rate via HTTP metrics endpoint
curl http://localhost:8080/metrics | jq .cache
# Example output:
{
"cache": {
"hits": 847,
"misses": 23,
"hitRate": 0.973,
"size": 45
}
}
Target hit rate: >80% for production workloads with stable pipelines.
Execution Mode Comparison Table
| Aspect | Cold Execution | Hot Execution (Compile-Once) | Hot Execution (Cached) |
|---|---|---|---|
| Compilation | Every execution | Once at startup | On cache miss only |
| Typical Overhead | 50-500ms | 1-10ms | <5ms (hit) / 50-500ms (miss) |
| Memory Usage | Low (no persistent state) | Medium (PipelineStore) | High (PipelineStore + cache) |
| Startup Time | Instant | Medium (pre-compile) | High (warm cache) |
| Runtime Latency | High (compile + execute) | Low (execute only) | Lowest (cache hit + execute) |
| Source Changes | Always fresh | Requires recompile | Auto-detects via hash |
| Best For | Scripts, CI/CD | Production APIs | LSP, dashboards, dev tools |
Decision Matrix: Which Execution Mode?
Choose Cold Execution if:
- ✅ Execution frequency is low (minutes to hours between runs)
- ✅ Source text changes every execution (dynamic generation)
- ✅ Minimal memory footprint is required
- ✅ Startup time matters more than runtime latency
- ❌ You need sub-50ms response times
- ❌ You execute the same pipeline thousands of times
Choose Hot Execution (Compile-Once) if:
- ✅ Same pipeline executes many times (hundreds to millions)
- ✅ Pipeline definition is stable (source rarely changes)
- ✅ Latency matters (APIs, request/response services)
- ✅ You can afford startup cost (pre-compilation)
- ❌ Source text changes frequently (then use caching instead)
- ❌ You need to support interactive editing (then use caching)
Choose Hot Execution (Cached) if:
- ✅ Source text changes frequently but repeats (interactive editing)
- ✅ Multiple users may run identical pipelines (multi-tenant)
- ✅ You need to support LSP or live tooling
- ✅ Sub-5ms compilation latency is critical
- ✅ Memory for cache is available
- ❌ Every execution uses unique source text (no cache benefit)
- ❌ Minimal memory footprint is required
Part 2: HTTP API vs Embedded API
Embedded API (In-Process)
Definition: Constellation runs as a library within your application. All interactions are direct Scala method calls.
Architecture:
┌─────────────────────────────────────┐
│ Your Application (JVM Process) │
│ ┌────────────────────────────────┐ │
│ │ Application Code │ │
│ │ ↓ │ │
│ │ constellation.run(...) │ │ Direct method calls
│ │ ↓ │ │ No network overhead
│ │ Constellation Engine │ │ No serialization
│ │ ↓ │ │
│ │ Module Execution │ │
│ │ ↓ │ │
│ │ Results (CValue) │ │
│ └────────────────────────────────┘ │
└─────────────────────────────────────┘
Example:
import cats.effect._
import cats.implicits._
import io.constellation._
import io.constellation.TypeSystem._
import io.constellation.TypeSystem.CValue._
import io.constellation.impl.ConstellationImpl
import io.constellation.stdlib.StdLib
object EmbeddedExample extends IOApp.Simple {
def run: IO[Unit] =
for {
// 1. Initialize Constellation instance (embedded in your process)
constellation <- ConstellationImpl.init
// 2. Register modules
_ <- StdLib.allModules.values.toList.traverse(constellation.setModule)
// 3. Compile pipeline
compiler = StdLib.compiler
compiled <- IO.fromEither(
compiler.compile("""
in numbers: List[Int]
sum = Sum(numbers)
avg = Average(numbers)
out sum
out avg
""", "stats-pipeline").leftMap(errs =>
new RuntimeException(errs.map(_.message).mkString("\n"))
)
)
// 4. Execute directly (no HTTP, no serialization)
sig <- constellation.run(
compiled.pipeline,
Map("numbers" -> VList(List(VLong(10), VLong(20), VLong(30))))
)
// 5. Access results as typed Scala values
sum <- IO.fromOption(sig.outputs.get("sum").collect { case VLong(n) => n })(
new NoSuchElementException("Missing sum")
)
avg <- IO.fromOption(sig.outputs.get("avg").collect { case VFloat(d) => d })(
new NoSuchElementException("Missing avg")
)
_ <- IO.println(s"Sum: $sum, Average: $avg")
} yield ()
}
Performance Characteristics:
| Operation | Latency | Notes |
|---|---|---|
| Method call overhead | <0.01ms | Direct JVM method dispatch |
| Input conversion | <0.1ms | Scala values → CValue (zero-copy for most types) |
| DAG execution overhead | ~0.15ms/node | Pure orchestration (measured in benchmarks) |
| Output extraction | <0.1ms | CValue → Scala values |
For a 10-node pipeline where each module takes 5ms:
- Execution: 10 nodes × 5ms/node = 50ms (module logic)
- Orchestration: 10 nodes × 0.15ms/node = 1.5ms (DAG overhead)
- Total: 51.5ms (pure engine cost)
Advantages:
- ✅ Lowest latency: No network round-trip, no JSON serialization
- ✅ Type safety: Compile-time guarantees for inputs/outputs
- ✅ Direct control: Fine-grained configuration (scheduler, backends, lifecycle)
- ✅ Efficient resource usage: Shared memory, no IPC overhead
- ✅ Easier debugging: Single process, stack traces work normally
Trade-offs:
- ❌ JVM only: Cannot call from Python, JavaScript, or other languages
- ❌ Same process: Crashes in modules crash your application
- ❌ Deployment coupling: Must redeploy application to update Constellation version
- ❌ No cross-network execution: Cannot distribute across machines
When to Use Embedded API:
- Batch processing jobs (ETL pipelines, data transformations)
- ML inference servers (where latency is critical)
- Embedded systems (IoT devices running JVM)
- Tight control loops (real-time systems, robotics)
- Single-tenant applications (desktop apps, CLI tools)
Production Configuration:
import io.constellation.impl.ConstellationImpl
import io.constellation.execution.{GlobalScheduler, ConstellationLifecycle}
import io.constellation.spi.ConstellationBackends
import scala.concurrent.duration._
// Full production setup
GlobalScheduler.bounded(
maxConcurrency = 16,
starvationTimeout = 30.seconds
).use { scheduler =>
ConstellationLifecycle.create.flatMap { lifecycle =>
val constellation = ConstellationImpl.builder()
.withScheduler(scheduler)
.withBackends(ConstellationBackends(
metrics = myPrometheusMetrics,
tracer = myOtelTracer,
listener = myKafkaListener,
cache = Some(myRedisCache)
))
.withDefaultTimeout(60.seconds)
.withLifecycle(lifecycle)
.build()
// ... use constellation ...
// Graceful shutdown
lifecycle.shutdown(drainTimeout = 30.seconds)
}
}
HTTP API (Out-of-Process)
Definition: Constellation runs as a standalone HTTP server. Clients interact via REST API over the network.
Architecture:
┌──────────────────┐ ┌──────────────────────────┐
│ Client App │ HTTP POST │ Constellation Server │
│ (Any Language) │ /compile │ (JVM Process) │
│ ┌─────── ──────┐ │ ──────────────────>│ ┌────────────────────┐ │
│ │ Python │ │ │ │ HTTP Routes │ │
│ │ JavaScript │ │ JSON Response │ │ ↓ │ │
│ │ Go │ │ <──────────────────│ │ Constellation API │ │
│ │ Ruby │ │ │ │ ↓ │ │
│ └─────────────┘ │ HTTP POST │ │ Module Execution │ │
└──────────────────┘ /execute │ │ ↓ │ │
──────────────────>│ │ JSON Response │ │
│ └────────────────────┘ │
└──────────────────────────┘
Example (Server):
import cats.effect._
import io.constellation.impl.ConstellationImpl
import io.constellation.stdlib.StdLib
import io.constellation.http._
object HttpServerExample extends IOApp.Simple {
def run: IO[Unit] =
for {
constellation <- ConstellationImpl.init
_ <- StdLib.allModules.values.toList.traverse(constellation.setModule)
compiler = StdLib.compiler
// Start HTTP server
_ <- ConstellationServer
.builder(constellation, compiler)
.withHost("0.0.0.0")
.withPort(8080)
.withDashboard // Optional: web UI for interactive testing
.run
} yield ()
}
Example (Client - curl):
# Compile a pipeline
curl -X POST http://localhost:8080/compile \
-H "Content-Type: application/json" \
-d '{
"name": "stats-pipeline",
"source": "in numbers: List[Int]\nsum = Sum(numbers)\nout sum"
}'
# Response:
# {
# "success": true,
# "structuralHash": "a3f7c2e8b...",
# "syntacticHash": "d9e1f4a2c...",
# "name": "stats-pipeline"
# }
# Execute by name
curl -X POST http://localhost:8080/execute \
-H "Content-Type: application/json" \
-d '{
"ref": "stats-pipeline",
"inputs": {
"numbers": [10, 20, 30]
}
}'
# Response:
# {
# "success": true,
# "outputs": {
# "sum": 60
# },
# "status": "completed",
# "executionId": "f8a2c3d4-..."
# }
Example (Client - Python):
import requests
import json
# Compile pipeline
compile_response = requests.post('http://localhost:8080/compile', json={
'name': 'stats-pipeline',
'source': '''
in numbers: List[Int]
sum = Sum(numbers)
avg = Average(numbers)
out sum
out avg
'''
})
pipeline_hash = compile_response.json()['structuralHash']
# Execute pipeline
execute_response = requests.post('http://localhost:8080/execute', json={
'ref': 'stats-pipeline', # or use pipeline_hash
'inputs': {
'numbers': [10, 20, 30]
}
})
results = execute_response.json()['outputs']
print(f"Sum: {results['sum']}, Average: {results['avg']}")
Example (Client - JavaScript/TypeScript):
// Compile pipeline
const compileResponse = await fetch('http://localhost:8080/compile', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
name: 'stats-pipeline',
source: `
in numbers: List[Int]
sum = Sum(numbers)
out sum
`
})
});
const { structuralHash } = await compileResponse.json();
// Execute pipeline
const executeResponse = await fetch('http://localhost:8080/execute', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
ref: 'stats-pipeline',
inputs: { numbers: [10, 20, 30] }
})
});
const { outputs } = await executeResponse.json();
console.log(`Sum: ${outputs.sum}`);
Performance Characteristics:
| Operation | Latency | Notes |
|---|---|---|
| Network round-trip (same datacenter) | 0.5-2ms | Varies by network topology |
| Network round-trip (cross-region) | 10-50ms | Significant overhead |
| JSON serialization (request) | 0.1-1ms | Depends on input size |
| JSON deserialization (request) | 0.1-1ms | Circe parsing |
| DAG execution | 1ms + module time | Same as embedded |
| JSON serialization (response) | 0.1-1ms | Output encoding |
| JSON deserialization (response) | 0.1-1ms | Client parsing |
For a 10-node pipeline where each module takes 5ms:
- Module execution: 50ms (same as embedded)
- Orchestration: 1.5ms (same as embedded)
- HTTP overhead: 2-10ms (network + serialization)
- Total: ~53.5-61.5ms (+ network latency)
Typical overhead: +2-10ms per request compared to embedded API.
Advantages:
- ✅ Language-agnostic: Call from Python, JavaScript, Go, etc.
- ✅ Process isolation: Module crashes don't affect client
- ✅ Independent deployment: Update Constellation without redeploying clients
- ✅ Horizontal scaling: Run multiple servers behind a load balancer
- ✅ Built-in dashboard: Web UI for testing and visualization
Trade-offs:
- ❌ Higher latency: Network + serialization overhead
- ❌ JSON overhead: Large inputs/outputs slow down significantly
- ❌ No type safety: Clients pass JSON (runtime errors possible)
- ❌ More moving parts: Network failures, load balancers, authentication
When to Use HTTP API:
- Multi-language environments (polyglot microservices)
- Multi-tenant platforms (SaaS applications)
- Web dashboards (browser-based clients)
- External integrations (third-party services)
- Distributed systems (cross-network execution)
Production Configuration:
ConstellationServer
.builder(constellation, compiler)
.withPort(8080)
.withDashboard
// Security hardening
.withAuth(AuthConfig(apiKeys = Map(
"admin-key" -> ApiRole.Admin,
"app-key" -> ApiRole.Execute
)))
.withCors(CorsConfig(allowedOrigins = Set("https://app.example.com")))
.withRateLimit(RateLimitConfig(requestsPerMinute = 200, burst = 40))
// Health checks
.withHealthChecks(HealthCheckConfig(enableDetailEndpoint = true))
// Pipeline management
.withPipelineLoader(PipelineLoaderConfig(
directory = java.nio.file.Paths.get(".constellation-pipelines"),
filePattern = "*.cst"
))
.withPersistentPipelineStore(java.nio.file.Paths.get(".constellation-store"))
.run
Available Endpoints:
| Endpoint | Method | Purpose |
|---|---|---|
/compile | POST | Compile constellation-lang source to pipeline |
/execute | POST | Execute pipeline by reference (hash or alias) |
/pipelines | GET | List all stored pipelines |
/pipelines/:ref | GET | Get pipeline details |
/pipelines/:ref | DELETE | Delete a pipeline |
/modules | GET | List available modules |
/health | GET | Health check (uptime, status) |
/health/ready | GET | Readiness probe (Kubernetes) |
/health/live | GET | Liveness probe (Kubernetes) |
/metrics | GET | Prometheus-style metrics |
/lsp | WebSocket | Language Server Protocol (LSP) for editors |
Authentication (opt-in):
# Set API keys via environment variables
CONSTELLATION_API_KEYS="admin:Admin,app:Execute"
# Client must include X-API-Key header
curl -X POST http://localhost:8080/execute \
-H "X-API-Key: admin" \
-H "Content-Type: application/json" \
-d '{ "ref": "my-pipeline", "inputs": {} }'
Rate Limiting (opt-in):
# Configure rate limits
CONSTELLATION_RATE_LIMIT_RPM=100 # 100 requests per minute per IP
CONSTELLATION_RATE_LIMIT_BURST=20 # Allow bursts of 20
# Exceeding limits returns HTTP 429 Too Many Requests
Deployment Comparison Table
| Aspect | Embedded API | HTTP API |
|---|---|---|
| Latency | ~0.15ms/node overhead | +2-10ms per request |
| Languages | Scala/JVM only | Any (Python, JS, Go, Ruby, ...) |
| Network | No network (in-process) | HTTP/1.1 or HTTP/2 |
| Serialization | None (direct CValue) | JSON (Circe encoder/decoder) |
| Type Safety | Compile-time (Scala types) | Runtime (JSON validation) |
| Isolation | Same process (shared fate) | Separate process (fault isolation) |
| Scalability | Vertical (scale JVM) | Horizontal (multiple servers) |
| Deployment | Bundled with app | Independent service |
| Debugging | Easy (single process) | Harder (distributed tracing needed) |
| Best For | Latency-critical, JVM-only | Multi-language, distributed systems |
Decision Matrix: HTTP vs Embedded?
Choose Embedded API if:
- ✅ Your application is written in Scala or another JVM language
- ✅ Sub-millisecond latency is critical (real-time systems)
- ✅ You process millions of executions per second (throughput-bound)
- ✅ Deployment coupling is acceptable (single binary)
- ✅ You need fine-grained control over scheduler/backends/lifecycle
- ❌ You need to call from non-JVM languages
- ❌ You want independent deployment of Constellation and clients
Choose HTTP API if:
- ✅ Clients use multiple languages (Python, JavaScript, Go, etc.)
- ✅ You need process isolation (module crashes shouldn't affect clients)
- ✅ Horizontal scaling is required (load balancer + multiple servers)
- ✅ You want a web dashboard for interactive testing
- ✅ You have a microservices architecture
- ❌ Network overhead is unacceptable (<1ms latency required)
- ❌ Large payloads make JSON serialization prohibitively slow
Part 3: Combining Execution Modes
Hot Execution + Embedded API (Best Performance)
Use Case: High-throughput batch processing, ML inference servers.
Pattern:
// Startup: compile once
for {
constellation <- ConstellationImpl.init
_ <- StdLib.allModules.values.toList.traverse(constellation.setModule)
compiler = StdLib.compiler
compiled <- compiler.compileIO(source, "my-pipeline")
hash <- constellation.PipelineStore.store(compiled.pipeline.image)
_ <- constellation.PipelineStore.alias("my-pipeline", hash)
// Runtime: execute millions of times with zero compilation overhead
_ <- Stream.iterate(0)(_ + 1) // Infinite stream
.evalMap { i =>
constellation.run("my-pipeline", inputs(i), ExecutionOptions())
}
.compile
.drain
} yield ()
Performance: ~0.15ms/node overhead + module execution time. No compilation, no network, no serialization.
Hot Execution + HTTP API (Best Flexibility)
Use Case: Multi-tenant SaaS platforms, web applications.
Pattern:
// Server: pre-compile common pipelines at startup
for {
constellation <- ConstellationImpl.init
_ <- StdLib.allModules.values.toList.traverse(constellation.setModule)
compiler = StdLib.compiler
// Pre-compile top 10 most-used pipelines
_ <- commonPipelines.traverse { case (name, source) =>
compiler.compileIO(source, name).flatMap { compiled =>
constellation.PipelineStore.store(compiled.pipeline.image).flatMap { hash =>
constellation.PipelineStore.alias(name, hash)
}
}
}
// Start HTTP server
_ <- ConstellationServer.builder(constellation, compiler).run
} yield ()
Clients: Execute pre-compiled pipelines by name (no compilation overhead on each request).
Performance: ~2-10ms HTTP overhead + 0.15ms/node overhead + module execution time.
Cached Execution + HTTP API (Best for Interactive Tools)
Use Case: LSP servers, web dashboards, live editors.
Pattern:
// Server with caching compiler
for {
constellation <- ConstellationImpl.init
_ <- StdLib.allModules.values.toList.traverse(constellation.setModule)
baseCompiler = StdLib.compiler
// Enable caching
cachingCompiler = CachingLangCompiler.withDefaults(baseCompiler)
// Start HTTP server with caching compiler
_ <- ConstellationServer.builder(constellation, cachingCompiler).run
} yield ()
Client workflow:
- User edits source in web UI
- Client POSTs to
/compileon every change - Server returns cached result (<5ms) if source unchanged
- Client executes pipeline via
/execute
Performance: <5ms compile (cache hit) + 2-10ms HTTP overhead + execution time.
Part 4: Advanced Patterns
Hybrid: Embedded Execution with HTTP Management API
Use Case: Local high-performance execution with remote pipeline management.
Architecture:
┌─────────────────────────────────────────┐
│ Application (JVM) │
│ ┌──────────────────────────────────┐ │
│ │ Embedded Constellation │ │ Hot execution
│ │ ↓ │ │ (in-process, fast)
│ │ constellation.run(...) │ │
│ └──────────────────────────────────┘ │
│ ↑ │
│ │ Sync pipelines │
└─────────┼──────────────────────────────┬─┘
│ │
│ HTTP (periodic sync) │
│ │
↓ ↓
┌──────────────────────────────────────────┐
│ Central Constellation Server │
│ (Pipeline registry + compilation) │
└──────────────────────────────────────────┘
Pattern:
// Application: embedded execution + periodic sync from HTTP registry
for {
localConstellation <- ConstellationImpl.init
_ <- StdLib.allModules.values.toList.traverse(localConstellation.setModule)
// Background fiber: sync pipelines from central server every 60 seconds
_ <- Stream.fixedRate[IO](60.seconds).evalMap { _ =>
for {
// Fetch latest pipelines from HTTP registry
response <- client.get("http://registry:8080/pipelines")
pipelines <- response.as[List[PipelineMetadata]]
// Update local PipelineStore
_ <- pipelines.traverse { pm =>
client.get(s"http://registry:8080/pipelines/${pm.hash}").flatMap { resp =>
resp.as[PipelineImage].flatMap { image =>
localConstellation.PipelineStore.store(image) *>
localConstellation.PipelineStore.alias(pm.name, pm.hash)
}
}
}
} yield ()
}.compile.drain.start
// Foreground: execute locally (hot, embedded)
_ <- localConstellation.run("my-pipeline", inputs, ExecutionOptions())
} yield ()
Benefits:
- ✅ Centralized pipeline management (single source of truth)
- ✅ Local execution performance (no network on hot path)
- ✅ Cache-friendly (pipelines update infrequently)
Persistent PipelineStore (Survive Restarts)
Use Case: Production servers where pipelines should survive process restarts.
Pattern:
ConstellationServer
.builder(constellation, compiler)
.withPersistentPipelineStore(java.nio.file.Paths.get(".constellation-store"))
.withPipelineLoader(PipelineLoaderConfig(
directory = java.nio.file.Paths.get(".constellation-pipelines"),
filePattern = "*.cst"
))
.run
How it works:
- Startup: Scans
.constellation-pipelines/for*.cstfiles - Compilation: Compiles each file and stores in PipelineStore
- Persistence: Writes
PipelineStoreto.constellation-store/(JSON files) - Restart: Loads pipelines from
.constellation-store/(no re-compilation)
Directory structure:
.constellation-pipelines/
text-processing.cst
data-analysis.cst
ml-inference.cst
.constellation-store/
images/
a3f7c2e8b....json # PipelineImage (structural hash)
d9e1f4a2c....json
aliases.json # { "text-processing": "a3f7c2e8b...", ... }
syntactic_index.json # { ("syntax-hash", "registry-hash"): "structural-hash" }
Performance:
- Cold start (first boot): Compiles all
.cstfiles (~100ms each) - Warm start (restart): Loads from
.constellation-store/(<1ms per pipeline)
Part 5: Performance Optimization Strategies
Strategy 1: Pre-Compilation at Build Time
Problem: First request after deployment is slow (cold start).
Solution: Compile pipelines during Docker build.
Dockerfile:
FROM eclipse-temurin:17-jre
# Copy application
COPY target/assembly.jar /app/constellation.jar
COPY pipelines/ /app/pipelines/
# Pre-compile pipelines during image build
RUN java -cp /app/constellation.jar io.constellation.cli.Main compile \
--input /app/pipelines/ \
--output /app/.constellation-store/
# Runtime: server loads pre-compiled pipelines
CMD ["java", "-jar", "/app/constellation.jar", "server", \
"--pipeline-store", "/app/.constellation-store"]
Result: Zero cold-start latency (all pipelines pre-compiled in image).
Strategy 2: Pipeline Versioning and Canary Deployments
Problem: Updating a pipeline affects all in-flight executions.
Solution: Version pipelines and use canary routing.
Pattern:
// Deploy new version alongside old version
for {
// Old version (v1)
_ <- constellation.PipelineStore.alias("my-pipeline-v1", oldHash)
// New version (v2)
compiled <- compiler.compileIO(newSource, "my-pipeline")
newHash <- constellation.PipelineStore.store(compiled.pipeline.image)
_ <- constellation.PipelineStore.alias("my-pipeline-v2", newHash)
// Canary: 90% traffic to v1, 10% to v2
_ <- canaryRouter.setWeights("my-pipeline", Map(
"my-pipeline-v1" -> 0.9,
"my-pipeline-v2" -> 0.1
))
// Clients execute "my-pipeline" (router chooses version)
_ <- constellation.run("my-pipeline", inputs, ExecutionOptions())
} yield ()
Benefits:
- ✅ Gradual rollout (reduce blast radius)
- ✅ A/B testing (compare performance/accuracy)
- ✅ Instant rollback (flip weights to 100% old version)
Strategy 3: Module Result Caching
Problem: Expensive modules (ML inference, DB queries) recompute for identical inputs.
Solution: Implement CacheBackend to cache module results.
Pattern:
import io.constellation.cache.CacheBackend
import scala.concurrent.duration._
// Implement CacheBackend (example: Redis)
val redisCache: CacheBackend = new RedisCacheBackend(redisClient)
// Configure Constellation with cache
val constellation = ConstellationImpl.builder()
.withBackends(ConstellationBackends(
cache = Some(redisCache)
))
.build()
// Modules automatically cache results by (moduleName, inputs) hash
// No code changes required in modules
Performance:
- Cache miss: Full module execution (e.g., 500ms for ML inference)
- Cache hit: <5ms (Redis lookup)
- Speedup: 100x for expensive cached operations
Strategy 4: Bounded Scheduler for Multi-Tenant Load
Problem: One tenant's heavy load starves other tenants' requests.
Solution: Use bounded scheduler with priority-based execution.
Pattern:
GlobalScheduler.bounded(
maxConcurrency = 16,
starvationTimeout = 30.seconds
).use { scheduler =>
val constellation = ConstellationImpl.builder()
.withScheduler(scheduler)
.build()
// Tenant A: high-priority (paying customer)
constellation.run(pipelineA, inputsA, ExecutionOptions(priority = 80))
// Tenant B: normal priority (free tier)
constellation.run(pipelineB, inputsB, ExecutionOptions(priority = 50))
// Tenant C: background (analytics job)
constellation.run(pipelineC, inputsC, ExecutionOptions(priority = 20))
}
Result: Tenant A's requests complete first, Tenant B's requests are delayed, Tenant C's requests wait (but eventually run due to starvation prevention).
Part 6: Troubleshooting
Problem: High Latency in HTTP API
Symptoms:
- Requests take >100ms even for small pipelines
- Network latency is acceptable (<5ms)
Diagnosis:
# Check metrics endpoint
curl http://localhost:8080/metrics | jq .
# Look for:
# - High compilation times (cache not working?)
# - High queue depth (scheduler overloaded?)
# - Low cache hit rate (<80% is bad)
Solutions:
- Enable caching: Use
CachingLangCompiler - Pre-compile pipelines: Use
PipelineLoaderat startup - Increase scheduler concurrency:
CONSTELLATION_SCHEDULER_MAX_CONCURRENCY=32 - Check module performance: Profile module execution times
Problem: Cache Not Hitting
Symptoms:
- Identical source text recompiles every time
- Cache hit rate is 0% or very low
Diagnosis:
// Check cache stats
compiler.cacheStats.flatMap { stats =>
IO.println(s"Hits: ${stats.hits}, Misses: ${stats.misses}, Hit Rate: ${stats.hitRate}")
}
Solutions:
- Function registry changing: Module versions or signatures changing invalidates cache
- Source whitespace differences: Even spaces/newlines change syntactic hash
- Cache eviction: LRU cache too small, increase max entries
- Wrong compiler instance: Using different compiler instance per request (create once at startup)
Problem: Embedded API Memory Leak
Symptoms:
- Heap usage grows unbounded
- GC pressure increases over time
Diagnosis:
# Heap dump
jmap -dump:format=b,file=heap.bin <pid>
# Analyze with VisualVM or Eclipse MAT
Common Causes:
- PipelineStore unbounded growth: No eviction policy, every compilation stores forever
- Module state leaks: Modules holding onto resources (unclosed connections, file handles)
- Listener accumulation: Adding execution listeners without removing them
Solutions:
- Implement PipelineStore eviction: Remove unused pipelines periodically
- Use
Resourcefor modules: Ensure cleanup on shutdown - Remove listeners: Unregister listeners when no longer needed
Summary Table: Execution Mode Decision Matrix
| Scenario | Recommended Mode | Key Reason |
|---|---|---|
| One-off scripts | Cold Execution | Simplest, no state management |
| Production APIs (JVM) | Hot Execution + Embedded API | Lowest latency, highest throughput |
| Production APIs (polyglot) | Hot Execution + HTTP API | Language-agnostic, scalable |
| LSP / Interactive tools | Cached Execution + HTTP API | Sub-5ms compilation, responsive UX |
| Batch processing (millions) | Hot Execution + Embedded API | Zero overhead, maximum throughput |
| Multi-tenant SaaS | Cached Execution + HTTP API | Per-tenant isolation, cache sharing |
| ML inference server | Hot Execution + Embedded API + Cache | Low latency, result caching |
| Microservices | Hot Execution + HTTP API | Process isolation, independent deployment |
| CI/CD pipelines | Cold Execution | Fresh compilation, no state persistence |
| Dashboard / Web UI | Cached Execution + HTTP API | Interactive, browser clients |
Next Steps
- Embedding Guide — Complete embedded API reference
- HTTP API Reference — Full endpoint documentation