Skip to main content

Execution Modes

This guide explains the different ways to execute Constellation pipelines, comparing hot vs cold execution patterns and HTTP vs embedded API deployment models. Understanding these modes helps you make informed architectural decisions based on your performance, scalability, and deployment requirements.

Quick Reference

ModeWhat Gets ReusedTypical LatencyUse Case
Cold ExecutionNothingParse + Compile + ExecuteOne-off scripts, CI/CD pipelines
Hot Execution (Compile-Once)Compiled pipelineExecute only (~1-10ms overhead)Production workloads, low-latency APIs
Hot Execution (Cached)Compiled pipeline + cache hits<5ms for unchanged sourceInteractive tools, LSP, high-throughput services
DeploymentArchitectureOverheadUse Case
Embedded APIIn-process library calls~0.15ms/nodeBatch jobs, embedded ML pipelines, tight control loops
HTTP APINetwork + JSON serialization+2-10ms per requestMulti-language clients, microservices, dashboard UI

Part 1: Hot vs Cold Execution

Cold Execution

Definition: Every execution starts from source text. The pipeline is compiled from scratch each time.

Flow:

Source Text → Parse → TypeCheck → IR Generation → Optimize → DAG Compile → Execute → Results
(5-50ms) (5-50ms) (5-20ms) (5-20ms) (10-50ms) (1-10000ms)

Example:

import cats.effect.IO
import io.constellation.impl.ConstellationImpl
import io.constellation.stdlib.StdLib
import io.constellation.lang.LangCompiler
import io.constellation.TypeSystem.CValue._

val source = """
in text: String
result = Uppercase(text)
out result
"""

// Full cold execution on every call
def coldExecute(input: String): IO[String] =
for {
constellation <- ConstellationImpl.init
_ <- StdLib.allModules.values.toList.traverse(constellation.setModule)
compiler = StdLib.compiler

// 1. Compile from source (cold)
compiled <- IO.fromEither(
compiler.compile(source, "my-pipeline")
.leftMap(errs => new RuntimeException(errs.map(_.message).mkString("\n")))
)

// 2. Execute
sig <- constellation.run(compiled.pipeline, Map("text" -> VString(input)))

// 3. Extract result
result <- IO.fromOption(sig.outputs.get("result").collect { case VString(s) => s })(
new NoSuchElementException("Missing output")
)
} yield result

// Every call pays full compilation cost
coldExecute("hello").unsafeRunSync() // ~100ms first time
coldExecute("world").unsafeRunSync() // ~100ms again (no reuse)

When to Use Cold Execution:

  • One-off scripts: Shell scripts, CI/CD pipelines, cron jobs
  • Dynamic code generation: When pipeline source changes every execution
  • Prototyping: Fast iteration during development (though caching is faster)
  • Low execution frequency: When compilation cost is amortized over long intervals

Performance Characteristics:

Program SizeParseTypeCheckCompileTotal Overhead
Small (10 nodes)<5ms<5ms<10ms~20-30ms
Medium (50 nodes)<50ms<50ms<50ms~100-150ms
Large (100 nodes)<200ms<200ms<150ms~300-500ms

For reference, these are target values from dev/benchmarks/performance-benchmarks.md.

Limitations:

  • High latency unsuitable for request/response APIs
  • Redundant work if source text doesn't change
  • Cannot leverage compilation caching
  • Module registry must be rebuilt each time

Hot Execution (Compile-Once)

Definition: Compile once, execute many times. The compiled LoadedPipeline is stored and reused.

Flow (First Execution):

Source Text → Parse → TypeCheck → IR Gen → Optimize → DAG Compile → Store in PipelineStore
(5-50ms) (5-50ms) (5-20ms) (5-20ms) (10-50ms)

(returns hash: abc123...)

Flow (Subsequent Executions):

Pipeline Reference (hash or alias) → PipelineStore Lookup → Execute → Results
(<1ms) (1-10000ms)

Example:

import cats.effect.IO
import io.constellation.impl.ConstellationImpl
import io.constellation.stdlib.StdLib
import io.constellation.lang.LangCompiler
import io.constellation.TypeSystem.CValue._
import io.constellation.ExecutionOptions

val source = """
in text: String
result = Uppercase(text)
out result
"""

// One-time setup (application startup)
val setup: IO[(io.constellation.Constellation, String)] =
for {
constellation <- ConstellationImpl.init
_ <- StdLib.allModules.values.toList.traverse(constellation.setModule)
compiler = StdLib.compiler

// Compile once
compiled <- IO.fromEither(
compiler.compile(source, "uppercase-pipeline")
.leftMap(errs => new RuntimeException(errs.map(_.message).mkString("\n")))
)

// Store the pipeline image (content-addressed storage)
hash <- constellation.PipelineStore.store(compiled.pipeline.image)

// Create human-readable alias
_ <- constellation.PipelineStore.alias("uppercase", hash)

} yield (constellation, hash)

// Production usage (executes many times)
def hotExecute(
constellation: io.constellation.Constellation,
pipelineRef: String,
input: String
): IO[String] =
for {
// Execute by reference (hash or alias) - no compilation
sig <- constellation.run(
pipelineRef, // "uppercase" or hash
Map("text" -> VString(input)),
ExecutionOptions()
)

result <- IO.fromOption(sig.outputs.get("result").collect { case VString(s) => s })(
new NoSuchElementException("Missing output")
)
} yield result

// Usage pattern
for {
(constellation, hash) <- setup // ~100ms (once at startup)

// All executions reuse compiled pipeline
r1 <- hotExecute(constellation, "uppercase", "hello") // ~1-10ms overhead
r2 <- hotExecute(constellation, "uppercase", "world") // ~1-10ms overhead
r3 <- hotExecute(constellation, hash, "test") // ~1-10ms overhead (by hash)
} yield ()

PipelineStore Operations:

// Store pipeline by structural hash (content-addressed)
val hash: IO[String] = constellation.PipelineStore.store(pipelineImage)
// Returns: "a3f7c2e8b..." (SHA-256 of DAG structure)

// Create human-readable alias
constellation.PipelineStore.alias("my-pipeline", hash)

// Resolve alias to hash
val resolvedHash: IO[Option[String]] = constellation.PipelineStore.resolve("my-pipeline")

// Retrieve by hash
val byHash: IO[Option[PipelineImage]] = constellation.PipelineStore.get(hash)

// Retrieve by alias
val byName: IO[Option[PipelineImage]] = constellation.PipelineStore.getByName("my-pipeline")

// Execute by reference (alias or hash)
constellation.run("my-pipeline", inputs, ExecutionOptions())
constellation.run(s"sha256:$hash", inputs, ExecutionOptions())

When to Use Hot Execution:

  • Production APIs: Request/response services where latency matters
  • High-throughput batch processing: Millions of executions with same pipeline
  • Long-running services: Web servers, microservices, daemon processes
  • Stable pipeline definitions: When source text rarely changes

Performance Characteristics:

OperationLatencyNotes
PipelineStore lookup (in-memory)<1msHash-based retrieval
Pipeline rehydration<1msReconstruct LoadedPipeline from image
DAG execution overhead~0.15ms/nodePure orchestration cost (no module time)

For a 10-node pipeline where each module takes 5ms, total execution time:

  • Cold: ~150ms (compile) + 50ms (execute) = 200ms
  • Hot: 50ms (execute only) = 50ms4x faster

Key Benefits:

  • Eliminates redundant parsing, type-checking, and compilation
  • Predictable latency (no compilation jitter)
  • Enables content-addressed caching (structural hash deduplication)
  • PipelineStore can be backed by persistent storage (survives restarts)

Hot Execution with Caching

Definition: Combines compile-once with automatic cache-hit detection. If source text hasn't changed, the compiler returns a cached LoadedPipeline immediately.

Flow (Cache Hit):

Source Text → Syntactic Hash → PipelineStore Syntactic Index Lookup → Return Cached Pipeline
(instant) (<1ms) (<1ms) (total: <5ms)

Flow (Cache Miss):

Source Text → Parse → TypeCheck → IR Gen → Optimize → DAG Compile → Store & Index → Return Pipeline
(5-50ms) (5-50ms) (5-20ms) (5-20ms) (10-50ms) (<1ms)

How It Works:

The PipelineStore maintains a syntactic index that maps (syntacticHash, registryHash) → structuralHash:

  1. Syntactic Hash: SHA-256 of the source text (unchanged source = same hash)
  2. Registry Hash: Hash of the function registry (detects when available modules change)
  3. Structural Hash: SHA-256 of the compiled DAG structure

When you compile source:

  • Compiler computes syntactic hash immediately (no parsing required)
  • Checks syntactic index for (syntacticHash, registryHash) pair
  • If found: returns cached LoadedPipeline (cache hit)
  • If not found: compiles, stores result, indexes (syntactic, registry) → structural

Example:

import io.constellation.lang.{CachingLangCompiler, LangCompiler}
import io.constellation.impl.ConstellationImpl
import io.constellation.stdlib.StdLib

// Setup with caching compiler
val setup: IO[(io.constellation.Constellation, CachingLangCompiler)] =
for {
constellation <- ConstellationImpl.init
_ <- StdLib.allModules.values.toList.traverse(constellation.setModule)
baseCompiler = StdLib.compiler

// Wrap with caching layer
cachingCompiler = CachingLangCompiler.withDefaults(baseCompiler)
} yield (constellation, cachingCompiler)

val source = """
in text: String
result = Uppercase(text)
out result
"""

// Usage pattern demonstrating cache hits
for {
(constellation, compiler) <- setup

// First compilation (cold - cache miss)
compiled1 <- compiler.compileIO(source, "uppercase")
// Took ~100ms (full compilation pipeline)

// Store the compiled pipeline
hash <- constellation.PipelineStore.store(compiled1.pipeline.image)

// Second compilation of identical source (cache hit)
compiled2 <- compiler.compileIO(source, "uppercase")
// Took <5ms (cache hit - no parsing/typechecking/compilation)

// Third compilation with modified source (cache miss)
modifiedSource = """
in text: String
result = Lowercase(text) # Changed: Uppercase → Lowercase
out result
"""
compiled3 <- compiler.compileIO(modifiedSource, "lowercase")
// Took ~100ms (cache miss - different syntactic hash)

} yield ()

Cache Performance:

ScenarioLatencySpeedup
Cold compile (medium program)~100msBaseline
Warm cache hit<5ms20x faster
Cache hit after server restart (persistent store)<5ms20x faster
Source modified (cache miss)~100msNo speedup (expected)

When to Use Cached Execution:

  • LSP (Language Server Protocol): Every keystroke triggers recompilation
  • Interactive dashboards: Users edit and re-run pipelines frequently
  • Development environments: Rapid edit-test cycles
  • Multi-tenant platforms: Many users running identical pipelines

Example: LSP Autocomplete Performance

// User types: "result = Upper|" (cursor at |)
// LSP needs to compile to provide autocomplete suggestions

// Without caching:
// Every keystroke = full recompilation = 100ms per keystroke = unusable

// With caching:
// Syntactic hash changes only when source changes
// Cache hit = <5ms = 50ms autocomplete response = smooth UX

Cache Invalidation:

The cache automatically invalidates when:

  1. Source text changes: Different syntactic hash triggers cache miss
  2. Function registry changes: Adding/removing modules changes registry hash
  3. Module signatures change: Version bump in module metadata changes registry hash

No manual invalidation is required — cache correctness is guaranteed by the hash-based indexing.

Monitoring Cache Effectiveness:

# Check cache hit rate via HTTP metrics endpoint
curl http://localhost:8080/metrics | jq .cache

# Example output:
{
"cache": {
"hits": 847,
"misses": 23,
"hitRate": 0.973,
"size": 45
}
}

Target hit rate: >80% for production workloads with stable pipelines.


Execution Mode Comparison Table

AspectCold ExecutionHot Execution (Compile-Once)Hot Execution (Cached)
CompilationEvery executionOnce at startupOn cache miss only
Typical Overhead50-500ms1-10ms<5ms (hit) / 50-500ms (miss)
Memory UsageLow (no persistent state)Medium (PipelineStore)High (PipelineStore + cache)
Startup TimeInstantMedium (pre-compile)High (warm cache)
Runtime LatencyHigh (compile + execute)Low (execute only)Lowest (cache hit + execute)
Source ChangesAlways freshRequires recompileAuto-detects via hash
Best ForScripts, CI/CDProduction APIsLSP, dashboards, dev tools

Decision Matrix: Which Execution Mode?

Choose Cold Execution if:

  • ✅ Execution frequency is low (minutes to hours between runs)
  • ✅ Source text changes every execution (dynamic generation)
  • ✅ Minimal memory footprint is required
  • ✅ Startup time matters more than runtime latency
  • ❌ You need sub-50ms response times
  • ❌ You execute the same pipeline thousands of times

Choose Hot Execution (Compile-Once) if:

  • ✅ Same pipeline executes many times (hundreds to millions)
  • ✅ Pipeline definition is stable (source rarely changes)
  • ✅ Latency matters (APIs, request/response services)
  • ✅ You can afford startup cost (pre-compilation)
  • ❌ Source text changes frequently (then use caching instead)
  • ❌ You need to support interactive editing (then use caching)

Choose Hot Execution (Cached) if:

  • ✅ Source text changes frequently but repeats (interactive editing)
  • ✅ Multiple users may run identical pipelines (multi-tenant)
  • ✅ You need to support LSP or live tooling
  • ✅ Sub-5ms compilation latency is critical
  • ✅ Memory for cache is available
  • ❌ Every execution uses unique source text (no cache benefit)
  • ❌ Minimal memory footprint is required

Part 2: HTTP API vs Embedded API

Embedded API (In-Process)

Definition: Constellation runs as a library within your application. All interactions are direct Scala method calls.

Architecture:

┌─────────────────────────────────────┐
│ Your Application (JVM Process) │
│ ┌────────────────────────────────┐ │
│ │ Application Code │ │
│ │ ↓ │ │
│ │ constellation.run(...) │ │ Direct method calls
│ │ ↓ │ │ No network overhead
│ │ Constellation Engine │ │ No serialization
│ │ ↓ │ │
│ │ Module Execution │ │
│ │ ↓ │ │
│ │ Results (CValue) │ │
│ └────────────────────────────────┘ │
└─────────────────────────────────────┘

Example:

import cats.effect._
import cats.implicits._
import io.constellation._
import io.constellation.TypeSystem._
import io.constellation.TypeSystem.CValue._
import io.constellation.impl.ConstellationImpl
import io.constellation.stdlib.StdLib

object EmbeddedExample extends IOApp.Simple {

def run: IO[Unit] =
for {
// 1. Initialize Constellation instance (embedded in your process)
constellation <- ConstellationImpl.init

// 2. Register modules
_ <- StdLib.allModules.values.toList.traverse(constellation.setModule)

// 3. Compile pipeline
compiler = StdLib.compiler
compiled <- IO.fromEither(
compiler.compile("""
in numbers: List[Int]
sum = Sum(numbers)
avg = Average(numbers)
out sum
out avg
""", "stats-pipeline").leftMap(errs =>
new RuntimeException(errs.map(_.message).mkString("\n"))
)
)

// 4. Execute directly (no HTTP, no serialization)
sig <- constellation.run(
compiled.pipeline,
Map("numbers" -> VList(List(VLong(10), VLong(20), VLong(30))))
)

// 5. Access results as typed Scala values
sum <- IO.fromOption(sig.outputs.get("sum").collect { case VLong(n) => n })(
new NoSuchElementException("Missing sum")
)
avg <- IO.fromOption(sig.outputs.get("avg").collect { case VFloat(d) => d })(
new NoSuchElementException("Missing avg")
)

_ <- IO.println(s"Sum: $sum, Average: $avg")

} yield ()
}

Performance Characteristics:

OperationLatencyNotes
Method call overhead<0.01msDirect JVM method dispatch
Input conversion<0.1msScala values → CValue (zero-copy for most types)
DAG execution overhead~0.15ms/nodePure orchestration (measured in benchmarks)
Output extraction<0.1msCValue → Scala values

For a 10-node pipeline where each module takes 5ms:

  • Execution: 10 nodes × 5ms/node = 50ms (module logic)
  • Orchestration: 10 nodes × 0.15ms/node = 1.5ms (DAG overhead)
  • Total: 51.5ms (pure engine cost)

Advantages:

  • Lowest latency: No network round-trip, no JSON serialization
  • Type safety: Compile-time guarantees for inputs/outputs
  • Direct control: Fine-grained configuration (scheduler, backends, lifecycle)
  • Efficient resource usage: Shared memory, no IPC overhead
  • Easier debugging: Single process, stack traces work normally

Trade-offs:

  • JVM only: Cannot call from Python, JavaScript, or other languages
  • Same process: Crashes in modules crash your application
  • Deployment coupling: Must redeploy application to update Constellation version
  • No cross-network execution: Cannot distribute across machines

When to Use Embedded API:

  • Batch processing jobs (ETL pipelines, data transformations)
  • ML inference servers (where latency is critical)
  • Embedded systems (IoT devices running JVM)
  • Tight control loops (real-time systems, robotics)
  • Single-tenant applications (desktop apps, CLI tools)

Production Configuration:

import io.constellation.impl.ConstellationImpl
import io.constellation.execution.{GlobalScheduler, ConstellationLifecycle}
import io.constellation.spi.ConstellationBackends
import scala.concurrent.duration._

// Full production setup
GlobalScheduler.bounded(
maxConcurrency = 16,
starvationTimeout = 30.seconds
).use { scheduler =>

ConstellationLifecycle.create.flatMap { lifecycle =>

val constellation = ConstellationImpl.builder()
.withScheduler(scheduler)
.withBackends(ConstellationBackends(
metrics = myPrometheusMetrics,
tracer = myOtelTracer,
listener = myKafkaListener,
cache = Some(myRedisCache)
))
.withDefaultTimeout(60.seconds)
.withLifecycle(lifecycle)
.build()

// ... use constellation ...

// Graceful shutdown
lifecycle.shutdown(drainTimeout = 30.seconds)
}
}

HTTP API (Out-of-Process)

Definition: Constellation runs as a standalone HTTP server. Clients interact via REST API over the network.

Architecture:

┌──────────────────┐                    ┌──────────────────────────┐
│ Client App │ HTTP POST │ Constellation Server │
│ (Any Language) │ /compile │ (JVM Process) │
│ ┌─────────────┐ │ ──────────────────>│ ┌────────────────────┐ │
│ │ Python │ │ │ │ HTTP Routes │ │
│ │ JavaScript │ │ JSON Response │ │ ↓ │ │
│ │ Go │ │ <──────────────────│ │ Constellation API │ │
│ │ Ruby │ │ │ │ ↓ │ │
│ └─────────────┘ │ HTTP POST │ │ Module Execution │ │
└──────────────────┘ /execute │ │ ↓ │ │
──────────────────>│ │ JSON Response │ │
│ └────────────────────┘ │
└──────────────────────────┘

Example (Server):

import cats.effect._
import io.constellation.impl.ConstellationImpl
import io.constellation.stdlib.StdLib
import io.constellation.http._

object HttpServerExample extends IOApp.Simple {

def run: IO[Unit] =
for {
constellation <- ConstellationImpl.init
_ <- StdLib.allModules.values.toList.traverse(constellation.setModule)
compiler = StdLib.compiler

// Start HTTP server
_ <- ConstellationServer
.builder(constellation, compiler)
.withHost("0.0.0.0")
.withPort(8080)
.withDashboard // Optional: web UI for interactive testing
.run

} yield ()
}

Example (Client - curl):

# Compile a pipeline
curl -X POST http://localhost:8080/compile \
-H "Content-Type: application/json" \
-d '{
"name": "stats-pipeline",
"source": "in numbers: List[Int]\nsum = Sum(numbers)\nout sum"
}'

# Response:
# {
# "success": true,
# "structuralHash": "a3f7c2e8b...",
# "syntacticHash": "d9e1f4a2c...",
# "name": "stats-pipeline"
# }

# Execute by name
curl -X POST http://localhost:8080/execute \
-H "Content-Type: application/json" \
-d '{
"ref": "stats-pipeline",
"inputs": {
"numbers": [10, 20, 30]
}
}'

# Response:
# {
# "success": true,
# "outputs": {
# "sum": 60
# },
# "status": "completed",
# "executionId": "f8a2c3d4-..."
# }

Example (Client - Python):

import requests
import json

# Compile pipeline
compile_response = requests.post('http://localhost:8080/compile', json={
'name': 'stats-pipeline',
'source': '''
in numbers: List[Int]
sum = Sum(numbers)
avg = Average(numbers)
out sum
out avg
'''
})

pipeline_hash = compile_response.json()['structuralHash']

# Execute pipeline
execute_response = requests.post('http://localhost:8080/execute', json={
'ref': 'stats-pipeline', # or use pipeline_hash
'inputs': {
'numbers': [10, 20, 30]
}
})

results = execute_response.json()['outputs']
print(f"Sum: {results['sum']}, Average: {results['avg']}")

Example (Client - JavaScript/TypeScript):

// Compile pipeline
const compileResponse = await fetch('http://localhost:8080/compile', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
name: 'stats-pipeline',
source: `
in numbers: List[Int]
sum = Sum(numbers)
out sum
`
})
});

const { structuralHash } = await compileResponse.json();

// Execute pipeline
const executeResponse = await fetch('http://localhost:8080/execute', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
ref: 'stats-pipeline',
inputs: { numbers: [10, 20, 30] }
})
});

const { outputs } = await executeResponse.json();
console.log(`Sum: ${outputs.sum}`);

Performance Characteristics:

OperationLatencyNotes
Network round-trip (same datacenter)0.5-2msVaries by network topology
Network round-trip (cross-region)10-50msSignificant overhead
JSON serialization (request)0.1-1msDepends on input size
JSON deserialization (request)0.1-1msCirce parsing
DAG execution1ms + module timeSame as embedded
JSON serialization (response)0.1-1msOutput encoding
JSON deserialization (response)0.1-1msClient parsing

For a 10-node pipeline where each module takes 5ms:

  • Module execution: 50ms (same as embedded)
  • Orchestration: 1.5ms (same as embedded)
  • HTTP overhead: 2-10ms (network + serialization)
  • Total: ~53.5-61.5ms (+ network latency)

Typical overhead: +2-10ms per request compared to embedded API.

Advantages:

  • Language-agnostic: Call from Python, JavaScript, Go, etc.
  • Process isolation: Module crashes don't affect client
  • Independent deployment: Update Constellation without redeploying clients
  • Horizontal scaling: Run multiple servers behind a load balancer
  • Built-in dashboard: Web UI for testing and visualization

Trade-offs:

  • Higher latency: Network + serialization overhead
  • JSON overhead: Large inputs/outputs slow down significantly
  • No type safety: Clients pass JSON (runtime errors possible)
  • More moving parts: Network failures, load balancers, authentication

When to Use HTTP API:

  • Multi-language environments (polyglot microservices)
  • Multi-tenant platforms (SaaS applications)
  • Web dashboards (browser-based clients)
  • External integrations (third-party services)
  • Distributed systems (cross-network execution)

Production Configuration:

ConstellationServer
.builder(constellation, compiler)
.withPort(8080)
.withDashboard

// Security hardening
.withAuth(AuthConfig(apiKeys = Map(
"admin-key" -> ApiRole.Admin,
"app-key" -> ApiRole.Execute
)))
.withCors(CorsConfig(allowedOrigins = Set("https://app.example.com")))
.withRateLimit(RateLimitConfig(requestsPerMinute = 200, burst = 40))

// Health checks
.withHealthChecks(HealthCheckConfig(enableDetailEndpoint = true))

// Pipeline management
.withPipelineLoader(PipelineLoaderConfig(
directory = java.nio.file.Paths.get(".constellation-pipelines"),
filePattern = "*.cst"
))
.withPersistentPipelineStore(java.nio.file.Paths.get(".constellation-store"))

.run

Available Endpoints:

EndpointMethodPurpose
/compilePOSTCompile constellation-lang source to pipeline
/executePOSTExecute pipeline by reference (hash or alias)
/pipelinesGETList all stored pipelines
/pipelines/:refGETGet pipeline details
/pipelines/:refDELETEDelete a pipeline
/modulesGETList available modules
/healthGETHealth check (uptime, status)
/health/readyGETReadiness probe (Kubernetes)
/health/liveGETLiveness probe (Kubernetes)
/metricsGETPrometheus-style metrics
/lspWebSocketLanguage Server Protocol (LSP) for editors

Authentication (opt-in):

# Set API keys via environment variables
CONSTELLATION_API_KEYS="admin:Admin,app:Execute"

# Client must include X-API-Key header
curl -X POST http://localhost:8080/execute \
-H "X-API-Key: admin" \
-H "Content-Type: application/json" \
-d '{ "ref": "my-pipeline", "inputs": {} }'

Rate Limiting (opt-in):

# Configure rate limits
CONSTELLATION_RATE_LIMIT_RPM=100 # 100 requests per minute per IP
CONSTELLATION_RATE_LIMIT_BURST=20 # Allow bursts of 20

# Exceeding limits returns HTTP 429 Too Many Requests

Deployment Comparison Table

AspectEmbedded APIHTTP API
Latency~0.15ms/node overhead+2-10ms per request
LanguagesScala/JVM onlyAny (Python, JS, Go, Ruby, ...)
NetworkNo network (in-process)HTTP/1.1 or HTTP/2
SerializationNone (direct CValue)JSON (Circe encoder/decoder)
Type SafetyCompile-time (Scala types)Runtime (JSON validation)
IsolationSame process (shared fate)Separate process (fault isolation)
ScalabilityVertical (scale JVM)Horizontal (multiple servers)
DeploymentBundled with appIndependent service
DebuggingEasy (single process)Harder (distributed tracing needed)
Best ForLatency-critical, JVM-onlyMulti-language, distributed systems

Decision Matrix: HTTP vs Embedded?

Choose Embedded API if:

  • ✅ Your application is written in Scala or another JVM language
  • ✅ Sub-millisecond latency is critical (real-time systems)
  • ✅ You process millions of executions per second (throughput-bound)
  • ✅ Deployment coupling is acceptable (single binary)
  • ✅ You need fine-grained control over scheduler/backends/lifecycle
  • ❌ You need to call from non-JVM languages
  • ❌ You want independent deployment of Constellation and clients

Choose HTTP API if:

  • ✅ Clients use multiple languages (Python, JavaScript, Go, etc.)
  • ✅ You need process isolation (module crashes shouldn't affect clients)
  • ✅ Horizontal scaling is required (load balancer + multiple servers)
  • ✅ You want a web dashboard for interactive testing
  • ✅ You have a microservices architecture
  • ❌ Network overhead is unacceptable (<1ms latency required)
  • ❌ Large payloads make JSON serialization prohibitively slow

Part 3: Combining Execution Modes

Hot Execution + Embedded API (Best Performance)

Use Case: High-throughput batch processing, ML inference servers.

Pattern:

// Startup: compile once
for {
constellation <- ConstellationImpl.init
_ <- StdLib.allModules.values.toList.traverse(constellation.setModule)
compiler = StdLib.compiler

compiled <- compiler.compileIO(source, "my-pipeline")
hash <- constellation.PipelineStore.store(compiled.pipeline.image)
_ <- constellation.PipelineStore.alias("my-pipeline", hash)

// Runtime: execute millions of times with zero compilation overhead
_ <- Stream.iterate(0)(_ + 1) // Infinite stream
.evalMap { i =>
constellation.run("my-pipeline", inputs(i), ExecutionOptions())
}
.compile
.drain

} yield ()

Performance: ~0.15ms/node overhead + module execution time. No compilation, no network, no serialization.


Hot Execution + HTTP API (Best Flexibility)

Use Case: Multi-tenant SaaS platforms, web applications.

Pattern:

// Server: pre-compile common pipelines at startup
for {
constellation <- ConstellationImpl.init
_ <- StdLib.allModules.values.toList.traverse(constellation.setModule)
compiler = StdLib.compiler

// Pre-compile top 10 most-used pipelines
_ <- commonPipelines.traverse { case (name, source) =>
compiler.compileIO(source, name).flatMap { compiled =>
constellation.PipelineStore.store(compiled.pipeline.image).flatMap { hash =>
constellation.PipelineStore.alias(name, hash)
}
}
}

// Start HTTP server
_ <- ConstellationServer.builder(constellation, compiler).run

} yield ()

Clients: Execute pre-compiled pipelines by name (no compilation overhead on each request).

Performance: ~2-10ms HTTP overhead + 0.15ms/node overhead + module execution time.


Cached Execution + HTTP API (Best for Interactive Tools)

Use Case: LSP servers, web dashboards, live editors.

Pattern:

// Server with caching compiler
for {
constellation <- ConstellationImpl.init
_ <- StdLib.allModules.values.toList.traverse(constellation.setModule)
baseCompiler = StdLib.compiler

// Enable caching
cachingCompiler = CachingLangCompiler.withDefaults(baseCompiler)

// Start HTTP server with caching compiler
_ <- ConstellationServer.builder(constellation, cachingCompiler).run

} yield ()

Client workflow:

  1. User edits source in web UI
  2. Client POSTs to /compile on every change
  3. Server returns cached result (<5ms) if source unchanged
  4. Client executes pipeline via /execute

Performance: <5ms compile (cache hit) + 2-10ms HTTP overhead + execution time.


Part 4: Advanced Patterns

Hybrid: Embedded Execution with HTTP Management API

Use Case: Local high-performance execution with remote pipeline management.

Architecture:

┌─────────────────────────────────────────┐
│ Application (JVM) │
│ ┌──────────────────────────────────┐ │
│ │ Embedded Constellation │ │ Hot execution
│ │ ↓ │ │ (in-process, fast)
│ │ constellation.run(...) │ │
│ └──────────────────────────────────┘ │
│ ↑ │
│ │ Sync pipelines │
└─────────┼──────────────────────────────┬─┘
│ │
│ HTTP (periodic sync) │
│ │
↓ ↓
┌──────────────────────────────────────────┐
│ Central Constellation Server │
│ (Pipeline registry + compilation) │
└──────────────────────────────────────────┘

Pattern:

// Application: embedded execution + periodic sync from HTTP registry
for {
localConstellation <- ConstellationImpl.init
_ <- StdLib.allModules.values.toList.traverse(localConstellation.setModule)

// Background fiber: sync pipelines from central server every 60 seconds
_ <- Stream.fixedRate[IO](60.seconds).evalMap { _ =>
for {
// Fetch latest pipelines from HTTP registry
response <- client.get("http://registry:8080/pipelines")
pipelines <- response.as[List[PipelineMetadata]]

// Update local PipelineStore
_ <- pipelines.traverse { pm =>
client.get(s"http://registry:8080/pipelines/${pm.hash}").flatMap { resp =>
resp.as[PipelineImage].flatMap { image =>
localConstellation.PipelineStore.store(image) *>
localConstellation.PipelineStore.alias(pm.name, pm.hash)
}
}
}
} yield ()
}.compile.drain.start

// Foreground: execute locally (hot, embedded)
_ <- localConstellation.run("my-pipeline", inputs, ExecutionOptions())

} yield ()

Benefits:

  • ✅ Centralized pipeline management (single source of truth)
  • ✅ Local execution performance (no network on hot path)
  • ✅ Cache-friendly (pipelines update infrequently)

Persistent PipelineStore (Survive Restarts)

Use Case: Production servers where pipelines should survive process restarts.

Pattern:

ConstellationServer
.builder(constellation, compiler)
.withPersistentPipelineStore(java.nio.file.Paths.get(".constellation-store"))
.withPipelineLoader(PipelineLoaderConfig(
directory = java.nio.file.Paths.get(".constellation-pipelines"),
filePattern = "*.cst"
))
.run

How it works:

  1. Startup: Scans .constellation-pipelines/ for *.cst files
  2. Compilation: Compiles each file and stores in PipelineStore
  3. Persistence: Writes PipelineStore to .constellation-store/ (JSON files)
  4. Restart: Loads pipelines from .constellation-store/ (no re-compilation)

Directory structure:

.constellation-pipelines/
text-processing.cst
data-analysis.cst
ml-inference.cst

.constellation-store/
images/
a3f7c2e8b....json # PipelineImage (structural hash)
d9e1f4a2c....json
aliases.json # { "text-processing": "a3f7c2e8b...", ... }
syntactic_index.json # { ("syntax-hash", "registry-hash"): "structural-hash" }

Performance:

  • Cold start (first boot): Compiles all .cst files (~100ms each)
  • Warm start (restart): Loads from .constellation-store/ (<1ms per pipeline)

Part 5: Performance Optimization Strategies

Strategy 1: Pre-Compilation at Build Time

Problem: First request after deployment is slow (cold start).

Solution: Compile pipelines during Docker build.

Dockerfile:

FROM eclipse-temurin:17-jre

# Copy application
COPY target/assembly.jar /app/constellation.jar
COPY pipelines/ /app/pipelines/

# Pre-compile pipelines during image build
RUN java -cp /app/constellation.jar io.constellation.cli.Main compile \
--input /app/pipelines/ \
--output /app/.constellation-store/

# Runtime: server loads pre-compiled pipelines
CMD ["java", "-jar", "/app/constellation.jar", "server", \
"--pipeline-store", "/app/.constellation-store"]

Result: Zero cold-start latency (all pipelines pre-compiled in image).


Strategy 2: Pipeline Versioning and Canary Deployments

Problem: Updating a pipeline affects all in-flight executions.

Solution: Version pipelines and use canary routing.

Pattern:

// Deploy new version alongside old version
for {
// Old version (v1)
_ <- constellation.PipelineStore.alias("my-pipeline-v1", oldHash)

// New version (v2)
compiled <- compiler.compileIO(newSource, "my-pipeline")
newHash <- constellation.PipelineStore.store(compiled.pipeline.image)
_ <- constellation.PipelineStore.alias("my-pipeline-v2", newHash)

// Canary: 90% traffic to v1, 10% to v2
_ <- canaryRouter.setWeights("my-pipeline", Map(
"my-pipeline-v1" -> 0.9,
"my-pipeline-v2" -> 0.1
))

// Clients execute "my-pipeline" (router chooses version)
_ <- constellation.run("my-pipeline", inputs, ExecutionOptions())

} yield ()

Benefits:

  • ✅ Gradual rollout (reduce blast radius)
  • ✅ A/B testing (compare performance/accuracy)
  • ✅ Instant rollback (flip weights to 100% old version)

Strategy 3: Module Result Caching

Problem: Expensive modules (ML inference, DB queries) recompute for identical inputs.

Solution: Implement CacheBackend to cache module results.

Pattern:

import io.constellation.cache.CacheBackend
import scala.concurrent.duration._

// Implement CacheBackend (example: Redis)
val redisCache: CacheBackend = new RedisCacheBackend(redisClient)

// Configure Constellation with cache
val constellation = ConstellationImpl.builder()
.withBackends(ConstellationBackends(
cache = Some(redisCache)
))
.build()

// Modules automatically cache results by (moduleName, inputs) hash
// No code changes required in modules

Performance:

  • Cache miss: Full module execution (e.g., 500ms for ML inference)
  • Cache hit: <5ms (Redis lookup)
  • Speedup: 100x for expensive cached operations

Strategy 4: Bounded Scheduler for Multi-Tenant Load

Problem: One tenant's heavy load starves other tenants' requests.

Solution: Use bounded scheduler with priority-based execution.

Pattern:

GlobalScheduler.bounded(
maxConcurrency = 16,
starvationTimeout = 30.seconds
).use { scheduler =>

val constellation = ConstellationImpl.builder()
.withScheduler(scheduler)
.build()

// Tenant A: high-priority (paying customer)
constellation.run(pipelineA, inputsA, ExecutionOptions(priority = 80))

// Tenant B: normal priority (free tier)
constellation.run(pipelineB, inputsB, ExecutionOptions(priority = 50))

// Tenant C: background (analytics job)
constellation.run(pipelineC, inputsC, ExecutionOptions(priority = 20))
}

Result: Tenant A's requests complete first, Tenant B's requests are delayed, Tenant C's requests wait (but eventually run due to starvation prevention).


Part 6: Troubleshooting

Problem: High Latency in HTTP API

Symptoms:

  • Requests take >100ms even for small pipelines
  • Network latency is acceptable (<5ms)

Diagnosis:

# Check metrics endpoint
curl http://localhost:8080/metrics | jq .

# Look for:
# - High compilation times (cache not working?)
# - High queue depth (scheduler overloaded?)
# - Low cache hit rate (<80% is bad)

Solutions:

  1. Enable caching: Use CachingLangCompiler
  2. Pre-compile pipelines: Use PipelineLoader at startup
  3. Increase scheduler concurrency: CONSTELLATION_SCHEDULER_MAX_CONCURRENCY=32
  4. Check module performance: Profile module execution times

Problem: Cache Not Hitting

Symptoms:

  • Identical source text recompiles every time
  • Cache hit rate is 0% or very low

Diagnosis:

// Check cache stats
compiler.cacheStats.flatMap { stats =>
IO.println(s"Hits: ${stats.hits}, Misses: ${stats.misses}, Hit Rate: ${stats.hitRate}")
}

Solutions:

  1. Function registry changing: Module versions or signatures changing invalidates cache
  2. Source whitespace differences: Even spaces/newlines change syntactic hash
  3. Cache eviction: LRU cache too small, increase max entries
  4. Wrong compiler instance: Using different compiler instance per request (create once at startup)

Problem: Embedded API Memory Leak

Symptoms:

  • Heap usage grows unbounded
  • GC pressure increases over time

Diagnosis:

# Heap dump
jmap -dump:format=b,file=heap.bin <pid>

# Analyze with VisualVM or Eclipse MAT

Common Causes:

  1. PipelineStore unbounded growth: No eviction policy, every compilation stores forever
  2. Module state leaks: Modules holding onto resources (unclosed connections, file handles)
  3. Listener accumulation: Adding execution listeners without removing them

Solutions:

  1. Implement PipelineStore eviction: Remove unused pipelines periodically
  2. Use Resource for modules: Ensure cleanup on shutdown
  3. Remove listeners: Unregister listeners when no longer needed

Summary Table: Execution Mode Decision Matrix

ScenarioRecommended ModeKey Reason
One-off scriptsCold ExecutionSimplest, no state management
Production APIs (JVM)Hot Execution + Embedded APILowest latency, highest throughput
Production APIs (polyglot)Hot Execution + HTTP APILanguage-agnostic, scalable
LSP / Interactive toolsCached Execution + HTTP APISub-5ms compilation, responsive UX
Batch processing (millions)Hot Execution + Embedded APIZero overhead, maximum throughput
Multi-tenant SaaSCached Execution + HTTP APIPer-tenant isolation, cache sharing
ML inference serverHot Execution + Embedded API + CacheLow latency, result caching
MicroservicesHot Execution + HTTP APIProcess isolation, independent deployment
CI/CD pipelinesCold ExecutionFresh compilation, no state persistence
Dashboard / Web UICached Execution + HTTP APIInteractive, browser clients

Next Steps