List Processing Pipeline

Use higher-order functions to filter and transform lists - the foundation of functional data processing.

Use Case

Functional Style

If you are familiar with functional programming, these operations (map, filter, all, any) work exactly as you would expect from Scala, Haskell, or similar languages.

You have a list of items and need to:

Filter items matching criteria
Transform each item
Check conditions across all items

This is essential for data preprocessing, validation, and transformation.

The Pipeline

# list-processing.cst
# Filter and transform lists with lambda functions

use stdlib.collection

in numbers: List<Int>
in threshold: Int

# Filter: keep only numbers greater than threshold
filtered = filter(numbers, (x) => gt(x, threshold))

# Map: double each number
doubled = map(numbers, (x) => multiply(x, 2))

# All: check if all numbers are positive
allPositive = all(numbers, (x) => gt(x, 0))

# Any: check if any number exceeds 100
anyLarge = any(numbers, (x) => gt(x, 100))

out filtered
out doubled
out allPositive
out anyLarge

Explanation

Function	Description	Signature
`filter`	Keep items matching predicate	`(List<T>, (T) => Boolean) => List<T>`
`map`	Transform each item	`(List<T>, (T) => U) => List<U>`
`all`	True if all items match	`(List<T>, (T) => Boolean) => Boolean`
`any`	True if any item matches	`(List<T>, (T) => Boolean) => Boolean`

Running the Example

Input

{
  "numbers": [5, 25, 15, 150, 10],
  "threshold": 10
}

Expected Output

{
  "filtered": [25, 15, 150],
  "doubled": [10, 50, 30, 300, 20],
  "allPositive": true,
  "anyLarge": true
}

Variations

Filter and Compute

Filter then calculate statistics:

use stdlib.collection

in scores: List<Int>
in passingScore: Int

# Get only passing scores
passing = filter(scores, (s) => gte(s, passingScore))

# Calculate stats on passing scores
passingCount = list-length(passing)
passingAvg = Average(passing)

out passingCount
out passingAvg

Chain Transformations

Apply multiple transformations in sequence:

use stdlib.collection

in numbers: List<Int>

# First double all numbers
step1 = map(numbers, (x) => multiply(x, 2))

# Then filter to keep only those > 50
step2 = filter(step1, (x) => gt(x, 50))

# Finally check if any exceed 200
hasVeryLarge = any(step2, (x) => gt(x, 200))

out step2
out hasVeryLarge

Validation Pipeline

Validate data meets criteria:

use stdlib.collection

in values: List<Int>
in minValue: Int
in maxValue: Int

# Check all values are within range
allAboveMin = all(values, (v) => gte(v, minValue))
allBelowMax = all(values, (v) => lte(v, maxValue))

# Check if any values are exactly at boundaries
anyAtMin = any(values, (v) => eq-int(v, minValue))
anyAtMax = any(values, (v) => eq-int(v, maxValue))

out allAboveMin
out allBelowMax
out anyAtMin
out anyAtMax

Score Normalization

Transform scores to a different scale:

use stdlib.collection

in rawScores: List<Int>
in multiplier: Int

# Scale all scores
scaled = map(rawScores, (score) => multiply(score, multiplier))

# Filter to valid range (0-100 after scaling)
valid = filter(scaled, (s) => and(gte(s, 0), lte(s, 100)))

out scaled
out valid

Lambda Syntax

Lambdas in constellation-lang use arrow syntax:

# Single parameter
(x) => expression

# With comparison
(x) => gt(x, 10)

# With arithmetic
(x) => multiply(x, 2)

# With boolean logic
(x) => and(gt(x, 0), lt(x, 100))

Real-World Applications

Data Cleaning

Filter out invalid entries
Transform values to standard format

Validation

Check all items meet criteria
Find items that violate rules

Feature Engineering

Scale features for ML models
Create derived features

Business Logic

Filter products by criteria
Apply bulk transformations

Performance Considerations

Short-Circuit Evaluation

any stops at the first matching element, and all stops at the first non-matching element. This makes them efficient for early exit conditions.

Filter early: Reduce list size before expensive operations
Avoid redundant passes: Combine conditions when possible
Use short-circuit evaluation: any stops at first match, all stops at first failure

Best Practices

Lambda Complexity

Keep lambdas to single expressions. If you need complex logic, break it into multiple pipeline steps with meaningful variable names.

Keep lambdas simple: Complex logic should be in separate steps
Name intermediate results: Makes pipeline readable and debuggable
Validate inputs: Check list isn't empty before operations like list-first

Data Statistics Pipeline - Compute stats on filtered lists
Scoring Pipeline - Use filtering in scoring logic

Use Case​

The Pipeline​

Explanation​

Running the Example​

Input​

Expected Output​

Variations​

Filter and Compute​

Chain Transformations​

Validation Pipeline​

Score Normalization​

Lambda Syntax​

Real-World Applications​

Data Cleaning​

Validation​

Feature Engineering​

Business Logic​

Performance Considerations​

Best Practices​

Related Examples​