Skip to content

Language Profiles

Why language profiles?

Up to v0.3.x, @fractal-co-design/fca-index could only scan TypeScript projects — the file/dir classification rules were hardcoded. v0.4.0 introduces a declarative LanguageProfile shape that captures everything the scanner needs to know about a single language ecosystem: file extensions, package markers, file/dir → FCA part rules, and language-specific extractors.

Five built-in profiles ship with v0.4.0:

ProfileSource ext.L3 markersTest files
typescript (default).tspackage.json*.test.ts, *.spec.ts
scala.scalabuild.sbt, *.sbt, pom.xml*Spec.scala, *Test.scala
python.py, .pyipyproject.toml, setup.py, setup.cfgtest_*.py, *_test.py, conftest.py
go.gogo.mod, go.sum*_test.go
markdown-only.md, .mdx(none)(none)

You can mix any subset: a TS+Scala monorepo uses languages: ['typescript', 'scala'], a docs-only RFC vault uses languages: ['markdown-only'], and a Python service codebase uses languages: ['python'].

Selecting profiles

Custom profiles are SDK-only. YAML accepts only built-in profile names (string array). To register a custom LanguageProfile, pass it to createFcaIndex/createDefaultFcaIndex programmatically — see Programmatically below. There is no YAML syntax for declaring a custom profile because LanguageProfile carries function fields (extractInterfaceExcerpt, extractDocBlock) that are not serializable.

Via .fca-index.yaml

The simplest route — list built-in profile names. Order matters when two profiles match the same file (the earlier-listed profile wins).

.fca-index.yaml
languages:
- typescript
- scala

Both block list and inline flow forms are accepted:

languages: [typescript, scala]

When languages is absent, the scanner uses ['typescript'] only — preserving v0.3.x behavior.

Unknown profile names throw a LanguageProfileError at the start of the next scan call; the error message lists the known built-ins.

Programmatically

Pass LanguageProfile[] to createFcaIndex or createDefaultFcaIndex. Use this when you have a custom profile (see “Authoring a custom profile” below) or want to override what .fca-index.yaml declares:

import { createDefaultFcaIndex, scalaProfile, typescriptProfile } from '@fractal-co-design/fca-index';
const fca = await createDefaultFcaIndex({
projectRoot: '/path/to/repo',
voyageApiKey: process.env.VOYAGE_API_KEY!,
languages: [typescriptProfile, scalaProfile],
});
await fca.scan();

If both .fca-index.yaml and FcaIndexConfig.languages are set, the programmatic profiles come first and the YAML-resolved ones are appended. Concretely:

// .fca-index.yaml has: languages: [typescript]
// SDK call:
const fca = await createDefaultFcaIndex({
projectRoot, voyageApiKey,
languages: [scalaProfile], // SDK-supplied
});
// Effective active profiles, in order: [scalaProfile, typescriptProfile]
// → Scala rules are tried first when classifying any file the two profiles
// both claim.

This means SDK callers always get final say on profile precedence, and YAML config is treated as additive defaults. To exclude YAML profiles entirely, omit languages: from the manifest or pass an explicit empty list in code contexts that bypass the manifest.

File classification ordering

When two active profiles claim the same file (e.g. both typescript and scala include *.md in their filePatterns), the first profile in the active list wins. The “active list” is the resolved LanguageProfile[] after merging programmatic + YAML, in the order described above.

Concrete example — README.md in a directory scanned with languages: [typescript, scala]:

Profile orderFirst-matching profileFile classified by
[typescript, scala]typescripttypescript’s *.md rule
[scala, typescript]scalascala’s *.md rule

For most cases the difference is invisible because both profiles assign README.md to the documentation part. But if you author a custom profile that classifies *.md differently (e.g. as architecture for ADR files), profile order determines which classification wins.

After a part has been attributed once for a component (by the first matching file), subsequent files matching the same part are ignored — same as v0.3.x. So profile order matters at the file granularity, not at the part granularity.

Authoring a custom LanguageProfile

The LanguageProfile shape is the SDK extension point. Built-in profiles use exactly the same shape — there’s no special path for new languages. Worked example: defining a Scala profile from scratch.

import type { LanguageProfile } from '@fractal-co-design/fca-index';
const MAX_EXCERPT = 600;
export const scalaProfile: LanguageProfile = {
// 1) Identity — lowercase, kebab-case, stable.
name: 'scala',
// 2) File extensions for source files in this language. Used for
// component qualification + boundary/L1 detection. Include the dot.
sourceExtensions: ['.scala'],
// 3) Files whose presence at a directory's root marks it as L3.
// Wildcards: a leading `*` is allowed for suffix matching
// (`*.sbt` → any file ending in `.sbt`). Other markers are exact
// filename matches.
packageMarkers: ['build.sbt', '*.sbt', 'pom.xml'],
// 4) Filename → FCA part rules. First match across all active
// profiles wins per file. Use a `^...$` regex to anchor when the
// pattern needs to match the entire basename.
filePatterns: [
{ pattern: /^README\.md$/, part: 'documentation' },
{ pattern: /^(?!.*\.test\.md$).*\.md$/, part: 'documentation' },
{ pattern: /Spec\.scala$/, part: 'verification' },
{ pattern: /Test\.scala$/, part: 'verification' },
{ pattern: /^[Aa]rchitecture\.scala$/, part: 'architecture' },
{ pattern: /Metrics\.scala$/, part: 'observability' },
{ pattern: /Port\.scala$/, part: 'port' },
{ pattern: /Domain\.scala$/, part: 'domain' },
{ pattern: /^package\.scala$/, part: 'interface' },
],
// 5) Subdirectory-name → FCA part rules. A child directory whose
// name matches a key marks the parent as having that part. The
// locator file is the first source-extension file found inside.
subdirPatterns: {
ports: 'port',
observability: 'observability',
arch: 'architecture',
domain: 'domain',
},
// 6) Component qualification: a directory qualifies if either
// `interfaceFile` is present (case-sensitive) or it has at least
// `minSourceFiles` files matching `sourceExtensions`.
componentRule: {
interfaceFile: 'package.scala',
minSourceFiles: 2,
},
// 7) Optional: pull the public-API excerpt from the interface file.
// Receives raw content; return the trimmed excerpt (the caller
// truncates to ≤600 chars).
extractInterfaceExcerpt(content) {
const lines = content.split('\n');
const sigLines = lines.filter(line =>
/^\s*(?:final\s+|sealed\s+|abstract\s+|case\s+|implicit\s+|private\s+|protected\s+|override\s+)*(?:trait|class|object|case\s+class|case\s+object|def|val|var|type)\s+/.test(line),
);
return sigLines.length > 0
? sigLines.join('\n').slice(0, MAX_EXCERPT).trimEnd()
: content.slice(0, MAX_EXCERPT).trimEnd();
},
// 8) Optional: pull the leading documentation block from any source
// file (JSDoc / ScalaDoc / docstring / godoc). Return empty string
// when absent.
extractDocBlock(content) {
const match = content.match(/^\s*\/\*\*([\s\S]*?)\*\//);
return match ? match[0].slice(0, MAX_EXCERPT).trimEnd() : '';
},
};

Pass it to the factory just like a built-in:

import { createDefaultFcaIndex } from '@fractal-co-design/fca-index';
import { scalaProfile } from './scala-profile.js';
const fca = await createDefaultFcaIndex({
projectRoot,
voyageApiKey,
languages: [scalaProfile],
});

Behavior in polyglot scans

When multiple profiles are active:

  • Component qualification is a UNION — a directory qualifies if ANY active profile considers it a component (interface file present, or ≥ minSourceFiles of that profile’s extensions).
  • Level detection uses the UNION of all packageMarkers for L3 decisions and the UNION of sourceExtensions for L1/source counts.
  • File classification is FIRST-WINS by profile order: the earliest profile in the list whose filePatterns matches the file’s basename attributes the part. After a part is attributed once for a component, later files that would match the same part are ignored (same as v0.3.x).
  • Default source globs (when sourcePatterns is unset): for the TypeScript-only default, ['src/**', 'packages/*/src/**'] — same as v0.3.x. For multi-language or non-TS scans, broadens to ['src/**', 'packages/*/src/**', 'modules/**', 'apps/**']. Override via sourcePatterns: in .fca-index.yaml for tighter control.

Built-in profile reference

typescript (default)

AspectValue
Source ext..ts
L3 markerspackage.json
Interface fileindex.ts (with export keyword)
Verification*.test.ts, *.spec.ts, *.contract.test.ts
Port*port.ts (incl. .port.ts, -port.ts)
Observability*.metrics.ts, *.observability.ts
Domain*-domain.ts
Architecturearchitecture.ts
Subdirsports/, observability/, arch/, domain/
Doc blockJSDoc /** ... */
Interface excerptexported `type

scala

AspectValue
Source ext..scala
L3 markersbuild.sbt, *.sbt, pom.xml
Interface filepackage.scala
Verification*Spec.scala, *Test.scala, *IntegrationSpec.scala, *IT.scala
Port*Port.scala
Observability*Metrics.scala, *Observability.scala, *Telemetry.scala
Domain*Domain.scala
Architecturearchitecture.scala, Architecture.scala
Subdirsports/, observability/, arch/, domain/
Doc blockScalaDoc /** ... */
Interface excerpttop-level def, val, var, trait, class, object, case class, case object, type

python

AspectValue
Source ext..py, .pyi
L3 markerspyproject.toml, setup.py, setup.cfg
Interface file__init__.py
Verificationtest_*.py, *_test.py, tests.py, conftest.py
Port*_port.py, port.py, ports.py
Observabilitymetrics.py, *_metrics.py, observability.py, telemetry.py
Domaindomain.py, *_domain.py
Architecturearchitecture.py, arch.py
Subdirsports/, observability/, arch/, domain/
Doc blockmodule-level """...""" or '''...''' docstring
Interface excerpttop-level def, class, from … import, import, __all__

go

AspectValue
Source ext..go
L3 markersgo.mod, go.sum
Interface filedoc.go
Verification*_test.go
Portport.go, ports.go, *_port.go
Observabilitymetrics.go, *_metrics.go, observability.go, telemetry.go
Domaindomain.go, *_domain.go
Architecturearchitecture.go, arch.go
Subdirsports/, observability/, arch/, domain/
Doc blockleading // ... lines (godoc) or leading /* ... */ block
Interface excerpttop-level func, type, var, const

markdown-only

AspectValue
Source ext..md, .mdx
L3 markers(none)
Interface fileREADME.md
Doc partsREADME.md, README.rst, *.md, *.mdx

Useful as one entry in a polyglot list to ensure README/docs always count when other profiles miss them, or alone for docs-only repos.

Migration from v0.3.x

No migration required. v0.3.x scans behave identically in v0.4.0 — the implicit default is ['typescript'], and the typescript profile reproduces the v0.3.x rule set exactly.

To opt into polyglot, add a single line to .fca-index.yaml:

languages: [typescript, scala] # or any other combination

See also

  • Guide 38 — fca-index getting started
  • Guide 39 — fca-index MCP tools
  • PRD 057 — Language profiles (this change’s PRD)
  • docs/arch/fca-index.md — architecture overview, including the LanguageProfile surface