This document covers power-user features: running your own registry peer, airgap deployments, custom embeddings, the granularity dial, telemetry inspection, bulk shave, and the yakcc v2 self-shave demo.
For common errors and diagnostic steps, see TROUBLESHOOTING.md. For the basic walkthrough, see USING_YAKCC.md.
Serve your local registry to teammates over HTTP:
yakcc federation serve \
--registry .yakcc/registry.sqlite \
--port 8080
This starts a read-only HTTP server that exposes blocks by their BlockMerkleRoot. Any yakcc client can mirror from it:
# On a teammate's machine:
yakcc federation mirror \
--remote http://your-host:8080 \
--registry .yakcc/registry.sqlite
F1 has no authentication — run it behind a reverse proxy or on a private network. Every transferred block is integrity-checked by recomputing its content-address from the received bytes; a tampered transfer is rejected loudly, never silently accepted.
Pull the full atom set from a team registry:
yakcc federation mirror \
--remote https://team-registry.example.com \
--registry .yakcc/registry.sqlite
Cherry-pick a single known atom instead of mirroring everything:
yakcc federation pull \
--remote https://team-registry.example.com \
--root <BlockMerkleRoot> \
--registry .yakcc/registry.sqlite
Update your peer list in .yakccrc.json to point at the new registry:
{
"version": 1,
"registry": { "path": ".yakcc/registry.sqlite" },
"federation": {
"peers": ["https://team-registry-new.example.com"]
}
}
Yakcc is offline-first by design. The entire pipeline — shave, compile, registry query, hook intercept — operates with zero outbound network calls when:
The embedding model is cached locally. On first use, bge-small-en-v1.5 is downloaded once to the model cache. To pre-populate the cache before airgap:
# On an internet-connected machine, pre-warm the model cache
yakcc registry rebuild --path /dev/null # downloads + caches the model
cp -r ~/.cache/yakcc/ <portable-cache-dir>/
The seed corpus is pre-loaded. On the airgap machine:
# Copy your registry.sqlite to the target machine
yakcc seed --yakcc # skips download if registry.sqlite is already present
No federation peers are configured. yakcc federation mirror requires HTTP access; omit it in airgapped environments.
Verify the install works without network:
# Block outbound on macOS with pf or on Linux with iptables, then:
yakcc query "store a block by content address"
# Should return registry hits from the local corpus
The default embedding model is bge-small-en-v1.5 (per DEC-EMBED-MODEL-DEFAULT-002). After an upgrade that changes the model, existing registry vectors must be regenerated:
yakcc registry rebuild --path .yakcc/registry.sqlite
rebuild is idempotent and preserves all atom content byte-for-byte — only the embedding index is regenerated. Use it whenever you:
outcome: "passthrough" (vector mismatch diagnostic).Custom model (when the flag is available; see open issue for the configuration surface):
# Planned syntax — check `yakcc registry rebuild --help` for current flags
yakcc registry rebuild --path .yakcc/registry.sqlite --model <model-name>
The shave pipeline exposes a --granularity flag (range 1–5, per #463) that controls how finely the decomposer splits atoms:
| Level | Behaviour |
|---|---|
| 1 | Coarse — only top-level named exports are extracted |
| 3 (default) | Balanced — decomposes into logical sub-expressions |
| 5 | Fine — maximally atomic; more splits, smaller individual atoms |
yakcc shave src/my-utils.ts --granularity=5
Higher granularity produces more atoms with narrower intent, which improves hit rate on specific sub-problems. Lower granularity produces fewer, broader atoms that are more likely to match whole-function queries.
Every hook invocation appends a JSON line to ~/.yakcc/telemetry/<session-id>.jsonl. This file is local-only; nothing leaves your machine.
One event per emission:
{
t: 1715568000000, // unix-ms timestamp
intentHash: "blake3:…", // BLAKE3 of the emission text
toolName: "Edit" | "Write" | "MultiEdit",
latencyMs: 12,
outcome: "registry-hit" | "synthesis-required" | "passthrough",
substituted: true,
substitutedAtomHash: "7f3a1c…" // BMR[:8] of the substituted atom, or null
}
Quick hit/miss tally for the current session:
jq -s 'group_by(.outcome) | map({outcome: .[0].outcome, count: length})' \
~/.yakcc/telemetry/<session-id>.jsonl
Rolling 7-day view across all sessions:
jq -s 'map(select(.t > (now * 1000 - 7*24*3600*1000)))
| group_by(.outcome) | map({outcome: .[0].outcome, count: length})' \
~/.yakcc/telemetry/*.jsonl
A dedicated yakcc telemetry subcommand is on the roadmap; until then jq is the read surface.
To ingest an entire TypeScript workspace into the registry:
yakcc bootstrap
This traverses every file in the workspace, decomposes JSDoc-annotated exports into atoms, and writes a manifest at bootstrap/expected-roots.json. Add --verify to byte-compare the produced manifest against a committed baseline:
yakcc bootstrap --verify
After an embedding-model upgrade, regenerate vectors without re-shaving:
yakcc registry rebuild --path .yakcc/registry.sqlite
For a single file:
yakcc shave src/my-utils.ts
Re-shaving is a no-op for unchanged files (content-addressed idempotency). Shave your most heavily-reused modules first for the fastest return on corpus density.
yakcc shaves the meaningfully-reusable parts of arbitrary TypeScript — including its own source — recompiles itself from those atoms, and the recompiled yakcc produces the same manifest. Reproducible from a fresh clone in 4 commands:
pnpm install --frozen-lockfile && pnpm -r build
node packages/cli/dist/bin.js bootstrap --verify
node packages/cli/dist/bin.js compile-self --output=dist-recompiled/
YAKCC_TWO_PASS=1 pnpm --filter @yakcc/v2-self-shave-poc test two-pass-equivalence
What each pass proves:
bootstrap --verify): yakcc shaves its own source into a content-addressed manifest of 3,807 atoms. The manifest matches bootstrap/expected-roots.json byte-for-byte.YAKCC_TWO_PASS=1): the recompiled yakcc (assembled entirely from its own atoms) produces the same manifest. Byte-identity across the compile-self round-trip.This is the “moat” claim: if yakcc can shave and recompile itself without drift, it can do the same for any sufficiently well-structured TypeScript codebase.
For the full fresh-clone reproduction with captured output and the “If equivalence fails” taxonomy, see docs/V2_SELF_SHAVE_DEMO.md.
For pass-1 internals (bootstrap mechanics, manifest semantics, CI integration) see docs/archive/developer/V2_SELF_HOSTING_DEMO.md.