Benchmarks

Ten properties of the yakcc paradigm and what the evidence shows for each. Click any section to expand its full detail (problem statement, what yakcc changes, real-data infographic). Inside each section, click the infographic to open it at full size.

Every claim cites real measurement data; results that are partial, in flight, or pending are shown as such.

Your import for isEmail ships 86 other validators you'll never call

An AI generating code to validate an email instinctively writes `import validator from 'validator'`. That single line drags in 86 different validators — isCreditCard, isBtcAddress, isEthereumAddress, isFQDN, isIBAN — plus 425 internal helpers. Every byte ships to your users. Every package gets scanned by every supply-chain tool. Every transitive dep is something you didn't audit.

What yakcc changes: yakcc serves a 6-function atom that does exactly the email validation you needed. No transitive closure. No extra surface. No bytes for behaviors you didn't ask for.

Measured: 511 reachable functions → 6 reachable functions (98.8% reduction). 260,056 bytes → 5,301 bytes (98.0%). 114 files → 1 file.

Side-by-side comparison: validator npm package showing all 86 is* validators with isEmail highlighted vs the yakcc atom's 6 function call graph
Click image to open at full size in a new tab
When you import a JSON schema validator, ~78% of what ships is dead on your contract

ajv@8 is the de-facto JSON Schema validator on npm. To validate a single schema your code uses, it ships 471 functions across 6 packages and 133 files. The 156-case JSON Schema 2020-12 test suite — the same one yakcc's atom passes — exercises maybe 22% of those functions at most. The rest is dead-on-your-contract code, but still in your bundle, still scanned by every CVE tool, still attack surface.

What yakcc changes: yakcc's JSON Schema validator atom is 17 functions, 1 file, 11 KB minified. Every function is exercised by the same 156-case suite. There is no dead code because there is nothing in the bundle that isn't part of the contract.

Measured: 471 functions → 17 functions (96.4% reduction). 117,520 bytes → 11,352 bytes (90.3%). 12 packages → 1 atom. 156 / 156 test cases pass on both.

Dot cluster of all 471 ajv functions (faded for dead-on-contract, dark red for exercised) vs the 17-function yakcc atom call graph with every function colored as exercised
Click image to open at full size in a new tab
Every transitive dep is a backdoor opportunity your team didn't review

Recent npm supply-chain attacks all entered through deps users barely knew they had. event-stream (2018) added a Bitcoin wallet stealer to a transitive dep of millions of projects. ua-parser-js (2021) had its maintainer's account hijacked and a crypto miner pushed to a package with 7M weekly downloads. colors / faker (2022) self-sabotaged, breaking thousands of production apps. polyfill.io (2024) was sold and malware-injected into 100K+ sites. xz-utils (2024) was nearly the worst — a multi-year social engineering attack to backdoor sshd, caught one week before it would have shipped in Debian and Fedora.

What yakcc changes: Atoms have no transitive closure to compromise. They are not packages with maintainers; they are content-addressed bytes. The only way to alter an atom is to compute a SHA collision on BLAKE3, which is not a known attack. There is no registry mirror to poison and no dep tree to walk.

Architecture proof: every transferred atom is integrity-checked by recomputing its BlockMerkleRoot from the received bytes. A tampered transfer fails loud. No maintainer trust required.

Five named npm supply chain attacks (event-stream, ua-parser-js, colors/faker, polyfill.io, xz-utils) with weekly download impact, vs yakcc's content-addressed atom model with no maintainer to compromise
Click image to open at full size in a new tab
Your team writes parseRfc3339Datetime 47 times. So does every team.

Every project generates parseInt, debounce, isEmail, and 6,000 other small utilities from scratch — from an LLM, from Stack Overflow, from copy-paste. Each implementation is slightly different. Each gets its own bugs. Each consumes review time. Across the industry, the same handful of functions get re-derived millions of times a year.

What yakcc changes: Write each atom exactly once, globally. Every reuse pulls the same content-addressed bytes the original author wrote. The first person pays the cost; everyone after gets it free. The commons accumulates one verified atom per problem, period.

Current commons size: 6,000+ atoms (bootstrap/expected-roots.json over git history). Per-user reuse rate sprint measurement: pending #187.

In flight: Per-user reuse rate measurement (B3 sprint) is filed and pending dispatch. Tracking issue →

Status-quo column listing five duplicated-implementation scenarios vs yakcc commons with 6000+ atoms, write-once-pay-once model
Click image to open at full size in a new tab
When the registry is confident, the model emits a reference instead of re-writing the code

When an LLM emits a function, it spends 500–2000 output tokens generating the implementation. Complex tasks need expensive models — Opus, Sonnet, GPT-4 — because cheaper models fail. That's the per-emission cost: real money, real latency, real carbon.

What yakcc changes: TOKEN ECONOMICS (measured, B4-v5 — 162 runs). When the registry is confident (auto_accept), the model emits a compact reference (~25–35 tokens in isolation) instead of re-writing the implementation: 91% oracle pass on the referenced atom, and ~1.1–5.2× less output than writing it verbatim within a full multi-turn run. In a controlled reference-emit experiment the isolated emission is 17–100× smaller than the full body. Prompt caching is 36–53% cheaper with no quality loss. The ceiling today is resolve coverage (56–72%): when confidence is low the model may ignore the reference, and the hooked context can underperform the unhooked baseline — so raising coverage is the next milestone.

Measured 2026-06-01 (B4-v5, 162 runs, $24.40). auto_accept tier: 91% oracle pass (58/64 runs). candidate_list tier: 14% (6/44). prompt cache: −36–53% cost, no quality change. Coverage ceiling: 56–72% (auto_accept coverage by model). Raw-aggregate hooked vs unhooked REDUCES pass rate — the conditional win is the honest headline.

TOKEN ECONOMICS: bar chart showing followed path 538–780 tokens vs ignored path 700–2772 tokens (in-run B4-v5 data); auto_accept 91% oracle pass; prompt cache −36–53% cost; coverage ceiling 56–72%
Click image to open at full size in a new tab
Atoms can be mathematically proven to do exactly what's advertised — and nothing more

A function the LLM hands you compiles and passes the tests you wrote. That's all you know about it. Does it log to /var/log? Does it phone home? Does it correctly match RFC 5321 across all edge cases? Without exhaustive property tests, you don't know — you just trust.

What yakcc changes: Every yakcc atom carries three artifacts: spec (the contract), impl (the strict-subset pure-function code), and proof (fast-check property tests). The property tests generate thousands of inputs; the impl passes every one. The atom does exactly what the spec promises — verifiable, not trusted.

Property-test contract authority: packages/contracts/src/proof-manifest.ts. Strict-subset enforcement: no I/O, no globals, no eval. Coverage discipline tracked via the bench/B9 min-surface benchmark.

Hand-written validateEmail with no verifiable bound on behavior vs yakcc atom triplet (spec, impl, proof) with property-test verification
Click image to open at full size in a new tab
Shave once. Use from any supported language.

Your team writes the same parsing function in TypeScript for the frontend, Python for the data pipeline, and Go for the service backend. Three implementations. Three sets of bugs. Three review cycles. The function is the same; the languages are an implementation detail.

What yakcc changes: yakcc stores atoms in a language-neutral intermediate representation. Compile adapters lower the same atom into TypeScript, Python, or Go on demand. The atom is the contract; the languages are surfaces.

Round-trip verified: 51 samber/lo functions to Go, 9 bs4 functions to Python. Same atom, byte-identical lower in each target language.

Central yakcc atom in IR form with three arrows to TypeScript, Python, and Go compile adapters, each showing the same atom's signature in the target language
Click image to open at full size in a new tab
Zero outbound network during a full pipeline run

Regulated industries (defense, finance, healthcare), unreliable-connectivity environments, and air-gapped networks can't take a hard dependency on always-on internet. Most modern dev tooling fails this bar — npm install alone fires hundreds of outbound requests; LLM coding assistants make a network call per generation.

What yakcc changes: yakcc's full pipeline — install, init, shave, query, compile, verify — runs with zero outbound network calls. The local registry is the source of truth. The network is optional: useful for syncing the global commons, but not required for any operation.

Measured 2026-05-11: 15.4-second wall-clock, 0 outbound connections (pcap-verified), 0 step failures.

Timeline of a 15.4-second full yakcc pipeline run with six phases (install, init, shave, query, compile, verify) and a giant '0' representing outbound packets sent
Click image to open at full size in a new tab
Same atom on every machine produces byte-identical bytes

"Works on my machine" isn't a meme; it's an industry. The same Python program produces different floating-point results on different CPUs. The same npm install produces different lockfiles on different days. The same Docker build varies by ~110 bytes between dev and CI because of build-time timestamps and host-specific paths. Reproducibility is asserted constantly and proven almost never.

What yakcc changes: yakcc's reproducibility is asserted on every push to main, against the strongest possible test: yakcc shaves its own source code into atoms, then shaves the same source again, and every single one of the 7,064 resulting atoms must have a byte-identical BlockMerkleRoot between the two passes. Same logical content → same canonical bytes → same content hash. If this ever goes red, the headline claim is broken; if it stays green, every other atom in the commons inherits the same property.

Measured on every push: 7,064 atoms in the workspace, all byte-identical between pass-1 and pass-2 of the self-shave test. Authority: DEC-V2-CI-GATE-FINAL-001 (unconditional gate) + DEC-V2-HARNESS-STRICT-EQUALITY-001 (strict byte-identity assertion).

Status quo column showing nondeterministic Python floats, varying package sha256s, and differing build output sizes, vs yakcc's two-pass self-shave proof asserting 7,064 atoms byte-identical on every commit
Click image to open at full size in a new tab
Programs are graphs of verified atoms, not opaque LLM-emitted monoliths

When an LLM gives you a 500-line file, the unit of trust is the entire file. You read it linearly. If there's a bug, you re-read it linearly. If a reviewer wants to verify it, they read it linearly. Provenance per function is informal at best — the file is one undifferentiated thing emitted in one breath. There is no graph to walk, no atom to bisect, no individual piece you can verify and check off.

What yakcc changes: `yakcc compile` produces a program as a graph: a top-level binding, its child atoms, their child atoms, and so on. Every node carries a content-addressed BlockMerkleRoot. Every program ships with a provenance manifest naming every constituent atom by its content hash. Code review collapses from "read all 500 lines" to "verify only the atoms not already verified." Bug in production? Identify the atom hash; every other program using the same atom is bug-compatible, so the bug isolates instantly.

Structural property of every yakcc compile output. Provenance manifest produced by packages/compile/src. README authority: "Every assembled program carries a provenance manifest naming every constituent block by its content-address. Bit-for-bit reproducibility is not a build option — it is the default."

Status quo monolithic 500-line LLM-emitted file you have to read top to bottom, vs yakcc compile output as a tree of named atoms each with a content hash and verifiable independently
Click image to open at full size in a new tab

For benchmark nerds: raw results by codename

The internal naming convention is B1..B10. The grid below maps each codename to its current status.

B6-Airgap PROVEN

Installable with zero network during install.

B7-Commit PROVEN

Zero outbound network during commit verification.

B10-Import PROVEN

Drop-in replacement for transitively-bundled npm packages.

v0-Smoke PROVEN

End-to-end install + run smoke test on real OS.

B2-Bloat MEASUREMENT-LIMITED

Validator atom is 91% smaller than equivalent ajv bundle (cold corpus).

B5-Coherence PARTIAL

Cross-language semantic equivalence (TS↔Python). Below directional target — iterating.

B1-Latency PARTIAL

Per-atom cold-cache latency vs hand-written baseline.

B4-Tokens PARTIAL

When confident (auto_accept): 91% oracle pass; prompt cache −36–53%. Coverage-gated (56–72%).

B8-Synthetic PARTIAL

Synthetic-vs-real harness validation. Slice 1 of 3.

B9-Min Surface PARTIAL

Minimum atom-surface for substrate self-replication.

B3-Cache Hit PENDING

Atom cache hit-rate on real workloads.

B8-Curve PENDING

Coverage curve vs corpus size.