MetaGenesis Core – offline verification for computational claims

metagenesis-core.dev

15 points by Lama9901 5 days ago


MetaGenesis Core is a verification protocol layer for computational results.

It lets a third party verify a packaged computational claim offline, with one command, without access to the original environment.

I built it solo, after hours, while working construction, using AI tools heavily. I kept running into the same wall: even when a result looks good, there's no simple way for someone else to check it independently without re-running the full environment or trusting the number on faith.

That problem shows up everywhere: - ML: "our model reached 94.3% accuracy" - materials: "our simulation matches lab data within 1%" - pharma: "our pipeline passed quality checks" - finance: "our risk model was independently validated"

Different domains, same structure.

---

The gap

MLflow / W&B / DVC / Sigstore / SLSA solve adjacent problems well. What they don't provide is an offline third-party verification step with a semantic layer for the claim itself. File integrity alone is not enough.

The bypass attack: 1. remove core semantic evidence (job_snapshot) 2. recompute all SHA-256 hashes 3. rebuild the manifest 4. submit

A hash-only check still passes. MetaGenesis Core adds a second layer: - integrity layer → PASS - semantic layer → FAIL (job_snapshot missing)

That attack is an adversarial test in the public repo.

---

How it works

Layer 1 — integrity: SHA-256 per file + root hash Layer 2 — semantic: required fields present, payload.kind matches claim type, provenance intact

  python scripts/mg.py verify --pack /path/to/bundle
  → PASS
  → FAIL: job_snapshot missing
  → FAIL: payload.kind does not match registered claim
Same workflow across domains — ML, materials, pharma, finance, engineering. The claim type changes, not the protocol.

---

Current state

  python scripts/steward_audit.py        → PASS
  python -m pytest tests/ -q            → 91 passed
  python demos/open_data_demo_01/run_demo.py → PASS / PASS
No API keys. No network. Python 3.11+.

---

Honest limitations

Not validated by an external production team yet. The protocol works on the public codebase and tests, the adversarial scenario is caught, the demo is reproducible — but real-world integration still needs proof.

Limitations are machine-readable in reports/known_faults.yaml.

That first external "yes, this worked on our pipeline" is what I'm looking for.

---

If you think this is flawed, I want to know where. If it overlaps with an existing tool I'm missing, I want to know that too.

  Site:    https://metagenesis-core.dev
  Repo:    https://github.com/Lama999901/metagenesis-core-public
  Contact: yehor@metagenesis-core.dev

  Inventor: Yehor Bazhynov
  Patent pending: USPTO #63/996,819
Lama9901 - 13 hours ago

Update — March 15, 2026. I owe this thread a more honest picture. When I posted this, the repo had 91 tests and 5 claims. Today: 14 claims across 7 domains (ML, materials science, digital twin, pharma, finance, IoT), 223 passing tests, 3 independent verification layers, and a Cross-Claim Cryptographic Chain that links physical constants (E = 70 GPa for aluminum) to simulation outputs — cryptographically, end-to-end. Check the commit history: https://github.com/Lama999901/metagenesis-core-public/commit... From PPA filing to 14 verified claims in 10 days. Solo. After-hours. Working construction by day.

On building with AI. I've seen "AI-assisted" used as a caveat, a disclaimer, almost an apology. I want to say the opposite. I didn't use Claude as autocomplete. I used it as a collaborator for architecture decisions, adversarial red-teaming, and systematic auditing. Every claim in this repo has: a deterministic implementation, a 4-step cryptographic execution trace (Step Chain), a bypass attack test that proves the semantic layer actually catches tampering, and a governance invariant enforced on every PR. That level of rigor, solo, in this timeframe — is only possible with AI as a genuine development partner. Not a crutch. A force multiplier. The question isn't "did AI write the code." The question is: does the code work, are the proofs real, and can you break it? bashgit clone https://github.com/Lama999901/metagenesis-core-public python scripts/deep_verify.py # → ALL 10 TESTS PASSED That script runs steward audit, 223 tests, verifies Step Chain in all 14 claims, executes a live bypass attack, and checks site-to-code consistency. One command, no network, no trust required. If you think the architecture is wrong, the verification logic is broken, or this problem is already solved — I genuinely want to know. Comments, issues, or yehor@metagenesis-core.dev. The reproducibility crisis costs $28B/year in biomedical research alone. Meta confirmed Llama 4 benchmarks were fudged. FDA is finalizing AI verification requirements. The protocol layer for this doesn't exist yet. Proof, not trust. That's the whole point.

Lama9901 - a day ago

shipped two more things today. want to explain what they actually mean. one: step chain verification is now in all 8 claims, not just ML_BENCH-01. every computation in the protocol — materials calibration, FEM verification, drift monitoring, data pipelines, system identification — now produces a 4-step cryptographic execution trace. trace_root_hash commits to the exact sequence. change any input, skip any step — the hash breaks. 153 adversarial tests. all pass. two: cross-claim cryptographic chain. this is the part i haven't seen anywhere else. the trace_root_hash of one claim can be embedded as anchor_hash in the next: E = 70 GPa (aluminum — measured in thousands of labs, not my number) ↓ MTR-1 trace_root_hash: "abc..." ↓ anchor_hash = "abc..." baked into DT-FEM-01 step 1 DT-FEM-01 trace_root_hash: "def..." ↓ anchor_hash = "def..." baked into DRIFT-01 step 1 DRIFT-01 trace_root_hash: "ghi..." the final hash commits to the entire chain. tamper MTR-1 — every downstream hash breaks. verify the full chain with one command, offline, without accessing any simulation environment. this is not a claim in a document. it's in the code, tested: tests/steward/test_cross_claim_chain.py ::test_full_chain_is_cryptographically_linked ::test_tampered_anchor_hash_changes_chain i built this working construction full-time. if you find something that does end-to-end cryptographic chain verification from a physical constant to a simulation output — show me. i want to know. git clone https://github.com/Lama999901/metagenesis-core-public python -m pytest tests/steward/test_cross_claim_chain.py -v proof, not trust.

Lama9901 - 2 days ago

let me be direct about where i see this going.

right now there's no standard way to verify a computational result independently. you either trust the number or you don't. that's true for ML benchmarks, simulation outputs, pharma pipelines, financial models — everything.

what this builds toward: any result, any domain, packaged once, verifiable forever by anyone with python and 5 minutes. no access to the original environment. no trust required.

the physical anchor is the part that excites me most — for materials and engineering, the chain connects to actual physical reality. not a number i chose. not a convention. physics.

that's a different category of proof than anything that exists right now in this space.

if you're working in a domain where results need to be audited, reproduced, or submitted to regulators — this is the missing layer. try it:

  git clone https://github.com/Lama999901/metagenesis-core-public
  python demos/open_data_demo_01/run_demo.py
if it works — let's talk about your use case. if it doesn't — tell me exactly where it breaks.

proof not trust. that's the whole thing.

Lama9901 - 3 days ago

Author update: spent the day doing a final pass before asking HN to re-up the post.

What changed since the original submission: - 8 active claims (added DT-FEM-01 — FEM/digital twin verification) - 107 tests passing, steward_audit PASS - Every link on the site now points to the actual file in the repo - system_manifest.json synced, all docs consistent

Still solo, still transparent about limitations (reports/known_faults.yaml). Happy to answer any questions about the protocol design.

itsthecourier - 2 days ago

"A hash-only check still passes. MetaGenesis Core adds a second layer: - integrity layer → PASS - semantic layer → FAIL (job_snapshot missing)"

may you please elaborate on this?

rubyrfranklin2 - 2 days ago

Real-time speech translation is something I think about constantly running heyvid.ai — we're always chasing that latency vs. quality tradeoff for multilingual video. JEPA's approach is interesting because it sidesteps the typical encode-decode bottleneck that kills most real-time pipelines. I'd be curious how it holds up on accented or fast speech. Back at Adobe I saw how even 200ms of lag completely destroyed the perceived quality of live demos. The latency budget for translation is so much tighter than transcription-only, so any architectural win like this is worth watching closely.

ddfproof - 2 days ago

looked at the repo — the bypass attack test caught my eye.

strip job_snapshot, recompute hashes, rebuild manifest — hash-only verifier passes silently.

how common is this attack in practice? like do you actually see people trying to game verification systems this way or is it more of a theoretical concern you're protecting against?

measurablefunc - 2 days ago

This is another "art" project. Nice work OP.

Lama9901 - a day ago

shipped something today. then found a problem with it. fixed it. here's the full story. there are three ways a computational result can lie to you:

the file was changed after the fact — SHA-256 catches this the evidence was stripped from the bundle — the semantic layer catches this the computation itself was run differently than claimed — nothing catches this

until today. i added Step Chain Verification to ML_BENCH-01. every step of the computation hashes itself into the next: init_params → hash_1 hash_1 + dataset → hash_2 hash_2 + metrics → hash_3 hash_3 + verdict → trace_root_hash change the seed, skip a step, reorder anything — trace_root_hash doesn't match. the chain breaks. this isn't blockchain. no network, no consensus, no tokens. same idea as git commits — each commit hashes its parent. except here it's computation steps, not code commits. then i checked the actual verifier. mg.py verify --pack bundle.zip — the command i've been telling people to run — wasn't checking trace_root_hash at all. the chain was in the data. the construction tests passed. but the verifier itself ignored it entirely. so "three verification layers" was technically true in the data structure. not true in what the verifier actually ran. i fixed it before posting. added to scripts/mg.py _verify_semantic():

trace_root_hash must equal the final step hash if one field exists without the other → FAIL if any step hash isn't valid 64-char hex → FAIL

then wrote tests/steward/test_cert03_step_chain_verify.py — 5 tests that attack the verifier specifically, not just the chain construction. now mg.py verify actually runs all three layers: integrity: SHA-256 root_hash match semantic: job_snapshot present, payload.kind correct step chain: trace_root_hash == final step hash 118 tests total. steward_audit PASS. git clone https://github.com/Lama999901/metagenesis-core-public python -m pytest tests/steward/test_cert03_step_chain_verify.py -v the lesson: "i implemented X" and "X runs when you call verify" are two different things. found that gap myself. fixed it first. # the chain is just SHA-256, chained: hash_1 = SHA256("init_params" + data + "genesis") hash_2 = SHA256("generate_dataset" + data + hash_1) hash_3 = SHA256("compute_metrics" + data + hash_2) trace_root_hash = SHA256("threshold_check" + data + hash_3) ```

change anything — seed, sample count, noise level, step order — trace_root_hash changes. the verifier catches it.

118 tests. three independent layers. MIT license. no network. no trust required. ``` git clone https://github.com/Lama999901/metagenesis-core-public python -m pytest tests/steward/test_cert03_step_chain_verify.py -v

- 5 days ago
[deleted]