Cover photo

LLM as Sensor. Verdict as Math.

Why ZK-proven LLM inference is the wrong stack for prediction market resolution

A growing school of prediction-market infrastructure has decided that the path to trustworthy oracle resolution runs through one architectural commitment: make the LLM itself deterministic and prove it cryptographically.

Pin the GPU kernel. Lock the PRNG. Force floating-point associativity. Wrap the model in a TDX trusted execution environment. Pipe the attestation into a ZKVM. Recompute the proof from scratch. Emit a STARK. Wrap the STARK in a Groth16 SNARK. Verify on Ethereum.

It is heroic engineering against the wrong axis.

This dispatch is a structural critique of that approach, and a position statement for the alternative DJZS Protocol has shipped: LLMs detect. TypeScript decides. The verdict is mathematics.

// TRANSMISSION_ID     : djzs-doc-LLMSENS-v3.2
// SYS_ID              : djzs-mainnet-01
// DOC_CLASS           : POSITION_STATEMENT
// AUDIENCE            : ORACLE_DESIGNERS / PREDICTION_MARKET_OPS / ZKML_RESEARCHERS
// FRAMEWORK           : DST v3.0 (Deterministic Simulation Thesis)
// TAXONOMY            : DJZS-LF v1 (11 codes / 5 domains / weight=200)
// THESIS              : LLM_as_sensor + verdict_as_math
// PAYMENT_RAIL        : x402 on Base Mainnet
// ANCHOR_PRIMARY      : ProofOfLogicNFT (ERC-721, Base)
// ANCHOR_SECONDARY    : Irys Datachain
// STATUS              : CANONICAL — supersedes v1.1

Trust the math you can replay, not the oracle you cannot inspect.

The Conflation

Every "deterministic AI oracle" thesis I have read in the last twelve months performs the same sleight of hand. Two different problems get welded into one phrase.

Problem A — Computational Determinism. Given identical inputs, does the LLM produce identical outputs? Hard. Solvable. The deterministic-oracle stack solves this beautifully.

Problem B — Resolution Correctness. Given the actual state of the world, does the LLM produce the right answer? Different problem. Different category.

The deterministic-oracle architecture solves Problem A. It says nothing about Problem B. A bit-identical wrong answer is still wrong. You have built a system that produces reliably incorrect resolutions and cryptographically attests to their reliability.

This is a textbook DJZS-S01 [CIRCULAR_LOGIC] failure pattern. The trust claim is recursive: the LLM is correct because the proof is valid because the LLM is the LLM we proved. Nothing in that loop touches ground truth.

// LAW_INVOKED :: DST-L04 [UNCERTAINTY_IS_OBSERVER_LOCAL]

Uncertainty does not vanish because you moved it inside an opaque model and bolted a SNARK to the exterior. It still lives in the model's weights, its training corpus, its prompt-framing sensitivity. The ZK proof verifies the wrapper. The wrapper is not where the uncertainty lives.

Steel-Manning the Stack

To be fair, let me steel-man it. The architecture is genuinely impressive.

Open-source models keep parameters inspectable and no closed vendor (OpenAI, Anthropic, Google) sits in the trust path.

TDX execution because SGX cannot fit modern model weights. TDX gives a DCAP attestation that the binary that ran is the binary that was specified.

ZKVM recompute (RISC Zero, Succinct) reproduces the attestation inside a zero-knowledge environment and emits a STARK proof, roughly 200 KB.

Groth16 wrapping compresses the STARK into a SNARK that a Solidity contract can verify in a few hundred thousand gas.

Container digest pins GPU architecture, driver version, kernel, decoding policy (temperature, top-K, top-P), the PRNG, and the floating-point execution order.

The last item is where the engineering goes deepest. Floating-point non-associativity the fact that (a + b) + c ≠ a + (b + c) when you are summing massive matrices of decimals across thousands of GPU cores has to be forcibly serialized. Variable batching has to be killed. KV-cache memory layout has to be made invariant. Kernel scheduling has to be deterministic.

Each of those problems is a doctoral thesis.

And every one of them is solving Problem A.

A Category Error, Named

The thing being proved and the thing prediction markets actually need are not the same thing.

A prediction market needs to know: did the leader wear a suit? The market needs a function from world-state to boolean. What the deterministic-oracle stack delivers is a function from (prompt, model_weights, seed) to boolean, plus a proof that the function was evaluated correctly.

There is no morphism from one to the other.

The model's mapping from the world to its boolean output is the entire problem. The model is exactly the part of the system that is not under deterministic control. Its weights are statistical. Its outputs are framing-sensitive. Its self-correction is non-existent. It does not know what year it is.

The proof attests that the sensor reported what the sensor reported. It does not attest that the sensor was looking at the world.

This is the category error. A ZK-proof of LLM inference is a proof about the inference process, not about the answer. Equating the two is the foundational confusion of the field.

It is also a DJZS-E01 [ORACLE_UNVERIFIED] failure mode at the architectural layer. The cryptographic guarantees attach to the sensor instead of to the verdict. Misplacing the trust anchor by one structural layer is enough to make the entire pipeline produce attested falsehood.

The DJZS Move

DJZS Protocol resolves this by refusing to put the LLM on the critical path of the verdict.

DJZS CANONICAL FLOW

  01  LLM emits boolean detection flags only
      "does this argument show CIRCULAR_LOGIC?  y/n"

  02  computeVerdict(flags) runs in pure TypeScript
        deterministic by construction
        weighted, hashable, replayable

  03  computeVerdictHash() emits SHA-256 over scored output

  04  Certificate anchored on Irys + Base Mainnet

The LLM is a sensor. Its job is detection narrow, bounded, classification-shaped. The verdict — the thing the market resolves on, the thing the user pays for — is computed by a deterministic TypeScript function over the sensor's outputs. That function is bit-identical every time you run it. Its hash is its own proof.

We did not need TDX. We did not need RISC Zero. We did not need a 200 KB STARK or a Groth16 wrapper. We sidestepped the entire stack with a schema decision: never let the LLM compute the score.

The deterministic-oracle school is hardening the LLM. We isolated it.

These are not equivalent strategies. One produces cryptographic guarantees about a sensor's reliability. The other produces cryptographic guarantees about the logic applied to the sensor's reading. Only the second is what an audit is.

// AXIOM :: LLM detects. TypeScript decides. The contract enforces.

What Is Worth Importing

The deterministic-oracle literature is right about one thing, and it is the thing worth taking seriously: single-model resolution is structurally weak.

A single LLM tasked with planning, executing, and self-critiquing degrades. The technical term in the field is "single model echo." The model is overconfident, framing-sensitive, blind to its own knowledge gaps. The canonical example: an LLM that knows who Tom Cruise's mother is may fail to answer who Mary Lee Pfeiffer's son is. The information is in the weights. The retrieval path is asymmetric.

The proposed remedy is a multi-agent reasoning system an LLM council with role differentiation, persistent memory, reputation, anonymized models, and adversarial deliberation between them.

This is sound. It maps cleanly onto DJZS's existing architecture without compromising the LLM-as-sensor principle. A Council Tier sits on top of the existing detection layer:

COUNCIL_TIER spec (scoped, not yet built)

  N >= 3 heterogeneous detection models
    (GLM 5.1, Claude Opus, plus rotating third)
  Independent boolean flag emission per model
  TypeScript reconciliation layer:
    - majority vote
    - weighted by historical detector accuracy
    - inter-model variance surfaced to certificate
  Variance becomes a first-class confidence signal
  Reconciliation is deterministic. Hash holds.

The council's deliberation stays in the LLM layer. The aggregation stays in TypeScript. We get the accuracy gains of multi-agent reasoning without surrendering the determinism property.

What This Means for the Field

If "ZK-proven LLM inference" becomes the default vocabulary of agent-economy infrastructure — and several well-funded projects are pushing in that direction — DJZS will be miscategorized. We will be lumped in with a school of architecture we do not belong to and do not endorse.

The position has to be stated bluntly and repeated:

LLM as sensor. Verdict as math. ZK-proving a sensor is a category error.

That is the line. It belongs on the homepage. It belongs in the deck. It is the cleanest one-line refutation of the entire architectural school.

The deeper claim underneath it is older than crypto, older than AI, and lives in financial security primitives a hundred years old: trust the math you can replay, not the oracle you cannot inspect. DJZS applies that principle to autonomous reasoning.

Everyone else is trying to build a more honest oracle. We built a system in which the oracle does not get to be the judge.

Terminal Transmission

// THE SENSOR REPORTED WHAT THE SENSOR REPORTED.
// THE VERDICT IS THE MATH THAT FOLLOWED.
// ZK-PROVING A SENSOR IS A CATEGORY ERROR.

DJZS Protocol // djzs.ai // username.dj-z-s.eth // GitHub

{
  "PROOF_OF_LOGIC": {
    "sys_id"             : "djzs-mainnet-01",
    "transmission_id"    : "djzs-doc-LLMSENS-v3.2",
    "doc_class"          : "POSITION_STATEMENT",
    "thesis"             : "LLM_as_sensor_verdict_as_math",
    "framework"          : "DST_v3.0",
    "law_invoked"        : "DST-L04_UNCERTAINTY_IS_OBSERVER_LOCAL",
    "logic_taxonomy"     : "DJZS-LF v1",
    "lf_findings_against_competing_school" : [
      {
        "code"     : "DJZS-S01",
        "name"     : "CIRCULAR_LOGIC",
        "subtype"  : "TAUTOLOGICAL_TRUST_LOOP",
        "evidence" : "Trust loop -- LLM correct because proof valid because LLM is the proven LLM"
      },
      {
        "code"     : "DJZS-E01",
        "name"     : "ORACLE_UNVERIFIED",
        "subtype"  : "MISPLACED_TRUST_ANCHOR",
        "evidence" : "Cryptographic guarantees attached to sensor whose output should not carry the verdict"
      },
      {
        "code"     : "DJZS-S02",
        "name"     : "MISSING_FALSIFIABILITY",
        "subtype"  : "PROOF_PROVES_PROCESS_NOT_ANSWER",
        "evidence" : "Architecture admits no condition under which the proof would invalidate the verdict"
      }
    ],
    "stack_anchors"      : {
      "chain"     : "base-mainnet",
      "payment"   : "x402-usdc",
      "datachain" : "irys",
      "engine"    : "@djzs/trust"
    },
    "council_tier_spec"  : {
      "status"            : "SCOPED_NOT_YET_BUILT",
      "model_count"       : "N >= 3 heterogeneous",
      "reconciliation"    : "TypeScript_deterministic",
      "variance_role"     : "first_class_confidence_signal"
    },
    "next_actions"       : [
      "Council Tier specification — N>=3 deterministic reconciliation",
      "intelligence_context injection (Polymarket / Limitless)",
      "Homepage copy update — adopt thesis line"
    ],
    "audit_verdict"      : "POSITIONAL_LOCK",
    "score"              : 200,
    "max_score"          : 200,
    "confidence_baseline": 0.94,
    "logic_hash"         : "0x9c1f3e7a8b4d2e6f",
    "payment_verified"   : true,
    "version"            : "3.2",
    "supersedes"         : ["v1.1"]
  }
}
// END_TRANSMISSION. //