GKR for LLM Inference Proving — A New Era for Verifiable AI
As AI models—especially Large Language Models—become integral to sensitive and trust-critical decision-making, the ability to verify their outputs is no longer just a nice-to-have; it’s becoming essential. But historically, generating cryptographic proofs for the massive computations behind an LLM has been prohibitively slow.
This is where the GKR (Goldwasser-Kalai-Rothblum) protocol enters the picture, marking a significant shift in how we approach Verifiable AI.
The “Layered” Advantage of GKR
Unlike traditional zero-knowledge proof systems that often struggle with the sheer depth of neural networks, GKR is mathematically tailor-made for them. Vitalik Buterin recently published a fantastic GKR Tutorial that explains why this specific protocol is so effective.
The core advantage lies in its structure. Neural networks are essentially “big” computations in two specific ways: they have many layers, and they process large batches of data (like tokens in an LLM) in parallel. Traditional proving methods often require costly cryptographic “commitments” for every single intermediate step, which becomes a massive bottleneck. GKR sidesteps this by only committing to the inputs and outputs. It then uses a clever recursive technique called “sumcheck” to verify everything in between without the heavy cryptographic overhead.
Vitalik notes that this approach can reduce the overhead of proving computations from ~100x down to sub-15x. For AI, that difference is the bridge between theoretical research and practical, real-world application.
Bringing GKR to Production: JSTProve
Building on these theoretical efficiencies, our team at Inference Labs has been working to turn this potential into a usable toolkit for AI developers.
In our latest paper, JSTProve: Succinct Zero-Knowledge Proofs for Efficient Machine Learning Inference, we introduced a specialized zkML toolkit designed specifically for this purpose. By integrating GKR-based proofs with DSperse (our distributed proving framework) and leveraging the Polyhedra Network’s Expander backend, we’ve been able to demonstrate verifiable inference at a scale that was previously out of reach.
Our work shows that it is possible to generate high-speed proofs for real LLM workloads without drowning in computational costs. JSTProve allows developers to verify that an AI model ran exactly as intended—confirming both the model’s integrity and the correctness of its output—while maintaining the performance needed for production systems.
We are entering a new era where “trust me, it’s AI” isn’t enough. With tools like GKR and JSTProve, we can finally say, “don’t trust, verify.”
For a deeper dive into the mechanics of GKR, I highly recommend reading Vitalik’s post. To see how we implemented these concepts for ML inference, check out our paper on arXiv.