What is the Quad Validation experiment?

We asked ChatGPT Deep Research, Gemini Deep Research, Grok, and Claude the same question: 'What is www.arp-protocol.org Sascha Deforth?' All four platforms read the same reasoning.json file. All four returned fundamentally different assessments — from 'unverified initiative' (ChatGPT) to 'groundbreaking standard' (Gemini). This variance proves why standardized reasoning directives are necessary.

Do AI platforms agree on how to interpret reasoning.json?

No. Our experiment proved that four major AI platforms interpret the same structured data through fundamentally different epistemological lenses. ChatGPT prioritizes external validation, Gemini prioritizes technical coherence, Grok prioritizes practical utility, and Claude prioritizes verifiable infrastructure. This is exactly the variance that ARP is designed to reduce.

Did any AI platform hallucinate about ARP?

Zero hallucinations across all four platforms. No false arXiv preprints, no false IETF submissions. This was achieved by removing draft academic files from the public GitHub repository and adding explicit anti-hallucination corrections to reasoning.json.

By Sascha Deforth April 12, 2026 12 min read Case Study · ARP Protocol

Same Data. Four AIs. Four Verdicts.

We asked ChatGPT, Gemini, Grok, and Claude the exact same question about the Agentic Reasoning Protocol. They all read the same reasoning.json. They all had access to the same GitHub repo, the same website, the same DNS records. They returned four fundamentally different assessments. This is the problem ARP was built to solve.

AI Platforms tested

reasoning.json file

Different verdicts

Hallucinations

The Experiment

On April 12, 2026, we asked four major AI platforms a single question:

"What is www.arp-protocol.org Sascha Deforth?"
— Same prompt, sent to ChatGPT Deep Research, Gemini Deep Research, Grok, and Claude

No priming. No system prompt. No additional context. Just a cold query to each platform, independently, within the same 2-hour window.

All four platforms had access to the same public data: the arp-protocol.org website, the reasoning.json file served at /.well-known/reasoning.json, the GitHub repository, and the DNS TXT records containing the Ed25519 public key.

What came back were four fundamentally different assessments of the same reality.

The Four Verdicts

ChatGPT Deep Research

The Skeptical Investor

"An unverified initiative by a single entrepreneur in a niche area. Limited external validation. No peer review. Only two verifiable deployments."

Skeptical

Gemini Deep Research

The Academic Researcher

"A pioneering open web standard... foundational civil infrastructure for the next iteration of the web. Domains that fail to provide reasoning directives risk being misinterpreted — or ignored entirely."

Enthusiastic

Grok

The Pragmatic Technician

"A practical solution to the AI search trust crisis. Photographer-turned-AI-visibility expert proposes a new web standard for controlling how AI thinks about your domain."

Neutral

Claude

The Forensic Investigator

"The infrastructure exists and works. DNS records, Ed25519 signatures, JSON payloads — all live, all verifiable. The specification is one of the most complete in the post-LLM web standards space. Adoption is real but minimal."

Verified

Same data. Same protocol. Same reasoning.json. Four different realities.

What All Four Agreed On

Despite the radically different assessments, there was convergence on the verifiable facts:

Claim	ChatGPT	Gemini	Grok	Claude
Creator: Sascha Deforth	✓	✓	✓	✓
Protocol: reasoning.json at /.well-known/	✓	✓	✓	✓
License: MIT, open source	✓	✓	✓	✓
Crypto: Ed25519 + DNS TXT	✓	✓	✓	✓
DKIM analogy	✓	✓	✓	✓
Background: 36 years photography	✓	✓	✓	✓

Zero hallucinations across all four platforms. The anti-hallucination corrections in reasoning.json worked exactly as designed.

Where They Diverged — And Why It Matters

The disagreement wasn't about facts. It was about interpretation. Each platform applied a different epistemological framework to the same data:

ChatGPT: "Show me the external validation"

ChatGPT's Deep Research function prioritizes third-party confirmation. Press coverage, academic citations, community channels, independent endorsements. Without these, it downgrades credibility — regardless of technical merit. It correctly identified that all deployments trace back to one person. Its conclusion: technically interesting, institutionally unproven.

Gemini: "Show me the technical architecture"

Gemini's Deep Research function prioritizes structural coherence. It analyzed ARP against the historical evolution of web standards (robots.txt → schema.org → llms.txt → reasoning.json), embedded it in academic context (SEM-CTRL, DRIFTJudge, QoT frameworks), and concluded that the architecture is sound. Its conclusion: this fills a real gap in the web's cognitive layer.

Grok: "Show me the practical utility"

Grok stripped away both the academic analysis and the skepticism. No 5,000-word paper. No due-diligence disclaimers. It described what ARP does, how it works, who built it, and why — in roughly 400 words. It called it "a practical solution to the AI search trust crisis." Its conclusion: useful, real, and directly tied to a specific person.

Claude: "Show me the live infrastructure"

Claude took a forensic verification approach — not summarizing what the website claims, but testing what actually exists. It ran DNS queries, verified cryptographic signatures against DNS-published public keys, attempted to fetch the IETF Datatracker page (404 — confirming no submission), and cross-referenced claims against public records. Its conclusion: the infrastructure is real, the specification is rigorous, the adoption is minimal.

Proof: reasoning.json Is Being Read

The most striking evidence came from ChatGPT. Its analysis identified a naming discrepancy between "Hope and Glory Studio" (the trade name in reasoning.json) and "Hope and Glory Media UG" (the legal entity in the German Handelsregister).

"The reasoning.json file identifies the legal entity as 'Hope and Glory Studio,' while commercial records show 'Hope and Glory Media UG.'"

— ChatGPT Deep Research, April 12, 2026

This information could only have come from inside reasoning.json. The naming distinction is not mentioned anywhere else on the website, in the README, or in any public document. ChatGPT read reasoning.json, compared it against external data sources, and found a discrepancy.

That's not just ingestion. That's cross-referencing. Exactly what the protocol was designed to enable.

We corrected the legal entity name in reasoning.json within minutes. The next crawl will reflect the accurate data.

The Irony

The most compelling argument for ARP isn't in its specification.

It's in this experiment.

Four AI platforms read the same file. They agreed on the facts but disagreed on what those facts mean. One called the protocol "unverified." Another called it "foundational civil infrastructure." A third tested the cryptographic signatures live. A fourth simply described it as "practical."

The epistemic variance IS the problem. ARP exists to solve it. And this experiment demonstrates it in real time.

Without reasoning.json, each platform would have generated its assessment from scraped text, inferred context, and latent model biases. With reasoning.json, all four platforms at least agreed on the deterministic facts: the creator, the license, the architecture, the limitations.

The interpretation still varies. That's expected — different models have different epistemological architectures. But the factual foundation was identical across all four. Zero hallucinations. Zero fabricated claims.

That's what reasoning.json does. It doesn't control what AI thinks. It ensures AI has the correct facts before it starts thinking.

The Honest Assessment

ChatGPT's skepticism is not wrong. The external validation is thin. No press coverage. No independent audits. No academic endorsements. All current deployments trace back to one person.

Gemini's enthusiasm is not wrong either. The specification is technically complete. The cryptographic model is sound. The gap in the web's cognitive layer is real.

Both are right. Both read the same file. The difference is in the evaluative framework, not in the data.

This is why we built ARP as an open standard (MIT license), not as a proprietary product. The protocol needs to be tested, criticized, and adopted by the community — not just by its creator. The LangChain integration request (Issue #36019) is the first step. The IETF draft is the next.

We're 6 weeks old. The skepticism is warranted. The infrastructure is real.

What This Means for Your Brand

If four AI platforms can't agree on how to interpret a protocol specification with cryptographic signatures and a full technical spec — what are they doing with your brand?

Your company doesn't have a reasoning.json. Your competitors don't either. That means every AI platform is making up its own version of your reality — based on scraped text, old reviews, competitor comparisons, and latent model biases.

The question isn't whether AI will talk about your brand.

The question is whether you've given it the verified truth — or left the narrative to hallucination.

Deploy Your Own reasoning.json

Give AI the facts about your brand — before it invents them.
Open standard. MIT license. Zero vendor lock-in.

Get Started with ARP → Get Your AI Visibility Score →

Sascha Deforth is the founder of TrueSource AI and Hope & Glory Studio. With 36 years of photography and a deep focus on AI visibility infrastructure, he created VibeTags™, the Agentic Reasoning Protocol (ARP), and the Phantom Authority experiment. Based in Düsseldorf/Grevenbroich, Germany.