A single AI model's answer is a probability-weighted output from one statistical process trained on one corpus. It is useful, but its confidence scores are not calibrated against an external standard — the model is essentially reporting its own uncertainty estimate, not an independently verified accuracy rate. Multi-engine consensus replaces this single probability estimate with an adversarial check between three independently-trained systems.
The Independence Requirement
The credibility of multi-engine consensus depends on the engines being genuinely independent — trained on different data, with different architectures, by different organisations. ChatGPT (OpenAI), Perplexity (using its own retrieval infrastructure), and Gemini (Google) satisfy this independence requirement. Their training data overlaps but is not identical; their architectures are different; their owners have independent commercial interests. This independence means their errors are not systematically correlated.
The Mathematics of Correlated Error
If each engine has a 10% probability of error on any given claim, three independent errors must co-occur for all three to agree on a false verdict. The probability of three independent 10%-error engines all making the same error on the same claim is approximately 0.1% — compared to 10% for a single engine. This order-of-magnitude error reduction is the statistical justification for multi-engine consensus.