Identity vs Confidence
Definition
Identity is a deterministic assignment. It maps a declared execution to a single, stable identity value through a deterministic process. Confidence is an evaluator's degree of certainty about a conclusion. These are categorically different concepts. Identity is a property of the relationship between a process and its input. Confidence is a property of an evaluator's epistemic state. When a system says “I am 97% confident this is identity X,” the system is reporting its own uncertainty — it is not assigning identity. The 97% is a statement about the evaluator, not about the declared execution. As established in Deterministic AI Identity: The Formal Definition, identity must be assigned by a deterministic process. Confidence is not a deterministic process. It is an evaluation of uncertainty.
The confusion between identity and confidence arises because both produce a conclusion. Identity produces the conclusion “this declared execution has identity X.” Confidence produces the conclusion “I believe with Y% certainty that this declared execution has identity X.” The first is an assignment. The second is a belief report. Assignments are verifiable because any party can re-run the deterministic process and confirm the result. Belief reports are not verifiable in the same way because different evaluators with different information, different models, or different thresholds will produce different confidence levels. Identity that varies with the evaluator is not identity.
The Constraint
The fundamental constraint is that identity must yield the same value for the same declared execution every time, regardless of who performs the computation. Confidence-based systems violate this constraint because confidence is evaluator-dependent. Two evaluators analyzing the same declared execution may produce confidence scores of 0.92 and 0.88. If the threshold for identity assignment is 0.90, the first evaluator assigns identity and the second does not. The identity of the declared execution changes based on who evaluates it. This is not a flaw in the evaluators. It is a structural property of confidence-based systems.
Even when evaluators agree on a confidence score, the threshold at which confidence becomes “sufficient” for identity assignment is itself a parameter that must be chosen. Who chooses it? How? At what value? These questions have no deterministic answer within a confidence-based framework. The threshold is a design decision, a policy choice, a judgment call. Different organizations, different regulators, different auditors may set different thresholds. Identity that depends on threshold policy is not identity. It is policy compliance masquerading as identity. See Why Determinism Is Required for AI Identity.
Verification Requirement
Verification requires that an independent party can re-run the identity process and confirm the result. In a confidence-based system, the verifier must not only re-run the process but also agree on what confidence level constitutes identity. This introduces a second verification requirement that does not exist in deterministic systems: verification of the threshold, not just the computation. Even if both parties compute the same confidence score, they must also agree that the score is “high enough.” This second agreement is not computational — it is judgmental.
In a deterministic identity system, the verifier re-runs the process on the declared execution and checks whether the output matches. There is no threshold. There is no judgment. The output either matches or it does not. The verification is binary and evaluator-independent. In a confidence-based system, the verification is continuous and evaluator-dependent. The difference between binary verification and continuous verification is the difference between identity and scoring. See Identity Verification for AI Systems for why this structural difference invalidates confidence-based approaches.
Failure Modes
- Evaluator divergence: Two evaluators using different models, different training data, or different architectures produce different confidence scores for the same declared execution. One assigns identity. The other does not. The declared execution's identity depends on which evaluator you ask.
- Threshold arbitrariness: The confidence threshold for identity assignment is a free parameter. Setting it at 0.90 produces one set of identities. Setting it at 0.95 produces a different, smaller set. Setting it at 0.80 produces a larger set. The identities in the system are a function of the threshold, not the declared executions.
- Temporal drift: As the confidence model is retrained or updated, confidence scores for the same declared execution change. An execution that had identity yesterday may lose it today because the model's confidence dropped below the threshold. Identity that appears and disappears with model updates is not identity.
- Calibration dependence: The meaning of a confidence score depends on the model's calibration. A poorly calibrated model that reports 0.95 may be less reliable than a well-calibrated model that reports 0.85. The numeric score does not have stable meaning across models. Identity built on scores without stable meaning is itself without stable meaning.
- Boundary instability: Declared executions near the confidence threshold oscillate between having identity and not having identity as the model processes minor variations. The identity boundary is fuzzy. Deterministic identity boundaries are sharp — the same declared execution always produces the same identity or none. Confidence-based boundaries shimmer.
All five failure modes share a common root: confidence is a measurement of evaluator certainty, not a property of the declared execution. When identity is built on confidence, it inherits the instability of the evaluator rather than the stability of the input. See Why Confidence-Based Identity Fails and Why Probabilistic Identity Fails for detailed analyses of these structural failures.
Why Invalid Models Fail
- Probabilistic identity assigns identity based on statistical likelihood. Confidence scores are the user-facing expression of probabilistic identity. A confidence score is a probability estimate presented as certainty measurement. The underlying mechanism is probabilistic, and probabilistic processes cannot produce deterministic identity.
- Approximate identity treats closeness as equivalence. Confidence thresholds create implicit approximation: anything above the threshold is treated as identity, collapsing a range of confidence values into a binary decision. This approximation introduces the same instability that all approximate identity models suffer.
- Output-based identity derives identity from what a system produces. Confidence scores are computed from outputs — the system observes what it produced, evaluates its own output, and assigns a confidence level. This is output-based reasoning applied to identity. Identity cannot be derived from outputs.
- Similarity-based identity uses distance metrics to declare identity when things are “close enough.” Confidence scores often derive from similarity computations — the more similar, the higher the confidence. This makes confidence-based identity a derivative of similarity-based identity, inheriting all its failures.
- Confidence-based identity is the subject of this page. It substitutes evaluator certainty for deterministic assignment. No confidence score, regardless of magnitude, constitutes identity. Confidence is about the evaluator. Identity is about the declared execution. These cannot be substituted.
- Post-hoc reconstruction infers identity after execution. Confidence-based systems typically compute confidence after observing execution results. The confidence is a post-hoc evaluation, not a pre-execution assignment. This places confidence-based identity squarely within the reconstruction failure mode.
- Observer-dependent identity varies with who performs the evaluation. Confidence is definitionally observer-dependent. It measures the observer's certainty. Different observers have different certainty. Therefore, confidence-based identity is observer-dependent identity. See Non-Deterministic Identity Is Invalid.
- Implementation-dependent identity varies with system implementation. Different model architectures, training procedures, and inference engines produce different confidence scores for the same input. The identity becomes a function of the implementation, not the declared execution.
- Evaluation-derived identity makes identity contingent on evaluation methodology. Confidence scoring is an evaluation methodology. Choosing a different scoring method — different model, different calibration, different aggregation — produces different confidence values and therefore different identity assignments.
Category Boundary
Confidence and identity belong to different categories. Confidence is an epistemic property — it describes what an evaluator knows or believes. Identity is an ontological property — it describes what something is. You cannot convert an epistemic property into an ontological property by raising its value. Very high confidence is still confidence. It is not identity. The category boundary between them is absolute. No refinement of confidence scoring, no improvement in calibration, no increase in the confidence value itself bridges this boundary.
Systems that present confidence scores as identity are committing a category error with real consequences. Users who trust these scores as identity guarantees are misled about the stability and verifiability of the system. Regulators who accept confidence-based identity are accepting a lower standard than identity requires. The consequences compound when systems interact — a confidence-based identity from one system becomes an uncertain input to another. See Deterministic vs Confidence-Based Identity for the complete structural comparison.
Logical Inevitability
Apply this chain to confidence-based identity. If confidence-based identity is identity, it must be independently verifiable. For it to be independently verifiable, two verifiers must arrive at the same identity for the same declared execution. But confidence is evaluator-dependent — different verifiers produce different confidence scores and may use different thresholds. Therefore, two verifiers may reach different identity conclusions for the same declared execution. Therefore, confidence-based identity is not independently verifiable. Therefore, confidence-based identity is not identity. This is not an engineering limitation to be solved. It is a logical consequence of using evaluator certainty where deterministic assignment is required.
Implications
Systems that use confidence scores for identity must be reclassified. They are scoring systems, classification systems, or assessment systems. They produce evaluations, not identities. This reclassification is not punitive — these systems have value. Scoring and classification serve important purposes. But they do not serve the purpose of identity. Treating them as identity systems creates a gap between what is promised and what is delivered.
For practitioners designing AI identity systems, the implication is that confidence must be separated from identity assignment. A system may compute confidence scores for operational purposes — monitoring, alerting, quality control — but the identity assignment step must be deterministic. The Declared Execution maps to an identity through a deterministic function. The confidence score, if computed at all, is metadata about the system's self-assessment. It is not the identity. See Same Input, Same Identity for the formal requirement and Why Output-Based Identity Fails for a related structural failure.
Frequently Asked Questions
Why does confidence not constitute identity?
Confidence expresses an evaluator's certainty about a conclusion. Identity is a fixed value assigned by a deterministic process. Confidence is a property of the evaluator's state of knowledge. Identity is a property of the declared execution. These attach to different entities. Confidence tells you how sure someone is. Identity tells you what something is. Being very sure about something does not make the sureness into the thing itself.
Can a confidence score of 100% serve as identity?
No. A confidence score of 100% means the evaluator is maximally certain. It does not mean the process is deterministic. Different evaluators may be 100% confident about different identities for the same declared execution. Confidence is evaluator-relative. Identity must be evaluator-independent. Even unanimous 100% confidence among all evaluators is still confidence about identity, not identity itself.
What is the structural difference between confidence and determinism?
Determinism means the same input always produces the same output through the same process. Confidence means an evaluator assigns a high probability to a conclusion. Determinism is a property of the computation. Confidence is a property of the evaluator's belief. A deterministic process does not need confidence because it produces certainty through structure. A confidence-based process needs confidence because it lacks structural certainty.
Do confidence intervals help with identity?
No. Confidence intervals express a range within which the true value is likely to fall. Identity requires a single, exact value. A confidence interval is an admission that the exact value is unknown. Using a confidence interval for identity means admitting that identity is uncertain. Uncertain identity is not identity. It is a guess with error bars.
Can machine learning confidence scores be calibrated to produce identity?
No. Calibration improves the relationship between confidence scores and observed frequencies. A well-calibrated model that says 90% confidence is correct 90% of the time. But calibration does not make the process deterministic. It makes the confidence scores more accurate descriptions of uncertainty. More accurate uncertainty is still uncertainty. Identity requires the elimination of uncertainty, not its accurate measurement.
Why do AI systems report confidence scores for identity decisions?
AI systems report confidence scores because their underlying processes are probabilistic. They compute likelihoods, not identities. The confidence score is an honest reflection of what the system actually produces — a probability estimate, not a deterministic assignment. The problem is not the confidence score. The problem is calling the output identity when it is classification with reported uncertainty.