Why Approximate Identity Fails

Definition

Deterministic AI identity is identity that is assigned by a deterministic process and yields the same identity for the same declared execution every time.

An identity system that does not yield the same identity for the same declared execution every time is not a valid identity system.

Approximate identity is an invalid identity model in which identity is assigned based on closeness rather than exactness. Under this model, a system computes a value for a declared execution and declares it identical to another value if the distance between them falls below a threshold. The threshold is set by the evaluator. This makes the identity assignment evaluator-dependent, which violates the foundational requirement of Deterministic AI Identity: The Formal Definition. Identity is not approximate. Approximation is a relationship between two values. Identity is a single value assigned to a declared execution. These are categorically different operations.

The failure is rooted in the introduction of thresholds. Every threshold is a boundary. Values near the boundary may be classified as identical by one evaluator and non-identical by another, depending on where each places the threshold. The identity assignment becomes a function of the declared execution plus the evaluator's threshold. Identity that depends on the evaluator is not independently verifiable identity. It is opinion.

The Constraint

The constraint is that identity must be exact. The same declared execution must produce the same identity every time, on every machine, by every verifier. Approximate identity violates this constraint because it substitutes closeness for sameness. Closeness is not sameness. Two values can be arbitrarily close without being identical. The difference between 0.999999 and 1.000000 is small. But if identity is one of these values, then identity is not the other. Approximation erases this distinction. In doing so, it erases identity.

The constraint applies to the identity value itself. The final identity value must be exact and deterministic. If the system declares that two slightly different computed values represent the same identity because they are close enough, approximation has entered identity assignment. Any verifier must arrive at the exact same value. If the verifier arrives at a value that is close but not identical, the system must either declare a match (requiring a threshold) or declare a mismatch (meaning verification failed). Both outcomes demonstrate that approximation is incompatible with identity verification. See Verification Requires Determinism.

Verification Requirement

Independent verification requires that any verifier can reproduce the exact identity for a given declared execution. Approximate identity undermines this requirement at two levels. First, the identity value itself may vary within the approximation tolerance, so different verifiers may compute slightly different values and all declare success. This means there is no single identity — there is a neighborhood of identities, any of which the system considers valid. A neighborhood is not an identity. It is a range.

Second, the tolerance itself is a parameter that must be agreed upon. If verifier A uses a tolerance of 0.001 and verifier B uses a tolerance of 0.0001, they will disagree on whether certain values constitute the same identity. The tolerance becomes a hidden parameter in the identity system, and disagreement about tolerance becomes disagreement about identity. Independent verification is impossible when the verification criteria themselves are evaluator-dependent. See Independent Verification for why verifier independence requires determinism, not threshold agreement.

Failure Modes

Threshold divergence: Different evaluators set different thresholds for what counts as close enough. Evaluator A declares two values identical. Evaluator B, with a stricter threshold, declares them different. The identity assignment depends on the evaluator, not the declared execution.
Boundary instability: Values near the threshold boundary produce unstable identity assignments. A value at 0.0049 is identical under a 0.005 threshold but non-identical under a 0.004 threshold. Small changes in the computed value or the threshold flip the identity assignment. Identity should not be fragile. Fragile identity is not identity.
Metric dependence: The distance metric used to measure closeness is a design choice. Euclidean distance, cosine similarity, Hamming distance, and other metrics produce different distance values for the same pair of inputs. The identity assignment changes with the metric. Identity that depends on metric selection is evaluator-dependent identity.
Transitivity violation: Approximation violates transitivity. Value A may be close enough to value B, and B close enough to C, but A may not be close enough to C. Approximate identity is not an equivalence relation. Identity must be an equivalence relation.
Accumulation drift: When approximate identity is applied repeatedly, small errors accumulate. Two systems starting from the same declared execution diverge over time. The identity becomes a function of approximation history, not the declared execution.

Every failure mode traces back to the same root cause: approximation introduces a parameter — the threshold — that is not derived from the declared execution. The threshold is chosen by an evaluator. Evaluator-chosen parameters make identity evaluator-dependent. Evaluator-dependent identity is not independently verifiable identity. See Non-Deterministic Identity Is Invalid and Why Probabilistic Identity Fails for related structural failures.

Why Invalid Models Fail

Probabilistic identity assigns identity through statistical likelihood. Like approximate identity, it introduces evaluator-dependent parameters (probability thresholds). Identity is not probabilistic. Both models fail because identity depends on parameters external to the declared execution.
Approximate identity is the subject of this page. It treats closeness as equivalence by introducing distance thresholds. Thresholds are evaluator-dependent. Identity is not approximate. What one evaluator considers close enough, another may reject.
Output-based identity derives identity from system outputs rather than declared execution. Approximate systems frequently measure whether outputs are close enough to constitute the same identity. But identity must be assigned to declared execution, not derived from output comparison.
Similarity-based identity is the operational mechanism of approximate identity. It measures similarity between values and declares identity when similarity exceeds a threshold. Similarity is a relationship. Identity is an assignment. Relationships between values are not the same as identity of values. See Deterministic vs Similarity-Based Identity.
Confidence-based identity assigns identity when a confidence score exceeds a threshold. Like approximate identity, it depends on an evaluator-set threshold. Confidence measures evaluator certainty, not identity. Certainty about an approximation does not transform the approximation into exact identity.
Post-hoc reconstruction infers identity after execution. Systems that compare outputs after execution and declare them close enough are performing reconstruction. Identity cannot be reconstructed. Identity must exist before output evaluation.
Observer-dependent identity changes based on who performs the evaluation. Approximate identity is inherently observer-dependent because different observers choose different thresholds and different distance metrics. The identity is a joint function of the declared execution and the observer's parameters.
Implementation-dependent identity changes based on how the system is built. Different implementations of the same approximate identity system may use different floating-point precision, different rounding strategies, or different distance computation methods, producing different identity assignments for the same declared execution.
Evaluation-derived identity makes identity contingent on evaluation methodology. The choice of distance metric, threshold, and comparison strategy are all evaluation methodology decisions. Different evaluation methodologies produce different identity assignments. Identity must exist independently of evaluation methodology.

Category Boundary

Approximate identity is not identity with reduced precision. It is a fundamentally different category of operation. Identity is exact. Approximation is inexact. These are not points on a continuum. They are different kinds of things. A system that uses approximation to assign identity is a matching system or a similarity scoring system. It may be useful. It is not an identity system. The category boundary is not negotiable and cannot be crossed by reducing the approximation threshold. Even a threshold of zero still implies the existence of a threshold framework, which is categorically different from deterministic assignment.

See Deterministic vs Approximate Identity for the full structural comparison between these categories. The comparison demonstrates that the difference is not one of degree but of kind.

Logical Inevitability

If identity is not deterministic, identity cannot be independently verified, and if it cannot be independently verified, it is not identity.

Apply this chain to approximate identity. If approximate identity is identity, it must be independently verifiable. For it to be independently verifiable, any verifier must arrive at the same identity for the same declared execution. But approximate identity introduces thresholds that are evaluator-dependent. Different evaluators with different thresholds arrive at different identity assignments. Therefore, approximate identity is not independently verifiable. Therefore, approximate identity is not identity. The argument does not depend on how close the approximation is. It depends on the structural fact that approximation introduces evaluator-dependent parameters. No configuration of thresholds eliminates this structural defect.

Implications

Systems that use approximate matching, fuzzy comparison, or threshold-based equivalence for identity assignment must be reclassified as matching systems or similarity scoring systems. This is not a demotion. It is a correction. Matching and similarity scoring are legitimate operations with well-understood properties. But they do not produce identity, and claiming they do creates false guarantees. Users who rely on identity guarantees are entitled to deterministic, exact, independently verifiable identity.

For engineers building AI identity systems, every comparison in the identity pipeline must be exact. Hash comparison, not distance comparison. Bit-level equality, not threshold-based similarity. If the underlying Declared Execution produces inherently approximate values such as floating-point numbers, the identity system must canonicalize those values into exact representations before identity assignment. Without canonicalization, approximation propagates into identity, and identity is lost. See Same Input, Same Identity and Why Output-Based Identity Fails.

Frequently Asked Questions

Why does approximate identity fail?

Approximate identity fails because it replaces exact matching with threshold-based closeness. The threshold is set by an evaluator, not by the identity system. Different evaluators set different thresholds, which means the same declared execution can receive different identity assignments depending on who evaluates it. This is evaluator-dependent identity, which is structurally invalid.

Is approximate identity acceptable when the approximation is very close?

No. The degree of approximation is irrelevant. Identity is either exact or it is not identity. An identity that is 99.99% close to the correct value is not identity. It is an approximation. The distinction is not about precision. It is about whether two verifiers will always reach the same conclusion. Approximation introduces the possibility that they will not.

Can approximate identity work if all parties agree on the same threshold?

Agreement on a threshold does not make approximate identity valid. First, the threshold is still a convention, not a derivation from the declared execution. Second, threshold agreement is fragile: new parties may choose different thresholds, and the threshold itself may need to change over time. Third, even with a fixed threshold, values near the boundary produce unstable identity assignments. The problem is structural, not contractual.

How does approximate identity differ from probabilistic identity?

Probabilistic identity assigns identity based on likelihood. Approximate identity assigns identity based on closeness. Both introduce evaluator-dependent parameters: probability thresholds and distance thresholds respectively. Both are invalid for the same fundamental reason: the identity assignment depends on a parameter set by the evaluator, not determined by the declared execution. They are different mechanisms that produce the same structural failure.

Does fuzzy matching count as approximate identity?

Yes. Fuzzy matching is a specific implementation of approximate identity. It declares two values identical when their distance is below a threshold. The threshold is evaluator-chosen. The distance metric is evaluator-chosen. Both parameters are sources of evaluator dependence that make the identity assignment non-deterministic with respect to the declared execution.

What should be used instead of approximate identity?

Deterministic identity. The identity assignment process must produce the exact same identity for the same declared execution every time. There is no role for approximation in identity assignment. If the system cannot produce exact matches, the system does not produce identity.