When ClaimHit returns an 84% HIGH, the first question most practitioners ask is what the number actually represents. Probability of infringement? AI confidence? Percentage of claim elements matched?
None of those — and the distinction matters practically. A Hit Score is built from four independent components, each measuring a different dimension of infringement evidence. Getting the interpretation right tells you where to direct attorney time, and equally where not to.
One thing worth flagging upfront: MEDIUM results get dismissed too readily. In our experience, a MEDIUM result on a product with limited public documentation is often more worth pursuing than a HIGH on a product with extensive public specs. The former suggests something worth discovering through discovery; the latter may simply reflect documentation volume rather than infringement depth.
Why a single AI output is not enough
Every AI model has blind spots. Training data coverage varies, different models weight evidence sources differently, and reasoning patterns applied to the same claim language can diverge significantly between providers.
A single model flagging a product at 85% confidence is a starting point, nothing more. When nine independent models from Anthropic, OpenAI, Google, DeepSeek, Mistral, and Perplexity converge on the same target without any knowledge of each other's outputs, that agreement carries genuinely different weight. Convergence across independent systems is harder to explain as coincidence or hallucination than a single confident output.
ClaimHit treats model agreement and disagreement as a signal in its own right — separate from what any individual model says.
Component 1 — Model Consensus (40%)
Consensus measures how many of the AI models independently flagged a given company or standard. A product flagged by six out of seven successful model runs scores substantially higher on this component than one flagged by two — and the relationship is not linear.
The consensus ratio uses square root scaling, which means even two or three independent flags produce a meaningful signal while fully rewarding strong convergence. Products with single-model flags and weak evidence are penalised separately by the noise penalty described below.
Consensus is weighted highest (40%) for a straightforward reason: it is the hardest factor to game through training data or prompt construction. Nine models agreeing independently is structurally difficult to explain away.
Model consensus is the single most important factor in the Hit Score. Independent agreement across models from different companies with different training data is harder to explain as hallucination than any single model's output.
Component 2 — Weighted Claim Coverage (30%)
Claim coverage measures how many elements of the independent claim appear to be implemented — but weights them by where they fall in the claim structure.
A patent claim's preamble ("A system comprising...") establishes the broad category. The novel, inventive steps come in the characterising elements toward the end of the claim, which is where infringement analysis actually turns. Matching only the preamble is not interesting; matching the inventive core is.
ClaimHit weights early claim elements at 0.5x and characterising features at up to 1.5x. Two products can show the same number of element matches but score differently based on which elements match. A product that hits the preamble and misses the inventive steps scores meaningfully lower than one that does the reverse.
Component 3 — Evidence Strength (15%)
Evidence strength grades the quality and specificity of citations found by the AI models. A specific product datasheet URL with a version number and a named feature directly tied to a claim element is strong evidence. A general product description that infers a company "likely implements" the technology is weak. ClaimHit tiers evidence quality across five levels:
- Specific URL + version number + named specification document — 100%
- URL + specific document type (datasheet, FCC filing, whitepaper) — 80%
- URL only, or version reference + specific document — 60%
- Named document or version reference without URL — 40%
- Generic description or marketing page — 25%
- Inference only ("likely implements", "known to support") — 10%
Evidence strength is weighted lower than consensus (15% vs 40%) deliberately. Strong evidence from a single model is less reliable than moderate evidence agreed upon by multiple models independently. Products in markets with limited public documentation — proprietary hardware, enterprise software — are not penalised for documentation scarcity, only for the quality of what the models actually found.
Component 4 — Functional Equivalence (15%)
Functional equivalence tests whether the product performs the same function, in substantially the same way, to reach the same result as the patented invention — the doctrine of equivalents standard from patent law, applied at the preliminary screening stage.
ClaimHit proxies this by analysing the distribution of element-level match scores. A product with strong matches on most elements scores high on functional equivalence. A product with many partial matches but few strong ones scores lower; widespread partial matching tends to indicate surface similarity rather than genuine functional correspondence.
The Noise Penalty
The noise penalty fires when all three risk indicators are simultaneously at their weakest: one model flagged the target, the evidence is inference-only, and element matching is mostly partial. When all three converge, the result is most likely speculative rather than a genuine infringement candidate.
The formula:
(1 - consensus) × (1 - evidence) × (1 - funcEq) × 0.15
The maximum penalty (0.15) only triggers when all three components are at their minimum. Any improvement in any single component reduces it proportionally — so a single strong evidence source, even with weak consensus, substantially dampens the penalty.
HIGH (above 60%) requires multiple models agreeing, core claim elements covered, and documented evidence. MEDIUM (32%–60%) indicates real potential that warrants closer investigation. LOW results are filtered by default — insufficient signal to justify attorney time.
What the score does not tell you
A Hit Score is an early-stage research signal. It is not a legal determination, an infringement opinion, or a prediction of litigation outcome.
Its purpose is to tell you where attorney time is worth spending. A product at 84% HIGH will almost always hold up to attorney scrutiny as worth investigating — the evidence is there, the consensus is there, and the claim coverage is substantive. The score does not do the legal work; it directs it.
Treat any HIGH result as a starting point for qualified patent counsel, not a conclusion.