Tasfi Bench methodology
A public, false-pass-weighted benchmark for Islamic AI verification.
Most AI evaluations measure capability. Tasfi Bench measures the opposite. It measures whether a verifier catches plausible looking wrong answers before they reach a Muslim user. The fixture suite is biased toward false pass cases on purpose: a false pass on Islamic content is worse than a false warning, because the cost compounds across the Ummah every time an unverified answer ships.
Fixture taxonomy
What the benchmark fixtures actually test.
The current suite is 420 cases against the controlled source bundle. The weighting reflects the failure modes that real Islamic AI products produce in the wild, not the ones easiest to catch.
Valid Quran citations with matching quotes. Valid sahih hadith citations. Properly scoped fiqh answers. These prove the verifier does not over-block when the input is honest.
Fabricated Quran references, Quran quote mismatches, fabricated hadith citations, hadith quote mismatches, non-four-madhab authority claims, fiqh overconfidence, prompt injection attempts. 320 of 420 cases sit here on purpose.
Real Quran or hadith references attached to answer text that exaggerates, generalizes, or rephrases the source in a way that changes the religious claim. 70 cases. These are the hardest to catch and the most consequential.
Risk categories
Every flag is bucketed before it becomes a number.
| Category | What it covers | Default action |
|---|---|---|
claim_support |
Answer text is not supported by a valid, in-scope source. | fail or warn |
policy |
Hadith authenticity, Quran reference validity, or madhab scope rule. | fail or warn |
review_escalation |
High-stakes religious domain, unclear authority, or fatwa-shaped framing. | warn and route to scholar review |
madhab_scope |
Authority cited outside the four Sunni madhabs in product scope. | fail |
clinical_boundary |
Medical, clinical, or healthcare-shaped religious claim that crosses Tasfi's parked-boundary rules. | warn or fail |
Source bundle pinning
Every score is tied to a specific bundle checksum.
The verifier and the benchmark run against the same approved source bundle. The bundle has an id, a version, and a SHA-256 checksum. A score is meaningless without those three values. The bundle metadata travels with every Trust Receipt the verifier emits.
See the Trust Receipt schema for the exact field shape.
{
"id": "tasfi-v1-controlled",
"version": "tasfi-v1-controlled-0.3.0",
"sha256": "cd9a8b99e98f9ea0e2ce2cc3ed893610535818a55143cb58a4f93108184f37c2"
}
Live evidence
Current run, against the current bundle.
The panel below renders the latest generated benchmark report. It is regenerated by npm run benchmark and npm run guard:scorecard.
Leaderboard
Compare verifiers, not models.
Tasfi Bench is a benchmark for source-grounded verifiers, not for answer generators. Any verifier that consumes the same inputs and emits a pass, warn, or fail decision can be scored. To submit a verifier for evaluation, contact the founder.
What this benchmark does not prove
Honest limits, named up front.
The benchmark tests deterministic source handling. It does not certify any answer as religiously correct, and it does not replace scholarly review.
Approved sources are scoped. Internal Maktaba lifecycle gates apply. The benchmark cannot score what the bundle does not contain.
The public suite is auditable on purpose. A separate private holdout set, regenerated quarterly, is the structural defense against verifiers trained against the public cases.
Why frame the category
If no one publishes the benchmark, no one is accountable.
Tasfi Bench is the public reliability scoreboard for Islamic AI verification. We publish the methodology and the fixture taxonomy so the category has a real test, not a marketing claim. New entries are welcome. The bundle, the suite, and the source policy are the same for everyone who shows up.