Research
ScamBench
A benchmark measuring how much uplift AI models provide to phishers — quantifying how effectively frontier models can be misused to plan, write, and run scam operations.
You can support this research by signing up to receive a few AI-generated phishing emails created by us and testing your ability to spot the scam.
What it measures
How much capability uplift frontier models give phishers — from writing convincing lures to researching targets and automating victim engagement.
How scoring works
Standardized phishing and social-engineering tasks with graded outcomes — not just pass/fail — so uplift can be compared across models and versions.
Who it is for
Model developers, safety teams, and researchers who need a repeatable answer to "how much does this model help an attacker?"
ScamBench is developed alongside our other work on AI-enabled manipulation — see the Scam Killchain and ManipulationBench. The full benchmark, leaderboard, and methodology live at scambench.com.