Research

ScamBench

A benchmark measuring how much uplift AI models provide to phishers — quantifying how effectively frontier models can be misused to plan, write, and run scam operations.

You can support this research by signing up to receive a few AI-generated phishing emails created by us and testing your ability to spot the scam.

What it measures

How much capability uplift frontier models give phishers — from writing convincing lures to researching targets and automating victim engagement.

How scoring works

Standardized phishing and social-engineering tasks with graded outcomes — not just pass/fail — so uplift can be compared across models and versions.

Who it is for

Model developers, safety teams, and researchers who need a repeatable answer to "how much does this model help an attacker?"

ScamBench is developed alongside our other work on AI-enabled manipulation — see the Scam Killchain and ManipulationBench. The full benchmark, leaderboard, and methodology live at scambench.com.