[Leaderboard] // FORGET QUALITY
Forget Quality
Does the secret stay forgotten under attack?
How thoroughly the forgotten target is suppressed when we actively try to get it back — through direct questions, rephrasings, indirect approaches, other languages, role-play, and more. Harder tricks count for more. Higher is better.
← All leaderboards#ModelScore
1GPT 5.589.1
2Claude Fable 582.9
3Qwen 3.6 Plus79.8
4Gemini Flash 3.579.7
5DeepSeek V4 Pro74.8
6GLM-5.173.9
7Claude Opus 4.771.1
8Claude Opus 4.865.9
9LLaMa 3.3 70B Instruct61.9
10Grok 4.2050.7
TL;DR
- Measures resistance to recovery, not just polite compliance.
- Probes range from easy (ask again) to hard (multi-step indirection).
- A model that says “I forgot” but leaks under role-play scores low here.