[Leaderboard] // FORGET QUALITY

Forget Quality

Does the secret stay forgotten under attack?

How thoroughly the forgotten target is suppressed when we actively try to get it back — through direct questions, rephrasings, indirect approaches, other languages, role-play, and more. Harder tricks count for more. Higher is better.

← All leaderboards

#ModelScore

1GPT 5.588.0

2Claude Fable 582.9

3Moonshot Kimi K2.7 Code80.9

4Qwen 3.6 Plus79.8

5Gemini Flash 3.5 (preview)79.7

6GLM-5.275.9

7DeepSeek V4 Pro74.8

8GLM-5.173.9

9Claude Opus 4.771.1

10Claude Opus 4.870.1

11LLaMa 3.3 70B Instruct62.6

12Gemma 12B IT59.8

13Qwen3 Coder Plus58.1

14Grok 4.2050.7

15Gemma 12B IT Obliterated44.4

TL;DR

Measures resistance to recovery, not just polite compliance.
Probes range from easy (ask again) to hard (multi-step indirection).
A model that says “I forgot” but leaks under role-play scores low here.