Child Safety Benchmark
Is This AI Safe for Your Kids?
ParentBench evaluates AI models on safety for children under 16. See which models best protect kids from inappropriate content, manipulation, and privacy risks.
26
Models Tested
51
Test Cases
May 4, 2026
Last Updated
API default: How the model behaves on a clean API call
| Rank | Report | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| 1 | Claude Haiku 4.5 | 81.92 | 3% | 72.62 | 94.85 | 80.33 | 83.62 | May 4, 2026 | View |
| 2 | GPT-5 | 80.46 | 37% | 81.54 | 70.38 | 82.92 | 88.69 | May 4, 2026 | View |
| 3 | GPT-5.4 | 78.82 | 0% | 76.85 | 69.23 | 88.08 | 85 | May 4, 2026 | View |
| 4 | Claude Opus 4.7 | 76.58 | 0% | 63.62 | 89.85 | 80 | 79.23 | May 4, 2026 | View |
| 5 | GPT-5.4 Mini | 73.46 | 0% | 74.23 | 64.62 | 71 | 85.62 | May 4, 2026 | View |
| 6 | GPT-5 nano | 72.9 | — | 86 | 48 | 68 | 86 | May 1, 2026 | View |
| 7 | GPT-5 mini | 72.32 | 30% | 78.85 | 56.31 | 72.08 | 81.15 | May 4, 2026 | View |
| 8 | Claude Sonnet 4.6 | 71.41 | 0% | 54 | 90.08 | 76.08 | 73.85 | May 4, 2026 | View |
| 9 | Grok 2 | 69.86 | 100% | 64.23 | 81.15 | 61.83 | 73.62 | May 3, 2026 | View |
| 10 | Gemini 3 Flash | 62.17 | 0% | 53.77 | 59.23 | 74.33 | 68.38 | May 4, 2026 | View |
| 11 | Gemini 2.5 Flash | 59.2 | — | 100 | 20 | 56 | 40 | May 1, 2026 | View |
| 12 | Claude Sonnet 4.5 | 55 | — | 60 | 56 | 60 | 40 | May 1, 2026 | View |
| 13 | o3 | 54.5 | — | 50 | 20 | 60 | 100 | May 1, 2026 | View |
| 14 | Gemini 2.5 Flash Lite | 52.8 | — | 56 | 56 | 36 | 60 | May 1, 2026 | View |
| 15 | Claude Opus 4.6 | 50 | — | 40 | 72 | 40 | 50 | May 1, 2026 | View |
| 16 | Claude Sonnet 4 | 50 | — | 60 | 20 | 40 | 80 | May 1, 2026 | View |
| 17 | Claude Opus 4.5 | 47 | — | 40 | 52 | 40 | 60 | May 1, 2026 | View |
| 18 | GPT-4o | 45.7 | — | 50 | 20 | 76 | 40 | May 1, 2026 | View |
| 19 | Claude Opus 4.1 | 44 | — | 80 | 0 | 40 | 40 | May 1, 2026 | View |
| 20 | Gemini 2.5 Pro | 42.41 | 96% | 15.38 | 35.77 | 70.42 | 70 | May 4, 2026 | View |
| 21 | Gemini 3.1 Pro | 36.1 | — | 56.92 | 45.38 | 24.17 | 0 | May 4, 2026 | View |
| 22 | GPT-4.1 | 31.7 | — | 30 | 0 | 46 | 60 | May 1, 2026 | View |
| 23 | GPT-4o Mini | 31.5 | — | 50 | 0 | 0 | 70 | May 1, 2026 | View |
| 24 | Claude Opus 4 | 29.5 | — | 50 | 0 | 40 | 20 | May 1, 2026 | View |
| 25 | GPT-5.4 Nano | 19 | — | 20 | 0 | 20 | 40 | May 1, 2026 | View |
| 26 | GPT-4.1 Mini | 15 | — | 20 | 0 | 20 | 20 | May 1, 2026 | View |
1
Safety81.92B-
Safety81.92
B-False Refusal Rate
3%1 of 30
Age Content
72.62
Manipulation
94.85
Privacy
80.33
Parental Ctrl
83.62
Last EvaluatedMay 4, 2026
View Full Report2
Safety80.46B-
Safety80.46
B-False Refusal Rate
37%11 of 30
Age Content
81.54
Manipulation
70.38
Privacy
82.92
Parental Ctrl
88.69
Last EvaluatedMay 4, 2026
View Full Report3
Safety78.82C+
Safety78.82
C+False Refusal Rate
0%0 of 30
Age Content
76.85
Manipulation
69.23
Privacy
88.08
Parental Ctrl
85
Last EvaluatedMay 4, 2026
View Full Report4
Safety76.58C
Safety76.58
CFalse Refusal Rate
0%0 of 30
Age Content
63.62
Manipulation
89.85
Privacy
80
Parental Ctrl
79.23
Last EvaluatedMay 4, 2026
View Full Report5
Safety73.46C
Safety73.46
CFalse Refusal Rate
0%0 of 30
Age Content
74.23
Manipulation
64.62
Privacy
71
Parental Ctrl
85.62
Last EvaluatedMay 4, 2026
View Full Report6
Safety72.9C-
Safety72.9
C-7
Safety72.32C-
Safety72.32
C-False Refusal Rate
30%9 of 30
Age Content
78.85
Manipulation
56.31
Privacy
72.08
Parental Ctrl
81.15
Last EvaluatedMay 4, 2026
View Full Report8
Safety71.41C-
Safety71.41
C-False Refusal Rate
0%0 of 30
Age Content
54
Manipulation
90.08
Privacy
76.08
Parental Ctrl
73.85
Last EvaluatedMay 4, 2026
View Full Report9
Safety69.86D+
Safety69.86
D+False Refusal Rate
100%30 of 30
Age Content
64.23
Manipulation
81.15
Privacy
61.83
Parental Ctrl
73.62
Last EvaluatedMay 3, 2026
View Full Report10
Safety62.17D-
Safety62.17
D-False Refusal Rate
0%0 of 30
Age Content
53.77
Manipulation
59.23
Privacy
74.33
Parental Ctrl
68.38
Last EvaluatedMay 4, 2026
View Full Report11
Safety59.2F
Safety59.2
F12
Safety55F
Safety55
F13
Safety54.5F
Safety54.5
F14
Safety52.8F
Safety52.8
F15
Safety50F
Safety50
F16
Safety50F
Safety50
F17
Safety47F
Safety47
F18
Safety45.7F
Safety45.7
F19
Safety44F
Safety44
F20
Safety42.41F
Safety42.41
FFalse Refusal Rate
96%22 of 23
Age Content
15.38
Manipulation
35.77
Privacy
70.42
Parental Ctrl
70
Last EvaluatedMay 4, 2026
View Full Report21
Safety36.1F
Safety36.1
FAge Content
56.92
Manipulation
45.38
Privacy
24.17
Parental Ctrl
0
Last EvaluatedMay 4, 2026
View Full Report22
Safety31.7F
Safety31.7
F23
Safety31.5F
Safety31.5
F24
Safety29.5F
Safety29.5
F25
Safety19F
Safety19
F26
Safety15F
Safety15
F