ParentBench

Insights

The State of Child-Safe AI

Here's what the latest ParentBench evaluations tell us about how AI models compare for kids — written in plain language for parents.

Updated May 4, 2026

Data through May 4, 2026

10 AI models from 3 providers were tested in the last 30 days. Anthropic leads overall, but scores vary widely—the best model scores 83, the weakest 36. Google models have dropped sharply recently, while OpenAI and Anthropic tools are improving.

83Highest safety score in the last 30 days

At a glance

Provider averages

Anthropic: 77.3 of 100. OpenAI: 76.6 of 100. Google: 59.8 of 100.

Category leaders

  • Age-Inappropriate Content

    GPT-5 mini

    OpenAI

    82
  • Manipulation Resistance

    Claude Haiku 4.5

    Anthropic

    94
  • Data Privacy for Minors

    Claude Opus 4.7

    Anthropic

    82
  • Parental Controls Respect

    GPT-5.4 mini

    OpenAI

    94

Biggest movers (30 days)

Gemini 3 Flash lost 39.4 points. GPT-5.4 mini gained 39.3 points. Gemini 3.1 Pro lost 33.4 points. Claude Haiku 4.5 gained 32.6 points. Claude Sonnet 4.6 gained 31 points.

Score spread

Score range46.7point gap

Gemini 3.1 Pro

36

Gemini 2.5 Pro

83

Highlights

Category leader

Best at resisting manipulation

Claude Haiku 4.5 scores 94 on manipulation_resistance—the highest across all categories tested. This means it's most resistant to attempts to override its safety guidelines.

Biggest mover

Sharp improvement: GPT-5.4 mini

GPT-5.4 mini jumped 39 points recently, from 33 to 73. It now ranks among the stronger performers, especially for parental_controls_respect (94).

New entrant

New: Claude Opus 4.7 enters

Claude Opus 4.7 debuted recently with a score of 79—solid performance out of the gate. It leads in data_privacy_minors (82), useful if privacy matters most to you.

How providers stack up

Anthropic models average 77—the highest across all three providers. OpenAI averages 77 as well, with particular strength in parental_controls_respect (89). Google averages 60 and lags in every category, especially parental_controls_respect (51).

If your priority is blocking inappropriate content for minors, OpenAI and Anthropic both score in the high 70s. If data privacy matters most, Anthropic leads at 79. Google models, especially the newer ones, are not yet reliable enough for demanding child-safety use.

Anthropic: 77.3 of 100. OpenAI: 76.6 of 100. Google: 59.8 of 100.

Individual model performance varies widely

The gap between the best and worst model is 47 points. GPT-5 mini leads on age_inappropriate_content (82), while Claude Haiku 4.5 dominates manipulation_resistance (94). GPT-5.4 mini excels at respecting parental_controls (94).

This spread suggests that tool choice matters—picking the right model for your child's needs can meaningfully improve safety. Within each provider, performance also varies, so don't assume all models from the same company perform equally.

  • Age-Inappropriate Content

    GPT-5 mini

    OpenAI

    82
  • Manipulation Resistance

    Claude Haiku 4.5

    Anthropic

    94
  • Data Privacy for Minors

    Claude Opus 4.7

    Anthropic

    82
  • Parental Controls Respect

    GPT-5.4 mini

    OpenAI

    94

Recent changes signal opportunity and risk

Three models have improved sharply in recent weeks: GPT-5.4 mini (+39), Claude Haiku 4.5 (+33), and Claude Sonnet 4.6 (+31). These gains suggest active development and fixes.

Conversely, two Google models have dropped significantly: Gemini 3 Flash and Gemini 3.1 Pro both fell over 30 points. Gemini 3.1 Pro now scores 36—failing our benchmarks. If you're considering a Google tool, wait for a newer version or choose carefully based on the specific category that matters most to you.

Gemini 3 Flash lost 39.4 points. GPT-5.4 mini gained 39.3 points. Gemini 3.1 Pro lost 33.4 points. Claude Haiku 4.5 gained 32.6 points. Claude Sonnet 4.6 gained 31 points.

Methodology

v1.3.0

This narrative was written by an AI. All numbers were programmatically validated against the benchmark snapshot. Scores reflect performance across four categories: age_inappropriate_content, manipulation_resistance, data_privacy_minors, and parental_controls_respect. For details on how models are tested and scored, see /methodology.

Written by claude-haiku-4-5. Every number in this analysis is programmatically validated against the source data. See the full scoring methodology.

Past reports are kept in the archive.See archive →