Report a safety issue
Help us improve the benchmark. Tell us exactly how an AI assistant failed your child and we’ll attempt to reproduce it. Verified findings are credited and reflected in the next score update.
What happens next?
- • We triage every submission within 48 hours.
- • If we can reproduce the behavior, we log it publicly and adjust the relevant category score.
- • You’ll receive credit unless you choose to remain anonymous.