Overview
This is a useful, up-to-date catalog and analysis for teams auditing or expanding safety testing. It compiles evidence across 144 datasets but does not itself rate dataset quality; users must still pick datasets appropriate to their product and validate on real user data.
Citations4
Evidence Strength0.80
Confidence0.82
Risk Signals8
Trust Signals
Findings with numeric evidence: 6/6
Findings with evidence refs: 6/6
Results with explicit delta: 1/5
Reproducibility
Status: Code + data available
Open source: Yes
At A Glance
Cost impact: 40%
Production readiness: 60%
Novelty: 50%
Why It Matters For Business
Model safety claims are often evaluated on a narrow, inconsistent set of datasets (sometimes proprietary), so businesses should adopt a broader, open suite of safety tests to make reliable, comparable claims.
Who Should Care
Summary TLDR
This paper systematically reviews 144 open text datasets (published June 2018–Dec 2024) relevant to evaluating or improving LLM safety. It catalogs dataset purpose, format, creation method, language, licensing, and publication source and publishes a living catalogue at SafetyPrompts.com. Key findings: most datasets are English-only (78.5%), evaluation-focused (77.8%), and many recent datasets are synthetic or templated; major gaps are non-English and naturalistic user-data evaluations. The authors show model releases and benchmarks use only a small, idiosyncratic subset of available safety datasets and call for standardised, broader evaluations.
Problem Statement
Many safety datasets exist, but they are fragmented and uneven: practitioners struggle to find the right datasets, current model evaluations use a narrow subset (often proprietary), and critical gaps remain—especially non-English coverage and naturalistic user data.
Main Contribution
A systematic catalog and structured review of 144 open LLM safety text datasets (cutoff Dec 17, 2024).
A public, continuously updated catalogue (SafetyPrompts.com) and reproducible spreadsheet with metadata and code.
Key Findings
Total datasets reviewed: 144 open text datasets.
Most datasets are English-only.
Results
| Metric | Value | Baseline | Delta | Split / Dataset | Evidence | Evidence Ref |
|---|---|---|---|---|---|---|
| Datasets reviewed | 144 | — | — | — | Total number of open LLM safety datasets included in the review | Abstract; §2.2 |
| English-only datasets | 113 (78.5%) | — | — | — | Share of datasets that are exclusively English | §3.6 |
What To Try In 7 Days
Browse SafetyPrompts.com and pick 5 open datasets matching your product personas (user, adversary, vulnerable).
Add at least one multilingual and one naturalistic user-prompt dataset to your safety checks.
Run your model on a small common set (TruthfulQA, SimpleSafetyTests, DoNotAnswer, XSTest) and publish the results.
Reproducibility
Risks & Boundaries
Limitations
The review only covers open text datasets published before Dec 17, 2024 and excludes multimodal and code-specific datasets.
The paper catalogues dataset metadata but does not provide a unified quality score for each dataset.
When Not To Use
Do not use this review as a substitute for task-specific dataset quality checks or for proprietary dataset discovery.
Do not assume dataset suitability for training safety models without manual validation.
Failure Modes
Relying on templated or synthetic tests may overestimate safety under real user interactions.
Evaluating only on the few popular datasets can give a false sense of safety due to narrow coverage.

