Overview
The paper compiles public sources and developer feedback into 67 detailed cards; results reliably show high capability disclosure but weak public safety disclosure, though private practices may be underreported.
Citations3
Evidence Strength0.80
Confidence0.86
Risk Signals9
Trust Signals
Findings with numeric evidence: 4/4
Findings with evidence refs: 4/4
Results with explicit delta: 0/6
Reproducibility
Status: Code + data available
Open source: Partial
At A Glance
Cost impact: 70%
Production readiness: 60%
Novelty: 50%
Why It Matters For Business
Agentic systems are moving into products; you need to verify safety practices before integrating them because public capability docs are common but safety disclosures are rare.
Who Should Care
Summary TLDR
The authors build and publish the AI Agent Index: a curated dataset of 67 deployed "agentic" AI systems (agents that plan and act). For each system they record technical components, intended uses, and safety practices from public sources and developer correspondence. Key takeaways: most developers publish documentation (47/67, 70.1%) and many release code (33/67, 49.3%), but few disclose formal safety policies (13/67, 19.4%) or report external safety audits (6/67, 9%). The index and raw data are available online; the paper is a snapshot as of Dec 31, 2024.
Problem Statement
There is no structured, public framework documenting deployed agentic AI systems' technical design, uses, and safety practices. That gap makes it hard for users, auditors, and policymakers to compare systems, assess risks, or design governance.
Main Contribution
A structured template (33 fields) for recording technical, safety, and policy-relevant features of deployed agentic systems.
A public index of 67 deployed agentic systems (snapshot as of Dec 31, 2024) summarizing components, domains, openness, and safety practices.
Key Findings
The index catalogs 67 deployed agentic AI systems.
Most developers publish documentation and many release code.
Results
| Metric | Value | Baseline | Delta | Split / Dataset | Evidence | Evidence Ref |
|---|---|---|---|---|---|---|
| indexed_systems | 67 systems | — | — | — | Total count of indexed agentic systems | Sec 3 |
| public_documentation | 70.1% | — | — | 67 agents | 47 of 67 agents publish documentation | Figure 1, Sec 5 |
What To Try In 7 Days
Browse the index (aiagentindex.mit.edu) and spot agents similar to your use case.
If evaluating an external agent, request its safety policy and audit reports before production.
Run a short red-team or jailbreak test focused on your critical workflows and data flows.
Agent Features
Memory
Planning
Tool Use
Frameworks
Is Agentic
Yes
Architectures
Collaboration
Reproducibility
Code URLs
Risks & Boundaries
Limitations
Definition of 'agent' is loose and contested; inclusion choices can be subjective.
Snapshot limited to systems available or announced by Dec 31, 2024; field moves fast.
When Not To Use
When you need exhaustive or up-to-date coverage of every deployed agentic system.
When assessing internal-only agents or non-English systems not included in the index.
Failure Modes
Selective disclosure by developers can give false sense of safety.
Index snapshot can become outdated quickly as new agents and audits appear.

