Overview
The study uses a large, real-world corpus and mixed methods. Findings about prevalence and maintenance are well supported. Generalization is limited to public repos and three agent tools.
Citations0
Evidence Strength0.80
Confidence0.85
Risk Signals9
Trust Signals
Findings with numeric evidence: 8/8
Findings with evidence refs: 8/8
Results with explicit delta: 3/8
Reproducibility
Status: Code + data available
Open source: Partial
At A Glance
Cost impact: 60%
Production readiness: 60%
Novelty: 45%
Why It Matters For Business
Agent context files control what AI developers do in your codebase. If they lack security or performance rules, agents will likely produce code that works but is vulnerable or inefficient. Treat these files like configuration and governance documents so agents follow team standards.
Who Should Care
Summary TLDR
The authors analyze 2,303 agent context files (e.g., CLAUDE.md, AGENTS.md) from 1,925 repos to show these files are long, hard to read, actively maintained, and biased toward functional instructions (build, testing, implementation). Non-functional concerns like security and performance are rare. Automated labeling of these files is feasible (micro F1 0.79) for concrete topics but struggles with abstract guidance.
Problem Statement
AI coding agents rely on persistent, project-level instruction files to act correctly. We lack evidence about what those files contain, how they evolve, and whether we can automatically monitor them. Without that evidence, agents can be well‑informed about how to run code but poorly constrained on safety, performance, or quality.
Main Contribution
A large empirical corpus: 2,303 agent context files from 1,925 open-source repositories across Claude Code, OpenAI Codex, and GitHub Copilot.
A 16‑label taxonomy of agent instructions (e.g., Build & Run, Testing, Architecture, Security) and prevalence counts.
Key Findings
Collected 2,303 agent context files across 1,925 repositories.
Files are long and differ by tool: Copilot and Claude files are longer than Codex.
Results
| Metric | Value | Baseline | Delta | Split / Dataset | Evidence | Evidence Ref |
|---|---|---|---|---|---|---|
| Corpus size | 2,303 agent context files from 1,925 repos | — | — | — | Section 3; Table 1 | Table 1 |
| Median words per file | Copilot 535, Claude 485, Codex 335.5 | — | Copilot & Claude > Codex | By agent type | Figure 3a; Section 4.1.3 | Figure 3a |
What To Try In 7 Days
Scan your repo for agent context files (CLAUDE.md, AGENTS.md, copilot-instructions.md).
Add a short 'Non-functional requirements' section that lists mandatory security and performance rules.
Include context-file checks in PR templates: 'Did you update the agent manifest if build or API changed?' and require a CODEOWNER approval for manifest edits.
Agent Features
Memory
Planning
Tool Use
Frameworks
Is Agentic
Yes
Architectures
Collaboration
Optimization Features
Token Efficiency
Reproducibility
Risks & Boundaries
Limitations
Manual labels record only presence, not the depth of a topic (binary labeling may overstate importance).
Dataset limited to public repos and three agent tools; private corpora may differ.
When Not To Use
Do not generalize prevalence numbers to private or enterprise-only repositories without further sampling.
Avoid using the taxonomy as a strict checklist for highly domain-specific projects without tailoring.
Failure Modes
Agents produce insecure or inefficient code if manifests omit NFRs like security and performance.
Manifests may become append-only and contradictory if not versioned or reviewed.

