Overview
Production Readiness
0.4
Novelty Score
0.55
Cost Impact Score
0.65
Citation Count
0
Why It Matters For Business
Connecting AI agents to blockchains can automate treasury, trading and governance but also creates irreversible financial and governance risk; invest in intent typing, policy gates, hardened custody, and MEV-aware execution before granting any agent signing power.
Summary TLDR
This paper surveys how LLM-driven autonomous agents can safely interact with public blockchains. The authors reviewed 3,270 records and synthesized 317 relevant works, coded 85 systems in depth, and compared 20+ platforms across custody, policy, observability, and execution. They propose a five-part integration taxonomy (read-only → multi-agent workflows), a threat model focused on prompt injection, MEV and key compromise, and a practical roadmap: standardize Transaction Intent Schema (TIS) and Policy Decision Record (PDR) plus reproducible benchmarks and evaluation checklists.
Problem Statement
Connecting probabilistic, tool-using AI agents to immutable blockchains creates unique risks: irreversible financial loss, bearer-key authorization, adversarial transaction ordering (MEV), and cross-chain complexity. We need standardized, auditable interfaces and enforcement layers so agents can plan and act without exposing users and protocols to unacceptable economic or governance risk.
Main Contribution
A five-part taxonomy of agent→chain integration patterns from read-only analytics to multi-agent workflows.
A threat model tailored to agent-driven transaction pipelines, mapping attack classes across observe→verify stages.
A comparative capability matrix analyzing 20+ representative systems across 13 security and integration dimensions.
A roadmap proposing two interface standards: Transaction Intent Schema (TIS) and Policy Decision Record (PDR).
A reproducibility package proposal and concrete evaluation checklist and benchmark suite for safety and economic robustness.
Key Findings
Systematic literature review found 317 relevant works from 3,270 records.
85 systems were coded in depth and 20+ representative systems were analyzed across 13 dimensions.
Account abstraction (ERC-4337) materially lowers the blast radius of agent delegation by enabling on-chain validation logic.
Maximal Extractable Value (MEV) is a persistent, structural execution risk for agents that broadcast intent to public mempools.
Defense-in-depth converges to four core controls: typed intents, policy gating, preflight simulation, and hardened signing (MPC/TEE).
Results
Records screened
Studies included in qualitative synthesis
Systems deep-coded for matrix
Representative systems compared
Who Should Care
What To Try In 7 Days
Run a small pilot: agent generates TIS-style intents but humans sign every transaction.
Add preflight simulation to any production transaction pipeline (forked state test before sign).
Audit wallet flows: move from raw keys to session keys or smart-account modules with strict spend limits.
Agent Features
Memory
- short-term context windows
- retrieval-augmented state from indexers (cached world model)
Planning
- outcome-first intent planning
- simulation-driven plan scoring
- MEV-aware route selection
Tool Use
- function calling
- RPC/indexer tooling
- oracle-augmented data fetch
Frameworks
- MCP (Model Context Protocol)
- UTCP-like tool-call standards
- ERC-4337 smart accounts
Is Agentic
true
Architectures
- chain-of-thought / ReAct-style planner
- tool-augmented LLMs (function calling)
- multi-agent proposer/verifier/executor
Collaboration
- quorum-based approval
- role-specialized agent workflows
Optimization Features
Token Efficiency
- intent summarization to reduce action complexity
Infra Optimization
- bundlers and paymasters for ERC-4337
- MPC/TEE signing to remove keys from agent host
Model Optimization
- not applicable (survey)
System Optimization
- preflight simulation and canonical intent hashing
- policy checking off-chain before signing
Inference Optimization
- MEV-aware routing and solver selection
- private orderflow to reduce public mempool exposure
Reproducibility
Code Available
Data Available
Open Source Status
- partial
Risks & Boundaries
Limitations
- Focuses on public, permissionless blockchains (mainly Ethereum/EVM); private chains get limited coverage.
- Emphasizes LLM-based agents; other agent paradigms (pure RL) receive limited treatment.
- Rapidly evolving ecosystem—specific system statuses and vendor implementations may change after early 2026.
When Not To Use
- For fully autonomous high-frequency trading without hardened MPC or TEE custody.
- When target execution venues do not support account abstraction or intent-based settlement.
- If you lack operational monitoring, kill switches, and tested recovery procedures.
Failure Modes
- Prompt injection steering the planner to malicious intents.
- Tool/data-plane spoofing (compromised RPC or oracle) causing unsafe decisions.
- Middleware mutation between preview and signing (WYSIWYS failure).
- Key or credential exfiltration from agent runtime.
- MEV extraction via public mempool reordering.
- Multi-agent collusion or quorum capture in distributed approval designs.
Core Entities
Models
- GPT-3
- ReAct
- Chain-of-Thought
- Toolformer
- Gorilla
- NexusRaven
Metrics
- PIR (prompt injection resistance)
- MEV leakage
- Accuracy
Datasets
- AndroidWorld
- AndroidLab
- AgentClinic
- MLGym
Benchmarks
- AndroidWorld
- AndroidLab
- AgentClinic
- MLGym
Context Entities
Models
- LLM agents (tool-using architectures)
Metrics
- production_readiness
- evidence_strength
Benchmarks
- proposed safety/economic evaluation suite (this paper)

