Overview
Production Readiness
0.2
Novelty Score
0.6
Cost Impact Score
0.4
Citation Count
0
Why It Matters For Business
This design lets teams scale software work with many specialized AI agents while enforcing rules (privacy, security, legal). That reduces manual coordination and speeds routine work, but requires rule design and human oversight.
Summary TLDR
This is a vision and system-design paper proposing BDIM-SE: a cognitive agent architecture for autonomous software-engineering (SE) agents that adds persistent memory and LLM links to classic BDI agents, and NorMAS-SE: a normative multi-agent team framework that uses commitments and deontic norms to ensure compliant coordination between human developers and SE agents. The authors implemented core components and lay out a practical roadmap for single-agent tests, multi-agent experiments, and human studies.
Problem Statement
Current LLM-based multi-agent SE frameworks use scripted workflows and role prompts. They lack genuine agency, persistent memory, norm-awareness, and scalable coordination that supports human collaboration and legal/ethical compliance.
Main Contribution
BDIM-SE: A cognitive architecture that extends BDI (Belief-Desire-Intention) agents with persistent, LLM-empowered memory (episodic/semantic/procedural) and query hooks to LLMs.
NorMAS-SE: A normative multi-agent systems design that represents software-development norms as deontic modalities (obligations, prohibitions, permissions) and encodes coordination as commitments.
A normative reasoner for self-regulation that filters goals, plans, and actions for norm compliance and triggers runtime remedies on violations.
A commitment-to-protocol approach that auto-generates interaction protocols and remedies, reducing the need for exhaustively scripted coordination plans.
A roadmap and prototype status: core BDIM-SE components implemented; planned evaluations at single-agent, multi-agent, and human-interaction levels.
Key Findings
BDIM-SE extends BDI agents by adding persistent memory and direct LLM queries to support longitudinal reasoning.
Norms are modeled as deontic modalities and encoded as JSON norms and commitments to make agent interactions accountable.
Normative Reasoner enforces compliance at three levels: goals (desires), plan selection (intentions), and runtime actions.
Commitments can auto-generate interaction protocols and remedies for violations, reducing manual coordination scripting.
Who Should Care
What To Try In 7 Days
Map two recurring development tasks (e.g., PR review, privacy scan) and express their rules as simple JSON norms.
Prototype one BDIM-SE agent by connecting an LLM to a small belief store (task list + repo URL) and implement a plan that runs tests and reports results.
Run a tabletop test where the agent generates a commit and a separate 'testing' agent enacts the commitment, observing failure and remedy behavior.
Agent Features
Memory
- Short-term working memory
- Long-term episodic memory
- Semantic and procedural memory
- LLM-backed belief queries
Planning
- Plan library with invocation and context conditions
- Plan selection with fallback alternatives
Tool Use
- LLM integration for code and feasibility queries
- Tool connectors for GitHub, JIRA, CI/CD
Frameworks
- NorMAS-SE
Is Agentic
true
Architectures
- BDIM-SE
Collaboration
- Commitment-based coordination
- Normative reasoner for compliance
- Auto-generated interaction protocols
Reproducibility
Open Source Status
- unknown
Risks & Boundaries
Limitations
- No empirical results yet; evaluation is future work.
- Relies on current LLM capabilities which can hallucinate and be brittle.
- Norm authoring and jurisdiction-aware rule design are complex and human-intensive.
- Integration with existing toolchains and heterogeneous data sources is nontrivial.
When Not To Use
- In high-assurance or safety-critical systems without strong human oversight and formal verification.
- When you cannot invest in norm design, governance, and monitoring.
- For very small teams where the coordination overhead outweighs automation benefits.
Failure Modes
- LLM hallucination producing norm-violating outputs.
- Conflicting norms or incomplete norms causing wrong plan filtering.
- Runtime monitor misdetection or delayed remedies leading to leaks or bad commits.
- Over-reliance by humans on agent decisions without adequate review.
Core Entities
Models
- foundational LLM (unnamed)
Context Entities
Models
- LLM-based code generation and testing models (surveyed references)

