A configurable multi-agent framework that adds persona trees and a skill-backed cognitive architecture to make LLM agents act more human in場

August 24, 20237 min

Overview

Production Readiness

0.4

Novelty Score

0.6

Cost Impact Score

0.5

Citation Count

5

Authors

Shi Jinxin, Zhao Jiabao, Wang Yilei, Wu Xingjiao, Li Jiawen, He Liang

Links

Abstract / PDF

Why It Matters For Business

CGMI lets product teams simulate social workflows (training, UX, game NPCs, edtech) with more realistic agent behavior by adding persona trees and memory-driven planning.

Summary TLDR

CGMI is a framework that combines a tree-structured persona model, a cognitive architecture (declarative/procedural/working memory + skill library), and auxiliary general/supervisory agents to simulate realistic multi-agent social scenes. The paper demonstrates CGMI on a virtual classroom built with GPT-3.5-turbo-16k. Results show teacher-led discourse proportions and that persona-driven selection produces more realistic student responses than random selection. The system is presented as a research platform—not a production product—and the authors plan to open-source it.

Problem Statement

LLM-based agents can act in roles but tend to forget role settings, produce shallow content, and lack structured memory and coordination. The paper asks how to give agents stable personalities, deeper domain-aware reasoning, and realistic multi-agent communication for social simulations.

Main Contribution

A tree-structured persona model for assigning, testing, and restoring agent traits to keep roles stable across long dialogues.

A cognitive architecture (working, declarative, procedural memories) plus a configurable skill library that uses Chain-of-Thought and Chain-of-Action to form and retrieve domain knowledge.

CGMI: a configurable multi-agent framework that composes role agents, general agents, and supervisory agents and demonstrates classroom simulations using GPT-3.5-turbo-16k.

Key Findings

Teacher utterances dominated classroom discourse in simulated lessons.

NumbersTeacher behavior averaged 61.23% of discourse (across C1–C3).

Persona-aware selection yields notably different answer patterns than random choice.

NumbersWillingness agent recommended Emily 9 times vs random 3 times (others redistributed).

The cognitive architecture produced measurable within- and between-lesson adaptation by agents.

NumbersMultiple lessons show teacher reflection influenced later plans (qualitative traces shown in Figure 4).

Persona trees improve expressiveness and stability of agent utterances.

NumbersWithout persona, student expressions became uniform; with persona they showed distinct styles (qualitative, Figure 5).

Results

Teacher discourse proportion (overall)

Value61.23%

Student discourse proportion (facilitated by teacher prompts)

Value23.53%

Student-initiated interactions

Value15.23%

Answer recommendations per student (persona-based)

ValueJohn:4, Emily:9, Ryan:6, Samantha:1, Ying Zheng:8

BaselineRandom selection: John:7, Emily:3, Ryan:4, Samantha:6, Ying Zheng:8

Who Should Care

What To Try In 7 Days

Build a small CGMI demo: 1 teacher + 3 students + 1 supervisor using GPT-3.5 and a simple persona tree.

Compare random vs persona-based selection for role assignment and measure engagement (counts of responses).

Add a tiny skill library (3 domain prompts) to let one agent reflect and plan across two short sessions.

Agent Features

Memory

  • Working memory (short-term)
  • Declarative memory (facts)
  • Procedural memory (skills/actions)
  • Skill library (configurable domain knowledge)

Planning

  • Reflection module
  • Planning module
  • Chain-of-Thought (CoT) for declarative memory
  • Chain-of-Action (CoA) for procedural memory

Tool Use

  • supervisory agents
  • assistant agents
  • consistency checker agent

Frameworks

  • CGMI

Is Agentic

true

Architectures

  • Tree-structured persona model
  • Cognitive architecture (working/declarative/procedural memory)

Collaboration

  • Multi-agent coordination with supervisory arbitration
  • Persona-based answer selection
  • Role + general agent binding

Reproducibility

Open Source Status

  • partial

Risks & Boundaries

Limitations

  • Evaluation is limited to a single classroom domain and GPT-3.5-turbo-16k; generality is untested.
  • Quantitative claims rely on expert annotation and selected examples rather than large-scale user studies.
  • Persona restoration uses random testing; edge cases and adversarial forgetting are not fully explored.

When Not To Use

  • High-stakes or safety-critical decision systems (medical, legal).
  • Production agents requiring rigorous provable correctness or compliance.
  • Scenarios demanding large, diverse human-subject validation before deployment.

Failure Modes

  • LLM may still produce superficial or off-role outputs despite persona trees (persona forgetting).
  • Supervisory or assistant agents could mis-route actions and break the intended interaction flow.
  • Skill library retrieval might return irrelevant guidance if prompts or skills are poorly authored.

Core Entities

Models

  • gpt-3.5-turbo-16k

Metrics

  • FIAS interaction category proportions
  • answer recommendation counts (per student)