Overview
Production Readiness
0.3
Novelty Score
0.6
Cost Impact Score
0.5
Citation Count
11
Why It Matters For Business
Simulated multi-agent LLM planning can surface local needs early, reducing time and rehearsal costs before engaging humans; it helps test many “what-if” land-use options quickly while keeping service coverage competitive.
Summary TLDR
This paper builds a multi-agent system of LLMs that simulates a planner plus thousands of resident agents to produce land-use plans. Residents are role-played from census distributions and discuss via a fishbowl mechanism (inner/outer circles). The planner (GPT-4 vision) proposes an initial map, residents discuss in rounds, discussion is summarized, and the planner revises the plan. On two Beijing regions the method raises need-aware metrics (Satisfaction and Inclusion) substantially vs baselines while keeping service/access metrics competitive. The setup omits costs, ownership, and many real-world constraints.
Problem Statement
Traditional participatory planning is slow, costly, and hard to scale to thousands of residents. How can we simulate many stakeholders cheaply and efficiently so planners can create land-use plans that actually reflect diverse residents' needs?
Main Contribution
A multi-agent LLM framework that role-plays a planner and many residents to simulate participatory urban planning.
A fishbowl discussion mechanism (inner/outer circles + summaries) to scale resident discussion and limit context length.
Deployment on two real Beijing regions with quantitative metrics showing higher resident satisfaction and inclusion than baselines and human experts.
Key Findings
Simulated participatory planning raised resident Satisfaction to 0.787 on HLG.
Inclusion for marginalized groups improved to 0.773 on HLG.
Service accessibility remained competitive while optimizing for residents.
Fishbowl rounds materially affect outcomes: 3 rounds improved need-aware metrics.
Ablations show role-play and discussion matter.
Results
Satisfaction
Inclusion
Service
Ecology
Satisfaction
Service
Who Should Care
What To Try In 7 Days
Run a small pilot: create 100 resident agents from local demographics and role-play 1 community with GPT-4 vision and gpt-3.5 residents.
Use 3 fishbowl rounds and compare Satisfaction/Inclusion vs a planner-only baseline.
Produce summaries after each round to keep context short and reuse them in prompts.
Agent Features
Memory
- Short-term discussion summary
- Round-by-round history aggregation
Planning
- Planning with LLMs
- Task Decomposition
- Community-level revision
Tool Use
- Multimodal map input
- Prompt-based role-play
Frameworks
- Inner/outer fishbowl
- Role-play prompts
Is Agentic
true
Architectures
- GPT-4-vision
- GPT-3.5
Collaboration
- Fishbowl discussion
- Sequential community revision
Optimization Features
Token Efficiency
- Use summaries to limit token growth
Inference Optimization
- Reduce context by summarizing rounds
Reproducibility
Open Source Status
- no
Risks & Boundaries
Limitations
- Does not model ownership, development cost, or regulatory constraints.
- Land-use types and requirements are simplified to eight categories.
- Relies heavily on manually designed prompts and map descriptions.
When Not To Use
- For legally binding or final planning decisions that require ownership/cost modeling.
- When transparent, auditable decision chains are required without prompt engineering.
- If the site requires domain data not encoded in prompts (e.g., soil, utilities).
Failure Modes
- LLM-generated residents may hallucinate unrealistic needs or locations.
- Prompt bias can skew which resident concerns are surfaced.
- Long discussions could stagnate or degrade outcomes if rounds exceed ~3.
Core Entities
Models
- gpt-4-vision-preview
- gpt-3.5-turbo-1106
Metrics
- Service
- Ecology
- Satisfaction
- Inclusion
Datasets
- Huilongguan (HLG)
- Dahongmen (DHM)

