iAgents: agents mirror human social networks to trade private info and solve group tasks under information asymmetry

Overview

Decision SnapshotNeeds Validation

InfoNav gives clear multi-turn guidance that boosts small-network reasoning; mixed memory and recursive communication matter most when retrieval over many messages is required.

Citations1

Evidence Strength0.70

Confidence0.80

Risk Signals10

Trust Signals

Findings with numeric evidence: 4/4

Findings with evidence refs: 4/4

Results with explicit delta: 2/8

Reproducibility

Status: Code + data available

Open source: Yes

At A Glance

Cost impact: 60%

Production readiness: 50%

Novelty: 70%

Authors

Wei Liu, Chenxi Wang, Yifei Wang, Zihao Xie, Rennai Qiu, Yufan Dang, Zhuoyun Du, Weize Chen, Cheng Yang, Chen Qian

Links

Abstract / PDF / Code / Data

Why It Matters For Business

iAgents lets one agent per user coordinate across private data without centralizing it, enabling multi-user scheduling, concierge and workflow automation—but expect higher token costs and privacy trade-offs.

Who Should Care

Product Manager ML Engineer CTO Engineering Lead Founder

Summary TLDR

This paper defines information asymmetry for multi-agent systems (each agent only sees its user's private data) and proposes iAgents: a system of one agent per user that proactively requests and exchanges only necessary human information. Two core ideas: InfoNav, an explicit plan that tracks which facts (rationales) are unknown and guides multi-turn questions; and Mixed Memory, combining exact-span 'Clear Memory' with embedding-based 'Fuzzy Memory' for retrieval. The authors release InformativeBench (5 datasets) and show GPT-4 achieves ~50% on average while smaller LLMs perform worse. Ablations show InfoNav is critical for small-network reasoning and mixed memory + recursive communication is

Problem Statement

Multi-agent systems assume shared context but real human collaborations are asymmetric: each agent only sees its user's private information. That breaks coordination. The challenge is to enable agents to acquire and exchange needed facts without centralizing private data, while scaling retrieval over many messages and keeping multi-turn communication focused.

Main Contribution

Formulate the problem of information asymmetry in multi-agent collaboration and shift focus from a single shared virtual entity to agents that mirror users.

Propose iAgents: integrates InfoNav (plan-driven communication) and Mixed Memory (Clear + Fuzzy) to retrieve and exchange human information without centralizing all data.

Key Findings

GPT-4-backed iAgents solved many tasks but performance varies strongly by dataset difficulty.

NumbersGPT-4: Schedule Easy 56.67%, Schedule Medium 51.00%, Schedule Hard 22.80%, NP 64.00%, FriendsTV 57.94%

Practical UseUse strong LLM backends (GPT-4-class) for best results; expect major drops on hard algorithmic or large-network tasks.

Evidence RefTable 1

iAgents scaled to a large simulated social network and retrieved many messages during runs.

NumbersFriendsTV: 140 nodes, 588 edges; agents searched ~70,000 messages and completed tasks within ~3 minutes

Practical UseMixed-memory + ANN retrieval can handle hundreds of users and tens of thousands of messages, but expect nontrivial latency and token cost.

Evidence RefAbstract, Figure 6, Section 1

Results

Metric	Value	Baseline	Delta	Split / Dataset	Evidence	Evidence Ref
Schedule Easy (precision)	56.67%	—	—	Schedule Easy	GPT-4 row in Table 1	Table 1
Schedule Medium (precision)	51.00%	—	—	Schedule Medium	GPT-4 row in Table 1	Table 1

What To Try In 7 Days

Prototype InfoNav prompts on a small 4–6 person calendar use case to test multi-turn info exchange.

Build a mixed memory of exact spans + session summaries and compare retrieval quality.

Run InformativeBench (NP or ScheduleEasy) with your preferred LLM to measure baseline accuracy and token cost.

Agent Features

Memory

Mixed Memory: Clear Memory (exact spans)Mixed Memory: Fuzzy Memory (session summaries + embeddings)

Planning

InfoNav (explicit plan tracking)Consensus reasoning (plan-based merge)

Tool Use

embedding-based retrieval (ANN)LLM summarizer for session-level summaries

Frameworks

iAgents (InfoNav + Mixed Memory)InformativeBench

Is Agentic

Yes

Architectures

one-agent-per-user mirroringrole-play prompt-created agents

Collaboration

recursive inter-agent communicationmulti-turn autonomous dialogs (max 10 turns in experiments)

Optimization Features

Token Efficiency

paper reports ~30k input tokens per task as cost concern

Reproducibility

Code AvailableYes

Data AvailableYes

Open Source StatusYes

LicenseUnknown

Code URLs

https://github.com/thinkwee/iAgents

Data URLs

https://github.com/thinkwee/iAgents

Risks & Boundaries

Limitations

Privacy vs. utility trade-off: stronger privacy restrictions noticeably reduce accuracy.

High token and latency cost: experiments report ~30,000 tokens per task.

When Not To Use

When absolute local-only privacy is required (L3 level) and edge models cannot match performance.

For tiny tasks where centralizing data is simpler and cheaper.

Failure Modes

Agents hallucinate 'fake solved' rationales and pass incorrect facts into consensus.

Pretrained model priors override user-provided evidence, leading to prior-distraction errors.

Core Entities

Models

gpt-4-0125-previewgpt-3.5-turbo-16kgemini-1.0-pro-latestclaude-sonnet 2

Metrics

PrecisionF1IoU

Datasets

InformativeBenchNeedle in the Persona (NP)FriendsTVSchedule (Easy/Medium/Hard)

Benchmarks

InformativeBench

Context Entities

Models

role-play prompting agents (prior MAS baselines referenced)

Overview

Trust Signals

Reproducibility

At A Glance

Authors

Links

Why It Matters For Business

Who Should Care

Summary TLDR

Problem Statement

Main Contribution

Key Findings

GPT-4-backed iAgents solved many tasks but performance varies strongly by dataset difficulty.

iAgents scaled to a large simulated social network and retrieved many messages during runs.

Results

What To Try In 7 Days

Agent Features

Optimization Features

Reproducibility

Code URLs

Data URLs

Risks & Boundaries

Limitations

When Not To Use

Failure Modes

Core Entities

Models

Metrics

Datasets

Benchmarks

Context Entities

Models

You May Also Want to Read

Survey: Reframe LLMs as agents that plan, act, and continually learn

Key finding

TRiSM: practical trust, risk and security controls for LLM-based multi-agent systems

Key finding

RAPS: intent-driven, reputation-aware publish–subscribe for adaptive multi-agent LLM coordination

Key finding

Survey of safe interfaces, threat models, and standards for LLM-driven agents that act on blockchains

Key finding

ACP: a layered, federated protocol for secure cross-platform agent-to-agent collaboration

Key finding