RAPS: intent-driven, reputation-aware publish–subscribe for adaptive multi-agent LLM coordination

February 8, 20267 min

Overview

Decision SnapshotNeeds Validation

RAPS is a practical, training-free coordination layer that improves multi-agent performance and robustness in experiments, but it depends on backbone LLM quality and requires reputation warm-up in new deployments.

Citations0

Evidence Strength0.80

Confidence0.88

Risk Signals11

Trust Signals

Findings with numeric evidence: 3/3

Findings with evidence refs: 3/3

Results with explicit delta: 0/6

Reproducibility

Status: Code + data available

Open source: Partial

At A Glance

Cost impact: 60%

Production readiness: 60%

Novelty: 70%

Authors

Rui Li, Zeyu Zhang, Xiaohe Bo, Quanyu Dai, Chaozhuo Li, Feng Wen, Xu Chen

Links

Abstract / PDF

Why It Matters For Business

RAPS offers inference-time, training-free coordination that improves task accuracy, scales with more agents, and reduces single-point failures — useful for teams building modular LLM services and open agent marketplaces.

Who Should Care

Summary TLDR

This paper reframes multi-agent LLM coordination as dynamic ad-hoc networking and introduces RAPS: a distributed publish–subscribe substrate plus two overlays — Reactive Subscription (intent refinement) and Bayesian Reputation (decentralized trust). RAPS routes messages by semantic intent, lets agents refine intents at runtime, and uses Bayesian watchdogs to detect and isolate bad actors. On five standard benchmarks (reasoning, math, code) RAPS improved average performance and scaled better with more agents while staying robust to injected adversaries.

Problem Statement

Current automatic coordination either (a) fixes communication topologies and cannot adapt at inference, or (b) uses a central meta-controller that becomes a single point of failure and scalability bottleneck. The challenge is to design a coordination protocol that is adaptive (message-level), scalable (dynamic membership), and robust (resists misbehavior) without heavy training.

Main Contribution

Perspective: cast LLM agent coordination as dynamic ad-hoc networking to unify adaptivity, scalability, and robustness goals.

Communication substrate: a distributed publish–subscribe protocol that routes by semantic match between publications and subscriptions.

Key Findings

RAPS achieves strong end-task gains across five benchmarks.

Numbers90.0% average accuracy/Pass@1 across MMLU, GSM8K, SVAMP, AQuA, HumanEval (Table 1).

Practical UseUse RAPS to boost multi-agent performance on mixed reasoning and code tasks without additional model training.

Evidence RefTable 1

Reactive Subscription materially improves results.

NumbersRemoving RS lowers MMLU by 2.6pts (88.285.6) and HumanEval by 2.2pts (91.589.3) (Table 3).

Practical UseAllow agents to refine their intent prompts at runtime to get 1–3 percentage point gains on reasoning and code tasks.

Evidence RefTable 3

Results

MetricValueBaselineDeltaSplit / DatasetEvidenceEvidence Ref
Accuracy88.2%MMLU test subsetTable 1 reports RAPS MMLU = 88.2Table 1
Accuracy95.4%GSM8K testTable 1 reports RAPS GSM8K = 95.4Table 1

What To Try In 7 Days

Prototype a publish–subscribe wrapper around existing LLM agents and route by semantic similarity instead of static addresses.

Add a lightweight intent-refinement step: re-run a prompt rewriter on received messages before generating replies.

Implement a simple Bayesian score (beta counts) to downweight agents that produce flagged errors and observe robustness changes.

Agent Features

Memory
Local context buffer H_i (interaction history)System prompt S_i as standing subscription
Planning
Intent-driven routing (semantic matching)Reactive intent refinement at inference time
Tool Use
Embedding-based semantic broker (text-embedding-3-small)LLM-driven broker variant (GPT-4o-mini)LLM watchdog audits for first-hand evaluation
Frameworks
RAPS (this paper's framework)
Is Agentic

Yes

Architectures
Distributed publish–subscribe substrateReactive subscription overlayBayesian reputation overlay
Collaboration
Content-centric publish–subscribe interactionsSpontaneous marketplace-style collaboration among agents

Optimization Features

Token Efficiency
AccuracyBroker filters to avoid unnecessary agent invocations
System Optimization
Decentralized routing to avoid central bottleneckReputation filtering to prevent communication with low-trust agents
Training Optimization
Training-free, inference-time intent refinement
Inference Optimization
Embedding-based broker for fast semantic matchingSelective dissemination (top-k subscribers) to reduce message routing

Reproducibility

Code AvailableYes
Data AvailableYes
Open Source StatusPartial
LicenseUnknown

Risks & Boundaries

Limitations

Performance depends on the quality of underlying LLM backbones; RAPS does not fix weak agents.

Reputation has a cold-start: initial interactions may not provide enough evidence to isolate adversaries.

When Not To Use

You have a small, fixed agent chain where static orchestration is sufficient.

Your LLM backbone is weak and cannot provide reliable audits or intent rewriting.

Failure Modes

Cold-start reputation allows early adversary influence until posterior evidence accumulates.

Colluding adversaries may initially pass deviation tests and poison reputations.

Core Entities

Models

GPT-4o-miniLLM-driven brokers (GPT-4o-mini variant)

Metrics

AccuracyPass@1

Datasets

MMLUGSM8KSVAMPAQuAHumanEval

Benchmarks

MMLUGSM8KSVAMPAQuAHumanEval