Use mined "shortcuts" from past multi-agent runs to cut tokens and speed up code generation

May 28, 20257 min

Overview

Decision SnapshotNeeds Validation

The idea is practical: reuse past successful agent transitions to cut redundant turns; experiments on SRDD show token and quality gains but only on similar tasks and a single dataset.

Citations0

Evidence Strength0.60

Confidence0.62

Risk Signals9

Trust Signals

Findings with numeric evidence: 3/3

Findings with evidence refs: 3/3

Results with explicit delta: 5/5

Reproducibility

Status: Partial assets available

Open source: Unknown

At A Glance

Cost impact: 70%

Production readiness: 40%

Novelty: 60%

Authors

Rennai Qiu, Chen Qian, Ran Li, Yufan Dang, Weize Chen, Cheng Yang, Yingli Zhang, Ye Tian, Xuantang Xiong, Lei Han, Zhiyuan Liu, Maosong Sun

Links

Abstract / PDF / Data

Why It Matters For Business

Co-Saving can cut token bills and developer compute costs by reusing prior multi-agent transitions, while keeping or improving code quality on similar tasks, so teams can scale automated software generation under a fixed budget.

Who Should Care

Summary TLDR

Co-Saving adds a small memory of past successful agent interactions (called "shortcuts") to multi-agent software-development systems. It ranks shortcuts by value vs cost (time and token usage), applies a dynamic emergency factor tied to remaining budget, and forces termination when interaction cost hits reference limits. On the SRDD software tasks, Co-Saving reports a large cut in token use and higher overall code quality versus prior multi-agent systems, while ablations show shortcut selection and the emergency factor materially affect success and budget completion.

Problem Statement

Multi-agent systems for software development produce good results but often waste tokens and time through redundant interactions. The paper aims to make multi-agent collaboration resource-aware so agents can reuse prior successful transitions to save tokens/time while keeping or improving code quality.

Main Contribution

Introduce "shortcuts": instruction fragments mined from historical multi-agent trajectories that connect non-adjacent solution states and can bypass redundant reasoning steps.

Design a value-vs-cost scoring and filtering pipeline (time, tokens normalized, harmonic mean) plus an "emergency factor" that weights cost more as budget depletes.

Key Findings

Co-Saving reduces token usage versus ChatDev.

Numbers50.85% average reduction in tokens (paper abstract).

Practical UseIf you run multi-agent code generation, storing and reusing shortcuts can roughly halve token bills in similar tasks on SRDD-style workloads.

Evidence RefAbstract

Co-Saving improves measured overall code quality versus ChatDev.

NumbersPaper reports a 10.06% improvement in overall code quality (abstract).

Practical UseUsing shortcut-guided paths can yield measurable quality gains on evaluated software tasks; expect better completeness/executability trade-offs when budgets are respected.

Evidence RefAbstract

Results

MetricValueBaselineDeltaSplit / DatasetEvidenceEvidence Ref
Token usage reduction vs ChatDev50.85% reductionChatDev-50.85%SRDD (experiments)Abstract claim: average reduction of 50.85% in token usageAbstract
Overall code quality improvement vs ChatDev10.06% improvementChatDev+10.06%SRDD (experiments)Abstract claim: improves the overall code quality by 10.06%Abstract

What To Try In 7 Days

Log agent interactions as (state, instruction, next state) triples and build a small shortcut index from past successful tasks.

Implement a cheap embedding retrieval (text-embedding-ada-002 or similar) to find reference tasks for new requirements.

Add simple cost filters: estimate token/time cost for candidate shortcuts and drop those exceeding remaining budget; test forced termination thresholds.

Agent Features

Memory
reference task retrieval (shortcut memory)
Planning
task decompositionreference-guided plan shortcuts
Tool Use
external code compilation/execution environmentsemantic embeddings for retrieval
Frameworks
ChatDev (used as base for experiments)MetaGPT (baseline)
Is Agentic

Yes

Architectures
multi-agent system (role-based agents)
Collaboration
iterative instruction-exchange (chat chain)role assignment (programmer/reviewer)

Optimization Features

Token Efficiency
token-aware shortcut filteringnormalization and ranking of token/time cost
System Optimization
budget-aware emergency factor to shift priorities
Inference Optimization
interaction pruning via shortcutsforced termination when path length exceeds reference

Reproducibility

Code AvailableNo
Data AvailableYes
Open Source StatusUnknown
LicenseUnknown

Data URLs

SRDD dataset referenced via [9] (ChatDev paper)

Risks & Boundaries

Limitations

Relies on finding similar historical tasks; cold-start tasks get no shortcut benefit.

Embedding-based similarity may miss fine-grained code semantics and produce imperfect matches.

When Not To Use

For novel tasks without historical analogs in the shortcut store.

When budgets are so large that extra reasoning improves quality and cost is irrelevant.

Failure Modes

Applying an incorrect shortcut that produces semantically wrong code despite compiling.

Over-pruning useful interactions and returning incomplete implementations.

Core Entities

Models

GPT-3.5-TurboGPT-4LLaMA 3 70BGPT-EngineerReActMetaGPTChatDevCo-Saving (this work)

Metrics

CompletenessExecutabilityConsistencyGranularityQualityBCR (Budgeted Completion Rate)

Datasets

SRDD (subset used for training shortcuts and testing)