ALAS: modular LLM agents with persistent memory that locally repair plans under runtime disruptions

May 18, 20257 min

Overview

Decision SnapshotNeeds Validation

ALAS is a strong prototype for reactive planning: it shows reproducible benchmark gains and clear mechanisms (memory + local compensation), but it needs deployment tests, robust code generation safeguards, and dynamic cost tuning before industrial use.

Citations2

Evidence Strength0.70

Confidence0.80

Risk Signals9

Trust Signals

Findings with numeric evidence: 4/4

Findings with evidence refs: 4/4

Results with explicit delta: 4/4

Reproducibility

Status: No open assets linked

Open source: Partial

At A Glance

Cost impact: 60%

Production readiness: 60%

Novelty: 70%

Authors

Edward Y. Chang, Longling Geng

Links

Abstract / PDF

Why It Matters For Business

ALAS turns LLMs into practical schedulers for dynamic operations by keeping state, validating plans, and repairing disruptions locally—reducing rework, travel, and missed deadlines in logistics and operations.

Who Should Care

Summary TLDR

ALAS (Adaptive LLM Agent System) turns a monolithic LLM planner into a network of role-specialized agents with a shared persistent execution memory and a local reactive protocol (LRCP). The system: (1) validates templates, (2) instantiates agents (via code/LLM prompts), and (3) reacts to runtime failures with local compensation instead of full replanning. On toy and large job-shop benchmarks ALAS reduces travel distance (URS) and closes gap-to-optimal on JSSP benchmarks while fixing reactive failures that broke standalone LLMs. Key limits: no physical deployment yet, relies on reliable code generation, and uses static cost models.

Problem Statement

Standalone LLMs struggle for real-time, transaction-style planning because they lack self-checks, persistent state, long-context fidelity, and disruption recovery. ALAS fixes this by decomposing plans into role-specific agents, adding persistent execution memory, validators, and a Local Reactive Compensation Protocol (LRCP) that prefers local fixes over costly global replanning.

Main Contribution

Three-layer architecture (workflow blueprint, agent factory, runtime monitor) that builds validated templates and instantiates agents.

Persistent execution memory that logs state transitions, enabling rollback, targeted compensation, and causal checks.

Key Findings

Alas produces shorter ride-sharing routes than standalone LLM baselines on the URS task.

NumbersAverage distance 95.1 km vs 118.9 km (20% reduction, p<0.01)

Practical UseIf you use ALAS for dynamic dispatch, expect meaningful travel savings vs naive LLM outputs; adopt the template+agent pattern for routing tasks.

Evidence RefSection 4.1, Fig.2

Alas reliably handles mid-run disruptions in a family-event scenario while many LLMs fail.

NumbersAlas succeeded 10/10 reactive trials; DeepSeek and Claude failed 7/10 each

Practical UseFor workflows where deadlines and interdependent tasks change at run time, use persistent state + LRCP to avoid infeasible plans during replanning.

Evidence RefSection 4.2, D.9

Results

MetricValueBaselineDeltaSplit / DatasetEvidenceEvidence Ref
URS total travel distance95.1 km (Alas mean)118.9 km (baseline LLMs mean)-20%URS (10 runs)Section 4.1, Fig.2Fig.2
Family Reunion reactive success10/10 feasible replans (Alas)3/10 feasible replans (DeepSeek/Claude typical success)Alas +7 trialsFamily Reunion reactive test (10 trials)Section 4.2, D.9D.9

What To Try In 7 Days

Prototype ALAS for one scheduling pipeline: build role templates, run a validator, and log persistent state for a small set of jobs.

Implement one compensator (LRCP) that performs local swaps and measures WIP movement vs global replanning cost.

Compare baseline LLM scheduling vs ALAS on a small real workflow (10–50 tasks) and track feasibility and downstream rework.

Agent Features

Memory
persistent execution memory (state transitions, logs)dependency graphs for rollback
Planning
workflow template constructionLRCP local reactive compensationqueue reordering with WIP penalty
Tool Use
LLMs to generate code and promptsexternal validators and classical heuristics
Frameworks
ALASLRCP
Is Agentic

Yes

Architectures
three-layer (blueprint, agent factory, runtime)role-specialized agent graph
Collaboration
master coordinator LLMmessage-based inter-agent alerts and DELAY_NOTIFY

Optimization Features

Token Efficiency
compartmentalize context to reduce long-context erosion
System Optimization
swap-limited queue optimization (practical bound O(S J O_max))WIP penalty to avoid excessive reordering

Reproducibility

Code AvailableNo
Data AvailableNo
Open Source StatusPartial
LicenseUnknown

Risks & Boundaries

Limitations

No in-factory or live deployment; results are simulation and benchmark-based.

Agent Factory assumes reliable LLM code generation; complex agents may need human review.

When Not To Use

Purely static optimization where classical combinatorial solvers already outperform LLM-based approaches.

Safety-critical domains that cannot tolerate automated code generation without human certification.

Failure Modes

Faulty agent code generation creating incorrect compensators or logging gaps.

Validator blind spots that miss constraint interactions leading to invalid templates.

Core Entities

Models

GPT-4o-TaskDeepSeek R1Claude 3.7 SonnetGemini 2.5 ProSeEvo-GPT3.5

Metrics

makespangap-to-upper-bound (%)total travel distance (km)reactive success rate (feasible replans / trials)

Datasets

Demirkol-DMU (DMU) JSSPTaillard (TA) JSSPUrban Ride Sharing (URS) syntheticFamily Reunion (custom event scenario)

Benchmarks

DMU job-shop benchmarksTaillard (TA) job-shop benchmarks

Context Entities

Models

GeminiOpenAI APIsClaude