Overview
The system is practical and demoed on Kaggle tasks with concrete configs, but evidence is limited to demos and no public code or broad benchmark results yet.
Citations0
Evidence Strength0.50
Confidence0.78
Risk Signals11
Trust Signals
Findings with numeric evidence: 5/5
Findings with evidence refs: 5/5
Results with explicit delta: 0/4
Reproducibility
Status: No open assets linked
Open source: Partial
At A Glance
Cost impact: 50%
Production readiness: 60%
Novelty: 60%
Why It Matters For Business
CEDAR reduces repetitive scripting by automating stepwise DS workflows while keeping data local, speeding prototyping and improving privacy controls for enterprise projects.
Who Should Care
Summary TLDR
CEDAR is a small multi-agent app that structures data‑science work for LLMs. A main orchestrator routes requests to two sub-agents (Text and Code). The system produces numbered plan-and-code steps like a readable notebook, runs code locally in Docker, keeps only compact history (configurable truncation, default 10k chars), and exposes configs (max steps, retries). The authors demonstrate CEDAR on Kaggle-style DS tasks and emphasize privacy (data stays local) and fault recovery via iterative code re-generation.
Problem Statement
Large LLMs can simplify data science, but single-shot prompts fail on multi-step workflows, large or private data, math-heavy tasks, and context-length limits. Practitioners need a structured, transparent way to split planning, code, and outputs while keeping data local and recoverable when code errors occur.
Main Contribution
A practical three-agent architecture: orchestrator, text agent, and code agent to produce interleaved plan and executable code cells.
Structured prompts and JSON-based function calls to prevent hallucinated routing and make tool calls robust.
Key Findings
CEDAR uses three LLM roles: an orchestrator plus separate text and code agents to produce a readable stepwise notebook.
Code runs locally in Docker; only aggregate snapshots and outputs are passed to LLMs, so raw data need not leave the host.
Results
| Metric | Value | Baseline | Delta | Split / Dataset | Evidence | Evidence Ref |
|---|---|---|---|---|---|---|
| Autorun runtime for demo | ≈3 minutes for 10–20 steps | — | — | Demo Kaggle task | Section 3.1: autorun ≃ 3 minutes for 10–20 steps | Section 3.1 |
| Default maximum solution steps | 30 steps | — | — | Configuration | Section 3.3: default 30 | Section 3.3 |
What To Try In 7 Days
Run CEDAR on a small Kaggle task to compare autorun vs manual scripting.
Set up local Docker execution and confirm that raw data never leaves your host.
Try structured prompts (project summary form) and measure time saved assembling steps versus free-form prompting.
Agent Features
Memory
Planning
Tool Use
Frameworks
Is Agentic
Yes
Architectures
Collaboration
Optimization Features
Token Efficiency
Infra Optimization
System Optimization
Reproducibility
Risks & Boundaries
Limitations
Demo scope limited to beginner-to-intermediate Kaggle tasks, not large real-world pipelines.
Agent complexity is basic; no independent faithfulness or critic agent implemented yet.
When Not To Use
For very large datasets that do not fit in RAM.
When you need provable correctness for critical math-heavy pipelines.
Failure Modes
Hallucinated action names (mitigated by JSON schema but still possible if schema changes).
Code execution errors needing multiple retries for complex environments.

