Overview
Production Readiness
0.5
Novelty Score
0.5
Cost Impact Score
0.4
Citation Count
0
Why It Matters For Business
Automates the end-to-end path from raw tables to a polished report, saving analyst time on repetitive chart creation, basic QA, and first-draft narrative. The system produces multiple scored insight candidates so teams can select defensible findings instead of relying on a single model output.
Summary TLDR
A2P-Vis is a two-part multi-agent system: a Data Analyzer that profiles data, generates and executes plotting code, filters poor figures, and scores candidate insights; and a Presenter that orders topics and writes a coherent chart‑grounded report. The pipeline emphasizes automated quality checks (schema profiling, code rectification, chart legibility) and a rubriced insight scorer to produce ready-to-publish narratives from raw tables.
Problem Statement
Current LLM-based data pipelines often (1) fail to produce diverse, evidence-rich visual insights, and (2) do not assemble those charts and findings into a coherent, professional report without manual work.
Main Contribution
Design of Data Analyzer: profile data, propose visualization directions, generate/exec plots, reject low-quality charts, and score candidate insights.
Design of Presenter: rank topics, compose chart-grounded narratives with transitions, summarize takeaways, and revise a polished Markdown report.
An end-to-end agentic workflow that couples quality gating (metadata sniffing, code rectifier, chart judger) with a structured insight rubric to reduce routine failures and improve narrative coherence.
Key Findings
The Insight Generator creates multiple alternatives and delivers a small set of vetted insights.
Lightweight dataset profiling (Sniffer) enforces a schema contract to avoid common failures.
Visualizer uses a looped code-execute-rectify-validate flow to improve plot reliability.
Results
Candidate insights generated per chart
Final insights returned per chart after scoring
Visualizer flow steps
Who Should Care
What To Try In 7 Days
Run the Sniffer on a representative table to extract a schema profile and check for mis-typed columns.
Use the Visualizer flow on one dashboard: generate directions, auto-generate code, execute, and inspect the rectified plots.
Generate 5–7 candidate insights per chart and apply a simple rubric to pick the top 3. Compare human picks to the scorer.
Agent Features
Memory
- Short-term metadata profile (schema contract)
- Per-run topic and chart outputs
Planning
- Task Decomposition
- Topic Sequencing
- Visualization Direction Planning
Tool Use
- Code generation for plots
- Automated code execution
- Error-rectification callbacks
- Chart legibility checking
Frameworks
- chain-of-thought style revision
Is Agentic
true
Collaboration
- Multi-agent pipeline (Analyzer ↔ Presenter)
- Module handoffs via structured metadata
Reproducibility
Data Available
Open Source Status
- unknown
Risks & Boundaries
Limitations
- No quantitative evaluation or user study reported in the paper.
- No public code repository provided; reproducibility is limited.
- Relies on LLMs for text and code generation; quality depends on underlying model and prompts.
When Not To Use
- When you need statistically rigorous inference or peer-reviewed analysis (not just automated reporting).
- When data is highly sensitive and cannot be profiled or passed to external services.
- When you require end-to-end reproducible pipelines with open-source implementations (paper provides no code).
Failure Modes
- LLM-generated code may still produce incorrect plots despite rectification.
- Insight scorer can surface plausible but incorrect explanations (model hallucination).
- Topic ordering or transitions may be superficially coherent but miss domain nuance without human review.
Context Entities
Models
- Google Data Science Agent (referenced)
Metrics
- insight scoring rubric (Correctness, Specificity, Depth, So-what quality)

