A two-stage agent pipeline that turns raw tables into vetted charts and a publication-ready narrative report.

December 26, 20257 min

Overview

Production Readiness

0.5

Novelty Score

0.5

Cost Impact Score

0.4

Citation Count

0

Authors

Shuyu Gan, Renxiang Wang, James Mooney, Dongyeop Kang

Links

Abstract / PDF

Why It Matters For Business

Automates the end-to-end path from raw tables to a polished report, saving analyst time on repetitive chart creation, basic QA, and first-draft narrative. The system produces multiple scored insight candidates so teams can select defensible findings instead of relying on a single model output.

Summary TLDR

A2P-Vis is a two-part multi-agent system: a Data Analyzer that profiles data, generates and executes plotting code, filters poor figures, and scores candidate insights; and a Presenter that orders topics and writes a coherent chart‑grounded report. The pipeline emphasizes automated quality checks (schema profiling, code rectification, chart legibility) and a rubriced insight scorer to produce ready-to-publish narratives from raw tables.

Problem Statement

Current LLM-based data pipelines often (1) fail to produce diverse, evidence-rich visual insights, and (2) do not assemble those charts and findings into a coherent, professional report without manual work.

Main Contribution

Design of Data Analyzer: profile data, propose visualization directions, generate/exec plots, reject low-quality charts, and score candidate insights.

Design of Presenter: rank topics, compose chart-grounded narratives with transitions, summarize takeaways, and revise a polished Markdown report.

An end-to-end agentic workflow that couples quality gating (metadata sniffing, code rectifier, chart judger) with a structured insight rubric to reduce routine failures and improve narrative coherence.

Key Findings

The Insight Generator creates multiple alternatives and delivers a small set of vetted insights.

NumbersProduces 5–7 candidate insights per chart; returns top 3 per chart after scoring.

Lightweight dataset profiling (Sniffer) enforces a schema contract to avoid common failures.

Visualizer uses a looped code-execute-rectify-validate flow to improve plot reliability.

NumbersFour-step flow: direction → code → execute (rectify on error) → chart judger

Results

Candidate insights generated per chart

Value5–7 candidates

Final insights returned per chart after scoring

ValueTop 3 insights

Visualizer flow steps

Valuedirection → code → execute → rectifier → chart judger

Who Should Care

What To Try In 7 Days

Run the Sniffer on a representative table to extract a schema profile and check for mis-typed columns.

Use the Visualizer flow on one dashboard: generate directions, auto-generate code, execute, and inspect the rectified plots.

Generate 5–7 candidate insights per chart and apply a simple rubric to pick the top 3. Compare human picks to the scorer.

Agent Features

Memory

  • Short-term metadata profile (schema contract)
  • Per-run topic and chart outputs

Planning

  • Task Decomposition
  • Topic Sequencing
  • Visualization Direction Planning

Tool Use

  • Code generation for plots
  • Automated code execution
  • Error-rectification callbacks
  • Chart legibility checking

Frameworks

  • chain-of-thought style revision

Is Agentic

true

Collaboration

  • Multi-agent pipeline (Analyzer ↔ Presenter)
  • Module handoffs via structured metadata

Reproducibility

Data Available

Open Source Status

  • unknown

Risks & Boundaries

Limitations

  • No quantitative evaluation or user study reported in the paper.
  • No public code repository provided; reproducibility is limited.
  • Relies on LLMs for text and code generation; quality depends on underlying model and prompts.

When Not To Use

  • When you need statistically rigorous inference or peer-reviewed analysis (not just automated reporting).
  • When data is highly sensitive and cannot be profiled or passed to external services.
  • When you require end-to-end reproducible pipelines with open-source implementations (paper provides no code).

Failure Modes

  • LLM-generated code may still produce incorrect plots despite rectification.
  • Insight scorer can surface plausible but incorrect explanations (model hallucination).
  • Topic ordering or transitions may be superficially coherent but miss domain nuance without human review.

Context Entities

Models

  • Google Data Science Agent (referenced)

Metrics

  • insight scoring rubric (Correctness, Specificity, Depth, So-what quality)