Overview
Production Readiness
0.7
Novelty Score
0.6
Cost Impact Score
0.45
Citation Count
1
Why It Matters For Business
PlotEdit turns static chart images in PDFs into editable, high-fidelity charts using natural language, speeding up content updates and improving accessibility for visually impaired users.
Summary TLDR
PlotEdit is a multi-agent system that edits chart images (PDFs/scans) from plain-language instructions. It uses five LLM agents to extract data, style, and code, decomposes user edits into steps, and applies multimodal feedback (numeric, visual, code) to iteratively fix errors. On the ChartCraft test set it improves structural and style fidelity versus prior methods (e.g., overall SSIM 89.0 vs 82.4 for a strong baseline). The system is most useful when you need faithful, editable reconstructions from chart images and when Python replotting is acceptable.
Problem Statement
Charts in PDFs and scans are often images with no source data or style metadata. That makes edits — changing data, layout, or style — hard or manual. Existing single-shot vision-language models and plain LLM prompting struggle because they hallucinate or fail to recover accurate tables, styles, or executable code.
Main Contribution
A five-agent pipeline (Chart2Table, Chart2Vision, Chart2Code, Instruction Decomposition, Multimodal Editing) that de-renders and edits chart images from natural language.
Three linked feedback modes — code checks, visual comparison, numeric checks — used iteratively for self-reflection and error correction.
Empirical gains on ChartCraft: higher SSIM, RMS, and style scores versus ChartLLaMA, ChartReformer, and in-context LLM baselines; ablations show feedback matters.
Key Findings
PlotEdit produces more faithful edited charts than prior methods on ChartCraft.
Multimodal feedback improves edit quality.
Agentic orchestration beats single-shot in-context LLM prompting.
Results
Overall SSIM
Layout SSIM
Data-centric SSIM
Ablation: overall SSIM without multimodal feedback
Who Should Care
What To Try In 7 Days
Run PlotEdit on 20 representative PDF charts to measure replot fidelity versus your current process.
Prototype a pipeline: use Chart2Table + Chart2Code to extract tables and generate Python plots, then apply visual feedback checks.
Test accessibility use cases: adjust color/contrast and re-evaluate readability for low-vision settings.
Agent Features
Memory
- short-term iterative feedback state
Planning
- instruction decomposition
- self-reflection loops
Tool Use
- code execution and dynamic checks
- AST static analysis
- image similarity (MS-SSIM)
- pandas for data edits
Frameworks
- chain-of-thought prompting
- few-shot in-context learning
Is Agentic
true
Architectures
- multi-agent LLM orchestration
Collaboration
- sequential agent orchestration
Reproducibility
Data Urls
- ChartCraft dataset (used for evaluation; see Table 1)
Data Available
Open Source Status
- unknown
Risks & Boundaries
Limitations
- Relies on large multimodal LLMs (GPT-4V/GPT-4o); access and cost can be barriers.
- Evaluation is on ChartCraft; real-world charts may be more diverse.
- Workflow assumes Python plotting and may not support all charting libraries or bespoke visuals.
When Not To Use
- If you cannot run or pay for multimodal LLMs.
- When charts are created with non-Python or proprietary rendering pipelines.
- For legally sensitive charts where sending images to external models is not allowed.
Failure Modes
- Poor de-rendering when input images are extremely low-resolution or heavily occluded.
- Hallucinated or incorrect data tables leading to wrong edits.
- Runtime errors in generated code if environment differs from assumed libraries.
Core Entities
Models
- GPT-4V
- GPT-4o
- ChartReformer
- ChartLLaMA
- In-context Learning (LLM prompts)
- PlotEdit (this work)
Metrics
- SSIM
- MS-SSIM
- RMS (Relative Mapping Similarity)
- VAES (Visual Attribute Edit Score)
Datasets
- ChartCraft
Benchmarks
- ChartCraft evaluation (style/layout/format/data edits)

