Agentic copilot that converts natural language into P&ID DEXPI XML and Visio drawings

December 17, 20247 min

Overview

Production Readiness

0.4

Novelty Score

0.6

Cost Impact Score

0.6

Citation Count

2

Authors

Shreeyash Gowaikar, Srinivasan Iyengar, Sameer Segal, Shivkumar Kalyanaraman

Links

Abstract / PDF

Why It Matters For Business

Automating P&ID creation cuts manual drafting time and improves auditability by producing interoperable DEXPI XML and editable Visio drafts.

Summary TLDR

The authors build ACPID, an "agentic" copilot that turns plain-English descriptions of piping systems into machine-readable DEXPI XML and draft Visio diagrams. The system uses a Plan-and-Execute LLM workflow to emit a compact DSL, deterministically translates that DSL to DEXPI Proteus XML, and renders visuals via the Visio API. On a small DEXPI-based test bench the copilot achieves much higher element recall (soundness 96.96%) and syntactic completeness (92.97%) than zero-shot or few-shot GPT-4-Turbo. Limits: small public test set, careful prompt design needed, and higher inference time versus single-shot methods.

Problem Statement

Creating P&ID diagrams is manual, slow, error-prone, and hard to audit. Prior ML work digitizes existing diagrams but does not generate interoperable P&ID machine formats directly from natural-language requests. The paper aims to automate subsystem-level P&ID creation from text while producing editable, interoperable DEXPI XML and draft Visio diagrams.

Main Contribution

ACPID copilot: an agentic, multi-step Plan-and-Execute system that converts natural language to a DSL and then to DEXPI Proteus XML.

Deterministic rule-based translator from the DSL to DEXPI XML plus a Visual Diagram Generator that renders draft Visio (.vsdx) outputs.

Evaluation on DEXPI example files showing large gains in soundness and completeness versus zero-shot and few-shot GPT-4-Turbo.

Design for subsystem-level, iterative authoring with a human-in-the-loop editing step to improve provenance and correctness.

Key Findings

ACPID achieves much higher soundness than single-pass GPT-4-Turbo.

NumbersACPID 96.96% vs Zero-shot 58.33% and Few-shot 65.90%

ACPID produces substantially more syntactically complete DEXPI XML.

NumbersACPID 92.97% completeness vs Few-shot 68.28% and Zero-shot 0%

Evaluation is small and constrained by public data availability.

NumbersSoundness tested on ~132 artifacts; completeness on 555 sections

Results

Soundness (proportion of prompted elements present)

Value96.96%

BaselineZero-shot 58.33%; Few-shot 65.90%

Completeness (DEXPI XML syntactic completeness)

Value92.97%

BaselineZero-shot 0%; Few-shot 68.28%

Evaluation scale (soundness)

Value132 artifacts evaluated

Evaluation scale (completeness)

Value555 sections evaluated

Who Should Care

What To Try In 7 Days

Run ACPID on a couple of small subsystem descriptions to compare time-to-draft vs manual creation.

Convert existing simple P&ID text descriptions into DEXPI XML and open results in Visio for quick edits.

Use the DSL+rule translator idea to add deterministic checks to your diagram export pipeline.

Agent Features

Memory

  • Short-term context via appending prior executed steps

Planning

  • LLM-generated execution plans (plan step list)

Tool Use

  • Deterministic DSL→DEXPI translation
  • Microsoft Visio C# API for rendering

Frameworks

  • PwR (Programming with Representation)

Is Agentic

true

Architectures

  • Plan-and-Execute agents

Collaboration

  • Human-in-the-loop editing and validation

Optimization Features

Token Efficiency

  • Agent edits reduce need to resend whole XML as context

Infra Optimization

  • Not addressed; paper notes higher inference time as trade-off

System Optimization

  • Rule-based deterministic translation to reduce LLM variability

Inference Optimization

  • Partial token savings by editing XML directly instead of sending full XML context

Reproducibility

Data Available

Open Source Status

  • partial

Risks & Boundaries

Limitations

  • Evaluation limited to small public DEXPI examples; proprietary plant data not tested.
  • Rules-driven translation requires careful prompt design and can be rigid.
  • Higher inference time than single-shot generation methods.
  • Focused on subsystem-level generation; full-plant automation not demonstrated.
  • No open release of code or full implementation details in paper.

When Not To Use

  • When you need end-to-end full-plant diagrams in one shot without iterative steps.
  • When low-latency, real-time diagram generation is required.
  • If your organization uses a non-DEXPI P&ID standard without an easy mapping.

Failure Modes

  • Missing or mis-linked connections between elements due to LLM planning errors.
  • Incorrect or incomplete XML attributes despite element presence.
  • Rigid rule mapping causing syntax-correct but semantically wrong XML.
  • Hallucinated components not present in the prompt.

Core Entities

Models

  • GPT-4-Turbo

Metrics

  • soundness
  • completeness

Datasets

  • DEXPI example P&IDs (DEXPI Consortium examples)

Context Entities

Datasets

  • DEXPI P&ID Specification 1.3