Overview
The method is experimentally validated across four standard ABSA datasets with multiple baselines, but it depends on a closed LLM API and on aspect-extraction quality.
Citations1
Evidence Strength0.80
Confidence0.80
Risk Signals11
Trust Signals
Findings with numeric evidence: 5/5
Findings with evidence refs: 5/5
Results with explicit delta: 6/6
Reproducibility
Status: No open assets linked
Open source: Unknown
At A Glance
Cost impact: 65%
Production readiness: 60%
Novelty: 45%
Why It Matters For Business
IDG can produce usable labeled ABSA data from unlabeled text, lowering annotation cost and quickly bootstrapping sentiment models in new domains.
Who Should Care
Summary TLDR
The paper presents IDG, a three-stage pipeline that uses an LLM (GPT‑3.5‑turbo) to extract domain aspects from unlabeled text, expand them, generate single- and multi-aspect sentence-aspect-polarity triplets via iterative prompting, and filter outputs with an LLM-based discriminator. On four SemEval ABSA benchmarks, synthetic data from IDG matches or improves performance of five baseline ABSA models. Key wins: generated-only training often approaches manual labels; mixing generated + original data yields consistent gains (up to +4.01% F1); discriminator and multi-aspect generation materially help. The method requires access to an LLM and careful aspect extraction and filtering.
Problem Statement
Aspect-based sentiment models need many labeled sentence–aspect–polarity examples but manual annotation is expensive. Existing augmentation methods either tweak words or paraphrase and still suffer poor fluency, low diversity, or require labeled seeds. Directly prompting LLMs is promising but leads to hallucinations and low-quality pseudo labels. The goal is to produce diverse, fluent, high-quality ABSA training data from an unlabeled corpus using LLMs while controlling hallucination.
Main Contribution
IDG: a three-stage, iterative LLM pipeline (aspect extraction/extension, iterative generation, LLM-based evaluation/filtering) to produce pseudo-labeled ABSA data from unlabeled text.
A self-reflection discriminator that uses the LLM as a judge plus automatic scoring to remove low-quality outputs.
Key Findings
IDG-generated data can match or exceed manual training data on ABSA models.
Mixing IDG synthetic data with original labeled data consistently improves models.
Results
| Metric | Value | Baseline | Delta | Split / Dataset | Evidence | Evidence Ref |
|---|---|---|---|---|---|---|
| Accuracy | 80.25 | 78.37 (R-GAT base) | +1.88 | Laptop14 (Generated data) | Table IV: R-GAT + IDG Acc 80.25 vs baseline 78.37 | Table IV |
| R-GAT F1 | 76.18 | 73.92 (R-GAT base) | +2.26 | Laptop14 (Generated data) | Table IV: R-GAT + IDG F1 76.18 vs baseline 73.92 | Table IV |
What To Try In 7 Days
Run IDG on your domain unlabeled corpus to generate ~1× training data and train a BERT-based ABSA model.
Enable few-shot examples for aspect extraction to raise aspect F1 quickly.
Include the discriminator (LLM-as-judge + score threshold) before training to avoid noisy samples harming performance.
Reproducibility
Risks & Boundaries
Limitations
Requires access to a high-quality LLM (authors use GPT‑3.5‑turbo); API cost and privacy may limit adoption.
Performance depends on accuracy of extracted aspects; gold aspects give a clear upper bound.
When Not To Use
You already have ample, high-quality labeled ABSA data — manual labels may be better.
When LLM use is disallowed for privacy or compliance reasons.
Failure Modes
LLM hallucination produces wrong aspect–polarity pairs that degrade training if not filtered.
Repetitive low-diversity outputs without iterative feedback reduce model gains.

