Build query-specific evidence graphs on the fly to fix missing links and filter distractor facts

January 12, 20267 min

Overview

Decision SnapshotReady For Pilot

Results show consistent gains on five benchmarks, ablations confirm component roles, and robustness tests demonstrate survival under heavy KG sparsity.

Citations0

Evidence Strength0.80

Confidence0.90

Risk Signals10

Trust Signals

Findings with numeric evidence: 4/4

Findings with evidence refs: 4/4

Results with explicit delta: 3/3

Reproducibility

Status: Code + data available

Open source: Partial

At A Glance

Cost impact: 60%

Production readiness: 60%

Novelty: 70%

Authors

Manzong Huang, Chenyang Bu, Yi He, Xingrui Zhuo, Xindong Wu

Links

Abstract / PDF / Code

Why It Matters For Business

Relink reduces multi-hop QA errors and increases robustness by building only the facts a query needs, cutting wrong reasoning and making answers easier to verify.

Who Should Care

Summary TLDR

Relink replaces the usual static knowledge-graph then-reason pipeline with a "reason-and-construct" flow that builds a compact, query-specific evidence graph. It combines a high-precision KG backbone with a high-recall pool of latent relations (from entity co-occurrence + PMI). A query-driven ranker (coarse trainable ranker + LLM re-ranker) iteratively selects edges; when needed an LLM instantiates latent relations into factual triples. On five multi-hop QA benchmarks Relink improves average EM by 5.4% and F1 by 5.2% over strong GraphRAG baselines and stays robust when most KG edges are removed.

Problem Statement

GraphRAG methods rely on a static, pre-built knowledge graph. Static KGs are often incomplete and contain many query-relevant but misleading facts. This breaks multi-hop reasoning chains and amplifies distractors, so systems need a way to dynamically repair missing links and filter out misleading KG facts.

Main Contribution

Diagnose the limits of the build-then-reason paradigm: KG incompleteness and distractor facts break GraphRAG reasoning.

Propose Relink: a reason-and-construct framework that dynamically builds a compact, query-specific evidence graph from a factual KG plus a latent relation pool.

Key Findings

Relink yields consistent accuracy gains over leading GraphRAG baselines on five multi-hop QA datasets.

Numbersavg +5.4% EM; avg +5.2% F1 across five benchmarks

Practical UseSwitching to dynamic, query-driven graph construction can raise multi-hop QA accuracy noticeably in practice.

Evidence RefAbstract; Experiments; Table 1

On 2WikiMultiHopQA Relink achieves EM=0.628 and F1=0.722.

Numbers2WikiMultiHopQA EM 0.628, F1 0.722

Practical UseExpect strong per-dataset gains for structured multi-hop queries when using Relink-like pipelines.

Evidence RefTable 1

Results

MetricValueBaselineDeltaSplit / DatasetEvidenceEvidence Ref
EM (2WikiMultiHopQA)0.628HippoRAG 0.578+0.0502WikiMultiHopQA test(500 samples)Table 1 shows Relink EM 0.628 vs HippoRAG 0.578Table 1
EM (HotpotQA)0.558HippoRAG 0.498+0.060HotpotQA test(500 samples)Table 1 shows Relink EM 0.558 vs HippoRAG 0.498Table 1

What To Try In 7 Days

Run Relink-style pipeline on a small QA slice: add a PMI-based latent relation pool from your corpus.

Train a lightweight coarse ranker to prioritize candidates for a few-hot paths and compare EM/F1 to your static KG baseline.

Use an LLM to instantiate top latent relations and inspect provenance for a handful of failing queries.

Reproducibility

Code AvailableYes
Data AvailableYes
Open Source StatusPartial
LicenseUnknown

Risks & Boundaries

Limitations

Relies on LLM quality to instantiate latent relations; poor LLM outputs can introduce false facts.

Latent relation pool built from co-occurrence + PMI may surface spurious links without semantic filtering.

When Not To Use

When strict, immutable provenance is required and generated relations are unacceptable.

In low-latency or low-cost environments where extra LLM calls are prohibitive.

Failure Modes

LLM-instantiated relations hallucinate plausible but incorrect triples.

Ranker fails to distinguish useful vs. merely related facts, letting distractors through.

Core Entities

Models

deepseek-v3-0324gpt-4o-2024-07-06RAPTORGraphRAGHippoRAGG-RetrieverTOGVanilla RAG

Metrics

EMF1

Datasets

2WikiMultiHopQAHotpotQAConcurrentQAMuSiQue-AnsMuSiQue-Full

Benchmarks

2WikiMultiHopQAHotpotQAConcurrentQAMuSiQue-AnsMuSiQue-Full

Context Entities

Models

OpenAI text-embedding-3-small

Datasets

2WikiMultiHopQAHotpotQA