Overview
Practical for KG-backed QA: it increases answer fidelity and lowers runtime vs agentic baselines, but still relies on many LLM calls and KG coverage; expect moderate engineering effort to index KGs and tune beam parameters.
Citations3
Evidence Strength0.80
Confidence0.90
Risk Signals10
Trust Signals
Findings with numeric evidence: 5/5
Findings with evidence refs: 5/5
Results with explicit delta: 5/5
Reproducibility
Status: Code + data available
Open source: Partial
At A Glance
Cost impact: 70%
Production readiness: 60%
Novelty: 60%
Why It Matters For Business
FiDeLiS improves factual QA without model retraining by combining KG retrieval and stepwise logic checks, raising answer accuracy and cutting runtime—useful where auditability and verifiable facts matter.
Who Should Care
Summary TLDR
FiDeLiS is a training-free framework that grounds LLM answers in verifiable knowledge-graph (KG) reasoning paths. It has two parts: Path-RAG, which preselects a small set of KG entity/relation candidates using dense embeddings + graph connectivity; and DVBS, a beam search guided by LLM planning with stepwise deductive verification to stop when the question is provably answered. Across KGQA benchmarks (WebQSP, CWQ, CR-LT) FiDeLiS raises accuracy vs prior KG+LLM methods and cuts runtime vs agentic baselines, while producing shorter, more verifiable reasoning paths. Code is released.
Problem Statement
LLMs often hallucinate or produce invalid multi-step reasoning. Knowledge graphs can anchor reasoning but prior approaches either (a) retrieve imprecise facts or miss graph structure, or (b) treat LLMs as agents and incur high latency. The field needs a method that is both faithful (verifiable steps) and efficient (limited LLM calls and search). RoG's analysis shows only 67% of generated reasoning steps were valid, leaving room for improvement.
Main Contribution
FiDeLiS: a training-free pipeline that combines Path-RAG (retrieval) and Deductive-Verification Beam Search (DVBS) to ground LLM answers in KG paths.
Path-RAG narrows KG search by combining semantic similarity with one-hop structural connectivity to preselect high-quality candidates.
Key Findings
FiDeLiS improves top-answer accuracy on WebQSP with strong LLMs.
Path-RAG meaningfully increases retrieval quality vs vanilla retrievers.
Results
| Metric | Value | Baseline | Delta | Split / Dataset | Evidence | Evidence Ref |
|---|---|---|---|---|---|---|
| WebQSP Hits@1 | 84.39% (FiDeLiS, GPT‑4‑turbo) | 81.84% (ToG, GPT‑4‑turbo) | +2.55pp | WebQSP | Table 1: FiDeLiS vs ToG on WebQSP | Table 1 |
| CWQ Hits@1 | 71.47% (FiDeLiS, GPT‑4‑turbo) | 68.51% (ToG, GPT‑4‑turbo) | +2.96pp | CWQ | Table 1: FiDeLiS vs ToG on CWQ | Table 1 |
What To Try In 7 Days
Run FiDeLiS end-to-end on a small KGQA subset to compare Hits@1 and runtime versus your current pipeline.
Swap your retriever for Path‑RAG (embed entities+relations and add one‑hop scoring) to reduce candidate load.
Add a simple deductive-check prompt to stop reasoning when the answer can be logically deduced.
Agent Features
Memory
Planning
Tool Use
Frameworks
Is Agentic
Yes
Architectures
Collaboration
Optimization Features
Token Efficiency
System Optimization
Training Optimization
Inference Optimization
Reproducibility
Code URLs
Data URLs
Risks & Boundaries
Limitations
Performance depends on KG quality and coverage; missing or outdated KG facts can still cause failures.
Beam search still requires multiple LLM calls; latency remains non-trivial for deep multi-hop cases.
When Not To Use
When no structured KG is available or building one is infeasible.
For ultra-low-latency real-time services where tens of seconds per query are unacceptable.
Failure Modes
Incorrect deductive-verifier judgments lead to premature stopping or false positives.
Path-RAG may still miss correct candidates if KG lacks relation labels or entity surface forms.

