Overview
Results are strong on a curated SIDER subset and show clear gains from structured retrieval, but real‑world noise and unreported events remain untested.
Citations0
Evidence Strength0.80
Confidence0.90
Risk Signals11
Trust Signals
Findings with numeric evidence: 5/5
Findings with evidence refs: 5/5
Results with explicit delta: 2/5
Reproducibility
Status: Code + data available
Open source: Partial
At A Glance
Cost impact: 60%
Production readiness: 70%
Novelty: 50%
Why It Matters For Business
Graph‑backed retrieval plus a small LLM turns a curated safety database into an almost error‑free lookup service for side‑effect presence, cutting clinician search time and reducing misinformation risk.
Who Should Care
Summary TLDR
This paper builds two retrieval-augmented systems to answer binary questions like “Is X a side effect of Y?” using the SIDER 4.1 drug-side effect database. A vector-based RAG (Pinecone + ada002 embeddings) and a graph-based GraphRAG (Neo4j + Cypher) feed a Llama-3 8B model. On a balanced subset of 19,520 pairs (976 drugs, 3,851 side effects) GraphRAG scored 0.9999 accuracy and RAG with pairwise format scored 0.998, while a standalone Llama-3 8B scored 0.529. Code is available on GitHub.
Problem Statement
Off-the-shelf LLMs hallucinate and lack reliable domain knowledge for pharmacovigilance. Clinicians need fast, accurate answers about whether a drug is known to cause a specific side effect. The paper asks: can retrieval (text or graph) plus a small LLM deliver reliable, binary drug–side-effect retrieval?
Main Contribution
Design and implement two retrieval-augmented pipelines for drug-side-effect lookup: vector RAG and GraphRAG using SIDER 4.1 as the knowledge base.
Show that GraphRAG (Neo4j graph + Cypher) plus Llama-3 8B gives near-perfect binary retrieval on a 19,520-pair balanced test set.
Key Findings
GraphRAG (Neo4j graph + Llama‑3 8B) achieved near‑perfect retrieval accuracy
Data representation strongly affects RAG performance
Results
| Metric | Value | Baseline | Delta | Split / Dataset | Evidence | Evidence Ref |
|---|---|---|---|---|---|---|
| Accuracy | 0.9999 | — | — | Balanced SIDER subset (19,520 pairs) | Fig. 3; Results section | Results section; Fig. 3 |
| F1 (GraphRAG) | 0.9999 | — | — | Balanced SIDER subset | Fig. 3; Results section | Results section; Fig. 3 |
What To Try In 7 Days
Index a small, curated drug–side‑effect table as pairwise text (Data Format B) and test RAG similarity retrieval.
Load the same pairs into a simple Neo4j graph and run direct existence queries with Cypher.
Add entity extraction and a binary prompt to a small LLM to compare results quickly.
Reproducibility
Risks & Boundaries
Limitations
Evaluation uses a balanced subset of SIDER 4.1; real-world reports and underreported events are not covered.
System only supports single‑drug queries; no multi‑drug, class, or reverse queries yet.
When Not To Use
When you need to discover novel or unreported adverse events from noisy real‑world data.
For causal inference about whether a drug caused an event rather than documented association.
Failure Modes
Missed new or underreported side effects because SIDER lacks post‑marketing signals.
Entity-recognition errors (drug or side‑effect spelling/variant mismatch) leading to false negatives.

