Overview
The system is a clear engineering prototype: code and dataset access exist, results are small-scale but measured; expect integration work and stronger LLMs for production.
Citations1
Evidence Strength0.60
Confidence0.80
Risk Signals8
Trust Signals
Findings with numeric evidence: 2/2
Findings with evidence refs: 2/2
Results with explicit delta: 1/2
Reproducibility
Status: Code + data available
Open source: Yes
At A Glance
Cost impact: 50%
Production readiness: 40%
Novelty: 50%
Why It Matters For Business
DepsRAG automates dependency analysis and vulnerability lookup, cutting manual checks that delay library approvals and enabling faster, evidence-backed decisions.
Who Should Care
Summary TLDR
DepsRAG is a multi-agent assistant that builds a dependency knowledge graph (KG) for a given package, augments queries with retrieval (KG + web + vulnerability DB), and uses an Agent–Critic loop to iteratively refine answers. In a proof-of-concept using GPT-4-Turbo and Llama-3, adding the Critic-Agent raised answer precision from 13.3% to 40% (threefold). The system is implemented in Python with Langroid, Neo4j, and the Deps.Dev API; code and demo are published.
Problem Statement
Developers need faster, more reliable tools to reason about direct and transitive software dependencies, security risks, and maintainability before importing third-party packages. Existing tools are fragmented (security, visualization, manual checks) and miss issues like circular dependencies, transitive risks, and up-to-date vulnerability context, creating approval bottlenecks.
Main Contribution
Design of DEPSRAG: a multi-agent, retrieval-augmented framework for reasoning about software dependencies.
A dependency Knowledge Graph builder that captures direct and transitive package relations via Deps.Dev and Neo4j.
Key Findings
Adding a Critic-Agent raised answer precision from 13.3% to 40% on evaluated tasks.
GPT-4-Turbo generated correct Cypher queries on first attempt for all test questions; Llama-3 needed retries and produced an incorrect final answer for one question.
Results
| Metric | Value | Baseline | Delta | Split / Dataset | Evidence | Evidence Ref |
|---|---|---|---|---|---|---|
| Answer precision (with vs without Critic-Agent) | 40% (with Critic) | 13.3% (without Critic) | ≈3× | Three multi-step tasks, ten iterations (GPT-4-Turbo) | Section 5.2.2, Figure 3 | — |
| Cypher query generation trials | GPT-4-Turbo: 0 retries; Llama-3: up to 2 retries | — | — | Questions on Chainlit v1.1.200 dependency KG | Section 5.2.1, Table 1, Listing 1 | — |
What To Try In 7 Days
Run DEPSRAG on a critical package to generate a dependency KG and identify top-risk transitive dependencies.
Integrate Critic-Agent style validation into existing LLM QA flows to reduce wrong answers.
Add schema retrieval + retry logic when converting natural language to DB queries.
Agent Features
Memory
Planning
Tool Use
Frameworks
Is Agentic
Yes
Architectures
Collaboration
Optimization Features
Token Efficiency
System Optimization
Reproducibility
Data URLs
Risks & Boundaries
Limitations
LLM fragility: incorrect DB-query translation can yield wrong graph answers.
Critic-Agent only validates final answers in this work, increasing token cost and runtime.
When Not To Use
For trivial dependency checks where existing tools suffice and LLM cost is unjustified.
Where strict low-latency or low-cost constraints prohibit multiple agent exchanges.
Failure Modes
Unproductive critic–agent loops leading to termination after iteration cap.
Hallucinated or overly general Cypher queries that return irrelevant results.

