Probabilistic federated RAG that routes across product domains to boost multi-product QA

January 25, 20257 min

Overview

Decision SnapshotNeeds Validation

Paper provides a concrete method, datasets and Azure-based evaluations. Results are consistent across uni- and cross-domain tests but code and public data release are pending, and exact numeric improvements are shown only in figures.

Citations0

Evidence Strength0.70

Confidence0.80

Risk Signals9

Trust Signals

Findings with numeric evidence: 2/3

Findings with evidence refs: 3/3

Results with explicit delta: 0/4

Reproducibility

Status: No open assets linked

Open source: Partial

At A Glance

Cost impact: 60%

Production readiness: 60%

Novelty: 60%

Authors

Parshin Shojaee, Sai Sree Harsha, Dan Luo, Akash Maharaj, Tong Yu, Yunyao Li

Links

Abstract / PDF

Why It Matters For Business

If product support queries span multiple products, probabilistic federated retrieval increases correct-document retrieval and improves answer quality without per-product LLM finetuning.

Who Should Care

Summary TLDR

The paper introduces MKP-QA, a multi-product RAG system that combines a learned domain router, stochastic gating, and a dense bi-encoder retriever to federate search across product domains. The authors also build Adobe-focused uni- and cross-product datasets (AEP, Target, CJA). MKP-QA consistently outperforms baselines in top-1 retrieval accuracy and in LLM-judged relevancy and faithfulness on these datasets, with larger gains for cross-domain queries. Datasets and deployment notes are provided; code and public data release are pending Adobe approval.

Problem Statement

Enterprise product questions often span multiple products and require cross-product knowledge. Existing RAG pipelines either search every domain (slow, more hallucination) or pick one domain (can miss cross-product info). There is also no suitable public benchmark for multi-product product QA.

Main Contribution

MKP-QA: a probabilistic federated RAG pipeline that combines a learned query-domain router, stochastic gating for exploration-exploitation, and a dense bi-encoder retriever to rank documents across product domains.

A stochastic gating mechanism that samples domains based on router likelihoods and adaptive entropy-based thresholds to reduce selection errors and enable exploration.

Key Findings

MKP-QA outperforms baselines on retrieval and response quality.

Practical UseUse probabilistic federated routing plus dense retrieval when queries may require cross-product knowledge; it raises correct-document retrieval and improves generated answers versus single-index or hard router methods.

Evidence RefFig.2, Fig.3

Large synthetic dataset per product was created with GPT-4 assistance.

NumbersSLA pairs: AEP 28,860; CJA 27,820; Target 29,610

Practical UseYou can train and evaluate multi-domain retrievers on tens of thousands of synthetic, SME-vetted query-doc pairs per product.

Evidence RefTable 1 (Section 4.4)

Results

MetricValueBaselineDeltaSplit / DatasetEvidenceEvidence Ref
SLA dataset size per productAEP 28,860; CJA 27,820; Target 29,610 query-doc pairsSLA uni-domainTable 1 (Section 4.4)Table 1
% positive pairs (SLA)AEP 17.53%; CJA 18.28%; Target 20.26%SLA uni-domainTable 1 (Section 4.4)Table 1

What To Try In 7 Days

Run a small federated retrieval prototype: train a domain router and a Sentence-BERT retriever on existing product docs, compare top-1 retrieval against unified search.

Implement entropy-based adaptive gating to allow low-confidence domains to be sampled and measure cross-product recall lift.

Use GPT-4 (or internal judge) to cheaply evaluate relevancy and faithfulness on a held-out sample before full deployment.

Agent Features

Tool Use
Uses GPT-4/GPT-3.5 for query generation and evaluation

Optimization Features

Infra Optimization
LoRA
System Optimization
Federated domain selection reduces the number of domains searched per query
Training Optimization
Contrastive fine-tuning of bi-encoder with symmetric InfoNCE
Inference Optimization
Offline document embedding and vector DB for fast retrievalPlanned: parallel domain routing and caching (deployment)

Reproducibility

Code AvailableNo
Data AvailableNo
Open Source StatusPartial
LicenseUnknown

Risks & Boundaries

Limitations

Dataset and code release are pending Adobe approval, so exact replication is currently limited.

Performance depends on quality of domain router; misclassification can still remove needed domains despite stochastic gating.

When Not To Use

If you cannot afford vector DB or offline embedding infrastructure for retrieval at scale.

If queries are strictly single-domain and a simple index yields sufficient accuracy.

Failure Modes

Router assigns near-zero probability to relevant domain and gating fails to sample it, causing missed evidence.

Too many active domains (low threshold) increases latency and may introduce irrelevant context that hurts LLM faithfulness.

Core Entities

Models

BERT variant (domain router)Sentence-BERT bi-encoder (retriever)GPT-3.5-turbo-1106 (generation/eval)GPT-4-0314 (generation/eval)GPT-4 (query generation and annotation assistance)

Metrics

AccuracyRelevancy (LLM judged)Faithfulness (RAGAS2 + GPT-4 judged)

Datasets

Adobe Experience Platform (AEP) multi-product datasetAdobe Target multi-product datasetAdobe Customer Journey Analytics (CJA) multi-product datasetCross-domain combinations: AEP+CJA, AEP+Target, CJA+Target

Benchmarks

Adobe multi-product uni-domain and cross-domain RAG datasets (new, pending release)