ProbDPP: pick diverse data that’s also likely to arrive — and learn reliabilities online

Overview

Decision SnapshotNeeds Validation

Theoretical proofs and simulation experiments support the claims, but evaluations are limited to simulated dropouts, two datasets, and a single LLM; more real-world tests are needed before large-scale deployment.

Citations0

Evidence Strength0.70

Confidence0.85

Risk Signals12

Trust Signals

Findings with numeric evidence: 2/5

Findings with evidence refs: 5/5

Results with explicit delta: 8/8

Reproducibility

Status: Partial assets available

Open source: Unknown

At A Glance

Cost impact: 50%

Production readiness: 60%

Novelty: 60%

Authors

Ahmad Sarlak, Abolfazl Razi

Links

Abstract / PDF

Why It Matters For Business

When some data sources are unreliable, selecting only diverse items can backfire. ProbDPP improves downstream QA and prompt quality by preferring items that are both diverse and likely to be available, reducing wasted context budget under noisy links or flaky tools.

Who Should Care

Product Manager ML Engineer Engineering Lead CTO Data Scientist

Summary TLDR

Selecting diverse inputs for LLM prompts or fine-tuning fails when some sources randomly drop. The paper proves naive expected log-det diversity collapses under Bernoulli dropouts, proposes ProbDPP (a minimally regularized k‑DPP) that adds a per-item reliability reward, and gives a KL‑UCB semi-bandit algorithm to learn unknown source reliabilities online. Theory gives matching regret bounds and simulations (MeetingBank, HotpotQA) show consistent gains under stochastic unavailability.

Problem Statement

Diversity-based subset selection (e.g., k‑DPP) assumes selected items are always available. In realistic pipelines sources can drop or be corrupted randomly. The straightforward expected log-det under Bernoulli dropouts is mathematically ill-posed (diverges to -∞) and cannot guide selection. We need a diversity objective that stays finite under random dropouts and a practical method to select when reliabilities are unknown.

Main Contribution

Proof that expected log-det under independent Bernoulli dropouts is ill-posed (diverges to -∞) when any chosen item can fail

ProbDPP: a regularized k‑DPP objective that decomposes into geometric diversity (log-det) plus an additive per-item reliability reward

Key Findings

Naive expected log-det collapses under independent Bernoulli dropouts.

Practical UseDo not use plain expected log-det if selected items can fail; optimization will be ill-posed and meaningless.

Evidence RefLemma 3.1; Appendix A.1

Regularizing the masked kernel by ε>0 yields a finite expected objective that splits into log-det diversity plus per-item reliability rewards.

Practical UseAdd a small ridge ε to the masked kernel to retain geometric diversity while favoring dependable sources.

Evidence RefLemma 3.2; Section 3.1

Results

Metric	Value	Baseline	Delta	Split / Dataset	Evidence	Evidence Ref
Token-F1	31.3 (ProbDPP)	LLMLingua2 28.8	+2.5 abs (+8.7% rel)	MeetingBank (max 30 chunks)	Table 1 reports Token-F1 on MeetingBank	Table 1
ROUGE-L	31.2 (ProbDPP)	LLMLingua2 28.8	+2.4 abs (+8.3% rel)	MeetingBank (max 30 chunks)	Table 1 reports ROUGE-L on MeetingBank	Table 1

What To Try In 7 Days

Add a small ridge ε to your masked similarity kernel and re-evaluate existing DPP-based selection

Measure per-source availability (success/failure) and plug empirical rates into the reliability term r_i(α_i,ε)

If reliabilities are unknown, run the ProbDPP KL‑UCB loop to learn reliabilities while selecting under budget

Optimization Features

Token Efficiency

Prompt compression (context pruning)

Training Optimization

Data-efficient Training

Inference Optimization

Context SelectionToken Budgeting

Reproducibility

Code AvailableNo

Data AvailableYes

Open Source StatusUnknown

LicenseUnknown

Risks & Boundaries

Limitations

Assumes independent Bernoulli dropouts; does not handle correlated or adversarial failures

Uses a fixed similarity kernel; real systems may need query-dependent kernels

When Not To Use

If failures are highly correlated or adversarial (violates independence assumption)

If you only get aggregate/episodic feedback (no per-item semibandit signals)

Failure Modes

Objective collapse (-∞ log-det) if regularization ε is omitted and items can drop

Wrong reliability estimates cause persistent suboptimal selection during learning

Core Entities

Models

llama3k-DPPProbDPP

Metrics

Token-F1ROUGE-LBERTScoreExact Match (EM)

Datasets

MeetingBankHotpotQA (distractor)

Benchmarks

HotpotQA distractorMeetingBank long-context QA

Overview

Trust Signals

Reproducibility

At A Glance

Authors

Links

Why It Matters For Business

Who Should Care

Summary TLDR

Problem Statement

Main Contribution

Key Findings

Naive expected log-det collapses under independent Bernoulli dropouts.

Regularizing the masked kernel by ε>0 yields a finite expected objective that splits into log-det diversity plus per-item reliability rewards.

Results

What To Try In 7 Days

Optimization Features

Reproducibility

Risks & Boundaries

Limitations

When Not To Use

Failure Modes

Core Entities

Models

Metrics

Datasets

Benchmarks

You May Also Want to Read

Case-aware LLM-as-a-judge scoring: eight enterprise metrics, severity-weighting, and JSON outputs for multi-turn RAG

Key finding

RGB: a bilingual benchmark diagnosing how LLMs fail when using retrieved evidence

Key finding

Curate systematic reviews + guidelines to make RAG answers more trustworthy for Long COVID

Key finding

Mask untruthful parts of context to cut hallucinations and keep helpful facts

Key finding

Practical survey of RAG: paradigms, core components, benchmarks, and engineering gaps

Key finding