Overview
Production Readiness
0.7
Novelty Score
0.45
Cost Impact Score
0.6
Citation Count
0
Why It Matters For Business
You can keep advanced agentic reasoning for hard requests while giving fast answers for common lookups. That reduces user wait time and session rounds, which likely raises engagement and lowers operational cost.
Summary TLDR
AdaptJobRec is an agentic conversational job recommender built for Walmart. It classifies incoming queries as simple or complex. Simple queries bypass planner/memory and call fast APIs; complex queries use a few-shot memory filter and a nested task planner that groups parallel subtasks. On Walmart data, this routing plus personalization cuts response latency by ~53% and reduces dialogue rounds while slightly improving ranking metrics.
Problem Statement
Agentic conversational recommenders give richer answers but are slow. Simple queries (e.g., 'check application status') waste time if they always trigger planner and memory modules. The paper asks: can we keep agentic reasoning for hard queries while giving fast responses for simple ones?
Main Contribution
AdaptJobRec: an LLM-powered agent that classifies query complexity and routes simple queries directly to fast tools while reserving full agentic flow for complex queries.
Few-shot memory processing module that filters chat history for only relevant segments, reducing redundant planning.
Task decomposition planner that outputs nested sub-task lists and groups subtasks that can run in parallel to save time.
End-to-end deployment design integrating People.AI knowledge graph (1.6M nodes, 83M edges), Redis caching, Kafka streaming, and Cypher-based tools.
Evaluation on real Walmart production data (job rec and career path tasks) with statistical tests showing latency and accuracy gains.
Key Findings
AdaptJobRec cuts average response latency by about half compared to a RAG baseline in pilot users.
AdaptJobRec reduces the number of conversation rounds needed to get the target information.
Job ranking metrics improve modestly over strong agentic baselines.
Career path prediction accuracy is higher than tuned LLM baselines while keeping low latency.
Results
Hit@10 (job recommendation)
NDCG@10 (job recommendation)
MAP@10 (job recommendation)
Hit (real transitions) (career path)
Latency (career path)
Average conversation rounds (pilot)
Average response latency (pilot)
Who Should Care
What To Try In 7 Days
Add a lightweight complexity classifier to route simple queries straight to existing APIs.
Implement a small memory filter (few-shot prompt) that returns only relevant chat snippets for complex queries.
Cache frequent tool responses (Redis) to avoid repeated LLM calls for high-frequency lookups.
Agent Features
Memory
- Few-shot memory processing module (filters chat history)
- Integrates profile and recent activity into query
Planning
- Task decomposition into nested lists
- Grouping of asynchronously executable subtasks
Tool Use
- Personalized recommendation engines
- Predefined Cypher templates
- Text-to-Cypher generation
- Job Application Microservice APIs
Frameworks
- Model Context Protocol (MCP)
- People.AI knowledge graph
Is Agentic
true
Architectures
- LLM-based reasoning agent
- Planner + Memory + Tool invocation
Optimization Features
Token Efficiency
- Few-shot memory processing to reduce unnecessary context
Infra Optimization
- Independent microservices for tools to scale separately
- Use of MCP server for tool execution and Cypher queries
System Optimization
- Kafka for streaming and orchestration
- Redis caching for frequent queries
- Cassandra for conversation history
Inference Optimization
- Complexity-based routing to avoid planner for simple queries
- Planner parallelization via nested async groups
- Cache Augmented Generation (Redis) to reuse tool results
Reproducibility
Open Source Status
- unknown
Risks & Boundaries
Limitations
- Evaluation uses internal Walmart data and a small pilot (150 sessions); external generalization is untested.
- System depends on a rich People.AI knowledge graph — missing or sparse KG coverage will hurt recommendations.
- Complexity classifier errors can misroute queries, causing either unnecessary latency or lower-quality answers.
When Not To Use
- You lack a structured knowledge graph or matching recommendation APIs.
- Most queries are complex and require deep planning in every session (less benefit from routing).
- You need an open-source, reproducible baseline (paper uses internal data and closed deployment).
Failure Modes
- Misclassifying a complex query as simple leads to incomplete answers.
- Memory filter omits relevant history, causing planner to miss necessary context.
- Text-to-Cypher can produce incorrect queries if the schema or prompt is off.
Core Entities
Models
- Llama-3.1-8B (fine-tuned as Llama-Capa)
- DeepSeek-R1-Distill-Qwen-7B (fine-tuned as DeepSeek-Capa)
- AdaptJobRec (system integrating LLM agent + tools)
Metrics
- Hit@10
- NDCG@10
- MAP@10
- Average response latency (s / ms)
- Average conversation rounds
- Statistical significance (Welch's t-test, p-values)
Datasets
- Walmart Job Recommendation logs (10,014 users)
- Walmart job transition records (932,854 training)
- Walmart job transition test set (471,495 records, 2024)
Benchmarks
- Job Recommendation (Hit@10, NDCG@10, MAP@10)
- Career Path Prediction (Hit real transitions, latency)
- Pilot user study (conversation rounds, response latency)

