Overview
The system is practically oriented and tested in a real e-commerce deployment, but results come from one case study and adapted attack datasets, so broader validation is still needed.
Citations0
Evidence Strength0.80
Confidence0.85
Risk Signals11
Trust Signals
Findings with numeric evidence: 5/5
Findings with evidence refs: 5/5
Results with explicit delta: 6/6
Reproducibility
Status: Partial assets available
Open source: Partial
At A Glance
Cost impact: 80%
Production readiness: 78%
Novelty: 48%
Why It Matters For Business
Small businesses can run secure, low-cost RAG chatbots on commodity hardware while keeping strong tenant isolation and practical defenses against prompt injection.
Who Should Care
Summary TLDR
This paper presents an open-source, multi-tenant platform for small businesses to deploy retrieval-augmented chatbots on low-cost, distributed k3s clusters. Security is handled at the platform level with container isolation, PII screening, guard prompts, and a pre-generation detector (GenTel-Shield). In an e-commerce case study, guard prompts alone give near-100% recall; GenTel-Shield achieves high precision (99.51%) and moderate recall (81.6%); combined defenses reach ~100% recall and ~99.8% F1. The k3s private cloud matched or reduced latency versus bare-metal for evaluated LLMs.
Problem Statement
Small businesses lack budget and engineering staff to run cloud GPU fleets and to harden RAG chatbots against prompt injection and data leakage; we need a low-cost, deployable platform that enforces tenant isolation and practical prompt-injection defenses without retraining models.
Main Contribution
An open-source, multi-tenant platform built on lightweight k3s clusters and an encrypted overlay network for small-business LLM deployments.
A layered, platform-level prompt-injection mitigation combining system-level guard prompts and the GenTel-Shield detector that avoids model retraining.
Key Findings
Guard prompts block prompt-injection attacks almost perfectly in the case study.
GenTel-Shield provides model-agnostic detection with high precision but misses some attacks.
Results
| Metric | Value | Baseline | Delta | Split / Dataset | Evidence | Evidence Ref |
|---|---|---|---|---|---|---|
| Guard Prompts recall | 99.6–100% | Pure LLM | Large increase vs. baseline | Balanced benign/adversarial set (250/250) | Table 1 shows near-100% recall and F1 for Guard Prompts | Table 1 |
| GenTel-Shield precision / recall / F1 | 99.51% / 81.6% / ~89.7% | Pure LLM | High precision, moderate recall | Balanced benign/adversarial set (250/250) | Table 1 GenTel-Shield row | Table 1 |
What To Try In 7 Days
Run a k3s demo cluster on spare machines to test private-cloud deployment.
Add simple system-level guard prompts to existing LLM prompts and test with known injection samples.
Integrate a model-agnostic pre-generation detector (e.g., GenTel-Shield) and measure false positives and missed attacks.
Optimization Features
Infra Optimization
System Optimization
Inference Optimization
Reproducibility
Code URLs
Risks & Boundaries
Limitations
Guard prompts require manual, scenario-specific tuning and may not generalise to new domains.
GenTel-Shield misses some attacks (recall ~81.6%) when used alone.
When Not To Use
High-stakes autonomous decision systems where full formal verification is required.
Unrestricted creative generation tasks where prompt constraints would block desired outputs.
Failure Modes
Detector false negatives allow obfuscated injection to reach the model.
Guard prompts can be bypassed by novel or domain-specific obfuscation without prompt updates.

