Open-source, low-cost platform that secures RAG chatbots for small businesses using k3s clusters and layered prompt-defences

Overview

Decision SnapshotNeeds Validation

The system is practically oriented and tested in a real e-commerce deployment, but results come from one case study and adapted attack datasets, so broader validation is still needed.

Citations0

Evidence Strength0.80

Confidence0.85

Risk Signals11

Trust Signals

Findings with numeric evidence: 5/5

Findings with evidence refs: 5/5

Results with explicit delta: 6/6

Reproducibility

Status: Partial assets available

Open source: Partial

At A Glance

Cost impact: 80%

Production readiness: 78%

Novelty: 48%

Authors

Jiazhu Xie, Bowen Li, Heyu Fu, Chong Gao, Ziqi Xu, Fengling Han

Links

Abstract / PDF / Code

Why It Matters For Business

Small businesses can run secure, low-cost RAG chatbots on commodity hardware while keeping strong tenant isolation and practical defenses against prompt injection.

Who Should Care

CTO Product Manager Founder Engineering Lead ML Engineer

Summary TLDR

This paper presents an open-source, multi-tenant platform for small businesses to deploy retrieval-augmented chatbots on low-cost, distributed k3s clusters. Security is handled at the platform level with container isolation, PII screening, guard prompts, and a pre-generation detector (GenTel-Shield). In an e-commerce case study, guard prompts alone give near-100% recall; GenTel-Shield achieves high precision (99.51%) and moderate recall (81.6%); combined defenses reach ~100% recall and ~99.8% F1. The k3s private cloud matched or reduced latency versus bare-metal for evaluated LLMs.

Problem Statement

Small businesses lack budget and engineering staff to run cloud GPU fleets and to harden RAG chatbots against prompt injection and data leakage; we need a low-cost, deployable platform that enforces tenant isolation and practical prompt-injection defenses without retraining models.

Main Contribution

An open-source, multi-tenant platform built on lightweight k3s clusters and an encrypted overlay network for small-business LLM deployments.

A layered, platform-level prompt-injection mitigation combining system-level guard prompts and the GenTel-Shield detector that avoids model retraining.

Key Findings

Guard prompts block prompt-injection attacks almost perfectly in the case study.

NumbersRecall 99.6–100%, F1 ~100% (Table 1)

Practical UseIf you carefully craft and test guard prompts, you can achieve near-complete attack blocking without model changes, but expect manual tuning and maintenance.

Evidence RefTable 1

GenTel-Shield provides model-agnostic detection with high precision but misses some attacks.

NumbersPrecision 99.51%, Recall 81.6%, F1 ~89.7% (Table 1)

Practical UseDeploy GenTel-Shield to reduce false positives and simplify operations; pair it with other controls to catch missed attacks.

Evidence RefTable 1

Results

Metric	Value	Baseline	Delta	Split / Dataset	Evidence	Evidence Ref
Guard Prompts recall	99.6–100%	Pure LLM	Large increase vs. baseline	Balanced benign/adversarial set (250/250)	Table 1 shows near-100% recall and F1 for Guard Prompts	Table 1
GenTel-Shield precision / recall / F1	99.51% / 81.6% / ~89.7%	Pure LLM	High precision, moderate recall	Balanced benign/adversarial set (250/250)	Table 1 GenTel-Shield row	Table 1

What To Try In 7 Days

Run a k3s demo cluster on spare machines to test private-cloud deployment.

Add simple system-level guard prompts to existing LLM prompts and test with known injection samples.

Integrate a model-agnostic pre-generation detector (e.g., GenTel-Shield) and measure false positives and missed attacks.

Optimization Features

Infra Optimization

k3s lightweight KubernetesPooling commodity hardware to reduce cost

System Optimization

Encrypted overlay networkingMulti-tenant container isolation

Inference Optimization

GPU-aware schedulingDistributed inference across heterogeneous nodesContainerised workload placement

Reproducibility

Code AvailableYes

Data AvailableNo

Open Source StatusPartial

LicenseUnknown

Code URLs

https://aisuko.github.io/secure_llm/

Risks & Boundaries

Limitations

Guard prompts require manual, scenario-specific tuning and may not generalise to new domains.

GenTel-Shield misses some attacks (recall ~81.6%) when used alone.

When Not To Use

High-stakes autonomous decision systems where full formal verification is required.

Unrestricted creative generation tasks where prompt constraints would block desired outputs.

Failure Modes

Detector false negatives allow obfuscated injection to reach the model.

Guard prompts can be bypassed by novel or domain-specific obfuscation without prompt updates.

Core Entities

Models

GPT-4.1GPT-4.1-miniMinistral-3BGenTel-Shield (detector)

Metrics

PrecisionRecallF1End-to-end inference latency (s)

Datasets

Customer Support queries (ATS)GenTel-Safe prompt-injection attack dataset (adapted)

Benchmarks

GenTel-Safe

Overview

Trust Signals

Reproducibility

At A Glance

Authors

Links

Why It Matters For Business

Who Should Care

Summary TLDR

Problem Statement

Main Contribution

Key Findings

Guard prompts block prompt-injection attacks almost perfectly in the case study.

GenTel-Shield provides model-agnostic detection with high precision but misses some attacks.

Results

What To Try In 7 Days

Optimization Features

Reproducibility

Code URLs

Risks & Boundaries

Limitations

When Not To Use

Failure Modes

Core Entities

Models

Metrics

Datasets

Benchmarks

You May Also Want to Read

Short adversarial suffixes can flip LLM-as-a-Judge decisions; CUA >30% success

Key finding

BackdoorAgent: a stage-aware framework and benchmark showing memory backdoors persist across multi-step LLM agents

Key finding

JudgeDeceiver: automatically craft prompts that reliably trick LLM-as-a-Judge to pick an attacker’s response

Key finding

Make tool-using LLM agents provably safe by combining safety engineering, info-flow labels, and MCP extensions

Key finding

A systematic, practitioner-focused map of 193 multi-agent security threats and how 16 frameworks cover them

Key finding