Add an editable triplet memory to LLMs via a read/write API and vector lookup

May 23, 20237 min

Overview

Decision SnapshotNeeds Validation

Idea is practical and low-cost to prototype (LoRA + LSH). Evidence is limited to synthetic training and qualitative examples; broader robustness and scaling are untested.

Citations6

Evidence Strength0.30

Confidence0.60

Risk Signals13

Trust Signals

Findings with numeric evidence: 0/3

Findings with evidence refs: 3/3

Results with explicit delta: 0/2

Reproducibility

Status: No open assets linked

Open source: Unknown

At A Glance

Cost impact: 40%

Production readiness: 30%

Novelty: 60%

Authors

Ali Modarressi, Ayyoob Imani, Mohsen Fayyaz, Hinrich Schütze

Links

Abstract / PDF

Why It Matters For Business

An external editable memory lets products keep facts up to date, audit what an LLM used to answer, and combine scattered facts without retraining the model.

Who Should Care

Summary TLDR

RET-LLM is a concept for giving language models an external, editable read/write memory. The memory stores extracted facts as triplets <arg1, relation, arg2>, keeps vector embeddings for fuzzy lookup with LSH, and exposes a text-based memory API ([MEM_WRITE], [MEM_READ]). The authors finetune Alpaca-7B (with LoRA) on synthetic triplet tasks so the model learns to emit API calls. Qualitative examples show RET-LLM answering questions correctly where the base Alpaca model failed. No large-scale quantitative evaluation is provided yet.

Problem Statement

Large LLMs encode knowledge implicitly in parameters. They lack a dedicated, editable memory that can store, update, and aggregate facts across documents and time. This makes handling changing facts, aggregations, and explicit retrieval harder without retraining.

Main Contribution

Design of an external read/write memory that stores facts as triplets ⟨t1, relation, t2⟩ and keeps mean vector embeddings for each triplet field.

A simple text-based memory API (MEM_WRITE / MEM_READ) so an LLM can call memory via generated text and a controller.

Key Findings

In qualitative examples, RET-LLM produced correct answers while the base Alpaca-7B produced incorrect answers despite having the same contextual text.

Practical UseStoring extracted facts and retrieving them via the memory can fix some retrieval/answering errors without reinputting full context or retraining the base model.

Evidence RefFigures 3 and 5; qualitative examples in Section 4

The memory stores both text triplets and their mean vector embeddings, using LSH to return semantically similar entries when exact text matches are absent.

Practical UseUse vector-based fuzzy lookup to retrieve related facts even when wording differs; this supports aggregation across documents.

Evidence RefSection 3.1 (Memory Structure) and 3.2 (Memory-API & Dataflow)

Results

MetricValueBaselineDeltaSplit / DatasetEvidenceEvidence Ref
qualitative QA correctnessRET-LLM correct vs Alpaca-7B incorrect on provided examplesAlpaca-7B zero-shothand-crafted qualitative examples (Figures 3,5)Section 4 and Figures 3 and 5 show examples where the base model fails but RET-LLM answers correctlyFigures 3,5
finetuning resourceLoRA on Alpaca-7B finetuned on single A6000 48GB GPUauthors' synthetic datasetSection 3.3 states LoRA used to finetune on one A6000 48GB GPUSection 3.3

What To Try In 7 Days

Prototype a triplet extractor that writes simple ⟨entity,relation,entity⟩ rows from documents.

Store triplets with embeddings and use an off-the-shelf LSH index for fuzzy lookup.

Finetune a small instruction model (LoRA) on synthetic read/write examples so it emits MEM_READ/MEM_WRITE calls and test a few QA flows.

Agent Features

Memory
read-writeupdatableaggregatable across documentsinterpretable (triplet rows)scalable in design (claims; not empirically tested)
Tool Use
memory-API (text-based read/write)LSH for vector lookup
Frameworks
Davidsonian-style triplet representation (<arg1, relation, arg2>)
Is Agentic

Yes

Architectures
LLM + external memory + controller

Optimization Features

Training Optimization
LoRA
Inference Optimization
LSH for fast approximate retrieval

Reproducibility

Code AvailableNo
Data AvailableNo
Open Source StatusUnknown
LicenseUnknown

Risks & Boundaries

Limitations

Only qualitative examples provided; no quantitative benchmarks or large-scale evaluation.

Finetuning and evaluation use a synthetic population dataset, not real-world corpora.

When Not To Use

High-stakes or safety-critical settings without rigorous evaluation.

Tasks requiring complex relational structures beyond simple triplets.

Failure Modes

Poor triplet extraction yields wrong or missing memory entries.

LSH misses semantically similar items, leading to empty query results.

Core Entities

Models

Alpaca-7B

Datasets

synthetic triplet population (authors generated names, relations, orgs)

Context Entities

Models

MemLLM (follow-up work referenced)