A small, formal language that turns vague memory commands into safe, verifiable operations for LLM agents

September 14, 20257 min

Overview

Decision SnapshotNeeds Validation

Design is concrete and practical (schema, pipeline, SQL prototype) but empirical validation is planned rather than reported. Good engineering foundation; missing published experiments comparing adapters or model performance.

Citations1

Evidence Strength0.60

Confidence0.80

Risk Signals9

Trust Signals

Findings with numeric evidence: 3/3

Findings with evidence refs: 3/3

Results with explicit delta: 0/3

Reproducibility

Status: No open assets linked

Open source: Unknown

At A Glance

Cost impact: 50%

Production readiness: 60%

Novelty: 60%

Authors

Yi Wang, Lihai Yang, Boyu Chen, Gongyi Zou, Kerun Xu, Bo Tang, Feiyu Xiong, Siheng Chen, Zhiyu Li

Links

Abstract / PDF

Why It Matters For Business

Text2Mem makes memory commands predictable and auditable. That reduces bugs from inconsistent agent behavior, improves portability across memory backends, and makes long-running agent behavior testable and repeatable.

Who Should Care

Summary TLDR

Text2Mem defines a compact JSON-based language and an execution pipeline that converts natural-language memory instructions into validated, typed operations. It standardizes 12 memory verbs (encode, update, promote, demote, merge, split, lock, expire, label, delete, retrieve, summarize), a 5-field schema, and a validator-parser-adapter pathway. A companion benchmark (Text2Mem Bench) separates planning (NL→schema) from execution (schema→SQL/backend effects) so systems can be measured for both correctness and real effects.

Problem Statement

Current agent memory systems expose inconsistent, ad-hoc commands. Natural-language memory requests are ambiguous about scope, action, and lifecycle. This causes unpredictable behavior, poor portability across systems, and hard-to-reproduce experiments.

Main Contribution

A verb-centered operation language (Text2Mem) with twelve mutually exclusive operations covering encoding, storage, and retrieval.

A compact, schema-based JSON contract (five backbone fields: stage, op, target, args, meta) plus a validator → parser → adapter pipeline that enforces safety and determinism before execution.

Key Findings

Text2Mem defines a fixed inventory of twelve memory operations covering encode, storage, and retrieval.

Numbers12 operations (Table I; encoding/storage/retrieval split)

Practical UseDesign agent memory workflows by composing these 12 verbs; expect consistent behavior when adapters implement them.

Evidence RefSection III-A; Table I

Every operation is a typed JSON object with a five-field backbone: stage, op, target, args, meta.

Numbers5 backbone fields (stage, op, target, args, meta)

Practical UseWhen building a memory API, require these fields to avoid ambiguous commands and to enable automated validation.

Evidence RefSection III-B (Schema Architecture)

Results

MetricValueBaselineDeltaSplit / DatasetEvidenceEvidence Ref
Operation inventory12 verbs covering encode/storage/retrievalTable I lists 12 operations and their motivation.Section III-A; Table I
Schema backbone5 fields (stage, op, target, args, meta)Schema architecture and invariants described for each field.Section III-B

What To Try In 7 Days

Map a small set of your agent memory flows to the Text2Mem verbs and enforce the five-field schema for incoming commands.

Run the SQL prototype for one workflow (encode → retrieve → promote) to confirm expected DB effects and logs.

Add validator checks for destructive actions (global writes, hard deletes) and require confirmation/dry-run flags.

Agent Features

Memory
long-term memory lifecycle controlspriority and governance (promote/demote/lock/expire)
Planning
schema generation from natural languagemulti-step workflow planning (schema lists)
Tool Use
LLM summarizationembedding services
Frameworks
validator-parser-adapterSQL prototype backendadapters for MemOS/mem0/Letta
Is Agentic

Yes

Architectures
memory operating layerschema-driven adapter
Collaboration
auditable workflows with actor/meta fields

Optimization Features

System Optimization
separation of planning and execution to reduce ambiguityschema validation moves safety checks earlier

Reproducibility

Code AvailableNo
Data AvailableNo
Open Source StatusUnknown
LicenseUnknown

Risks & Boundaries

Limitations

No released implementation or evaluation results in this paper; benchmark results are promised later.

Relies on LLM services for encoding and summarization, which shifts cost and variability to external models.

When Not To Use

When memory needs are simple and ephemeral (no governance or lifecycle controls).

When you cannot run or afford LLM-based services for encoding/summarization.

Failure Modes

Incorrect schema generation from ambiguous natural language leading to wrong actions.

Adapter mismatch where backend lacks a semantic equivalent of a verb, producing inconsistent effects.

Core Entities

Metrics

SMAESREMR

Datasets

Text2Mem Bench

Benchmarks

Text2Mem Bench