KnowEdit benchmark and EasyEdit toolkit: a unified study and comparison of methods to change facts inside LLMs

January 2, 20249 min

Overview

Production Readiness

0.5

Novelty Score

0.7

Cost Impact Score

0.6

Citation Count

20

Authors

Ningyu Zhang, Yunzhi Yao, Bozhong Tian, Peng Wang, Shumin Deng, Mengru Wang, Zekun Xi, Shengyu Mao, Jintian Zhang, Yuansheng Ni, Siyuan Cheng, Ziwen Xu, Xin Xu, Jia-Chen Gu, Yong Jiang, Pengjun Xie, Fei Huang, Lei Liang, Zhiqiang Zhang, Xiaowei Zhu, Jun Zhou, Huajun Chen

Links

Abstract / PDF

Why It Matters For Business

Knowledge editing can cheaply update specific facts or behaviors in an LLM without full retraining, saving compute and time; but edits can fail to generalize and may break unrelated behavior, so careful validation is required.

Summary TLDR

This paper surveys methods to update facts inside large language models, proposes a three-phase taxonomy (resort to external knowledge, merge into model, edit intrinsic parameters), and releases a new benchmark (KnowEdit) plus an EasyEdit toolkit. The authors run a large empirical comparison on Llama2-7b-chat across retrieval-based, parameter-efficient, and locate-and-edit methods. Results show many methods can force a target answer (high edit success) but struggle with portability (making edits usable in reasoning) and with large-scale or erasure edits. They also analyze where edits change weights and show location methods find entity-related areas but not full facts.

Problem Statement

Updating or removing specific facts in a trained LLM should be fast, local, and low-cost. Full retraining is expensive and brittle. Existing editing methods vary a lot in how reliably they change a fact, how much they break unrelated knowledge, and how well edits generalize to related queries. This paper benchmarks and analyzes these trade-offs.

Main Contribution

A simple three-phase taxonomy for knowledge editing: recognition (external memory), association (merge representations), mastery (edit weights).

KnowEdit: a multi-task benchmark (WikiData recent, ZsRE, WikiBio, WikiData counterfact, ConvSent, Sanitation) and evaluation protocol for insertion, modification, and erasure.

EasyEdit: an open-source framework to run, compare, and reproduce editing methods.

A large empirical study on Llama2-7b-chat comparing representative methods (SERAC, ICE, AdaLoRA, MEND, ROME, MEMIT, FT-L, FT-M) and analyses of weight-change sparsity, locating methods, and sequential edits.

Key Findings

Several editing methods can reach near-perfect edit success on fact-insertion and fact-modification datasets.

NumbersWikiData recent edit success: AdaLoRA=100, FT-M=100 (Table 4)

Portability (ability to use edited facts in related reasoning or aliases) remains low across methods.

NumbersExample: WikiData recent portability around 36.9–65.4 across methods (ICE=36.93, AdaLoRA=64.69, FT-M=65.44)

Locality varies: some editors change few parameters, others spread updates broadly.

NumbersSparse editors (ROME/MEMIT/MEND) show concentrated column updates; FT-L spreads changes (Figure 6 and Table 4 locality:M

Erasure (making model forget) is inconsistent and can harm unrelated knowledge.

NumbersSanitation edit success: ROME=85, FT-M=75 but corresponding locality often low (e.g., ROME locality ~50.31)

Sequential or large-scale edits degrade performance: many methods fail after hundreds–thousands of edits.

NumbersPerformance drops dramatically after 1,000 sequential edits; AdaLoRA more stable up to ~100 (Figure 4)

Results

LoRA

Value100

LoRA

Value100

Portability (example values)

ValueOften 36.9–76.1 across datasets and methods

Sanitation - Edit Success (ROME)

Value85

Sequential edits - robustness

ValuePerformance collapses after ~1,000 edits; AdaLoRA stable ~100 edits

Who Should Care

What To Try In 7 Days

Install EasyEdit and run the provided recipes on a small Llama2-7b-chat snapshot.

Reproduce one simple insertion (WikiData recent) with AdaLoRA and FT-M and compare edit success and portability.

Run a locality check: measure unchanged answers on a held-out 'retain' set after the edit and log failures.

Agent Features

Architectures

  • Transformer

Optimization Features

Model Optimization

  • LoRA
  • MEND (hypernetwork rank-one updates)

Training Optimization

  • FT-M: constrained fine-tune objective on FFN layer
  • Hypernetwork meta-learning for ∆W

Reproducibility

Code Available

Data Available

Open Source Status

  • yes

Risks & Boundaries

Limitations

  • Portability is low: edits rarely propagate cleanly into related reasoning chains.
  • Erasure and privacy sanitization are inconsistent and can damage unrelated knowledge.
  • Sequential or large-scale edits break current editors; methods are not robust for mass updates.
  • Knowledge-locating methods often identify entity-related areas but not the whole fact chain.

When Not To Use

  • When you need provable, auditable deletion of sensitive data at scale.
  • When you must apply thousands of edits without retraining or external memory.
  • When edits must be guaranteed to generalize across reasoning and alias forms without testing.

Failure Modes

  • Partial token replacement (conflicting residual memory)
  • Meaningless or repeated token generation
  • Missing tokens (incomplete target answers)
  • Knowledge-irrelevant generation (side effects from broad parameter changes)

Core Entities

Models

  • Llama2-7b-chat

Metrics

  • Edit Success
  • Portability
  • Locality
  • Fluency
  • ROUGE-1
  • KL-divergence

Datasets

  • KnowEdit
  • WikiData recent
  • ZsRE
  • WikiBio
  • WikiData counterfact
  • ConvSent
  • Sanitation

Benchmarks

  • KnowEdit

Context Entities

Models

  • GPT-4
  • GPT-3
  • LoRA

Metrics

  • Hit@10
  • Hit@50
  • n-gram entropy (fluency)

Datasets

  • RealToxicityPrompts (safety context)
  • FEVER, Vitamin-C (fact-checking context)