An LLM agent that plans CRISPR experiments, designs guides and protocols, and was validated in a wet‑lab knockout

Overview

Decision SnapshotNeeds Validation

Prototype with real wet‑lab validation and 12‑expert review; promising for lab design workflows but needs broader external validation and integration before clinical or high‑throughput deployment.

Citations9

Evidence Strength0.70

Confidence0.85

Risk Signals11

Trust Signals

Findings with numeric evidence: 4/4

Findings with evidence refs: 4/4

Results with explicit delta: 1/3

Reproducibility

Status: No open assets linked

Open source: Unknown

At A Glance

Cost impact: 60%

Production readiness: 60%

Novelty: 70%

Authors

Yuanhao Qu, Kaixuan Huang, Ming Yin, Kanghong Zhan, Dyllan Liu, Di Yin, Henry C. Cousins, William A. Johnson, Xiaotong Wang, Mihir Shah, Russ B. Altman, Denny Zhou, Mengdi Wang, Le Cong

Links

Abstract / PDF

Why It Matters For Business

Automating CRISPR design reduces expert time, speeds prototyping, and lowers error risk in early‑stage research; it can cut planning cycles and standardize lab protocols for teams without CRISPR specialists.

Who Should Care

Product Manager CTO ML Engineer Data Scientist

Summary TLDR

CRISPR-GPT is an LLM-powered agent that combines a planner, a tool wrapper, and state‑machine task executors to automate CRISPR experiment design. It supports 22 task states (4 meta‑pipelines), calls tools such as Primer3 and guide libraries, includes safety filters (e.g., blocks >=20 bp sequences and warns on human targets), received higher expert ratings than base ChatGPT in design tasks, and helped non-experts run a 4‑gene knockout in A375 cells with successful NGS validation. The system is a prototype: useful for design automation but not a replacement for wet‑lab expertise or clinical use.

Problem Statement

General LLMs produce confident but sometimes incorrect or incomplete guidance for CRISPR experiments (wrong guides, missing protocol details, unsafe suggestions). Researchers need a domain-aware agent that integrates tools and checks to produce practical, verifiable experimental designs for beginners and non-experts.

Main Contribution

An agent architecture combining an LLM planner, a Tool Provider wrapper, and state‑machine Task Executor to break CRISPR workflows into subgoals.

Implementation of 22 task states across 4 predefined meta‑pipelines (knockout, base editing, prime editing, activation/repression) and 13 Auto‑Mode tasks.

Key Findings

Domain‑augmented agent scored higher than general ChatGPT on expert design ratings.

Numbers12 experts; 1–5 rating scale; CRISPR‑GPT > ChatGPT 3.5/4 across Accuracy, Reasoning, Completeness, Conciseness

Practical UseUse a tool‑augmented agent rather than general chat models when you need experimentally actionable CRISPR designs.

Evidence RefFigure 6; Section 2.2

CRISPR‑GPT executed a real knockout workflow and produced validation‑ready results.

Numbers4 target genes (TGFBR1, SNAI1, BAX, BCL2L1) in A375 cells; validated by NGS

Practical UseYou can prototype cell‑line CRISPR knockouts using agent‑generated guides and protocols, but validate all reagents and steps experimentally.

Evidence RefSection 3.3 and Figure 7

Results

Metric	Value	Baseline	Delta	Split / Dataset	Evidence	Evidence Ref
Accuracy	CRISPR‑GPT scored higher than ChatGPT 3.5 and ChatGPT 4 in expert ratings (1–5 scale)	ChatGPT 3.5 / ChatGPT 4	Higher mean scores across metrics (Figure 6)	12 CRISPR experts, multiple design tasks	Section 2.2 and Figure 6	Figure 6
Wet‑lab validation — editing outcome	Consistent high rate of expected edits across 4 targeted genes by NGS	—	—	A375 cell line; targets: TGFBR1, SNAI1, BAX, BCL2L1	Section 3.3 and Figure 7	Figure 7

What To Try In 7 Days

Run Auto Mode to design an sgRNA knockout for a non‑clinical cell line and compare with your current design workflow

Integrate Primer3 calls into your pipeline to auto‑generate and BLAST‑check PCR primers

Set up the ≥20 bp input filter and human‑target warning flow to test privacy and safety gates

Agent Features

Memory

Session interaction history used in promptsNo autonomous persistent memory or dynamic task creation

Planning

Task decomposition tableReAct chain‑of‑thought promptingChained state machines per meta‑task

Tool Use

Web searchPrimer3 primer designgRNA library retrievalOff‑target prediction tools (CRISPRitz)BLAST checks

Frameworks

ReAct promptingChain‑of‑thoughtState‑machine orchestration

Is Agentic

Yes

Architectures

LLM planner + LLM Agentstate‑machine Task ExecutorTool Provider wrapper

Collaboration

Human‑in‑the‑loop oversight and manual correctionAgent executes steps and asks for required user inputs

Reproducibility

Code AvailableNo

Data AvailableNo

Open Source StatusUnknown

LicenseUnknown

Risks & Boundaries

Limitations

Cannot generate complete DNA constructs or vectors from natural language inputs.

Performance degrades on rare or complex biological cases and needs up‑to‑date domain data.

When Not To Use

Clinical decision‑making or patient care without expert oversight

Designs for human germline or embryo editing (legal/ethical restrictions apply)

Failure Modes

Proposed sgRNA sequences that do not align to the target genome if external checks are skipped

Incomplete protocols missing reagent quantities or timing details in edge cases

Core Entities

Models

gpt-4-0613ChatGPT 3.5GPT-4

Metrics

Accuracyreasoning (1-5)completeness (1-5)conciseness (1-5)NGS editing rate

Datasets

Broad Institute gold-standard gRNA librariespre-designed multi-species guide RNA database

Context Entities

Models

GeminiClaude

Overview

Trust Signals

Reproducibility

At A Glance

Authors

Links

Why It Matters For Business

Who Should Care

Summary TLDR

Problem Statement

Main Contribution

Key Findings

Domain‑augmented agent scored higher than general ChatGPT on expert design ratings.

CRISPR‑GPT executed a real knockout workflow and produced validation‑ready results.

Results

What To Try In 7 Days

Agent Features

Reproducibility

Risks & Boundaries

Limitations

When Not To Use

Failure Modes

Core Entities

Models

Metrics

Datasets

Context Entities

Models

You May Also Want to Read

Survey: Reframe LLMs as agents that plan, act, and continually learn

Key finding

Reference architecture, multi-agent taxonomy, and enterprise hardening for LLM agents

Key finding

Systematizes reusable 'agentic skills' for LLM agents, their lifecycle, design patterns, risks, and evaluation

Key finding

A closed-loop Sensing→Regulating→Correcting system that routes LLM execution by uncertainty to cut errors and API cost

Key finding

Diffusion-backed agents match accuracy but run ~30% faster and can reach up to 8× speedups in some cases

Key finding