Tippy: a production-ready multi-agent system that automates drug discovery lab workflows

Overview

Decision SnapshotNeeds Validation

The implementation covers end-to-end deployment concerns (Kubernetes, CI/CD, observability) and detailed tool lists, showing strong engineering readiness; however, the paper lacks quantitative benchmarks of lab outcome improvements and public code, so evidence is moderate.

Citations1

Evidence Strength0.60

Confidence0.80

Risk Signals10

Trust Signals

Findings with numeric evidence: 7/7

Findings with evidence refs: 7/7

Results with explicit delta: 0/4

Reproducibility

Status: No open assets linked

Open source: Unknown

At A Glance

Cost impact: 60%

Production readiness: 70%

Novelty: 60%

Authors

Yao Fehlis, Charles Crain, Aidan Jensen, Michael Watson, James Juhasz, Paul Mandel, Betty Liu, Shawn Mahon, Daren Wilson, Nick Lynch-Jonely, Ben Leedom, David Fuller

Links

Abstract / PDF

Why It Matters For Business

Tippy shows how to turn multi-agent lab automation from a concept into a deployable platform, enabling more automated DMTA cycles, reproducible deployments, and scalable instrument orchestration.

Who Should Care

CTO Product Manager ML Engineer Engineering Lead Data Scientist Founder

Summary TLDR

This paper describes the engineering and deployment of Tippy, a production-focused multi-agent system for automating drug-discovery lab workflows. Tippy uses a Supervisor agent plus specialized Molecule, Lab, Analysis, and Report agents, integrates tools through the Model Context Protocol (MCP), and runs on Kubernetes with Helm, CI/CD, an Envoy proxy, and vector DBs for retrieval. The authors detail tool lists, orchestration via OpenAI Agents SDK, Git-based configuration tracking, observability, and safety guardrails, but provide no quantitative benchmark of scientific outcomes.

Problem Statement

Laboratory automation needs reliable, production-grade software that coordinates heterogeneous instruments, human operators, and multi-step DMTA (Design-Make-Test-Analyze) workflows. Existing lab systems handle data and basic tracking but lack coordinated autonomous orchestration, standardized tool integration, and deployment practices required for scalable, repeatable automation.

Main Contribution

A production multi-agent microservices architecture with a Supervisor plus specialized Molecule, Lab, Analysis, and Report agents.

Integration pattern using the Model Context Protocol (MCP) to expose lab tools and instrument controls to agents.

Key Findings

Tippy uses five specialized agents (Supervisor, Molecule, Lab, Analysis, Report) plus a Safety Guardrail.

Numbers5 specialized agents

Practical UseDesign services as focused agents; separate responsibilities (design, execution, analysis, reporting, safety) to simplify integration and scaling.

Evidence RefSections 2, Figure 1

The Lab Agent exposes 13 MCP tools covering job creation, instrument control, lookup, and execution.

NumbersLab Agent: 13 MCP tools

Practical UseMap each lab capability (scheduling, control, queries) to explicit tool APIs to let agents reason about and safely invoke instruments.

Evidence RefSection 2.4

Results

Metric	Value	Baseline	Delta	Split / Dataset	Evidence	Evidence Ref
Specialized agents	5 (Supervisor + Molecule + Lab + Analysis + Report)	—	—	—	Figure 1; Sections 2–2.7	Sections 2, Figure 1
Lab Agent tool count	13 MCP tools	—	—	—	Section 2.4 lists 13 tools	Section 2.4

What To Try In 7 Days

Map key lab capabilities to MCP-like tool APIs (job start, status, instrument control).

Containerize one agent and deploy it with Helm on a local Kubernetes cluster.

Add Git-based config/versioning for agent prompts and tool configs to enable rollbacks.

Agent Features

Memory

retrieval memory (vector DB)context window management

Planning

dynamic agent routingcontext sharing for task handoff

Tool Use

Model Context Protocol (MCP) tool integrationfunction calling for tool selectionRAG-backed retrieval for memory

Frameworks

OpenAI Agents SDKOpenAI Responses APIMCP Federation

Is Agentic

Yes

Architectures

microservicessupervisor-agent patterndistributed multi-agent

Collaboration

handoff via OpenAI Agents SDKasynchronous communication patterns

Optimization Features

Infra Optimization

Kubernetes + HelmDocker containerizationEnvoy reverse proxyCI/CD with GitHub Actions

Model Optimization

GPU acceleration for MolMIM

System Optimization

Horizontal Pod Autoscalingblue-green and rolling updates

Inference Optimization

GPU inference clustersconfigurable resource limits per agent

Reproducibility

Code AvailableNo

Data AvailableNo

Open Source StatusUnknown

LicenseUnknown

Risks & Boundaries

Limitations

No quantitative benchmarks showing scientific or time-to-result gains.

Safety guardrail is described at a high level and not evaluated with adversarial tests.

When Not To Use

If you lack Kubernetes/container expertise or GPU resources.

When provable, auditable safety certification is required before lab actions.

Failure Modes

Tool or instrument API failures leading to incorrect job execution.

Model hallucination or incorrect protocol interpretation causing unsafe actions.

Tippy: a production-ready multi-agent system that automates drug discovery lab workflows

Overview

Trust Signals

Reproducibility

At A Glance

Authors

Links

Why It Matters For Business

Who Should Care

Summary TLDR

Problem Statement

Main Contribution

Key Findings

Tippy uses five specialized agents (Supervisor, Molecule, Lab, Analysis, Report) plus a Safety Guardrail.

The Lab Agent exposes 13 MCP tools covering job creation, instrument control, lookup, and execution.

Results

What To Try In 7 Days

Agent Features

Optimization Features

Reproducibility

Risks & Boundaries

Limitations

When Not To Use

Failure Modes

Core Entities

Models

Metrics

Context Entities

Models

Metrics

Overview

Trust Signals

Reproducibility

At A Glance

Authors

Links

Why It Matters For Business

Who Should Care

Summary TLDR

Problem Statement

Main Contribution

Key Findings

Tippy uses five specialized agents (Supervisor, Molecule, Lab, Analysis, Report) plus a Safety Guardrail.

The Lab Agent exposes 13 MCP tools covering job creation, instrument control, lookup, and execution.

Results

What To Try In 7 Days

Agent Features

Optimization Features

Reproducibility

Risks & Boundaries

Limitations

When Not To Use

Failure Modes

Core Entities

Models

Metrics

Context Entities

Models

Metrics

You May Also Want to Read

Chemistry foundation models power structure-focused multimodal RAG inside hierarchical multi-agent workflows

Key finding

Argues that 'agentic' buzzwords mostly rebrand decades-old agent and multi-agent research

Key finding

TRiSM: practical trust, risk and security controls for LLM-based multi-agent systems

Key finding

A dynamic town simulation that tests LLM agents on doing tasks while following local cultural norms

Key finding

A process-aware, auditable multi-agent evaluator that produces more stable, human-aligned scores than a single LLM judge

Key finding