Overview
Production Readiness
0.6
Novelty Score
0.6
Cost Impact Score
0.5
Citation Count
0
Why It Matters For Business
If you let agents mutate your lakehouse without transactional, runtime isolation, they can corrupt production data or leak secrets. Building a small, enforceable run API and sandboxed functions reduces risk and makes governance feasible.
Summary TLDR
Enterprises distrust autonomous agents because existing lakehouse infrastructure doesn't provide strong multi-step transaction and runtime isolation across heterogeneous tools. The paper proposes Bauplan: an agent-first lakehouse that (1) records immutable table snapshots and supports copy-on-write branching and atomic merges across multi-table pipelines, (2) runs each pipeline function in an isolated, network-blocked FaaS container, and (3) exposes a single declarative run API so agent runs publish to temporary branches and only merge on full success. This design restores transactional correctness for multi-node pipelines and makes governance practical through a small, checkable API surface
Problem Statement
Traditional lakehouses decouple storage and compute and support many runtimes. That decoupling breaks transactional guarantees across multi-step pipelines and inflates the attack surface for agents. Without new primitives, agents can leave the lake in inconsistent states or run untrusted code that harms production data.
Main Contribution
Diagnosis: explaining why MVCC (database transactions) cannot be transplanted naively to a decoupled, multi-runtime lakehouse.
Design: Bauplan — an agent-first lakehouse with copy-on-write branching, temporary runs, and atomic merges that span multiple tables and pipeline nodes.
Compute model: use FaaS-based, containerized, network-isolated functions per pipeline node to enforce runtime isolation and limit attack vectors.
Programming abstraction: declarative I/O (functions accept/output tables) plus a single run API that ties data branches and compute runs together.
Worked example: a self-healing pipeline pattern where an agent produces code and a verifier; the platform runs the verifier then human reviews before merge.
Key Findings
Multi-node pipelines need atomic commits across tables, not per-table transactions.
Branching with copy-on-write is efficient enough to handle large workloads.
Compute isolation is achieved by running each pipeline function in a network-isolated container (FaaS).
A single unified run API (bauplan.run) ties branching, data fetch, function execution and atomic merge into one flow.
Agent work can be made auditable and human-reviewed via data-branch outputs and verifiers.
Who Should Care
What To Try In 7 Days
Run a prototype pipeline: put agent outputs into a temporary branch and practice merge-on-success.
Sandbox a single pipeline node in a network-blocked container and test package whitelisting.
Map current pipelines to a declarative I/O interface (functions that accept/table return tables) to see gaps.
Agent Features
Memory
- branch snapshots (immutable table history)
Planning
- ReAct loop
- pipeline orchestration via unified run API
Tool Use
- containerized runtimes
- package whitelisting
- no-internet sandboxing
Frameworks
- Bauplan
Is Agentic
true
Architectures
- FaaS
- branch-and-merge (git-like)
- declarative I/O
Collaboration
- human-in-the-loop verification
- branch-review-merge workflow
Optimization Features
Infra Optimization
- centralized run API enabling platform-side optimizations
System Optimization
- copy-on-write branching to avoid full data copies
Reproducibility
Code Available
Open Source Status
- partial
Risks & Boundaries
Limitations
- Position paper without quantitative benchmarks or experiments.
- Relies on a central platform controlling run API and FaaS; not directly applicable to fragmented/legacy infra.
- Package whitelisting and network isolation reduce risk but may limit some valid workloads.
When Not To Use
- You cannot change the platform or impose a single run API.
- Workloads require direct internet access from runtime.
- You must support many legacy jobs that cannot be containerized easily.
Failure Modes
- Merge conflicts across branches that require manual resolution and delay deployment.
- Verifier false negatives — automated checks miss edge-case failures.
- Malicious or buggy packages slip through whitelist or via supply-chain vectors.

