Overview
Results are robust across two base models and two alignment algorithms, but rely on public benchmarks and GPT-4 judgments; dataset quality (UltraChat/UltraFeedback) and merging stability are open risks.
Citations4
Evidence Strength0.80
Confidence0.80
Risk Signals10
Trust Signals
Findings with numeric evidence: 4/4
Findings with evidence refs: 4/4
Results with explicit delta: 4/5
Reproducibility
Status: Partial assets available
Open source: Partial
At A Glance
Cost impact: 60%
Production readiness: 70%
Novelty: 60%
Why It Matters For Business
PAFT can preserve both task accuracy and alignment without retraining large models end-to-end; companies can run SFT and alignment in parallel, sparsify adapters, and merge them to ship stronger, aligned models faster.
Who Should Care
Summary TLDR
PAFT trains supervised fine-tuning (SFT) and preference alignment (DPO/ORPO) in parallel on the same pre-trained model, makes the SFT adapter sparse via an L1 penalty, and then merges the two adapters into a single model. Sparsifying SFT adapters (over 90% sparsity reported) reduces parameter interference during merging and yields stronger merged models. On public benchmarks PAFT-ed models top the HuggingFace Open LLM Leaderboard for the tested size classes and improve AlpacaEval performance versus many baselines.
Problem Statement
Sequentially applying SFT then preference alignment often causes 'alignment tax'—the aligned model loses or degrades capabilities learned by SFT. The paper asks whether training SFT and alignment in parallel, plus sparsifying adapters, reduces that tax and yields a stronger merged model.
Main Contribution
Introduce PAFT: learn SFT and preference-alignment adapters in parallel on the same base model and fuse them by weight merging.
Show SFT adapters are dense while alignment adapters are naturally sparse; add L1 during SFT to push sparsity and reduce interference.
Key Findings
Parallel training (PAFT) plus L1-sparsified SFT improves merged-model scores versus sequential or standalone training on the 6-task Open LLM suite.
Inducing sparsity in the SFT adapter greatly reduces merging interference and can yield large gains for some merge methods.
Results
| Metric | Value | Baseline | Delta | Split / Dataset | Evidence | Evidence Ref |
|---|---|---|---|---|---|---|
| Avg score on 6-task Open LLM suite (Mistral-7B, TIES merge) | 0.65243 | DPO-alone 0.6333 | +0.01913 | Open LLM Leaderboard (ARC,HellaSwag,MMLU,TruthfulQA,Winograde,GSM8K) | Table 1 (Mistral-7B, PAFT SFTsparse+DPO TIES) | Table 1 |
| TIES merge gap: sparse vs non-sparse (Mistral-7B) | PAFT 0.65243 vs Parallel SFT+DPO 0.58928 | Parallel SFT+DPO 0.58928 | +0.06315 | Open LLM Leaderboard (6-task avg) | Table 1 (TIES rows) | Table 1 |
What To Try In 7 Days
Train SFT and DPO adapters in parallel on your base model using LoRA.
Add small L1 regularization (λ≈1e-4 or 1e-3) to SFT to induce sparsity.
Experiment with simple merging (TIES, Task Arithmetic or linear) and evaluate merged model on your core metrics.
Optimization Features
Infra Optimization
Model Optimization
System Optimization
Training Optimization
Inference Optimization
Reproducibility
Data URLs
Risks & Boundaries
Limitations
No causal explanation why DPO adapters are naturally sparse and SFT adapters are dense.
Scalability and operational workflow for iterative merges in production is underexplored.
When Not To Use
You cannot merge adapters reliably due to incompatible architectures or runtime constraints.
Your SFT data is not similar to dialogue or is highly out-of-domain relative to alignment data.
Failure Modes
Merged model still suffers from parameter interference if SFT sparsity is insufficient.
Retraining the merged model can induce catastrophic forgetting of earlier traits.

