Overview
Production Readiness
0.6
Novelty Score
0.45
Cost Impact Score
0.65
Citation Count
0
Why It Matters For Business
Task-vector editing is a low-cost way to tune subgroup parity without full retraining; it can be used as an operational knob to reduce worst-case demographic gaps while keeping accuracy near existing adapt methods.
Summary TLDR
This paper measures how "task vectors"—the weight differences between a fine-tuned model and its base model—affect group fairness in binary text and image classification. Across hate-speech, toxicity, and age detection datasets and multiple models (LLaMA2-7B, DistilBERT, Qwen-2.5, ViT-Base), uniformly scaled task-vector merges (a single scalar λ) often preserve accuracy while substantially reducing demographic parity difference (DPD) and equalized odds difference (EOD). Injecting subgroup-specific vectors offers an additional, targeted knob: some subgroup vectors improve parity for certain groups, others worsen it. The paper also gives a theoretical bound connecting vector scaling and parity
Problem Statement
Fine-tuning large models is expensive and can keep or amplify subgroup biases. Task arithmetic (adding/subtracting model-weight differences) is cheap, but its effects on group fairness are not well understood. This paper asks: can task-vector edits match accuracy while reducing group disparities, and can scaling or subgroup-specific vectors be used as practical fairness controls?
Main Contribution
First systematic empirical study of group fairness for task arithmetic vs full fine-tuning (FFT) and LoRA across text and vision tasks.
Show that sweeping a single global scaling coefficient λ over merged subgroup task vectors traces a smooth fairness-accuracy frontier: for many λ values, task addition preserves accuracy and reduces DPD/EOD.
Demonstrate subgroup-targeted edits: injecting specific subgroup task vectors into an FFT model can selectively improve or worsen fairness for particular groups.
Provide a theoretical upper bound linking deviations in task-vector scaling to increases in DPD and EOD, explaining empirical sensitivity to vector norms.
Key Findings
Uniformly scaled task-vector merges can reduce group disparities while keeping accuracy close to FFT/LoRA.
Injecting subgroup-specific task vectors moves fairness in group-dependent ways: some subgroup vectors improve parity, others hurt it.
A theoretical bound shows fairness gaps scale with differences in scaling λ and with the norms of subgroup task vectors.
Task arithmetic can recover much of LoRA’s fairness gains while closing some of LoRA’s accuracy loss on vision and text tasks.
Results
Accuracy
Civil Comments (DistilBERT) Worst-case DPD
Accuracy
Accuracy
Typical fairness reductions on Civil Comments (midpoint comparisons)
Who Should Care
What To Try In 7 Days
Compute subgroup-specific task vectors by fine-tuning small models on subgroup slices and subtracting base weights.
Merge vectors with a single scalar λ and sweep λ on validation to find acceptable accuracy–DPD/EOD trade-offs.
Test injecting worst-performing subgroup vectors into FFT models and measure effects across all subgroups before deployment.
Optimization Features
Infra Optimization
- Experiments reported with ~30 GPU-hours total (H100); smaller budgets feasible per-vector
Model Optimization
- Use of task-vector arithmetic to edit behavior without retraining
System Optimization
- One-dimensional λ control reduces tuning complexity
Training Optimization
- Compute subgroup fine-tuned models separately; store differences
Inference Optimization
- Merged weights applied at load time avoid per-input compute overhead
Reproducibility
Data Urls
Code Available
Data Available
Open Source Status
- yes
Risks & Boundaries
Limitations
- Scope limited to open-weight 0.5–7B models; no evaluation on large proprietary API-only models.
- Single global λ cannot express richer, intersectional fairness constraints.
- Binary prediction tasks only; multi-label or generative settings not evaluated.
When Not To Use
- When you only have API access and cannot edit model weights.
- When fairness goals require intersectional or counterfactual guarantees beyond DPD/EOD.
- When subgroup task vectors have very large norms and even small λ changes cause large parity swings.
Failure Modes
- Negative transfer: merging vectors that benefit one group may degrade others.
- Overfitting to validation: selecting λ to maximize validation accuracy can increase worst-group disparity.
- Unstable effects on rare subgroups due to small subgroup sample sizes.
Core Entities
Models
- LLaMA2-7B
- DistilBERT
- Qwen2.5-0.5B
- ViT-Base/16
Metrics
- Demographic Parity Difference (DPD)
- Equalized Odds Difference (EOD)
- Accuracy
Datasets
- Berkeley D-Lab Hate Speech
- Civil Comments
- UTKFace

