Overview
Empirical evidence spans three datasets and four model families with bootstrap CIs; theoretical bounds explain observed trends and predict sensitivity to vector norms.
Citations0
Evidence Strength0.78
Confidence0.79
Risk Signals9
Trust Signals
Findings with numeric evidence: 4/4
Findings with evidence refs: 4/4
Results with explicit delta: 5/5
Reproducibility
Status: Code + data available
Open source: Yes
At A Glance
Cost impact: 65%
Production readiness: 60%
Novelty: 45%
Why It Matters For Business
Task-vector editing is a low-cost way to tune subgroup parity without full retraining; it can be used as an operational knob to reduce worst-case demographic gaps while keeping accuracy near existing adapt methods.
Who Should Care
Summary TLDR
This paper measures how "task vectors"—the weight differences between a fine-tuned model and its base model—affect group fairness in binary text and image classification. Across hate-speech, toxicity, and age detection datasets and multiple models (LLaMA2-7B, DistilBERT, Qwen-2.5, ViT-Base), uniformly scaled task-vector merges (a single scalar λ) often preserve accuracy while substantially reducing demographic parity difference (DPD) and equalized odds difference (EOD). Injecting subgroup-specific vectors offers an additional, targeted knob: some subgroup vectors improve parity for certain groups, others worsen it. The paper also gives a theoretical bound connecting vector scaling and parity
Problem Statement
Fine-tuning large models is expensive and can keep or amplify subgroup biases. Task arithmetic (adding/subtracting model-weight differences) is cheap, but its effects on group fairness are not well understood. This paper asks: can task-vector edits match accuracy while reducing group disparities, and can scaling or subgroup-specific vectors be used as practical fairness controls?
Main Contribution
First systematic empirical study of group fairness for task arithmetic vs full fine-tuning (FFT) and LoRA across text and vision tasks.
Show that sweeping a single global scaling coefficient λ over merged subgroup task vectors traces a smooth fairness-accuracy frontier: for many λ values, task addition preserves accuracy and reduces DPD/EOD.
Key Findings
Uniformly scaled task-vector merges can reduce group disparities while keeping accuracy close to FFT/LoRA.
Injecting subgroup-specific task vectors moves fairness in group-dependent ways: some subgroup vectors improve parity, others hurt it.
Results
| Metric | Value | Baseline | Delta | Split / Dataset | Evidence | Evidence Ref |
|---|---|---|---|---|---|---|
| Accuracy | 0.9395 (point estimate) | DistilBERT SFT 0.9457–0.9476 | ≈ -0.006– -0.008 | Civil Comments (Gender) | Table 2 DistilBERT entries | Table 2 |
| Civil Comments (DistilBERT) Worst-case DPD | 0.0454 (Task Addition) | SFT 0.0887–0.1101; LoRA 0.0735–0.0812 | ≈ -0.043 (vs SFT midpoint) | Civil Comments (Gender) | Table 2 DistilBERT entries | Table 2 |
What To Try In 7 Days
Compute subgroup-specific task vectors by fine-tuning small models on subgroup slices and subtracting base weights.
Merge vectors with a single scalar λ and sweep λ on validation to find acceptable accuracy–DPD/EOD trade-offs.
Test injecting worst-performing subgroup vectors into FFT models and measure effects across all subgroups before deployment.
Optimization Features
Infra Optimization
Model Optimization
System Optimization
Training Optimization
Inference Optimization
Reproducibility
Risks & Boundaries
Limitations
Scope limited to open-weight 0.5–7B models; no evaluation on large proprietary API-only models.
Single global λ cannot express richer, intersectional fairness constraints.
When Not To Use
When you only have API access and cannot edit model weights.
When fairness goals require intersectional or counterfactual guarantees beyond DPD/EOD.
Failure Modes
Negative transfer: merging vectors that benefit one group may degrade others.
Overfitting to validation: selecting λ to maximize validation accuracy can increase worst-group disparity.

