Send tasks as tiny label payloads: train clients from a shared image pool using <1 MB

Overview

Decision SnapshotReady For Pilot

Method is simple and relies on public datasets and standard compression; results are strong on many natural-image tasks but require clients to store a large reference set and need tuned filtering for far-OOD domains.

Citations0

Evidence Strength0.80

Confidence0.80

Risk Signals10

Trust Signals

Findings with numeric evidence: 4/4

Findings with evidence refs: 4/4

Results with explicit delta: 3/4

Reproducibility

Status: Partial assets available

Open source: Partial

At A Glance

Cost impact: 80%

Production readiness: 70%

Novelty: 60%

Authors

Elad Kimchi Shoshani, Leeyam Gabay, Yedid Hoshen

Links

Abstract / PDF / Data

Why It Matters For Business

If clients can store a shared unlabeled image pool, servers can deliver new classification tasks with tiny label-only payloads (<1 MB). This cuts recurring transfer costs drastically and enables operation over very low-bandwidth links.

Who Should Care

Product Manager ML Engineer Founder Data Scientist

Summary TLDR

PLADA (Pseudo-Labels as Data) lets a server convey a classification task by sending only hard labels for images in a large, preloaded reference dataset (e.g., ImageNet-21K). Using energy-based pruning plus a class-preserving Safety-Net and standard compression (Zstd), PLADA often fits the task payload under 1 MB (often 85–206 KB at 1% keep) while retaining strong classification accuracy on many natural-image benchmarks. Far-OOD tasks (medical images) need different selection (high-energy) and show bigger accuracy drops. Method requires clients to store the reference image pool beforehand.

Problem Statement

Dataset servers must repeatedly send large training data to heterogeneous clients. Sending model weights is not always feasible. Existing dataset distillation struggles to scale to high-resolution data or to produce tiny payloads. We need a method that compresses the training signal by orders of magnitude while keeping client-side training effective under extreme bandwidth limits.

Main Contribution

PLADA: represent a task by sending only hard pseudo-labels for a preloaded reference image pool, eliminating pixel transfer.

Pruning + Safety-Net: use energy-based OOD scores to keep a tiny fraction (1%–10%) of reference images and a class quota to avoid class collapse under extreme compression.

Key Findings

Task transfer with payloads well below 1 MB is practical.

NumbersZstd-compressed payload at 1% keep: 85–206 KB (Table 4)

Practical UseIf clients preload a large unlabeled image set, servers can send task labels instead of images and fit extreme links (deep-sea, rover).

Evidence RefTable 4

Aggressive pruning improves or preserves accuracy on many natural-image tasks.

NumbersExample: Caltech-101 accuracy 1% energy-filtered = 79.84% vs 92.74% full (Table 1)

Practical UseSend only the top 1%–10% low-energy reference images to cut bandwidth and often improve the client's final accuracy.

Evidence RefTable 1

Results

Metric	Value	Baseline	Delta	Split / Dataset	Evidence	Evidence Ref
Compressed payload size (Zstd)	85–206 KB at 1% keep (ImageNet-21K reference)	—	—	Aggregate (Table 4 ranges)	Table 4: Zstd sizes for p=1%	Table 4
Accuracy	Caltech-101: 79.84%	Full reference (100%): 92.74%	-12.90 pp	Caltech-101 (Table 1)	Table 1, 1% vs 100%	Table 1

What To Try In 7 Days

Preload a moderate reference image pool on test clients (ImageNet-like or domain-specific).

Implement teacher-side pseudo-labeling and energy-based ranking on one target task.

Compress the selected indices+labels with delta/RLE and Zstd and measure payload vs local training accuracy.

Agent Features

Tool Use

ZstdRLEHuffman coding

Optimization Features

Infra Optimization

reduce bandwidth by replacing pixel transfer with label payload

System Optimization

delta-index encodingbitmap vs index choice

Training Optimization

pruning dataset with energy-based OOD scoresSafety-Net class quota to avoid collapseimportance weighting (discussed)

Reproducibility

Code AvailableNo

Data AvailableYes

Open Source StatusPartial

LicenseUnknown

Data URLs

ImageNet-21K and ImageNet-1K (public references); target datasets listed in paper (public sources)

Risks & Boundaries

Limitations

Requires clients to store a large unlabeled reference dataset locally.

Works only for classification tasks as evaluated; regression/generative tasks are not handled yet.

When Not To Use

Clients cannot store the reference image pool due to storage or privacy constraints.

Tasks are regression or generative and cannot be represented by hard labels alone.

Failure Modes

Class collapse when extreme pruning removes rare classes unless Safety-Net is used.

Spurious label mappings for far-OOD tasks causing student collapse (medical datasets).

Core Entities

Models

ConvNeXt-V2-Tiny (teacher)ResNet-18 (student)

Metrics

Accuracypayload size (KB/MB)keep rate p (%)

Datasets

ImageNet-21K (reference)ImageNet-1K (reference)Caltech-101CIFAR-10CUB-200DTDFGVC-AircraftFood-101Oxford-Flowers-102Oxford-IIIT-PetPlaces365RESISC45BloodMNISTDermaMNISTRetinaMNISTNCT-CRC-HE-100K

Benchmarks

14 classification datasets (10 natural + 4 medical OOD)

Context Entities

Models

linear probe baselineINT8 quantized ResNet-18 (baseline variants)

Metrics

Accuracyintersection fraction (<1% overlap)

Datasets

reference vs target split analysis for leakage checks

Overview

Trust Signals

Reproducibility

At A Glance

Authors

Links

Why It Matters For Business

Who Should Care

Summary TLDR

Problem Statement

Main Contribution

Key Findings

Task transfer with payloads well below 1 MB is practical.

Aggressive pruning improves or preserves accuracy on many natural-image tasks.

Results

What To Try In 7 Days

Agent Features

Optimization Features

Reproducibility

Data URLs

Risks & Boundaries

Limitations

When Not To Use

Failure Modes

Core Entities

Models

Metrics

Datasets

Benchmarks

Context Entities

Models

Metrics

Datasets

You May Also Want to Read

CoALM: one fine-tuned model that combines multi-turn dialogue state tracking with robust API / function calling

Key finding

First holistic Burmese benchmark (BURMESE-SAN) that tests LLMs on understanding, reasoning, and generation.

Key finding

Hamza: Turkish LLMs, adaptation vs from‑scratch, plus new Turkish benchmarks

Key finding

FinTral: a 7B multimodal financial LLM + FinSet dataset that rivals GPT-4 on many finance tasks

Key finding

Tune open LLMs into safer, better tool-using agents by aligning data to chat, decomposing capabilities, and adding negative samples

Key finding