RHO-loss, 2022

Core Technique: Reducible Holdout Loss Selection (RHO-LOSS)

The key idea is elegantly simple: select training points where the current model's loss is high, but a reference model trained on clean holdout data can predict them well.

The Selection Formula

For each candidate point (x, y), compute:

$$\text{Reducible Loss} = \underbrace{L[y|x; \text{current model}]}{\text{training loss}} - \underbrace{L[y|x; \text{IL model}]}{\text{irreducible loss}}$$

Select points with the highest reducible loss for training.

How It Works

Step 1: Before training, train a small "irreducible loss (IL) model" on a holdout set. This model learns what's predictable from the data distribution.

Step 2: During training, for each batch of candidates:

Compute the training loss on your current model
Look up the pre-computed irreducible loss from the IL model
Subtract to get reducible loss
Train only on the top-scoring points

Why This Filters Out Bad Points

Point Type	Training Loss	Irreducible Loss	Reducible Loss	Selected?
Noisy/mislabeled	High	High (IL model also can't predict wrong labels)	Low	✗
Already learned	Low	Low	Low	✗
Learnable & not yet learned	High	Low	High	✓

Practical Efficiency

The IL model can be surprisingly cheap—a small CNN works even when training a large ResNet. It's computed once and reused across all training runs, hyperparameter sweeps, and architectures.