ReduceLROnPlateau

🏔️ Understanding Plateaus & ReduceLROnPlateau

1. What is a Plateau?

Plateau: When your model's loss (or accuracy) stops improving and stays flat for several epochs.
Notice: Epochs 10-25 show a plateau - the loss barely changes! The model is "stuck" at 0.3 loss.

2. Why Does a Plateau Happen?

❌ Problem: Learning Rate Too Large

Steps are too big - the model "bounces around" the optimal point without settling in.

Analogy: Like trying to walk down stairs by jumping 5 steps at a time - you'll miss the bottom!

❌ Problem: Learning Rate Too Small

Steps are too tiny - progress is extremely slow, appears stuck.

Analogy: Like walking down stairs taking baby steps - it takes forever!

3. What is ReduceLROnPlateau?

A callback that automatically reduces the learning rate when training plateaus.

The Logic: "If the model hasn't improved in a while, let's try smaller steps!"
See the magic: When plateau detected at epoch 10, LR reduced from 0.01 to 0.005. Loss starts improving again! This happens twice more.

4. How ReduceLROnPlateau Works

Epoch Val Loss Best So Far Patience Counter Action Learning Rate
1 0.500 0.500 0 ✅ New best! 0.01
2 0.450 0.450 0 ✅ New best! 0.01
3 0.455 0.450 1 ⚠️ No improvement 0.01
4 0.452 0.450 2 ⚠️ No improvement 0.01
5 0.451 0.450 3 ⚠️ No improvement 0.01
6 0.450 0.450 0 (reset) 🔻 REDUCE LR! (patience=3 reached) 0.005
7 0.430 0.430 0 ✅ New best! (smaller LR helps) 0.005

5. Code Example

from tensorflow.keras.callbacks import ReduceLROnPlateau

reduce_lr = ReduceLROnPlateau(
    monitor='val_loss',      # Watch validation loss
    factor=0.5,              # Reduce LR by half (new_lr = old_lr * 0.5)
    patience=10,             # Wait 10 epochs without improvement
    min_lr=1e-6,             # Don't go below 0.000001
    verbose=1                # Print messages
)

model.fit(
    X_train, y_train,
    validation_data=(X_val, y_val),
    epochs=100,
    callbacks=[reduce_lr]
)

6. Key Parameters Explained

Parameter What It Does Example
monitor Which metric to watch 'val_loss', 'val_accuracy'
patience How many epochs to wait before reducing 10 = wait 10 epochs with no improvement
factor How much to reduce LR 0.5 = cut in half, 0.2 = reduce to 20%
min_lr Minimum learning rate (stop reducing here) 1e-6 = don't go below 0.000001
verbose Print messages when LR is reduced 1 = yes, 0 = no

7. Real Example Output

Epoch 1/100
loss: 0.6500 - val_loss: 0.6200

Epoch 10/100
loss: 0.3200 - val_loss: 0.3500

Epoch 20/100
loss: 0.3150 - val_loss: 0.3480
Epoch 21: ReduceLROnPlateau reducing learning rate to 0.0005.

Epoch 25/100
loss: 0.2800 - val_loss: 0.3100  ← Loss improving again!

Epoch 35/100
loss: 0.2750 - val_loss: 0.3050
Epoch 36: ReduceLROnPlateau reducing learning rate to 0.00025.

8. When to Use ReduceLROnPlateau?

✅ Use When:

  • You're not sure what LR to use
  • Training long-term (100+ epochs)
  • You want automatic LR adjustment
  • Working with small datasets

❌ Don't Use When:

  • Doing LR finder experiment
  • Using LearningRateScheduler
  • Training very few epochs
  • You need fixed LR for comparison

9. Summary

🎯 The Big Picture:

  1. Plateau = Model stops improving (stuck)
  2. ReduceLROnPlateau = Automatically makes learning rate smaller when stuck
  3. Why it works = Smaller steps help fine-tune and escape local minimums
  4. Result = Better final performance without manual tuning!

Think of it as: Your model's GPS saying "You're close to the destination, take smaller steps now!"