Part 25: Training Neural Network, Optimizers, and Learning Rate Schedulers

by digitaltech2.com

Optimizers are algorithms that adjust the parameters of a neural network to minimize the loss function. Learning rate schedulers adjust the learning rate during training to improve performance and convergence.

Common Optimizers

PyTorch provides several built-in optimizers that implement various optimization algorithms.

Stochastic Gradient Descent (SGD):

  • Description: A simple and widely used optimization algorithm that updates the model parameters using the gradient of the loss function.
  • Usage
import torch.optim as optim

optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)

Adam (Adaptive Moment Estimation):

  • Description: An optimization algorithm that combines the benefits of AdaGrad and RMSProp, using adaptive learning rates for each parameter.
  • Usage
optimizer = optim.Adam(model.parameters(), lr=0.001)

RMSProp:

  • Description: An optimization algorithm that adjusts the learning rate based on the moving average of squared gradients.
  • Usage:
optimizer = optim.RMSprop(model.parameters(), lr=0.01)
Example: Using Different Optimizers
  • Optimizer Example:
# Using SGD
optimizer_sgd = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)

# Using Adam
optimizer_adam = optim.Adam(model.parameters(), lr=0.001)

# Using RMSProp
optimizer_rmsprop = optim.RMSprop(model.parameters(), lr=0.01)
Learning Rate Schedulers

Learning rate schedulers adjust the learning rate during training to improve convergence and performance.

StepLR:

  • Description: Reduces the learning rate by a factor every few epochs.
  • Usage
scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=30, gamma=0.1)

ExponentialLR:

  • Description: Decays the learning rate exponentially.
  • Usage:
scheduler = optim.lr_scheduler.ExponentialLR(optimizer, gamma=0.9)

ReduceLROnPlateau:

  • Description: Reduces the learning rate when a metric has stopped improving.
  • Usage
scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, 'min')
Example: Combining Optimizers and Schedulers
  • Combined Example:
import torch.optim as optim

model = SimpleNN()
optimizer = optim.Adam(model.parameters(), lr=0.001)
scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=30, gamma=0.1)

for epoch in range(100):
    for batch_inputs, batch_targets in dataloader:
        optimizer.zero_grad()
        outputs = model(batch_inputs)
        loss = loss_fn(outputs, batch_targets)
        loss.backward()
        optimizer.step()

    scheduler.step()  # Update the learning rate

    if epoch % 10 == 0:
        print(f'Epoch [{epoch}/100], Loss: {loss.item():.4f}, LR: {scheduler.get_last_lr()[0]}')
Example: ReduceLROnPlateau Scheduler
  • ReduceLROnPlateau Example:
scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, 'min')

for epoch in range(100):
    for batch_inputs, batch_targets in dataloader:
        optimizer.zero_grad()
        outputs = model(batch_inputs)
        loss = loss_fn(outputs, batch_targets)
        loss.backward()
        optimizer.step()

    # Simulate validation loss for demonstration
    val_loss = loss_fn(model(val_inputs), val_targets)
    scheduler.step(val_loss)

    if epoch % 10 == 0:
        print(f'Epoch [{epoch}/100], Loss: {loss.item():.4f}, Validation Loss: {val_loss.item():.4f}, LR: {optimizer.param_groups[0]["lr"]}')

This example demonstrates how to combine optimizers and learning rate schedulers to train a neural network more effectively.

Related Posts