Part 10: Tensor Broadcasting and Advanced Operations

Tensor broadcasting is a powerful feature in PyTorch that allows you to perform operations on tensors of different shapes without explicitly reshaping them. Broadcasting automatically expands the smaller tensor to match the shape of the larger tensor, enabling element-wise operations across mismatched shapes.

How Broadcasting Works

Broadcasting follows specific rules to match the shapes of tensors. The rules are:

If the tensors have a different number of dimensions, prepend dimensions of size 1 to the smaller tensor until both tensors have the same number of dimensions.
The size of each dimension must either match or be 1. If a dimension size is 1 in one tensor, it is expanded to match the size of the corresponding dimension in the other tensor.

Example of Broadcasting

Consider two tensors of different shapes. We can perform element-wise addition using broadcasting.

Broadcasting Example:

import torch

tensor_a = torch.tensor([[1, 2, 3], [4, 5, 6]])
tensor_b = torch.tensor([1, 2, 3])

result = tensor_a + tensor_b
print(result)

In this example, tensor_b is broadcast to match the shape of tensor_a, resulting in:

tensor([[2, 4, 6],
        [5, 7, 9]])

Advanced Broadcasting

Broadcasting can be applied to more complex cases involving higher-dimensional tensors.

Broadcasting with Higher Dimensions:

tensor_c = torch.tensor([[1], [2], [3]])
tensor_d = torch.tensor([4, 5, 6])

result = tensor_c * tensor_d
print(result)

In this case, tensor_c is broadcast to a shape of (3, 3) and tensor_d to a shape of (3,), resulting in:

tensor([[ 4,  5,  6],
        [ 8, 10, 12],
        [12, 15, 18]])

Practical Use Case: Normalizing a Batch of Data

Broadcasting is often used in machine learning to normalize batches of data.

Normalizing Data:

batch = torch.randn(10, 3)  # 10 samples, each with 3 features
mean = batch.mean(dim=0, keepdim=True)
std = batch.std(dim=0, keepdim=True)

normalized_batch = (batch - mean) / std
print(normalized_batch)

In this example, mean and std are computed for each feature across the batch. Broadcasting allows us to subtract the mean and divide by the standard deviation for each sample in the batch.

Advanced Tensor Operations

Advanced tensor operations are essential for performing complex manipulations and computations in deep learning models. These operations include linear algebra functions, element-wise operations, reductions, and more.

Linear Algebra Operations

PyTorch provides a range of linear algebra operations that are crucial for building neural networks and performing mathematical computations.

Matrix Multiplication:

import torch

matrix_a = torch.tensor([[1, 2], [3, 4]])
matrix_b = torch.tensor([[5, 6], [7, 8]])

matrix_product = torch.matmul(matrix_a, matrix_b)
print(matrix_product)

Matrix Inversion:

matrix = torch.tensor([[1.0, 2.0], [3.0, 4.0]])
matrix_inverse = torch.inverse(matrix)
print(matrix_inverse)

Eigenvalues and Eigenvectors:

eigvals, eigvecs = torch.eig(matrix, eigenvectors=True)
print("Eigenvalues:", eigvals)
print("Eigenvectors:\n", eigvecs)

Element-wise Operations

Element-wise operations are performed on individual elements of tensors.

Exponential and Logarithm:

tensor = torch.tensor([1.0, 2.0, 3.0])

exp_tensor = torch.exp(tensor)
log_tensor = torch.log(tensor)

print("Exponential:", exp_tensor)
print("Logarithm:", log_tensor)

Trigonometric Functions:

angles = torch.tensor([0.0, 3.1416 / 2, 3.1416])

sin_tensor = torch.sin(angles)
cos_tensor = torch.cos(angles)

print("Sine:", sin_tensor)
print("Cosine:", cos_tensor)

Reduction Operations

Reduction operations reduce the dimensions of tensors by performing operations such as summation or averaging.

Summation

tensor = torch.tensor([[1, 2, 3], [4, 5, 6]])

sum_all = torch.sum(tensor)
sum_dim0 = torch.sum(tensor, dim=0)
sum_dim1 = torch.sum(tensor, dim=1)

print("Sum of all elements:", sum_all)
print("Sum along dim 0:", sum_dim0)
print("Sum along dim 1:", sum_dim1)

Mean:

mean_all = torch.mean(tensor.float())
mean_dim0 = torch.mean(tensor.float(), dim=0)
mean_dim1 = torch.mean(tensor.float(), dim=1)

print("Mean of all elements:", mean_all)
print("Mean along dim 0:", mean_dim0)
print("Mean along dim 1:", mean_dim1)

Example: Implementing a Custom Operation

You can implement custom operations using basic tensor operations.

Example: Element-wise Square Root:

tensor = torch.tensor([1.0, 4.0, 9.0, 16.0])
sqrt_tensor = torch.sqrt(tensor)
print("Square root:", sqrt_tensor)