PyTorch

Language: Python

ML/AI

PyTorch was developed by Facebook's AI Research lab (FAIR) and released in 2016. It was designed to provide a more intuitive and flexible framework than TensorFlow at the time, supporting dynamic computation graphs that make debugging and experimentation easier. PyTorch quickly became popular in both research and production environments.

PyTorch is an open-source deep learning framework for Python that provides dynamic computation graphs, GPU acceleration, and a flexible platform for building neural networks and machine learning models.

Installation

pip: pip install torch torchvision torchaudio

conda: conda install pytorch torchvision torchaudio cpuonly -c pytorch

Usage

PyTorch provides tensors, autograd for automatic differentiation, neural network modules (torch.nn), optimizers, and utilities for loading and preprocessing data. It supports CPU and GPU computations seamlessly.

Creating tensors

import torch
x = torch.tensor([[1,2],[3,4]])
y = torch.rand(2,2)
print(x + y)

Creates a fixed tensor `x` and a random tensor `y` and performs element-wise addition.

Matrix multiplication

import torch
x = torch.rand(2,3)
y = torch.rand(3,2)
print(torch.mm(x, y))

Performs matrix multiplication between two tensors using `torch.mm`.

Defining a simple neural network

import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(10, 5)
        self.fc2 = nn.Linear(5, 1)

    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = torch.sigmoid(self.fc2(x))
        return x

model = Net()

Defines a feedforward neural network with one hidden layer using PyTorch's nn.Module.

Training loop example

import torch.optim as optim
x = torch.rand(100,10)
y = torch.rand(100,1)
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

for epoch in range(10):
    optimizer.zero_grad()
    outputs = model(x)
    loss = criterion(outputs, y)
    loss.backward()
    optimizer.step()
    print(f'Epoch {epoch+1}, Loss: {loss.item()}')

Demonstrates a basic training loop for a regression model using MSE loss and SGD optimizer.

Using GPU (CUDA)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)
x, y = x.to(device), y.to(device)
outputs = model(x)

Moves the model and tensors to GPU for accelerated computation if available.

Automatic differentiation

x = torch.tensor([1.0,2.0,3.0], requires_grad=True)
y = x.pow(2).sum()
y.backward()
print(x.grad)

Uses autograd to automatically compute gradients of `y` with respect to `x`.

Error Handling

RuntimeError: CUDA out of memory: Reduce batch size or move computations to CPU if GPU memory is insufficient.

RuntimeError: shape mismatch: Ensure the input and target tensors have compatible shapes for the model and loss function.

ModuleNotFoundError: No module named 'torch': Install PyTorch using pip or conda in the current Python environment.

Best Practices

Use `torch.nn.Module` to structure neural networks cleanly.

Leverage `torch.utils.data.DataLoader` for batching and shuffling datasets.

Always zero gradients with `optimizer.zero_grad()` before backpropagation.

Move tensors to GPU using `.to(device)` for faster training.

Use PyTorch Lightning or similar frameworks for cleaner training loops in production.

Official Docs Github