PyTorch

PyTorch tensors are like NumPy arrays, but with a couple of extensions

  1. GPU placement — tensors can be moved to a GPU for accelerated computation
  2. Autograd — tensors participate in a computational graph that enables backpropagation

The Computational Graph

  • PyTorch builds the graph dynamically as your code executes (define-by-run). Other frameworks like TensorFlow declare it upfront.
  • Each operation on tracked tensors adds a node to the graph, recording how outputs were derived from inputs
  • This is what makes .backward() possible: PyTorch traverses the graph in reverse to compute gradients

requires_grad

  • Graph tracking is opt-in: a tensor only participates if requires_grad=True
  • Learnable parameters in a model have this set automatically

torch.no_grad()

  • During inference you don’t need gradients, so wrapping code in torch.no_grad() disables graph tracking
  • This saves both memory and computation