PyTorch
PyTorch tensors are like NumPy arrays, but with a couple of extensions
- GPU placement — tensors can be moved to a GPU for accelerated computation
- Autograd — tensors participate in a computational graph that enables backpropagation
The Computational Graph
- PyTorch builds the graph dynamically as your code executes (define-by-run). Other frameworks like TensorFlow declare it upfront.
- Each operation on tracked tensors adds a node to the graph, recording how outputs were derived from inputs
- This is what makes
.backward()possible: PyTorch traverses the graph in reverse to compute gradients
requires_grad
- Graph tracking is opt-in: a tensor only participates if
requires_grad=True - Learnable parameters in a model have this set automatically
torch.no_grad()
- During inference you don’t need gradients, so wrapping code in
torch.no_grad()disables graph tracking - This saves both memory and computation