PyTorch

PyTorch tensors are like NumPy arrays, but with a couple of extensions

GPU placement — tensors can be moved to a GPU for accelerated computation
Autograd — tensors participate in a computational graph that enables backpropagation

The Computational Graph

PyTorch builds the graph dynamically as your code executes (define-by-run). Other frameworks like TensorFlow declare it upfront.
Each operation on tracked tensors adds a node to the graph, recording how outputs were derived from inputs
This is what makes .backward() possible: PyTorch traverses the graph in reverse to compute gradients

requires_grad

Graph tracking is opt-in: a tensor only participates if requires_grad=True
Learnable parameters in a model have this set automatically

torch.no_grad()

During inference you don’t need gradients, so wrapping code in torch.no_grad() disables graph tracking
This saves both memory and computation