 What is attention in neural networks? Attention is a mechanism in neural network uses to determine which part of the input to focus on when generating a specific output. This can be applied on different types of inputs. The most common application is in transformer neural networks, where the network pays attention to certain parts of the input text when encoding this text into some vectors. Attention mechanisms can also be used on images as well, so for example image captioning, where we're generating one word at a time by focusing our attention on different parts of the image at every single time step to generate a word.