 5 Concepts in Transformers. Part 2. 1. Layer Normalization. It's a method to stabilize the values output by a layer of a neural network used for faster training. 2. Residual Connections. These are connections that skip layers in a network to prevent vanishing gradients, and hence they ensure the transformer can learn. 3. Embeddings. These are vector representations of English words. Words closer in meaning have vectors that are closer to each other too. Now, 4 and 5, this is a twofer encoder-decoder architecture. It's a model that has an encoder and a decoder for accomplishing tasks. The encoder encodes the input into a representation, that's some vector. The decoder decodes this representation to generate some output. An example use case is in the transformer neural network. To build a transformer from scratch and with code completely, do check out this playlist on the channel.