 This sequence-to-sequence model had equal-size inputs and outputs. Most applications out there don't have equal-size inputs and outputs, though. Like in the case of language translation, a 10-word sentence in English may not have a 10-word German translation. Even in the case of text summarization, the input is a set of sentences, but the output, by definition, is a reduced set of sentences. Clearly, to deal with this new set of problems, we need another type of architecture that takes in an input sequence, but outputs a sequence of different length from the input. It exists, and it's called the encoder-decoder architecture. I'm pretty sure you can predict the two parts of this architecture. The first is the encoder. It converts the sequence to a vector. And the second part is the decoder that converts the vector to a sequence.