 The third architecture we looked at is sequence-to-sequence models, where both the input and the outputs are sequences. The input was a set of words of a sentence, and the output at each step was to predict the next word in a sequence. With sufficient training, this word-level language model can generate its own sentences.