 Incomes LSTM networks in 1991 that introduced a long short-term memory cell in place of dumb neurons. This cell has a branch that allows past information to skip a lot of the processing of the current cell and move on to the next. This allows the memory to be retained for longer sequences. Now to that second point, we seem to be able to deal with longer sequences well. Or are we? Well, kind of. Probably if the order of hundreds of words instead of a thousand words. However, to the first point, normal RNNs are slow. But LSTMs are even slower. They're more complex. For these RNN and LSTM networks, input data needs to be passed sequentially or serially, one after the other. We need inputs of the previous state to make any operations on the current state. Such sequential flow does not make use of today's GPUs very well, which are designed for parallel computation.