 Why does chat GPT generate one word at a time? So chat GPT is based on the concept of GPT models, which inherently are language models. Language models have a mathematical understanding of language and specifically the distribution of word sequences. P is the probability of determining the next I-th word of the response given some information about the user prompt as well as every single word that the GPT model has generated before it. And so language models will generate this word after every time step and in chat GPT this will manifest itself as generating one word at a time.