 So, how do these generative AI systems actually work? Generative AI systems are based on massive amounts of data and on a set of techniques known as machine learning. The data are used as training material, fed to a supercomputer, that applies sophisticated machine learning techniques, and finally, after incredibly many compute cycles, produces a trained model. Once the model is fully trained, the data is no longer needed. The trained model consists of billions of numbers that have all been set to the right value during training. Together, all these numbers, called parameters, determine the knowledge that the model has learned about the domain. For a trained large language model, to answer a question like, what is the biggest city of France, it needs to know many things about the grammar of English. It needs to know, to some extent, the meaning of words. For instance, that big is the opposite of small. It needs to know implicitly facts about the world, including that Paris is larger than Lyon. All of that knowledge, at different levels of abstraction, is somehow encoded in the billions of parameters. Similarly, an image generator needs to know all kinds of things about images, and about the relation between images and their descriptions in English. It needs to know what color red is. It needs to know that in portraits, the eyes of the main character are typically a little bit above the middle of the image, and that the background is often a bit blurry. In short, it needs to know what typical photographs and drawings look like, what the world looks like, and how humans describe these images. One of the big discoveries of the last few years is that machine learning, plus large amounts of data, plus large amounts of computing time on supercomputers, can indeed deliver generative AI systems that have acquired knowledge of this type. One final important note is that it is not the case that the models have simply memorized the entire dataset. Rather, they have learned patterns that many sentences or images have in common, although occasionally they generate sentences or images that are uncomfortably similar to what they have seen in the training data.