 This is a summary of Sparks of AGI – Early Experiments with GPT-4, a speech given by Sebastian Bubeck at MIT on March 22, 2023. The original video is 48 minutes long. This is a quick AI summary. The link to the full video is in the description below. The video discusses early experiments with GPT-4 and its potential to be a form of artificial general intelligence. The experiments include GPT-4's abilities in problem solving, drawing, coding, and math with its limitations also outlined. The speaker suggests that GPT-4 offers opportunities to rethink what intelligence means and explore other examples of intelligence beyond the natural intelligence of the world, but also warns about the impact and potential uses of this technology, which can range from data analysis and privacy detection to medical and law knowledge as well as playing games and managing files – zero hours, zero minutes, and zero seconds. In this section of the video, Sebastian Bubeck discusses his experiments with an early version of GPT-4, which he had access to through Microsoft. He mentions that though he was initially skeptical, he believes that GPT-4 has the potential to be a form of artificial general intelligence. Bubeck acknowledges that the model he studied was text input and output only, and that it was an early version that has since been modified for safety. He also credits open AI for creating this powerful tool and clarifies that he had nothing to do with its creation – zero hours, five minutes, and zero seconds. In this section, the speaker clarifies that the results of their experiment may not be reproducible, but the focus is on demonstrating a qualitative jump in intelligence rather than quantitative benchmarks. They caution against dismissing the intelligence of AI systems due to a lack of internal representation or reliance on statistics, as these systems are learning operators and algorithms rather than simple pattern matching. The speaker presents an example of a puzzle given to GPT-4, which initially struggled but eventually found a creative solution involving stacking the items in a unique way. The speaker argues that these examples demonstrate the emergence of common sense and understanding of the real world in AI systems – zero hours, 10 minutes, and zero seconds. In this section of the video, the speaker discusses the development of common sense and theory of mind in artificial intelligence models such as GPT-4, while acknowledging that GPT-4 grasps some common sense knowledge such as the fragility of eggs, the speaker points out that the theory of mind, including human emotions and motives, is beyond its capabilities. The speaker then presents a paper which will be published soon that suggests GPT-4 has a theory of mind and explains how this can impact the subfield of machine learning interpretability. Despite this, the speaker contends that intelligence is a complex mental capability that entails more than just common sense and theory of mind. Therefore, before the discussion of the intelligence of AI, a clear definition of intelligence is necessary – zero hours, 15 minutes, and zero seconds. In this section, the speaker discusses their assessment of GPT-4's abilities in six dimensions – planning, abstract thinking, comprehension of complex ideas, fast learning, learning from experience, and problem-solving. They state that GPT-4 can solve problems and think abstractly, but it cannot plan and has no real-time learning or memory. The assessment is not based on benchmarks but rather creative tasks that are outside of what GPT-4 has seen before across a broad range of domains such as vision, coding, mathematics, and privacy-harmfulness detection. The speaker provides an example of asking GPT-4 to write a proof of the infinity of primes with each line of proof rhyming and notes that GPT-4 can do the task correctly and with rhyming lines, showing its intelligence and problem-solving and comprehension of complex ideas – zero hours, 20 minutes, and zero seconds. In this section, the speaker discusses using GPT-4 to draw an illustration of a proof in SVG format and a unicorn in TXI format. Despite the speaker's belief that nobody would waste their time drawing a unicorn in TXI, GPT-4 was able to produce a very abstract but recognizable unicorn. The speaker also emphasizes the difference between previous versions of GPT and GPT-4's ability to produce much better results. Additionally, GPT-4 was able to use diffusion models to improve its unicorn drawing. The speaker mentions that the human-readable code produced by GPT-4 includes helpful comments which guide the user to its thinking – zero hours, 25 minutes, and zero seconds. In this section, the speaker demonstrates the drawing capability of GPT-4 using a unicorn as an example. The AI model was given code to draw a unicorn and was able to recognize the head and the main. The speaker notes that the unicorn benchmark has been used as a benchmark of intelligence and says that understanding is the same as intelligence. This capability is useful because GPT-4 can follow instructions and complete tasks accurately. The speaker then shows an example of GPT-4 accurately following instructions to draw a screenshot of a 3D building game which opens up a lot of possibilities for the future. The speaker also mentions the potential usefulness of GPT-4 for coding with the example of writing 3D games in HTML. Zero hours, 30 minutes, and zero seconds. In this section, the speaker discusses their experiments with GPT-4 and demonstrates how it can generate code for a simple game. The generated code is much more complex than what GPT-3 and other AI models could produce. The speaker also shares a demo of GPT-4 passing mock coding interviews with flying colors in record time showcasing its ability to produce superhuman coding. However, the speaker notes that GPT-4 is far from perfect and has its own set of weaknesses but it's intelligent enough to use available tools such as search engines, calculators, and APIs to complete tasks. Zero hours, 35 minutes, and zero seconds. In this section, the speaker demonstrates some early experiments with GPT-4, an AI language model. The model is shown to be able to automate tasks such as scheduling a dinner, reasoning over inputs, and even answering math problems at a middle school level. However, when asked to solve a more abstract math problem involving polynomial compositions, the model becomes lost in computation and has trouble understanding the question. The speaker notes that while the model's abilities are impressive, there are still limitations in areas for improvement, particularly in the area of arithmetic. Zero hours, 40 minutes, and zero seconds. In this section, the speaker discusses a slide that shows the internal representation and reasoning of GPT-4 when solving a math problem. The slide shows that GPT-4 initially gives an incorrect answer of 120 but then produces the correct answer of 92. The speaker explains that despite making a mistake, GPT-4 was trained enough to overcome it and arrive at the correct answer through its attention-based system. The slide also highlights GPT-4's inability to do true planning, as demonstrated in a task where it needed to modify an integer to make an equation equal to 106, showing the need for further training to improve its reasoning abilities. Zero hours, 45 minutes, and zero seconds. In this section, the speaker ponders the question of whether or not GPT-4 is considered intelligent, noting that it may depend on one's definition of intelligence. While it lacks some key abilities, such as real-time learning and advanced planning, some behaviors exhibited by GPT-4 are impressive and useful. The speaker suggests that GPT-4 may offer an opportunity to rethink what intelligence means and to explore other examples of intelligence beyond the natural intelligence of the world. The speaker ultimately asserts that there is much more on the horizon with GPT-4 and that society needs to move beyond debates about its capabilities to confront the real questions about its impact and potential uses. The speaker notes that GPT-4 has a variety of applications, from data analysis and privacy detection to medical and law knowledge, and can even play games and manage files.