 Hi Finland, I'm Jenny, an AI avatar created by Synthesia. With me, you can transform text into video in just a few minutes. I'm one of the newer versions, trained to sound and act even more human than my predecessors. But for now, please welcome Victor Ripper-Belly from Synthesia and Felique Botterie from Excel. The stage is all yours. Thank you. Hi, everyone. So quick question before we get started. How many of you think that this video looked very real? So if you think it looked really real, just raise your hand. Yes. Wow, that's nice. I think if we had asked that same question a year ago, I bet the answer would have been very, very different. But it's been amazing to see how fast the technology has been evolving. And I can't imagine even what you're going to see in a year from now, which is super exciting. So I'm Felique Botterie, I'm a partner with the venture, a global venture firm, Excel. I have the pleasure to be here today with Victor Ripper-Belly, the founder and CEO of Synthesia. So as you've seen, Synthesia is the leading text to video AI platform, recently achieved unicorn status. So very, very happy to be here with you, Victor. So just to get started, tell us a bit about your entrepreneurship journey and how you went from skateboarding and listening to punk rock music in Copenhagen to founding one of the leading AI company in London. Thanks for having me. Very exciting to be here. I think the short version is I think I grew up loving computers and always loving kind of like counterculture and I guess more kind of fringe ideas have just always been more attractive to me. And maybe in my teenage years, I've been very interested in punk rock and skateboarding and that sort of stuff. And in my kind of later, more mature years, always had a very deep interest in the kind of fringe of technology to some extent. And so the way that Synthesia kind of came about was I had been involved in startups, doing some few projects myself, and really figured out that I wanted to build a company. But most of what I had been building up until then had been much more traditional SaaS, business process, bookkeeping tools, which was of course exciting, but I've always had this deep interest in science fiction and kind of emerging technologies. And I wanted to combine those two things together. So I moved to London and I spent a year working on virtual reality, augmented reality, which was very captivated by back in 2016. And still am today, but I realized back then that that probably wasn't the market I wanted to go after. But through my work there, I met my today co-founder, Professor Matthias Niesner, who's a professor at Stanford at the time, and he'd done this sort of seminal paper in the space that the technology we operate in now, a paper called Face-to-Face, which really was the first time the world saw video frames generated by a neural network that looked more or less photo realistic. And when I saw this technology, I just felt like I saw magic. I immediately could start imagining how this would change everything we know about media production. And I kind of anchored this thesis that's very much built on, that in 10 years, you're going to be able to make a Hollywood film from your bedroom without needing any cameras, microphones, actors, or studios. You'll only need your imagination. And this is back in 2016. So now we're three years out for that vision. And I actually think that's going to more or less play out. We're getting pretty close with most of the technologies. They just need to be joined together. So it's been a very exciting time. But I think for me, I just love these kind of weird ideas. And back in 2016, when we tried to raise venture capital for this idea, it was definitely very fringe. It was really hard. But obviously, especially the last 12 months, it's been very different. I think the world has now kind of seen and been captivated by the amazing power that AI kind of promises to give us all. Nice. So as you mentioned, a lot of people would say, well, this is kind of text to video. So it's a much more effective way to actually produce video, which, of course, it's true. But that is not really your vision. I think when you look at Sintesia, you can say, well, actually any text now can become a video. And then the mission of the company becomes not how you make video, but how any text can now become a video. So therefore, the amount of video is exploding. And the reason for that is because video are a lot more effective at making people absorb content. And they are a lot more engaging. I mean, just to give you a quick example, so Axel launched every year in October a report called The Eurscape, which is kind of a state of the cloud in Europe. And so a few days before the release, I did one post on LinkedIn, which was just text. And I had 2,000 impressions. Four days later, I do the same post with using my Sintesia avatar. And I have 20,000 impressions. So like 10 times, R.I. on similar content. So was it your vision from the start to say, well, actually any PDF document, any PowerPoint presentation is going to become a video in the future? Or is that something that you will evolve with the technology? I would say I think the clarity around that mission is definitely something that has evolved over time. I think in some ways, we did what you were supposed to not do, which is we started with a technology that we were very excited about. And then we tried to figure out which problems can we actually solve with this technology. When you're building these kind of deep tech companies like Sintesia, where you're doing fundamental science to advance this, you kind of have to think about how do you sequence your way to the end goal. So the end goal, from a technical perspective, would be that our systems are so powerful you can make that Hollywood film. But back in 2016, that was very clearly unrealistic that we could build a product around it. So we went on this journey to figure out at the current maturity stage of that technology, what could we turn into a product? What could we sell? And the first thing we tried to do was actually to sell technology to people who are already making lots of video content. So marketing agencies, video production studios, even film studios, who we're kind of trying to work with. And it wasn't a disaster. It was a sense that we actually could sell it. We sold this AI doping product, which was, give me a normal video, and I'll translate it to a different language by animating the face to look like the person is speaking in Spanish or French. But the thing we realized was that for all these people whose job it is to make videos today, their house is not really on fire. They already get to make lots of cool content. They know how to go about it. They're not desperate to make more video content. They just want to do their job better, but they're not desperate to make more video content. And so the technology we could build for them would always be a vitamin, not a painkiller. Throughout those years, though, we spent a lot of time just talking to as many people as we could about video. And what we realized was that there is a huge, like millions, if not billions, of people who really, really, really want to make video content, but they can't because they don't know how to use a camera, they don't have the budget for it, they don't know where to start. And so they're stuck with making essentially text content, so PowerPoints, PDFs, emails, et cetera, they can't make video. But they're really desperate to make video. And for these people, if we could build a solution that's a thousand times easier to use, a thousand times more affordable, then they're okay with the video quality being a bit lower than a real camera, which it definitely was when we launched the product in 2020, right? And the thing that clicked for us there is that it's very much about what are you comparing it with. So if you're comparing, like let's say you're in a big company, you need to train and onboard people to work in a restaurant, then either you could do that with a 40-page handbook, you have to sit down and read, or you could do it with an AI video. That's kind of like the two choices, right? There's no choice to actually make a video, they would never do that beforehand. And in this situation, it's just very, very clear that the AI video is way better than the text. And from there on, it really evolved into, I think, this insight that when you look at the world today, everybody wants to watch and listen to their content. Most people in their private lives, at least, listen to podcasts, they watch YouTube videos, TikTok, whatever, more than they read, and that's a trend that's only going in one direction. So that led us on to this idea, this thesis, right, that really what we are doing here is we're helping people who usually would only be able to write, to make videos. And that the world in five, 10 years, I think we're probably still gonna be texting, but I think for most, like, content we consume, I think that's gonna be video and audio, and that's kind of the platform that we're building. And was it clear to you from the beginning that, because if you look at a lot of AI startups, they started with a consumer or prosumer product, which is kind of sales, everybody can use it, a lot of hacks, et cetera, funny things, you know, going around on TikTok, YouTube, et cetera. But your focus is and was kind of enterprise, right? So was it clear from you for the beginning that you say, well, I want something that's driving real ROI for enterprise customers versus, and why do you choose that path versus, oh, let's have a broader platform, let, you know, know our very limited moderation and let it go viral? Well, I think as a founding team, I think we always wanted to build a business, not a vehicle to raise VC money and hope it's gonna pan out in 10 years time. And so we always went weird towards the path of like, what can we do for people that creates sustainable, long-term business value, not just cool demos, right? But that said, I think it is incredibly important and has been very important for us to also have somewhat of an open platform where people can go in, they can try out lots of different things. Because with new technologies, we actually need a lot of people to do iterations and to experiment with it to figure out where it's valuable, right? We would probably not have been as clear on our thesis and where we deliver value today if we'd just been sitting in a room ourselves and saying, hey, we're gonna build this product for this particular use case. That's how we go about it. And I mean, obviously it's a bit tried to say, but I think especially in the last two years where obviously the funding market has changed significantly, the macro has changed significantly. It is now more important than ever to build products that real users like because it gives them real utility and help them do their job better, right? Not just cool, a wow, kind of a ha moment. So can you give us, just to make it more concrete and real for the audience, can you give us a couple of examples on how large companies are using Sintesia and how did that drive value for them? So to generally obviously with our enterprise customers, it's very much about driving operational excellence with video. One of my recent favorite examples is Zoom, the video calling platform. Zoom has around 1,000 sales people all around the world. And it's of course really important for Zoom that these sales folks are up to date of what's happening in the competitive landscape, what new features are Zoom pushing, what's happening in the news. And the way they used to do that was that they would send emails, right? It would be a Slack message or something. And then you would expect people to read it. And that worked to some extent. But now that they've switched to video, what they've seen is they can ramp people much faster and they can keep their sales force updated way better. And it's actually pretty simple. It's just very basic human nature. That there's been a lot of research on the difference between watching something in a video versus in text. You remember around 10% of what you read in text and around 80% of what you watch in a video. So you take that and across 1,000 sales people who remember 80% of the material you sent them versus 10%. That really moves the needle for your business, right? So that's one example. Heineken is another one of our customers. They, of course, have thousands of people all around the world that needs to be onboarded and trained to do their jobs really well. And we've also gone in there and really helped them just increase the information retention and also decrease budgets that they usually would have for video production of our server content, which you can now do directly on a Synthesia platform. And are most of the use case today focused more internally or do you start to see companies using it for kind of external marketing, communication? So we have seen, as a year ago, it was very, very much focused on internal content. We have seen the last six months quite a big push towards more outward-facing use cases. So that could be customer support, customer success, some elements of B2B marketing, partner marketing. We're definitely not doing like Super Bowl ads and really well produced high quality YouTube videos. Not yet. Not yet. We'll get there. But right now, I'd say it's more focused on in the kind of sales process, the materials, the touch points you have with the customer. You can go in and you can give them video instead of text, which just works much, much better. Next year, we're releasing a big upgrade to the avatars. And I think once we kind of push through the uncanny valley, as in we truly get to the last 1% that may still be missing in terms of making it completely like a normal video with avatars that can be emotional and expressive, I think we'll begin to see many, many more outward-facing use cases. I think we'll begin to see a lot of the content we consume on social media. It's going to be AI generated. And I really think that this technology it has yet to kind of come of age, but the next 12 months is going to be really wild, I think. And do you see interest from celebrities to have their own avatars? So maybe to specify, what you saw on the video is a stock avatar. So in synthetic, you have what, 200 stock avatars? Yeah. Something like that. But you can also have your personalized avatar, right? And the process is very simple. You go in a studio, you do a 20-minute video, and then you can see yourself speaking, which is quite an interesting experience. So do you see a lot of interest for personal avatar going either on the business side, celebrity side, creator side? For sure. I mean, it's a huge selling point for the platform. I think on celebrities, it's pretty interesting because there's definitely like two camps. There are those who definitely do not like AI, and don't want to have anything to do with it, and they're scared of it. They don't like the fact that you can actually create content with AI. And then there's the other camp, which totally embraces it, and sees it as an amazing new tool to drive new revenue streams, right? Or just for them efficiency. So for example, we're not that far off from you probably being able to make an advertisement with a celebrity, but other celebrities happen to be on set. A lot of celebrities would like that. What I think is more interesting is that you can begin as a celebrity to use your likeness in different ways. So we work with one really big celebrity who owns a hotel chain in China. And when you buy a suite in this hotel, then you'll come in. And then on the TV, there'll be a personalized welcome message from celebrity, right? Just like a little fun thing, but it creates like a social media moment. And of course, for the celebrity, there's no extra work involved, right? And I think we'll start to see this a lot more where celebrities can actually license their likeness in ways that wasn't possible before. Maybe in the future, we'll have chatbots where you're not chatting to a custom support agent. You're actually chatting to the celebrity who also does their billboards and their Super Bowl ads. That's like a fun thing with an avatar. So I think it's kind of bifurcated, but it'll be a huge thing in the future, I'm sure. And do you see any, like, how open are companies, like, enterprise, big enterprise today to use synthetic video, like, given like the noise around AI and deep fakes, like, how are companies thinking about it, and kind of how do you see, like, people starting to overcome this barrier, and a synthesizer, like, how do you prevent, you know, the technology to become deep fakes? Well, I think the landscape has definitely changed significantly in the last, well, three years since we launched the platform. I think in the beginning, there was like a bit of uneasiness about, like, one of these avatars. Is it weird? Do people want to watch these? Generally, what people find is that people like it. As long as you disclose it, it's an avatar. Like, hey, this is your virtual sales trainer, then people think it's like a really fun thing. I think what we've seen the last 12 months is probably, with the whole world being captivated by Genensive AI, is that a lot of people are now less on the not really sure I want to do this, like, we need to really use Genensive AI to propel our business forward, right? And this kind of use case is one that's easy to access. There's no risk of illusionation. It's sort of fairly easy to control. And I think that's a big part of why we've seen this switch to more external-facing content is because of the kind of cultural acceptance of these kind of technologies. And I think the coming years will see this become, I mean, as normal as many other technologies we use day to day. But it is, of course, incredibly important to safeguard systems. We have always invested a lot in AI safety. Around 9% of the company works on it today. You obviously can't create an avatar of someone with their consent. And we also content moderate. So we take a kind of a fairly strict stance of what kind of content you can create on the platform, not just the content you obviously don't want, but also things like gambling is not something we allow, for example. So we take quite a strong stance on what you can make on the platform. And I think that is important. These technologies are really powerful. And we want to make sure we roll them out into society in the best way possible. Yeah, because today you cannot use a person. The only way to have a video created of a personal avatar is to have that person OK'ing the release of the video. So it's kind of very strict. You can't create a Trump avatar and make him say whatever you want, for example. It's kind of like setting up a bank account or something like that. You have to go through a check or make sure that it is actually you who's making an avatar of yourself. OK, that makes sense. And so right now there isn't much regulation around it, but it's coming. So what do you think about if you had a magic wand, how would you set the regulation to make sure that on one side you still enable the technology to develop and thrive, but on the other side kind of protects personal identity, personal rights, et cetera? And it's obviously a really hard question exactly what it should look like. I was on a panel this morning debating this as well. I think from a high level, the way we should think about this is what we want to reduce is harmful outcomes. We don't want to limit technologies. And there is some place around the world where governments are focused on the technologies and the algorithms themselves. For example, the US, right? They're now thinking about capping the size of training runs you can do unless you have kind of government approval almost. In Europe, there's been talks about specific algorithms. And I think that's not very helpful. We should focus on the outcomes. We know that most of the bad things you can do with AI is not like net new things. People have been defrauding people for a very long time even before the internet and computers existed. People have impersonated people. Like all these things, AI will amplify it. The first step is to make sure that our existing laws work at a digital age. So if you're with deep fakes, for example, should we revisit impersonation? Like how we punish impersonation? We probably should, right? To make sure that it works in the age of AI. Maybe we want to be more punitive if you're using AI-generated tools than if you're just pretending to be someone on email. And there's still work to do there for sure. So I'm very pro-regulation. I think we should have safeguards in place. But we should focus on the outcomes of the technology. We shouldn't be overly focused on what size of model you're training, what specific algorithm that you're using. Because it's also just not going to stand the test of time. But today may be a huge model run to create a very powerful model in a year's time. It's probably going to be able to do that with 100 of the compute and 100 of the data set. So I really think we should focus on the outcomes, not the technologies themselves. And do you think, in particular in the EU, that the technology regulation is going into the right direction right now? Not really. I think, unfortunately, I don't think Europe or the EU has a great history of creating a great regulatory landscape for startups to thrive in. And I'm a little bit worried about the current direction. But there's been a great pushback from industry and academics. So I'm hopeful to see that the next three to six months will come to some fruitful discussions and get to a framework that both puts the right guardrails around it, but also doesn't stifle innovation so we can compete with the US here from Europe. Yeah, fingers crossed. I hope so. So we have about five minutes left. And so I'd love to talk about, finish with kind of the future of where this is going. I mean, as I mentioned at the beginning of the panel, it's been so impressive to see where the technology was two years ago when you were watching it and you say, well, it's funny. It's a bit of bubble head. It's like, you know, animated and speaking. And today you're like, wow, you know, is this real or is this synthetics? So tell us a bit about like, how is the technology evolving? Like right now you're, you spend a lot of time in the recent months walking on the face, the eye movement, the wrinkle movement, the smile, and you start to have some upper body movements. So how fast do you think the technology is gonna evolve? Like where do you think you're gonna be in a, you know, you're from now and how far are we from having like avatar that are, you know, full body, being able to move more freely on a stage? Yeah. I think the next 12 months is gonna be absolutely wild, not just for Synthesia, but for any AI company. Like the progress we're seeing right now is just absolutely immense. For us, the first thing we're focusing on which is gonna be early next year, gonna get a next generation of avatars out where they begin to have emotions and expressiveness. And what that means is that you as a user when you put in a script, the avatar understands that script and performs accordingly. So let's say you put in a very sad script, the avatar will look sad, it will sound sad, it'll probably talk a little bit slower. You put in a script of an avatar selling you a used car, it'll be a lot more upbeat, a lot more energy in the voice, which is kind of like what a human would do, right? If you gave a human actor a script, they would not perform everything in the same tone of voice in the same prosody and so on and so forth. And this is gonna be a huge, I think platform shift in how we think of these avatars. And then later in the year, it's gonna be about taking the avatars into rooms, so not having them just look at the camera. They need to be able to sit like we're doing right now, they need to be able to pick up objects. And essentially we want to enable our users to create much more rich and interesting scenes with Synthesia. The big, big breakthrough and the reason I think that next year is gonna be really wild is of course, large models, which is not a new idea, but we've really seen the power of that in the last 12 to 18 months. Chat GPT of course is the most obvious example. OpenAI trained these incredibly large text models and demonstrated just how powerful these and intelligent in some sense, at least these models can be. And we're now seeing that translate to different modalities as well. Image generation, video generation, speech generation, general audio generation. And I think what the last 12 months really have shown us is that when we do these very, very large training runs, these models inside of them, they have the capabilities to imagine or produce most of the content we want to produce, right? Like image generation today is, you cannot tell that it's a generated image most of the time. Video is lacking a little bit after, but inside these models, the capabilities are there. We just need to figure out how we extract it in the right way. Today, these systems are kind of like slot machines. You put in a text prompt and then you pull the handle and then you get something out. And it'll look photo realistic, but it's very difficult to control it. And that's okay if you're like a hobbyist and you're okay with doing 35 prompts to get to the image that you want. But what we really want these systems to be, right, is want them to be super controllable in a way where you can create consistent characters. You can instruct the model to do exactly what you want it to do. And I think this next year is gonna be about taking these powerful models with amazing capabilities, putting control structures around it and making sure that you as a user can generate exactly the image you want, exactly the video you want, and exactly the speech you want. And while that's still a hard problem, I think the harder part of that problem has been demonstrated, can be solved by these large models. And that is actually rendering or imagining this content at an incredibly high fidelity. So I think we'll see a lot of really, really magical, amazing technologies in the next 12 months. Nice. And so maybe just one last question to conclude this discussion. Like, if you look at the AI space more broadly, like what gets you the most excited? Well, actually, I think is what I just spoke about before. I think we've seen how much the powerful and the kind of world understanding these models have. Now it's about taming that and shaping it into products that are really useful for everyone. And I think that's gonna just go across from text modalities to image and speech and audio. And I think by the end of next year, the way we think about content creation, if it hasn't already been upended this year, I think the next 12 months, if anything, is gonna be even more advancements. And what we're gonna see is that these tools and technologies begin to start to replace workflows that people have today. And people will start to make content where they're not gonna tell you it's AI-generated content. They're just gonna do it because it makes their life easier. Whereas today what a lot of what we see is like, look at this AI-generated image I made, this AI-generated video, right? And it's more those like PR moments than it's actually replacing real workflows for real content creators. I think we'll see that by the end of the year. Good. Well, thank you, Victor. I think time is up, but it was a very interesting discussion. Thank you again for being here today. Thank you.