 Thanks for joining us. Before we get started, I have to show you a safe harbor statement. You've probably seen these before. Basically, we're going to show you some stuff that's in research, some experimental stuff that is no promise that they will end up in product. So I've got to legally say that. All right. My name is Evan Atherton. I'm a senior research engineer for Autodesk Research in the AI and Robotics Lab. And today, my colleagues and I are going to share some of the work we've been doing at Autodesk to leverage AI. And both the tools we make today, as well as looking at how to use it to change the way designers and artists are going to work in the future. So I want to start by setting some high-level context about where we see the role of AI in the design process. So this super scientific and definitely accurate graph shows the complexity of our world over time. So as our world becomes more complex, the problems we need to solve become more complex. And therefore, the tools that we need to solve those problems also become more complex. So to illustrate the point, imagine you've never seen a hammer and you've never seen a chisel. If you walked into a room and saw someone doing this, I think it would take everyone in here probably 15 seconds to figure out what the problem was and what the tool does, right? So you might take the blunt thing and hit the sharp pointy thing and someone goes flying off. So a super simple problem, super simple tool. Now let's say you walked into a room and you saw a wood lathe. So the problem is similar but a little bit more complex. So you still want to remove some wood but you want to do it on this cylindrical object that's spinning and maybe you want a more ordinate pattern. So you think, all right, I can do this. Let me put the cylindrical thing and the spinny thing and turn it on and take a chisel to it but then maybe wood goes flying, chisel goes flying, hit someone in the head. It's kind of a mess. So you have to start understanding things like grain direction and spindle speed and things like that. So maybe the machine takes more maintenance. So the problem you're trying to solve has become more complex and again the tool that we've created to solve that problem also has become more complex but it's still within our ability as humans to grasp that. Like if any of us spent enough time in a room with a wood lathe, we'd sort of figure it out. But unfortunately our ability as humans to manage this changing complexity, this increasing complexity is actually finite. So our brains unfortunately don't benefit from Moore's law for instance. So to take our analogy home, let's look at a CNC machine. So this is a massively more complex machine. It could take months of training to become proficient at. If you've ever seen G code or had to write G code by hand, it's pretty gnarly. But humans have developed software techniques that manage a lot of that complexity for us that write the G code for us and control the machine for us. So that lets us as designers and engineers to leverage this complex tool to make more and more complex parts. So this space between these two curves here is where we think the really profound impact of AI is gonna be. So there will come a point on this curve beyond which we won't be able to solve problems without it. So what we wanna do is use AI to create tools that understand what we're trying to accomplish and can then augment our abilities so that together we can solve problems that may be otherwise impossible for either of us to do on our own. It's hard to see, but that's a VCR clock trying to be set. I feel like I'm barely old enough to still make that joke. So the way we think this is gonna happen is in three phases. So the first phase is what we're calling smart tools. So these are tools that on the surface don't really seem that much different to interact with. And then the tools we're already using. So there might be single push button automations. But on the back end, what's going on is wildly different. So these are tools that use mountains and mountains of data to train algorithms that help solve the problems for us. And they're able to solve problems that are tedious, super time consuming, and also problems that we just to this day haven't been able to solve. And what's cool about them is they can continue to collect data as we use them and refine themselves and get better and better. So this is actually the stage we're in right now. The next stage, which I think is close, is what we're calling intelligent assistance. So these are tools that will actually learn who we are and what we're trying to accomplish. And we'll be able to reason about the actions that we need to take to help our goals. So if you're an architect, for instance, and you need to lay out an office, you don't need to be pushing polygons around, right? That's not the problem. The problem is, how do I balance the needs and requirements of the people in the room? So why not have a tool that lets you balance the needs and requirements of the people in our room? So because of that, the types of interactions we're gonna have with our tools fundamentally has to start changing so that we can represent what our goals are. And finally, the third phase is when our tools become truly trusted collaborators. So this is when they'll not just understand our goals but they'll understand the context that we're working in. So they'll help us achieve our goals, but beyond that, they'll be able to give us insights to make sure that the goals we're trying to achieve are actually the right goals. And again, the experience of working with the tools in this stage is gonna change. In this stage, we're gonna have interactions with our machines that are much closer to how we interact with other humans than how we interact with the tools we have today. So with that, I'm gonna hand it off to Will, who's gonna share some of the work that's already making its way into product. Thank you, Evan. So I'm Will Harris, product manager here at Autodesk for the Flame family of products. And if you haven't heard of Flame, it is a what we call a non-linear editor or a video editing tool that is also able to do visual fax compositing as a procedural node graph and also has a 3D scene environment for motion graphics and 3D scene compositing and also does color grading. So it's kind of a suite of tools that you might use as kind of like a Mac Daddy version of Adobe Premiere or Final Cut Pro if you're familiar with those. So with that as our background, we have been doing a lot of solving of problems in the world of visual fax compositing and trying to do image segmentation. And so our AI, if you like, has been focused towards image processing. So you might think of, you might, your first thought for AI might be to do with Google predicting what you're gonna search or picking which images in this authentication image are traffic lights or crosswalks. But our approach was very specific to the problems that we face and I guess complexity that's outside our reach, like Evan was saying earlier, in compositing and in working with images. So this illustration really brings in the idea that in a CG render world, you have as well as a beauty image of your spaceships that you wanna composite, you also have AOVs or render passes, arbitrary output variables that are more informant, they're rich asset types if you like or they're things that can allow you to apply depth of field, defocus through Z depth and normals to potentially relight those spaceships post render. That richness is something where we have nothing on our back play, right? So there's no such level of asset. There's no AOV for my camera shot scene shot on Sony or Arri or traditional cameras today. What if we could bring and generate that little detail of asset for live action? So enter AI, what we really need is we sometimes joke AIVs. And here you have some examples of what our research led to. So applied research is probably specific to this issue of image processing. We wanted to be able to generate depth for a live action scene so that we could brighten the dark in the background, add light in the foreground and we wanted to be able to generate normals for human faces, but not CG rendered human faces. We want the equivalent asset for live action content. So the power of normals brought to working with live action content. So that's pretty cool. Let's just have a look at how an artist actually uses it. We want to in some ways try to do VFX at the speed of grading. It's what we sometimes call this. And you can see how through the kind of mechanism that you might associate with doing color grading, like keying and doing a track on a shot, we can come right in here and quickly generate normals. And then from that, we can use a widget like this to then change the lighting on a face. So this is really a powerful new way that an AI tool, a push button automaton, as we said earlier, has brought us. There's nothing to do. There's no special knowledge needed, but there's no way you could paint or build the complexity of a normal matte for a human face. It would take months to build this asset, right? And from it, so you can quickly generate a matte that then you could use for color adjustment. We could also be able to defocus the background or change the lighting in the background as well, based on depth. So in this second example, what we have actually is the depth. So doing global analysis for the scene, it doesn't have to be a traditional camera trackable scene. It can be blurry, handheld. It's using the intelligence of our training to know things like the skies typically at the top of the image and at the back. The foreground is typically closer at the lower part of the image. And there's blobs of things that are gonna move around in the mid-ground from the hundreds of thousands of images that we trained it with. But this quickly can lead you then to be able to relight the background, add light in the background. Something that would have been historically, okay, I have to rotoscope all these people by hand or with maybe some traditional pixel-based keying. But through having this richness, you can start doing the kind of thing that you would do if this was a CG-rendered scene. Okay. So just to summarize, we took our applied research. We've built these tensor graphs that have then been computed into algorithms or train models. That was a big process that was done in part of our development. We then packaged that into a tool inside the Flame software itself to allow our artists to get quick and easy access to that, to be able to do some of the cleverness of VFX, but in, for example, here, some of the color grading discipline. We think that that's a great symbiosis, as we said. And we would just like to let you know that we think this is just the beginning of AI tools for us. We're already starting to think about other specialized trainings like being able to extract sky, just sky. So when you understand depth, be able to key the sky in different types of environments. There's a lot of our customers are doing, you know, replacing overcast skies with nice, puffy clouds or removing clouds or adding clouds or matching one sky from one shot to another. So having an auto keyer, an automaton keyer for that, we think would be good, but we're looking for feedback into what other specializations would be good as well. Do you need to be able to isolate buildings? Do you need to be able to up-res your footage? That there's AI that can do pixel up-resing, right? And then the last thing to mention, if you're a larger company, if you have teams that would actually like to utilize this framework and have specialized data sets, excuse me, give us specialized data sets that we could then build you a way to recognize a certain kind of character. If you have a whole bunch of movies coming up with one type of character, that might be useful. And so this is a framework that we built that we think could be specialized. If you'd like to hear more about this, please join me tomorrow at 4.30. I've got a special guest, Stefan from A52, one of our most cool and innovative customers to show some of his real work from this. With that, let me hand you over to Seb. Sebastian, thank you very much. Thank you, Will. Cool stuff. So I'm Sebastian. I'm a senior software engineer at Autodesk and I work with products like 3ds Max and Maya. And today I'm gonna talk about this product that we call UVAI. So as a refresher, for those of you who don't know what UV mapping is. So traditionally, when you want to texture a 3D model, that texture is defined into the, and UV mapping is simply the process of establishing the correspondence between the 3D model and the 2D image. This process is usually done in three steps. In the first step, we're going to add seams to the model, therefore separating the model in different pieces that we call UV shells or UV islands. Then we're gonna unwrap each of those shells so that they're flat. And in the final step, we're gonna lay those shells out in the UV space, typically in one or more square UV pieces. This is an important problem because it's taking our customers 40 to 60% of the asset creation time. And it's been a top 10 customer request for Maya and 3ds Max for more than three years. And it turns out that we have pretty good tools to do automatic unwrapping and layout, but we don't have very good tools to do automatic seams. So let's take a step back to understand why do we need seams and why they're hard. Imagine you have a sphere where you wanna place a map of the world. You could add a single seam, a vertical seam, and you will get a nice square UV shell. But you will notice that around the equator, there's a lot of stretching, which is represented by red. And around the poles, you will find a lot of compression, which is represented by blue. Now you could also add more seams, therefore generating more shells, which will lead you to less distortion, which is why all these shells are mostly white. But then imagine having a texture of the world split like this. It might be hard to work on and hard to understand. There's more. So the first layout that I showed you is very efficient because it's taking all that square, it's almost leaving no gaps, whereas the second one is leaving many little gaps in between. And that's very important in video games, for example, where every megabyte counts. In addition to the geometric and optimization constraints, seams are also used to create semantic boundaries. And that's because these UV maps are going to be used by a texture artist down the pipeline, and it's useful if they can identify the parts of the model easily. So you can see here that there's a head shell, there is a torso, legs, arms, et cetera. So we've seen three main characteristics. So seams are used to reduce distortion, they define semantic boundaries, and they also influence an efficient layout. Unfortunately, when you add seams to a model, you're most likely going to add some visual artifacts around those seams. So it might be a little bit hard to see the visual artifact in the image to the right. I don't know if anyone can see it. Let's zoom in. So you can see that there's a discontinuity on both sides of the seam, and you might think, well, this is okay because I'm looking at this asset from afar. But on a video game where you can get close to this character or in a movie where you're filming this character from behind, this is going to break the illusion, which is why artists will most often play their seams in non-apparent places, such as under the arms or between the legs or on the back. Now, when we compare what an artist would generate where they would place their seams with our current auto-UV, auto-seam tools, we can see that our auto-seam tool is preserving a good distortion, but it's not preserving the semantic boundaries again. So it's very hard for me to see what parts of the model are represented where in the UV space. But artists have been doing this for years and years, and so maybe a geometric approach is not the best approach to produce automatic seams. What if instead we could leverage the knowledge of artists and use all these models that they have already created, feed that into a machine learning algorithm, and let it learn and let it produce seams? So this is exactly what we did for this early prototype. I wanna emphasize this is a very early prototype in Maya. And here the artist is just going to, with one click generate the seams, one click do the unwrapping and two clicks the layout. So first they click on AI seams, they wait for a little bit while the seams are calculated. Now that they get good seams, they're gonna select the model, unfold it, unfold all the shells, finally orient and layout. So in just three clicks, they got pretty decent UVs. How does this compare to what the artist had imagined originally? Well, we have a very similar distortion and a similar number of UV shells. So there's similar fragmentation. But more importantly, we're preserving again these semantic boundaries. So you can see that there's a shell for the head, there's a shell for the torso and the legs, there's two shells for the arms. But then you might be wondering, well, the layout there doesn't quite look as the layout from the artist, the ground truth layout. Yes, but that's because we're not predicting the unwrapping and we're not predicting the layout. Only the seams for now. And we're still working on perfecting this seam prediction, but we're already thinking of what's next for this project. So first of all, it's this idea of generalization. We have tested on a few categories like characters and vehicles, but there's so many more assets that are placed in a 3D world. There's vegetation, buildings, furniture, and we wanna support all of these assets in our system. There's style transfer, which is very similar to what Will mentioned at the end. Maybe you're a big studio or an artist that has tons of assets and you already have your own seam style that you wanna apply to new assets that you create. You wanna transfer your style. We want to enable this workflow. And inevitably, when you have an automatic system, there might be things that you don't like from the result and you might want to do some tweaks, but you want the system to remember those tweaks for the next time. And so that's the idea of personalization that we also wanna enable. And finally, one of the biggest complaints of machine learning is that it's very hard to control the output that it generates. So we want to give artists some guidance or allow artists to guide the process, rather. By, for example, adding initial seams and let the system take it from there or blocking certain regions that would usually have seams so that the system does not generate seams there. And with that, I wanna leave it back to Evan who's gonna talk about all the great work they do at Autodesk Research. Thank you. Hello again. So as promised, I'd like to share some of the work we've been doing in Autodesk Research. It's a little bit further out there, where we're really trying to explore what I mentioned earlier about leveraging AI to create completely new design paradigms and to help us start moving toward that vision of that trusted collaborator. The first project I wanna talk about is called Deep Form. And what we're trying to do with Deep Form is try to use machine learning to essentially parameterize the essence of an object. So we wanna learn what are the key features that make something an airplane or guitar or table. And the way we're doing that is by showing a network, in this case, using auto encoders, if you're interested, a bunch of pictures of airplanes, and then it's able to encode its airplane-ness, for lack of a better word, as a vector of numbers. So it's essentially compressing what it means to be an airplane or guitar or table into something, into a string of numbers that we can then represent as a 3D point cloud. So this is real output from that network. So in this case, what it thinks a laptop is. What it thinks a vase is. So it's really cool about doing this. When we get a basically a vector of numbers that can represent an object, we can then add and subtract objects the same way we add and subtract numbers. So this is what we call the Cheroplane. So here we're finding all of the objects that exist between a chair and an airplane. So that's kind of a cute example, but for something more practical, we can actually take two objects of a similar class and interpolate between those to find all sorts of design variations in between, perhaps variations that we might not have thought of. So here's real output from the network as we interpolate between a table on the left that has no base and then the table on the right that has the base. So you can see this computing. But we can actually go beyond just like a straight, linear interpolation between two objects and we can actually control individual features. So for a chair, a feature might be the style of the legs or the style of the back or whether or not you have armrests or the hole in the back. So again, here's some real outputs. This time we're taking a vase that has sort of a lip at the top, pretty curvy. We're subtracting a vase that doesn't have a lip at the top and then we're adding a vase that has straight sides and the result you see here is a vase that has sort of a lip at the top and straight sides. So what this approach ultimately lets us do is draw from the massive body of knowledge and design language that already exists. And that lets us push our designs further faster and it also lets us come up with, again, perhaps designs that we wouldn't have considered on our own. We can also use this approach to help build tools to help people learn. So we've been doing some experiments with Tinkercad. If you're not familiar with Tinkercad, it's an online web-based CAD tool that's free. So a lot of kids and hobbyists use it. And because of that, learning is a really important part of the Tinkercad platform. So it has millions of users who've generated a ton of public data that we've been able to mine. And essentially what we're doing is learning what types of models people are creating and then how they create them. So after a while, the algorithm is sort of learned to predict what people are trying to make and it's continually updating. So you can see down there, it's a continual update of what it thinks you're trying to make. In this case, it thinks the person wants to make a fidget spinner. And then it's also able to understand, if you're trying to make a certain object, it can understand the steps that you might need to take to get to that object and provide those suggestions to you so that you can then use those to learn, learn how you might design an object. We can see now just by adding the block on the back you get things you want to make an airplane. So switching gears a little bit. The last thing I want to share with you is something that we spend a lot of time on and that's simulation. So simulation is in all of our tools, super powerful, but it's really hard to use. And more than anything, simulations just take forever. They take a long time to set up and then they're really computationally expensive. So they can just take a long time to compute, which makes it really hard to iterate. So often you might try to set up a simulation and then you have to go get a coffee or whatever and then you come back and then maybe your simulation didn't work the way you wanted it to. So this is true whether or not you're doing computational fluid dynamics of flow through a pipe or an explosion for visual effect. So here's an example of a finite element problem which is often used in mechanical product design to calculate the static stress on an object. So this is a couple of cylinders. On the left we have a boundary condition. So the left of the cylinder is attached to a wall. The green arrow here is basically a force vector. And what you see on the right here is we're using AI to approximate the simulation calculation. And we're able to do that in around four milliseconds. So what this enables us to do is adjust the design in real time and see the updated simulation result as we do that. So we can have a more interactive approach to designing and simulating at the same time. I'll also add that as geometry gets more complex, typical numerical approaches to simulation tend to increase the time it takes to calculate exponentially. This isn't true of these AI techniques. So as this geometry gets more complex it should stay computing on the order of the milliseconds. We've also done this with computational fluid dynamics. So these types of simulations typically are solving like crazy nonlinear partial differential equations and stuff, they're super gnarly. So what you see here is actually again an approximation in this case of laminar flow over this object that's updating in real time. You can see 60 frames per second up there as the object is being manipulated. So these are pretty like hardcore engineering problems but they're essentially the same problems that we face with effect simulations and tools like Maya. You can also see here on the right just the amount of parameters an artist has to play with. So especially if you're a new artist you might not know the effect that some of these are gonna have so you can burn a lot of time up front trying to set up a simulation that won't end up giving you the result that you're looking for. So I'll leave you with sort of a what if. What if we could just show Maya a ton of simulations and then it could use all that information to essentially learn how to use itself and form some intuition so that it can do what artists do all the time which is take some piece of reference photography or footage and then use that intuition as form to perhaps suggest the bifrost graph you might need to get started to get you that simulation recreated in 3D that you can then manipulate. So again the goal is not to replace the artist rather it's to augment their abilities so that they can start further faster so they're not faced with sort of the blank page and they can spend more time on the artistry of it and less on the setting it up. I've never met an artist who just finished their sim early and was like chillin' right? And so with that these are really complicated problems. So I wanted to leave you with just three main challenges we see in leveraging AI for this type of work. The first of which is identifying problems suited to the technology and I mean this in two ways. One is often in pop culture AI is presented as something that is just like magical that you can sprinkle on something and just call it a day. But there are certain types of problems that are better suited to these AI techniques, machine learning techniques than others. And then the second way I mean this is that we wanna make sure that we're solving problems that our users care about and that can help them do their jobs better and faster. Second major challenge and this is true for any AI practitioner is getting good data. So in fact if you have a inferior neural network that's trained on more and better data it will often outperform superior networks that's trained on worse data. So the higher quality data the more of it you have the better your algorithms will be. Which finally brings me to we don't believe any one group will be able to solve some of the toughest challenges with these design problems using AI. We have ball and researchers but no one understands your workflows better than you do. No one has better data than the companies that are using our tools day in and day out to create magic on the screen. So we're looking to partner with groups and help people solve these really challenging problems. So if that sounds rad and you wanna talk to us about it maybe get some more information about what we can do shoot us an email at ai.autodesk.com come talk to us after and yeah that's it. I think we'll take a couple questions and thank you. Thank you so much it was really inspiring and when you talk about like airplane-ness of the object you can see the scene analyzing like this whole thing for example emotion-ness or composition correctness or some following some rules of the scene analyzing which is cause some render recommendation how to light this scene to make it more impression or neutral. So not the only object analyzing but scene in self or for example you can analyze the render logs and predict light setup or render time or something like this a more global level of render analyzing. So it looks like life analyzing itself but it's really very inspiring your presentation. Thank you. Appreciate it. No question. Any other questions? Go ahead. Okay yes looking farther ahead do you see a space for AI to collaborate with humans on things that are at more higher semantic levels like the story and the emotion that the artist is trying to communicate and this might get into automating the blocking of the characters the movement of the camera the composition of the camera shots. Do you see a need for AI assist at these higher storytelling levels? Absolutely and I will say that I actually have an intern right now as a PhD researcher who's in the lab right now trying to figure out how we can use machine learning to map cinematography styles. It's really the early stages of that. So I think when we start to talk about things like emotion that's a little bit harder but we can start by giving creatives tools so that they can explore these spaces a little better and maybe you do have a camera shot and maybe if trained on enough data we can start to see clusterings of camera movement that feels to us more anxious and then they could pick one of those camera sort of features and add it to their camera. So I absolutely think that's where we're going and where I'd like to see us head. We can all make it look like J.J. Abrams. Yeah basically. And that's not to say that I certainly don't want to replace like cinematographers and I don't think that's what a tool like this would do. But again just having an assistance that can help you out and explore these different types of shot styles for your specific question. Okay, thank you. Yeah. Hello. I was wondering for the UVAI project the type of models you were using to handle the 3D data. We're still sort of developing our models so we cannot talk too much about them yet. So unfortunately. Okay, thanks. Is there any kind of rough estimate when technology like this will become available for artists to use? Like just rough like is it coming maybe in a few years? Is it coming sooner than that? I think we showed like the three, you know, the three futures kind of so flame. Yeah, the stuff that's already shipping is what I was showing there. And then I guess yours is sort of an active research. Yeah, I think like. Looking. Yeah, Autodesk stuff aside, I think we're already starting to see certainly in the last year more papers being written on tools for creatives. Seeing more examples. I think there's one from Google that's music related. So music generators and things like that. So I think it's in some ways like we're already there for the further vision like the stuff that I showed like manipulating objects. I think we're a little bit further out. I always hesitate to give time ranges but five to 10 years, somewhere someone on that order. Very exciting. Thank you. We'll stick around if anyone has individual questions for us. One more question. Yeah, one more. We just can't see him very well. There was an interesting research about Vimeo. They took it in. They feed the comments about videos to the neural network and they got like recommendation how to make video better or how to make like most emotional impact of the spectacular. And it looks like it's kind of manipulation start to be. So what do you think about that with developing these tools? We can easily manipulate spectacular in some ways. It's like we can push him to feel something or to go in some direction. So what is your personal opinion? It's good that we're not in the social media business. Yeah. We just make creative tools for people to make cool stuff. Yeah. I think certainly for stuff like maybe nonlinear editors there's room for, if I have a ton of footage that we shot and I just want to see a quick assembly edit can we show that tool a script? Have it understand like sort of the feeling that's in that particular scene and then just cut together a quick trailer or cut together a quick sequence from that footage. Well, we've had customers request us grouping shots. So like all the shots of this actor that need the beauty work we wish we could scan our entire movie and find the 500 shots that have that person in. Would be another scenario we might use. This feels related but just this morning I think I read or saw an article that someone trained an AI to watch all of the Queer Eye for the Straight Guy episodes and then write an episode based on that. So you see that kind of stuff like happening already. Yeah, that's semantic at that point. Yeah. Cool, all right, no one's popping up so maybe we'll hang out. We got a couple minutes we can hang out and answer some questions if you're interested. Thanks for coming. Thanks for sticking around. Thank you.