 works animation and I'm going to talk to you today about how and why we're using Swift in our animation feature film production pipeline To do that I'm going to talk a little bit about the company and what we do and how we do it a little bit about the feature Animation process itself and then talk about the ways We're using Swift both in production and the way we intend to use it in the future Since we're a media and entertainment company I can introduce a little bit about us by playing a movie We celebrated our 20th anniversary last year and we were founded in 1994 Dreamworks animation and Dreamworks studios was the first new Hollywood Studio in over 75 years. It's a pretty hard business to get into pretty big investment Between economic downturns Audience consumption preference changes and things like that over the years What's left of our company is the feature animation and television division that I work with and Steven Spielberg's live action television and movie production company How do I advance on this thing spacebar? There it goes. Oops Okay, so the movie didn't play Or maybe just playing slowly It says it's playing so imagine a movie That showing you pictures of all the properties we worked out we produced over 35 animated films Over the last 20 years. We started out as a traditionally hand-drawn animation shop and since then have converted Fully to CGI interestingly hand-drawn animation has come back as an art form more than a media So it's being used to enhance CGI films or to tell the story with a different visual medium Little text formatting issue so bear with me Dreamworks animation is a Recognized family brand feature length animation television consumer products company Our product is our characters our intellectual property becomes the characters And the situations those characters get involved in and the manifestation of our product is data Everybody here has data. It usually describes some other product our actual deliverable product happens to be data So we have a slightly different relationship to information than some companies do Excuse me this slide represents sort of our two main studio investments We've got Dreamworks feature animation, which has a studio in Glendale, California and another studio in Bangalore, India Excuse me And we're joint ventures partner with a studio in Shanghai China called Oriental Dreamworks That studio's intent is to produce content in China for Chinese consumption and then export it to the rest of the world Some of our characters might recognize Shrek puss in boots characters from the crudes things like that if you have kids you've seen these movies Four or five hundred times probably from the backseat of your car. I apologize for that. You know all the songs One of the things that's interesting about our environment is it's a it's a constant arms race between technology the ability to deliver a capability and Creative ambition so the creatives on the films always want to put more pixels on the screen They want to put more vibrant colors. They want to do more motion. They want to do more Simulation they want to change the the way you see the film to make the visual media and the 3d use of the visual media Help tell the story in the pet excuse me in the past things like furry characters and Clothing were difficult to do and they were especially difficult to do together in puss in boots as an example We had a quota on the creatives They could only have two instances where wet furry characters touched each other because the amount of computation required to do First simulation with water with 3d collisions in space was more than we had on the compute farm So that's an example of where a creative decision gets squashed by the technology decisions One of the things we're trying to do as technologists is not be a barrier We can give them some guardrails and we can give them some guidance But don't let creative ambition be stifled by the technology's ability to deliver it So I think what's happening is reading off the USB stick Slowly very slowly Kung Fu Panda you've all seen the first two kung fu panda movies I think about all the plush toys your kids are enamored by it the next film upcoming out of our studio Is interestingly going to release in China first kung fu panda 3 is being targeted and worked as a co-production with our studio in China It's one of the very first feature length animations done with the US with China as a co-producer. It puts us in an interesting position For the revenue streams as well as for the responsibility to be true to Chinese business practices in the culture the movies being targeted to come out at Christmas in China, but not until January for us because Star Wars comes out in Christmas in the US and you could be second or you could move your release date. It's pretty much how it works We pick second Couple more examples of po one of the interesting things about the process a little bit like software development Someone has an idea a script or a story element or an elevator pitch then you start doing rough drafts you start doing Top-level designs in the open-stack community. You might propose a blueprint You decide that you want to tell a story and then you start designing that story And this is an example of some concept art on the left is a hand-drawn animation of what Po might look like in one of the sequences Showing body position facial position Believe it or not this sort of reference material becomes input into the technology group because we look for things and scenes like this So they're going to be hard to make Well, so one of the challenges is that there's a slide in here if it'll ever come up that talks about how animation is Pure creation and you can make anything your mind can imagine But imagine if you had an infinite number of choices, it's almost impossible to make them So one of the big challenges facing the filmmakers is to bound their ambition a little bit by not what's possible But what tells a good story what makes the story compelling and what kind of makes sense? This is kind of bizarre I'm going to talk about it a little bit later, but there's in the animation process There's about anywhere from 12 to 15 different departments Depending on what sort of story you're telling and what sort of elements it needs to be told One of the very early first departments is the story department the story department's job is to figure out For 90 minutes what the story is going to look like how the characters interact how they emote What's their motivation all that typical Hollywood storytelling stuff? But it's important because if you don't have a good coherent story Partway through you get bored partway through you might decide that those characters aren't doing things that that makes sense And you're disinterested in you and we lose you as an audience member So the process itself is highly iterative It takes five years to make a movie and the first two or three years of it are iterating over trying to tell the story The mechanism used to tell that story are called storyboards That's interesting That wasn't in my pitch that's right is it tell me I'm not going fast enough beautiful Thank you this clip worked, but I don't have audio so this is an example of a story reel This is the manifestation of telling the story as a series of paper panels The old way we used to draw on paper put them up on these big four by eight bulletin boards And then people would pitch the story they go up in and pantomime the characters and acted out and show all the creatives And the people making the film what they were thinking about the character movement how the camera was blocked How the action might sync up with the audio and we'll iterate like this for a couple of years It's interesting in the old days they were drawn on paper and then scanned in and presented as a digital reel like this one Currently they're drawn on the computer using Cintiq tablets and then print it out and stuck on bulletin boards So we just this the same process we just turned it around and the thing that's amazing about it is walls full of storyboards There's still a very easy medium to edit a couple thumbtacks move the paper around draw a line through it cross it out It's the fastest way to iterate on a whole lot of visual information I always imagine someday we'll have walls covered in e-ink and we can just run around in scribble, but we're not there yet So after it iterates in story for a while It goes into a part my called animation even though the process is called animation Only about 10% of the people working on the product anime So animation is the part where you take the character that's been modeled by a modeler and rigged by a rigor Rigging is to add control points to the hands joints fingers arms all that sort of stuff and pose that character and make them do their performance The animators are the true actors. They get the character to a moat and tell the part of the story So this is the same clip with the same audio Imagine the little running around in Jack Black shout some things as he jumps out Without the background flat lit without a lot of what are called effects just to get the character animation figured out So the characters themselves are moving with full fidelity using the fully rigged model at Sort of the best resolution and the best performance that you can get Interestingly note right here. You can see crane in full glory. It's the whole character He was animated in complete detail and we get to the next step Which is called lighting? Lighting is the very last step of the show It's when we go through and position lights in space calculate the collision of light rays and all the geometry in 3d and where the lights are Crane turned into a shadow Imagine if you were the animator who had worked on this guy for two years and all the in this particular scene He's just a shadow because a lighter decided later on for cinematography reasons. It would look better if it was just a shadow So this is the final lit version of this piece of the sequence and you can see it has shading. It has shadow It has light. It has reflections has ambient Occlusion it has all that stuff the logo is in full 3d and it looks glorious This is the almost last step of the process and our kung fu panifrants So then like I said a little bit about the production process five years to make a movie Animation 101 a 90 minute movie is chopped up into 35 to 50 sequences a sequence is a location and the sequence in turn is chopped up into 20 to 50 Scenes so if we were doing a movie about people in a conference We've been here for four days and they're waiting for lunch This would be our location. We'll call this sequence 100. I'd aim the camera over there shot one camera over there That's shot to so the movie itself is chopped up into work like that and issued to different departments and worked on in Parallel so the movie is made out of order And you you only view it linearly pretty much at the end 130,000 frames to make a movie a film goes through a projector at 24 frames per second That's left over from the days when Edison couldn't make the projector mechanics go any faster without jamming We're digital now. There's no film projector and physical media here. We still do 24 frames a second. It's just legacy Frame is thousands of assets an asset and that's an important bit from when we get to the swift part an asset is a file or a Description of a file or a collection of descriptions of files that describe the relationship of all the things that go into making a frame And then each character po crane they have control points there are a three-dimensional model that has Kinematic system it has bones. It has hair skin muscles Cloth and hair and things like that are added after through a different department and Each of those in order to pose the character to get their animation have control points finger hair joint facial animation And how to train your dragon as an example of the dragon toothless had 4500 different control points So an animators job is essentially to edit a graph of how things move through space So the end result of this and I mentioned this earlier We produce data that happens to look good played at 24 Hertz through a projector, but primarily we manufacture data The data represents intellectual property represents characters Hopefully tells a compelling story and convinces you to go to Walmart and buy plus toys Kind of the goal We're also a file-based HPC shop one film will consume consume 75 million CPU hours over its lifetime Most of that in the last three or four months That's about 8,500 CPU years So if you start now you can get one movie done perhaps We each show we have about a 20,000 core compute farm dedicated to visualization computation rendering a Show at peak will use about half of that Since these shows are four or five years long. We release two shows a year We've got six or seven in flight at any given time We concentrate mostly from a resource perspective on the next three the one that's going to get released next is the most Important but the two after that are doing screenings and trailers and other sort of production work. That's just as critical Each show completes with a 300 to 500 terabyte footprint of storage. That's really important stuff that artists create We call those assets. That's intermediate stuff. We make that we The cutting room floors you will things that are used as to inform the later processes Then there's the ad to live as delivered movie The 2.2 billion asset transactions That's some process or some user or some compute machine asking the asset management system for an asset. I need Pose right harm. I need this eyeball. I need this piece of a tree. I need this piece of a rock That's done as a software as a service middleware Deployment that we built ourselves That's important for how we adopted Swift because we needed that layer between applications that don't understand object stores and the object store itself and the end product what we Manufacture at the end of the day is 250 billion pixels that are all neatly organized in the right way so that the movie is compelling to watch So what's what's got to do with it? Why where does Swift come in and why do we care about Swift? Both as an object store and Swift as a delivery mechanism if you think about File services is really three parts. There's persistence durability data protection making sure that you don't lose anything There's permission and security and protection and then there's delivery you have to get the bits from where they are to where you need them Right once or right only file systems aren't very useful This is the entire department layout for an animation feature. We talked a little bit about story They iterate at the top until we have something we want to tell it's the rough draft Manufacturing the movie is very expensive and very time-consuming so we try not to start until the movies fairly well understood Each of those lines represents what's called an asset handoff I might do some work product say in the editorial department which describes how long each scene in sequence or in what order they're in What the audio looks like and I hand that off to each of the other departments each of those asset handoffs is millions of files Potentially their hand and each department gets handoffs from its upstream and hands things to its downstream departments multiple times a day First place that we're using Swift is in our production management and production asset management environment Which we're taking advantage of the high-performance scalable durability features of Swift We use this as the the back end both for Content that artists create for some of the middle components as well as a delivered product Our asset middleware called PAM in turn leverages Swift's middleware to do some things that needed to do that weren't a native Swift This wasn't a full open stack deploy this was a deploy of object storage So we selected Swift stack as the vendor to help manage and install that infrastructure one of the things that we needed was Something Swift didn't had which is have sorry, which is object immutability the ability to guarantee that an object couldn't get changed by Somebody else when you put it into a certain state The notion of immutability and an eventually consistent globally accessible object store was kind of a mismatch So we worked with Swift stack engineering and they came up with a way to do a delegated authorization So when you do a Swift object put with a certain header in the file It calls out to our authentication server We give a thumbs up or a thumbs down and whether or not you have access if you can overwrite it if you can delete it Or if we should go call a police So it gives us a way to intercept in the Swift proxy and do authentication. You can use it too. It's in the product pretty cool The key to using an asset management system based on Swift in production is that our data set is is almost entirely read Only once it's been published We need to do a highly scalable reads into the compute farm in order to do the computations to get the next piece of the data We currently are deliver most of that data set using NFS We do about a million and a half NFS ops a second into the compute farm at scale The objects that are being delivered out of Swift are more complex So we don't have to do a million and a half ops, but we're looking anywhere from 300 to 500,000 object gets per second Most of those will come out of the cash couldn't do that out of the true object store I can do maybe 600 or 800, but if I get the object once out of Swift I put it into a scalable caching tier and then I deliver from there at a pretty high rate. It works out well The other thing we like about Swift Geographic diversity and global reach I mentioned we have studios in Southern California. We have studio in Bangalore and our studio in China Animation is a very collaborative sport You need people working on the same data set at the same time delivering content into the same repository for reuse by everybody else As a nice handy side effect of working collaboratively I get Geographic durability I get my replications not just in different parts of the data center I get them in different parts of the world. So if all of California explodes Could happen the indie operation could continue production on the film because the assets are all in country And our pan middle work piece knows geographic location of the data set if I ask for an asset And I don't have it in that data center It'll go get it from wherever it lives and bring it back by using Swift underneath the hood Swift can do the object motion and our middle work piece can just expect the data is available in those geos And if it hasn't replicated yet Swift will go get it for you. That's one of the nice features Archiving so this is if you recognize this is the scene at the end of the Raiders of the Lost Ark Archiving an asset preservation or near and dear to our hearts because our product is data and in order to monetize and use That data for other revenue streams down the road. I have to have it I have to have it with high integrity and I have to keep it for a low price the Next slide So archiving is an event. I'd have a data set. I like it. I archive it I put it in a cardboard box I put it in the warehouse next to the Ark of the Covenant and I forget all about it That doesn't really work because your data needs it needs to be preserved It's more like art. You have to steward it. You have to keep it clean You have to check it out every once in a while So I like to say asset preservation is a lifestyle if you're not consciously stewarding your data doing media refresh doing technology Refreshes in 50 plus years that data won't have any value to you because you won't be able to find it or get it And 50 years isn't really very long. That's the the time horizon that we look at because that's the durability of physical film Other companies you talk to the people from ancestry or the LDS church or the genomics guys They're basically keeping their data sets forever Where the data is persistent the technology changes the Mechanisms for moving data around will change. I mean they're talking about storing data in DNA Which is funny because in genetics I can store data about DNA in DNA, which is confusing The but the point I'm trying to make is that you need to be very strategic Intentional about stewarding your data and keeping it around and if you can't do it for a price point Lower than the value of the data than you're storing liabilities And nobody wants to keep liabilities around either delete it or keep it forever safely high integrity with low price One of the other use cases that I use sometimes is kind of a odd one It's to treat an object store as a big giant lun with named variable length blocks if you think about it if I named my blocks one two three four through some number and they were all 512 bytes long I could treat my object store exactly like a block device The model works there in two different places. We're using the Swift stack NAS gateway as a way to get NFS and Samba protocols Translated into an object store to support legacy applications And we're using a veer scalable NAS caches backed by object stores as a way to get low-cost durability storage for high-performance NAS delivery of assets that data set is primarily our transient data set There's a large amount of stuff we compute only to reuse for anywhere from a few hours to a few weeks That's recreatable upstream, but at a cost. Oh and the typical things you'd expect with open stacks So we have Swift underneath our open stack deployments to do glance and image and container repository to do build artifacts And deployment to do logs do all the stuff you'd expect it to do in an open stack environment And for the applications that are enlightened enough to talk to storage. Sorry object store directly Okay, so Peabody is the smartest dog. I know and he's gonna help me answer questions Questions got to be questions Given that Manila is coming out with with the technology to share Through through files, would you do this differently now or are you still using Swift and if so, how do you compare these two? Which vendor had you mentioned to In kilo release they have Manila, which is the file share mechanism. So My basic question is I mean, is there a reason why you opted for Swift As opposed to File share an object store as opposed to a file share. Well, most file shares don't work well globally The ones that do work globally that can give you the same namespace and access to the same bytes with the same path Have an object store underneath them. I haven't met any globally distributed NFS servers for example that actually work Because I've got people at both sides doing millions of ops on the same data set and I needed to transit the globe NFS in Particular chatty samba's even worse and you can't do either you can't do locking at any kind of scale more than about 10 or 15 milliseconds So the object store lets me make my NAS eventually consistent If not Swift it would be some other object store But at last summer was the summer of object store. I looked at all of them Swift has the best global characteristics. We've enabled the sort of file sharing at a distance that we're looking for So you're actually writing to this total, right? It's not just read. No, we're writing to both So it's what's interesting as I mentioned that middleware piece provides one of its functions in addition to asset management is address Translation, so if an application says I need this piece of data, it looks it up in a database and says that's available on NFS That's available in Swift. That's available in Swift, but in California And we'll transit the data around and provide it to you when an artist at their workstation does a model or creates a piece of imagery That's committed as an asset Our versioning system is only whole new objects. It's all right and everything's version We don't delete anything or modify anything in place So I can have multiple different versions of things in flight It tends to be read mostly because for example the lighting artists read the models, but they don't change them They read the texture maps, but they don't change them The render farm reads everything and only produces imagery at the end. So almost everything's read mostly Thank you Can you share a little more detail on the archive in terms of what actual technologies you're using there? Is it LTO tape? Is it you know, like when you need to actually move things? Yeah, so it's in it's in transition What we're doing today is a collection of some hand whittled scripts if you talk to a digital archivist They encourage multiple copies multiple ways multiple technologies We use and many people in the media industry use a product called front porch digital, which is a content store that's tape backed The intent and I'm we're building some middleware to do this ourselves And I'm shopping for media asset management systems that can do this The intent is I ingest a piece of media that I want to preserve and it makes a copy in Swift And it makes a copy through front porch onto tape tape goes off site and I never look at it and lets my swift copy Explodes it's the golden copy. It's the one that has the most value By abstracting tape away with a piece of I guess if you have too many middle sideware It'll do media management media refresh and technology refresh of the tape media Where the middleware piece just tracks the fact that it's an it's another object related system But I've got them in two technologies with two different delivery mechanisms and two different latency characteristics all my reeds for old material will come out of Swift and All my rights will go to both places Someday one or more of those legs might be a service But if you've ever done the cost math on storing objects store in the cloud The cost as a service over time if you ever read the data is ridiculous. You can do it cheaper on-premise always Today and are you tracking you said with the preservation? You know you need some attention over time Certainly at a 50 year horizon you're challenged with you know the format of the files or the format of the objects You have a system to track that it's a we can get the file back But we don't have the app to read it in yeah So one of the things we do is we we publish the source code and the then the description of the file formats along with it We're like a Rosetta stone The intent is that some clever software engineer in the future will be able to read the Python script and the description of the thing and either run an emulator that knows how to run the emulator that knows How to read the stuff back or they'll just read the stuff back using current technologies You think about it certain file formats have been around forever There will always be a reader for TIFF files. They're just it's open source. You can compile it yourself For our proprietary file formats. We've talked about maybe we normalize them into something else But publishing the source code in a Rosetta stone. We've talked about publishing the VMs that run the application stack that can read the files It gets kind of crazy and it's even worse when you think past 50 years to like 500 years Hopefully we leave enough evidence the archaeologists can figure it out Or the guy after me as long as it's still there when I retire then we're in good shape So caching is one of the key pieces to making all of this work Yes Is your caching implementation ubiquitous with anything being able to cash cash anywhere or is it lumpy where? Certain caches get certain pieces of data and other caches don't we've done both in the past currently the caches are unified What happens is a little bit of stuff that's infrequently used you pay up a first read penalty and then after that it tends to stay in the Cache metadata for example, we pin Data sets we tend to let float because we have pretty good locality over time We're experimenting today. We're using Apache traffic server and engine X caches They're very web centric and they do a lot of extra stuff. So we're experimenting with writing our own Swift specific storage delivery cash layer one of the things Swift can't do is it can inject cash control headers into the mix as an example? So by writing our own we can have an API where we can inject cash control headers when the objects persisted Maybe has metadata on the object. So when it gets read you get time to live characteristics or other sort of invalidation It's cash all of your 30,000 cores have equally quick access to it. No actually So we it's the Brenda farms distributed. So the model is the caches are adjacent to the compute and they they provide both WAN acceleration for the data set They provide offload from the origin Swift service and they provide scale out it within the local data center So we have a computer only data center in Las Vegas, which is about nine milliseconds away from Southern, California First read goes over the WAN, but every read after that is satisfied by the local cash Integrated your batch scheduling to I think use MRG now we use MRG. Yeah, Condor Have you integrated it with that so that when batch jobs hit the system? They try to go somewhere where it's already cashed or do they just go somewhere we haven't yet Data affinity scheduling is really important the amount of inertia that data has versus the inertia that compute has is really really a big difference Moving, you know 500 terabytes of data to do some math on it is done move the compute to where the data is So we're working really hard to do data affinity based scheduling to move the compute where the data most likely was or is Thank you anything else So I wondered had you looked at SAF and do you have any Editorial about the difference between Swift and SAF we have looked at SAF a couple I looked a couple years ago where it was a little bit premature and Not necessarily Seth's fault, but the underlying file systems had challenges Seth is less an object store than a block store that has object capabilities Since red had bought them I expect some commercialized versions of Seth has object store But it wasn't when I looked and when I looked last summer Doing all those sorts of things we wanted it to do it's still a little manual to operate Doesn't have some of the same characteristics another year or so the decision might be different I want to things we like about our strategy is having Abstracted middle where I can put a different back in them there without changing the applications at all And so I get sort of the best of everything Assuming that you had Data stored elsewhere before you went to Swift have you moved everything over or is this all net new? Did you migrate data from existing archives or other stores? Yeah, we are migrating yeah today So the existing archives are on IBM tape managed by TSM, which is a backup system backup systems are terrible archive systems But sometimes you use what you have So there's a ongoing initiative to read that material back in do a little curating to see if it's still stuff We want that makes sense and reorganizing and then re-injusting it into the archive. That's labor-intensive You don't want to have to do that very often. You want your media refresh really to be an automated Low-cost low labor thing, but since we have to do labor on this one. We're going to take a chance to call the data set a Little bit anybody else All right, I'll be around for a little while if you have other things want to talk about. Thank you