 And yes, yes, we are back at Big Things Conference 2020 and we are at the attic, we are in the attic room at the top of the house, this home edition that we created for this special ninth edition of Big Things Conference. It was a fantastic first talk with Brian O'Neill and we have a second speaker ready, which I'm introducing you straight away. Localizing a video game that is preparing for its sale in a new region or country can make a huge difference to its success and sales figures, but it's a big joke. It's not just the translation, it can be new audio, graphics and even altering aspects of the game because of cultural sensitivities. One of major game studios, Electronic Arts, is developing AI tools to make this faster and easier. To tell us more, we have with us the director of localization, analytics and video audio at Electronic Arts, he's Mario Bergan Piños. Mario, welcome. Hello, thank you. How are you? Thank you for having me. Very good. Thank you. Lovely to have you in this super cool office of yours. Yeah, it's not a background. It's my actual office. It is full of toys, it looks like, yeah? Yes, it is. It's work, it's work. You have to say, no, this is work. Of course, this is totally work. Console, sorry, hidden over here. We're looking forward to listening to you, so whenever you're ready, go ahead. Okay. All right. So, I hope you can see my screen. Okay. So, good evening, everyone. My name is Mario Bergan Piños and I'm director for localization analytics as well as the video and audio departments in AI localization. Well, as you can see, this is quite a mixed role and everybody asked me kind of the same thing. Why data, audio, video, it's kind of like really mixed. Why is this? So, I'll quickly sum up my experience. So, I come from broadcasting technology, so 3D video composing, audio. And after working for several multimedia and broadcasting companies, I started in EA for a summer job and I tried out and I stayed there for 16 years. So, it's been a long ride. So, my profile has been primarily focused in audio and video localization and then I moved between different departments and then kind of ended up in leadership roles in Madrid. So, I can say I'm not a data expert in the sense that I don't code but I'm a data enthusiast as we call it and I'm a business expert, that's for real. So, this kind of mixed experience with my leadership role and also co-leading innovation for localization kind of created the perfect mix to give me the opportunity to actually create and manage a data team. So, how come that all of a sudden I started getting involved in data? So, my entry point was mainly player data. I kind of started looking into the relationship between player data and specifically FIFA. This was kind of like four years ago. After that, I kind of started getting curious about asset telemetry data and then financial data and then machine trust data data and then vendors data and that's how I kind of realized that cross-referencing data and having all of it kind of like in the same place was something really powerful. So, to start doing this, we kind of created this data, this log analytics team and as soon as I created it, I realized thanks to Noelia, our data manager, that EA could benefit not only by merging and exploiting all of the data that we had from different sources but also from machine learning technologies like machine translation, speech-to-text, text-to-speech, or even automating several processes through algorithms. So, my talk today is kind of like about all of these changes and this journey, kind of like creating this team, the learnings that we had and then talking to you a little bit about what kind of data we're handling and what localization in video games is all about. So, I'll get it started. Before I start speaking about data, some of you might be wondering what is localization about. So, our team in EA kind of centralized all of the translations, voice-dabbing, testing, manuals, video, and basically all content that is not English for all of the EA games. So, our operations are quite complex in the sense that we're always kind of like shipping in a simultaneous ship, which means that at the same time as the English is shipping, we're shipping the rest of the languages. So, as you can imagine, this kind of means a lot of changes and complexity in the operation. So, we need not only to have specific tools but we also retrieve quite a lot of data that helps us kind of like to take better decisions in our operations and the quality or even the language decision-making. This is like we kind of have our guiding principles. So, we say that we obviously localize, you know, all of the EA games, text and audio, and all of the player-facing content. We partner with all of the EA studios, publishing worldwide customer experience. We're really highly scalable through technology, automation, and industry partnerships. We consider ourselves that we not only localize, but we kind of craft, you know, we ensure that the artistic style, it kind of targets the languages and cultures. And we try to achieve that immersive experience for players around the world. We also consider that localization kind of generates revenue in the sense that we should have analyzed and maximized the return of investment of each of the languages. So, we try to influence those localization decisions, you know, based on data. Our mission, similar to the one of EA, if EA inspires the world to play, localization kind of makes that possible, you know, especially around the world because of the languages. And then our vision is basically to be EA's voice for the right localization solutions. So, going to the next one. Here, I wanted to give you kind of like a snapshot of all of the data kind of streams that we handle and we consider, especially when we kind of started doing our taxonomy in EA localization. So, we have audio, we have video translations, financial testing and player data. So, if I kind of start with the audio data, we basically gather data from all of our audio departments, you know, from the recording studios we work with having almost like everything that has to do with the speed of recordings, the cost, vendor data, and all of the files basically that enter our content management system. So, we are basically aware of every single move, you know, that these audio files are making through our workflows. And if we have gone through re-recording or the quality wasn't good, we kind of, you know, cut all of those telemetries. The audio quality KPS are basically, they're really based on manual checks and back-entering. So, we know which assets were good or bad. And when they got into our systems and how many times they went back and forth, as you can imagine, this kind of gives us excellent insights about, I don't know, our vendor accuracy or the development studio's asset turn. So, how many times, you know, we actually redid some of the localization, as I mentioned before, because we have to do everything in simultaneous shift, you know, all of the changes that are made in the English content have to be replicated in the rest of the languages. Sometimes, specifically, for example, in audio up to 12 languages. So, it's quite a lot. With video data, it's mainly anything that has to do with marketing videos. So, we basically localized all of the languages for the marketing videos in four years. So, everything that you see on web pages and Twitter, tweets, whatever, right? So, we do all of the localization. We work in, with this department, we work a lot with automated image recognition to automate, basically, our quality assurance. And that gives us quite a nice metric, you know, our own workflow accuracy. So, we also try quite a lot of the stuff that has to do with the scopes, and the back and forth also of the videos. And because we are dealing with marketing, as you can imagine as well, you know, there are tons of changes, you know, in these videos. For translations, we kind of have two different areas. One, what we would call like regular translations, in the sense that this is like the translations that are basically done by translators. And in here, we kind of track all of the quality-based feedback from our testing teams, the speed of translation, the cost, and anything that has to do with the translators. The other area, it would be the machine translation area. So, here, of course, we have a really big focus in machine learning, and we are basically trying to improve the quality of those languages and also creating our own KPIs. In terms of financials, of course, we mainly track forecasting and billing internally to EA. And we're very good at this in the sense that we have been doing this for quite a long time, and we distributed quite a lot of data. So, for this specific one, it was really a challenge to gather all of that financial data and get it into the same place. This is something we're working right now. So, these days also we're really focused on anything that has to do with life services and life games, because games right now are not only like shipping a game and then wait for 12 months and then shipping another one, but content is constantly being changed. So, we also have to track the financials around that. Testing. For testing, we have bags, speed, quality. Everything is based on amount and severity of bags. This kind of gives us all of the trends that we need to focus on and what is important. So, there's also quite a lot of data regarding resources. We have quite a big internal testing team, so there are quite a few tools that are helping us, basically, to handle all of these resources, and we also track data around that. And then player data, one of the most important ones. So, this is where we study our game behaviors. It kind of, depending on the areas of usage of the game, monetization, but for lock specifically, for localization, it's really important how the game has been used in text and audio. So, for example, are people skipping, you know, cinematics in a specific language or are people using one language against another, you know, within the same country? We are also experimented as well with player sentiment metrics in which we can extract data from forums, social media, mainly assigning keywords to positive or negative sentiments, and this can be quite useful. So, this kind of gives you a brief snapshot of kind of our complexity and how we could lose a great opportunity if we didn't organize all of this data. So, when we started to look at all of the possibilities to cross reference all of the data that I just mentioned, we found ourselves with 35 different databases. We also needed to speed up our ROI process based on player data. And we, of course, wanted a data driven culture and whatnot, right? So, we didn't have the team to do this. We didn't have the talent to do this. So, that's when we reached the conclusion that we needed help and professional ones. So, that's when we started kind of creating the team and this is kind of like what it looks like, right? So, when we started the hiring, obviously the first thing that I did was looking for a data manager. In this case, Noelia González. I mean, she was supposed to be also here with me. So, I say hi to Noelia. She's looking at me. She's watching. So, when I first met Noelia, we kind of decided to divide the team in like four different areas. So, we have a business analyst area that is helping us with the management and understanding of the business requests. We have a data engineering area that is kind of like creating our data infrastructure which I will explain a little bit later. We have a data sciences area that is for applying machine learning techniques for localization. And we have a data analyst area where it's basically kind of like creating the APIs and the visualization of all our insights. So, the team is not big, but it's very senior. We try to nurture ourselves of segments from other teams in localization. Temporary staff, trainees for a specific project. So, we work also a lot with universities. So, yeah, when we had the team in place, we wanted to strive for that sort of like data-driven culture. And I kind of like get our own departments in localization really excited about this, right? So, in this slide, I wanted to mention some of the learnings, you know, that we went through, you know, through this journey. So, for example, you know, when we started all of this, you know, when we started talking about different initiatives or the data lakes or whatnot, you know, our first reaction was like, okay, let's just ask ourselves the right questions. What do we want to accomplish? What problems are we trying to solve? And we find ourselves, you know, with quite a lot of projects that we could help at our different departments. So, the next thing that we had to do is basically knowing our data sources. So, a little bit of what I showed before, you know, like you saw the different areas that we had. And this was definitely the most tedious task, you know. We had to look into every single database that we had, you know, track those volumetries, you know, how many times those databases were updated, are those real-time? What are they coming from? Is this basically an Excel in someone's computer? What is this, right? So, we had to go through all of that. And then when we had all of our initiatives, something really important was kind of like trying to map them and basically measure, you know, how much effort it was going to be taking, each of them, and then measure also the output, you know, and make a little bit of a return of investment on each of the initiatives. And then, of course, MVPs, right? I mean, we also heard Brian before, you know, talking about this. MVPs for us is key, because we didn't really have that culture of creating minimum-bio products, and we were always kind of like immersing ourselves in huge projects then, and then having this for years and you don't realize, you know, how much time you would lose doing something like this. So, we're also striving to kind of like, you know, try to fail fast if we need to and then reiterate from there, right? So, that's what we're trying to do. And then data evangelization, which is, you know, again, trying to get people excited about data. How are we doing this? So, data evangelization, for us, it's kind of like something key to get people, you know, as I said, excited. So, a good entry point, and this is, again, learning that we had throughout our journey. Data evangelization is kind of fast and a quick win, so, you know, creates some of it, you know, for your teams and then, all of a sudden, you will realize that people are getting engaged, you know, in data. So, I mean, you can see some examples here, like, you know, doing some machine, machine translation story, you know, grads or even stuff that we had, you know, small kind of data visualizations that we have here. We're trying to ship, you know, some of this every month or so or even less. Then we have stuff like data talks, you know, like, you know, having people come to your office, you know, or right now, obviously, you know, through webinar or Zoom or whatever it is. And try to do it yourself, you know, go out and talk to, you know, forums like this one. And I can say that the ones that we had in the office were quite successful. We have quite a lot of people, almost the whole office coming in. So, that was great. Something that we also do quite a lot is trying to, we have a mentorship program in EA and what we're trying to do here is being really active. So, trying to spread our message by coaching and mentoring internally. So, we also try to do something that we call Seguments, which is basically borrowing people, you know, from other teams to work with us for a few months and train them at the same time. So, also internal and external visibility is quite important, again, you know, like this forum. And then something as serious branding, you know, create your own brand and go for that, you know. This is something that we did as well. And for the training, which is something quite important, you know, you need people to start kind of like doing their own visualization as soon as you kind of create your data lake or whatever it is, you know, you need people to start using whatever, Power BI, Tableau, whatever it is, right? So, there's also a bit of training and this one, we try to do something different. We try to go for something called the data school. So, data school for us is something that we're trying to create our own data training program, you know, based on experience and recommendation from our area experts. So, the idea is that we try to create four different training paths, you know, from business analysts, data engineering, data scientists and data analysts. And the reason why we're doing this is because of, this becomes like a three-way tool, you know. First, you train other department team members, as I said, for second moments in our teams. The second is that it will help, you know, onboarding in your team from external candidates or trainees. And the third one, we feel it's a super important personal development tool. So, if you're not really familiarized with the data world, you know, this is an excellent tool. To give you an example, we could have a software engineer coming to our department from a team working with us for six months. And in this example, you know, this person would try to, you know, get their data inside our data lake. So, what better than doing this himself, right? And this is something that we're actually doing. And with this program, you know, so also helping us, you know, to fill the gaps that these people might be having in some of the area. So, training sources can go obviously from pre-training ourselves, kind of like recording videos and then, you know, some of the paid trainings. But I think the interesting bit is that we kind of created our own sort of like application in which people can, you know, get in here and kind of like do their own evaluation, you know, about how much do they know about coding specifically or data exploration. As soon as they have kind of like filled this up, they will get some results, you know, for each of the areas. And then some recommendation, you know. So, the cool thing about this application is that it also serves a couple of things. You know, it helps people to visually knowing, you know, which trainings it would be better for them. But it also gives us some insights, you know, from the different departments that we have in EA, you know, what is the level that they have in terms of knowledge. So, I think that's, you know, and one of the ideas that we have is that as soon as we shoot this to the EA localization team, we're probably going to be doing something for the rest of EA because I think it's quite important that we gather as many trainings as we can. So, okay. That was nice, but, you know, how do you guys actually organize your data, you know, so going back to my original point, you know, you guys have, you know, 35 different databases you guys want to mix data. You want quality of data. You want people having easy access to it. So, how do you do it? So, that's when we kind of started, you know, building our infrastructure. So, we started, you know, creating Ada, which is our EA log data house. So, when we started creating this hub, you know, this is kind of like an environment in which we can basically centralize data in a single repository of those dependencies. So, kind of like saving costs, efforts, and most importantly, after transforming those pieces of data, basically all of those data sources, it will give us, you know, kind of like the quality standards that we need. The name of Ada is actually, I like, you know, because it's coming from different sources, but the one that I like the most is because it comes from Ada Lovelands, which is considered to be, you know, the first woman super engineer. So, how does it look like? I'm not going to get really technical here, but our data hub kind of gives us kind of like the autonomy basically to our different business users to work with their data almost as a marketplace, right? So, the data hub is kind of like an open business data source with different solutions. It obviously has its governance. It will have its rights, security, but it is to have all those 35 different databases that I talked about ingested and wrong, you know, inside that data hub. So, people can basically, you know, use their business intelligence tools and create APIs and whatnot, right? So, all of the transferred data, you know, will be residing in our data lake, which is called Hall of Justice. Yes, we're pretty freaky about video games and also heroes. And this is based in, I mean, it's Amazon Web Service, and it's obviously quite scalable in terms of computing, but at the same time, we also have our own computers at the office, you know, that we will use for machine learning. And in this case, we call it an polygon and it's a really powerful computer that is helping us, you know, also to save some credits in Amazon. So, this is pretty how it looks like. All right. So, that was really interesting as well, but what about the projects? What are you guys working on? And I think this is the probably that is more exciting. So, I'll start with some ideas and some of the projects that we're working on in terms of player data, which a lot of people ask me. So, in this slide, I kind of thought about, you know, asking ourselves some questions that obviously all of them are, you know, have kind of like a guess as an answer. So, for example, can we create customized content based on player data? In example, record more via lines. This means if I knew from player data that, for example, in Spain on a game like FIFA would I know based on player data that people are playing with a specific team are playing a specific friendly matches, how many substitutions from a specific player they are and things like this, you know, I would know what are the behaviors of the players that we have inside FIFA, meaning that whenever I need to localize something, I won't do it exactly as the English, because that wouldn't make any sense to us. So, what we're doing is basically customize in per language. So, whenever people are listening to the speech and the commentaries, we try to refresh that, not only by the same concept that the English game is doing, but basically doing it specifically for each of the languages. So, we found that that was pretty cool and customized. Another thing would be do we know which language to invest depending on the player usage? Yes, of course, this is what we call our ROI process and it's also taking information, obviously it's taking information about sales and it's taking information about our own cost in localization but something really important now that more and more we are basically everything is about player engagement, as I said before, games are not only shipped every 12 months, but we have content constantly, we need to know if people are engaged. So, knowing if people are engaged with the language itself, with the text, with the audio, are people using this? It's pretty important for us to make the best solutions. Should we localize Polish in a specific game or maybe just doing audio or only text? We wouldn't know stuff like this because also of the player data. How do we get live sentiment from our localization games? This is something as simple and as difficult as going inside social media and basically or doing web scrapping from forums and using keywords, we can basically know if a game has been doing whether wrong or we can extract information about if people don't like a specific commentator or things like this. And the last one, could we change FIFA commentators based on English vs. foreign player language choosing? And I actually have an example about this, which this kind of happened between FIFA 19 and FIFA 20 and it's got an interesting case because when we started to retrieve data from our Dutch commentators we realized that there was something off, that there was something weird. In this case you can see that in FIFA 19 there was quite a lot of people playing in English so when I say this, it means that when people started FIFA they actually changed the language between Dutch to English, specifically for audio meaning that they were changing the comments and they were changing the commentators. So at the first glance you realize that okay, well the Netherlands is probably the highest in English proficiency across Europe so maybe we shouldn't be really worried but this is kind of strange compared to other languages. So then we started looking at player sentiment and looking at the forums and we kind of realized that people were kind of like complaining about the commentators. So we took the chance and we basically after quite a lot of talks with marketing we changed the commentators and this is what happened the next year that we decreased the usage of English in the game to 12% and we increased obviously that's to the 86%. So this means that a 16% uplift is quite big. If you think about going from 70 to 86 for us it was quite a success story and I think the success story was being able to track this and being able to make decisions with this kind of data. So in terms of machine learning, what's the stuff that we're doing? You know? We can basically, can we customize machine learning engines, for example like speech to text or text to speech or machine translation. This is something that we're working on right now. So machine translation is the one that we're obviously more advanced and we're doing stuff, for example, are we able to automate tests with image recognition? This is something that we're also doing with our marketing videos and I will show you a couple of examples later or can we even predict financial forecast based on historical series and we have a lot of financial data, can we do something like this is something also we're working on. In terms of to give you an example right now, stuff that we're working on in speech to text something that we're doing is obviously trying to search not only for the best KPI in terms of quality that we can get but also tracking this KPI. So we're tracking every rate of each of the engines, for example, Azure, Google Watson, we're looking at IBM and other companies and also the character error rate. So what we're doing is kind of like tracking from each of the languages this constantly. So we know which language is basically behaving better than the others and which one has an uplift in the last few months. This is something that we're doing so whenever we want to invest some time into improving one of the languages we know which is the one that we actually need to go for. And we're doing this for, you know, same thing for text-to-speech and machine translation. And then you know, image recognition this is something that was quite helpful in the past year. So we went for an MVP and this was almost a few months ago and now it's embedded in our workflow. So this is one of the examples of how we use machine learning technologies. Image recognition in this case, you know we went for a trading agency logo. So what we're trying to do is kind of like trying to recognize, you know, these age ratings, you know, and what we do is like we have let's say that we're doing, you know as I said here, more than 10,000 videos a year, right? So we need to embed, you know, in each of the videos these age ratings and we were doing this, we were checking and doing the quality assurance almost manually. You know, from this video just to make sure that we were not shipping a video that was supposed to be 12 and then it was 7, right? So now what we're doing is by image recognition we're kind of like getting this automatically and not only automatically knowing that this is a 7 but where is it placed specifically and in what time? Because everything is compliant, right? It needs to be in a specific place, it needs to be triggered at a specific time. So we don't really need now people looking at this video. So we actually is safe quality of money and we actually, you know, same as we talked about the MVPs, now we're reiterating into that MVP and doing more stuff and so now we're actually looking at subtitling prescription and then entity recognition which is basically looking at recognizing characters, for example from each of our games. So we can look at not only transcribing these subtitles and that this can be from art but also recognizing characters from the game in videos and getting that into a database. So I hope that wasn't too fast or too much of an information but this is in a nutshell what we are doing right now in entity localization in terms of data. So we have a lot of work in front of us and but we're really excited and totally open for questions. Thank you so much Mario. That wasn't too fast but that was a lot. I know. It was fantastic. Well, I believe you're a very, you're a football fan, obviously. You started talking about FIFA, you almost finished talking about FIFA. There's some pattern there, isn't there? No, I know. Yes, totally. You're totally transparent. So thank you for telling us the cultural change that you have to do to get people excited about data and you mentioned the importance of the team. You actually mentioned Noelia several times. So hey, no, Noelia, we know now she's worldwide known around the world. Noelia and the importance of having a strong team that you needed the help and you found the strong team and then it actually surprises me that you have to evangelize in data, getting people excited about data. You mentioned all the things that you're doing and I wonder, data scientists, data engineers, data analysts, there's plenty of jobs, they have fantastic salaries. I'm surprised that you actually have to get people excited about data and you mentioned the log data school and all the things that you're doing and the fantastic app that you have for people to find out their knowledge and their suggestions that you give them and people ask you what about next exciting talks and openings in electronic arts if you need any extra help for all that amount of work, tell us what are you looking for and what are the next talks that you might be sharing through the school, the log data school or that we can join. Yeah, totally. So a few things that you're actually doing is now for example, we're doing talks about our infrastructure all of the work in machine learning and there's a lot of stuff about visualization and storytelling specifically because it's one of the things that most of the departments within your localization were really focused on like there are tons of graphics and tons of data that they have but they don't really know how to present it and how to tell the story around that. So this is something that we also are helping them Sorry, but Mario, are you talking about the storytelling within the games for example or the storytelling to understand and interpret the data? Exactly, the storytelling Is it two different things? Yeah, it's the storytelling about the data itself. So how do you tell the story behind the data because normally we've seen so many power points with graphs and nobody understands and this is about that. It's not only about retrieving that data and visualizing it into a really nice way by telling the story about it. Well, they said that the power of data is not the data itself but what it means and to know just gathering data by itself is no use. So I guess that's what you call the storytelling which I thought it was just to interpret the data but of course to know the story behind it and to be able to sell it to offer it to the excellent. So when is that happening then Mario? So we are one of the things that we are doing is looking at forums like this one and having people from our teams also presenting. So the data is cool unfortunately it's internal to VA for now but some of the talks that we have internally are going to be really similar to the ones that we are going to be doing externally. So we'll keep you posted. Okay, so people want to know also the openings we have to check on your website or you keep us posted through the social media etc. Exactly, so we have jobs at EA.com I think and in there you will have mostly based in Madrid but now with COVID you never know. We have people all around the world in the sense of within our team in Madrid right now we have people in Salamanca in Valladolid, in Valencia so it's all good, we can have people from anywhere. Excellent news. I think people have seen your cool office and they say I want one like that. So you're going to have a lot of requests. Just one last thing before we go Mario because you mentioned Ada Lovelace you named Ada the EA Log Data Hub which I really like because I don't know if you know that she has a day Ada Lovelace or Ada Lovelace and she has a special day which is the second Tuesday of every October so I expect that Electronic Arts is going to make something special for this second Tuesday in October next year obviously because now it's gone regarding your EA Log Data Hub beta please tell us that you're going to come up with something in honour of all the successful to honour and celebrate the achievement of women in STEM careers and I put you there into that the situation now. We're really focused on that honestly so yes, yes, yes, yes, I didn't know the date but now I have it written so we'll make a big part of it. Noelia, writing down on the agenda Noelia, second Tuesday of October we expect to have some kind of something special from Electronic Arts there you go some homework for you you weren't expecting that were you Mario? No, I didn't expect that. So sorry about that but it was lovely having you we ran out of time so we want to say goodbye hopefully we'll see you around because we're also in Madrid but thank you for this fantastic talk congratulations on all your achievements good luck with all the work and keep us posted, yes please keep us posted okay? We will. Big kiss to Electronic Arts and Mario and for you audience don't go because we're going to come in a few minutes with our third speaker at the attic