 funding, talent acquisition, customer acquisition, so we're trying to help companies start, scale, and we're focusing on things that are more tech intensive, so we're trying to not do things that are simply app-based, but we're trying to also do things that have very economic value and hopefully somebody would like to buy for a big price. So that's what we're doing with SGE Innovate. We've got a lot of different investors. We have one shareholder called Ministry of Finance, but we're set up as a private limited company, which means we're not government, so we're not a government agency. We are this hybrid. What we're trying to do is experiment with how we can support entrepreneurs while still being connected. So part of what we're doing is a little inside-outside experiment. That's why we had to give Tharman here a couple of weeks to help us get off the ground. What I'd like to do is just start by saying thanks for being here and thank you for the team and helping pull it all together. All we'd like to do is if we can help you build a startup that you think that's really economic value, we'd like to see what would be part of it. That's it for the promo. That's the only price that I ask in order to give away my space, so thank you for joining us. So thanks to SGE Innovate for letting us use their free food menu as, of course, the drinks at the back. So now, on to the main show we do. You guys don't want to have to talk. We're going to get a John talk. John talk. Actually, you probably don't want to hear me talk to you. So yeah, so you see Eugene is here introducing, doing what I normally do. If you haven't noticed, it hasn't been a lot of data science SGE, or Quick Data SGE, you know, just lately, I've been very busy. So, I send Eugene, where's your hands? They're from the Data Science SGE Group, an instrumental growing app, and now we're helping kind of bring more life back into Big Data SGE. So thank you guys. Big round of applause. So, we're going to be a lot of reading off of the meetup invited here. So, we're going to be talking about driving the value of data for the organization. We have three panelists tonight. Would you guys come on up here? Are you here? So, first off, I'm Michael Julie's name. Julie, you know, I was able to use I was a loosey. I'm from Chicago. I should know Polish names. But she's the Executive Director of MSD Global Innovation, called RUMERK. Our second guest here is Manga John, right? Ask that one question. He's Director of Data Science Division, we'll talk about that. And if you've been to Big Data SGE before, you probably recognize the gentleman right now at the end, although he's getting slumber by the day. Castle Winter, who's the Global Chief Architect, who's the Intercharter of Data Science. For how long? For how long more? For how long more? For how long more? For how long more? For how long more? For how long more? For how long more? I'm curious to let you go around and see who's in the audience, what you're here for. So how many people here are working in data? That's their full-time job. So, okay. Okay, all right. How many people want to be working in data as a full-time job? You don't know. How many people consider themselves data scientists? Okay, data engineers? Okay. This is people with keen awareness of data and its value. Don't be shy, don't be shy. So it's good, we've got a very good mix of different people. Right, so do you guys have anything to add? Anything you want to say about yourself? That's great. Good job. All right, so I've got a list of questions that have been submitted by members of the group or random people on the internet. But they seem to be good questions, so I'll save it on. And we're just going to have this conversation, right? We're not going to be over there about this conversation. If you guys have questions, we'll answer some questions. We're not going to be over there. All the new questions, which I think will be included into the library. So, let's start now with a question now. I think everybody here is going to talk about the internet. Of course. In your opinion, what's the biggest challenge of doing the value of data science? Is it hiring the right people, cultivating data in German culture, to swing stakeholders to use your solutions? Or what? It's called Julia. Julia. Yes. What was the third option? Are the right people cultivating the data in German culture to swing stakeholders to use your solutions? I think probably the data in German culture, that's good. So, just to step back, I don't care about them, right? It's more pharmaceutical companies. And we've always used data in R&D, right? So, I would say R&D is a very big culture. It's sort of the nature of R&D, right? I mean, I know a ridiculous amount of data scientists in German engineers and research, they have a different title, not research data science. Yeah, exactly. That's a big topic of discussion in our R&D organizations, is why your data scientists get paid more than five statisticians, right? But we can have that holy war discussion at some other time. So, I feel like we've got a good base of people who can do data, but we need to move that across into our manufacturing areas and our sales and marketing areas. And that's where I need people who can help drive that data-driven mindset in manufacturing and sales and marketing. Um, then just to... If you present an awful choice question, it's probably kind of a bit of a different thing. So, my data science team and the R&D team work with a range of different companies. So, an agency has a certain issue with competition, health care or social services, and they try to use it in the solve problems, to be more of a place in the vision. And from what I see, I mean, all those things are important, but the starting point is really what question you kind of solve. So, I always go to some kind of director or senior person and ask them, what's your problem statement? What's your hypothesis? Okay, tell me, if you have a problem statement, is that my CEO wants me to do more analytics? That's not a problem statement. You're going to give me a bit more to work with. And, um... We're going to be the scientists on the roadmap of Q4. Right. Right now. Right, right. Yeah. So, it's really thinking about the nature of an organization. What are some of the... What does our company have to look like to them? How do I translate that into a metric into something that's measurable? And then, do you have to write data and what kind of methods make sense to do that? So, for the first time, if you're working with a strategy or operations team, you don't necessarily want to write it into an algorithm. You just want to give them some sense of the state of the operations, you know, descriptive statistics, and lots of realizations. And then, really trying to investigate for them, what's the rate of an investor? In government agencies, it's not a complex motivation, but how long people can come with a good sum of money in terms of social services? How much bank power do you have to save? And it's really figuring out, is it a hundred thousand dollar problem? Is it a billion dollar problem? Is it, you know, will it fundamentally change the way we do social services? Or not? And I think, we're figuring out the right problem in the strategy and the return on the actual data that exists is really important. And without that, we're not going to do a higher return. We're not going to drive data into the culture. And all the other things is kind of a poor problem. That's fine. Yeah, I mean, it seems able to, what it's here, the panelists, I think we need more important questions. So, if I could question, and in the bank, I'd be, I mean, I almost can't get it off. Because something is really, if you operate a professional business, right, there are certain things you need. And one of them is data science. I mean, do you want to argue about having a programmer to develop, you know, a mobile app, or are you arguing about, you know, having a CEO or a CEO or something? I think this is really, that's a risky question to have. The way we have looked at it, and so it's not just me, the person that will ask the question, and just as a matter of sharing is, I think we have established data science groups across different domains, like that they're specific to, to a set of industry of a particular data bank. We serve a lot of people. And then, create a data science lifecycle, as it is, a lot of things are about, how to discover data, how can we find insight in data, create better data models, better models, better clinical models, help them to implement these models into applications, which are provided inside, drive right decisions on the application level, I told you, I want to make sure my application is able to make a decision or what a customer should be offered, what a customer should see, and these things take some sophistication, right? And it's not enough to give that tour, this is how this will develop into a new data science group, right? And again, think about your, what I've said, now that's a, again, it's some value that always comes from, I have to talk to my boss on the internet. Now I think that there are any, there's an example, it's in any industry. I mean, it's a good term, you're going to get business cases there, or if you, you know, if you need a consultant for that, you can look at it as well. But I think that it's truly beyond the point that with our data, in this day and age, you're just, I mean, you're about three years old as many as you need. So sorry, to be honest, I'm happy to be more specific if, you know, everybody wants to talk about this, is it more, happy to talk about it? Yeah, I think data is both a true measure of the advantages right now, and how you can handle it. I want to, I have a couple of points to this. Because I want to make this question, you know, what's most important is hiring white people, cultivating the data, germ-culturing, persuading them to staple those new solutions? Yes. I mean, a little bit of chicken that I have that can reach their problem, right? You've got the three things that, and any one of them, you kind of need to have all three, right? You have to have the people that can deliver the results, you have to have stapled that, except that eventually you have to develop a culture of big germ culture that encourages this. But I think the main thing, one second, calm down, cowboy. I think the main thing is you really need to channel inside your organization. I'd say that it has to be the data that has followed and will channel the team and get it started and allow you something right now because it doesn't have all kinds of things that you want to do. Here, for the same person as well, you can possibly appear to be producing the value to meet that change that needs to be. So, I want to make it simple. You think there's a should we repeat the question? What's the question? Should we repeat the question? Two questions. Two questions about whether there's a lack of trust. Yeah. Yeah. So there's a lack of trust from stakeholders to add right people and to do the size right. Go ahead. You need to go. One picture, everyone. I think generally what we found working with corporate agencies is data scientists can be very threatening to encompass their organization. It makes people very uncomfortable like, who is data engineers who come in there and try to tell me things that I've known for years. So, I think that's the first thing that you need to realize that I think the transparency that comes with good data I've presented scantables and visualizations of data on transportation. And sometimes the senior guys are like, wow, you mean it was so bad? That's because you've always seen averages. You never see the outline because you've never been presented to them in the reports I've gone up. They've always presented information which they haven't seen before distribution of the data. So it can be pretty serious. So I think the first thing is to realize that data science can be kind of really threatening. So I think one of the first things we go to is we say, you know, how can we help people scope a project and help people and operation on it? So once you have something that sees to it once, there's something that brings immediate benefits to the operations. Like, how do I help you save time or save time? Maybe doing something that's great for strategy and senior office doesn't have immediate benefits. Then maybe it's going to be projected right now. So that's, I sympathize with that statement and it's very true and that's where they can engage with model and have different types of data like this data I don't have had to do this way. Yeah, I think this is the key, right? I mean you need trust in any work environment to get anything done, but particularly in this space because it could be so transformative, right? And I have some simple examples where my predecessor, who's no longer what's our talk to me, took the approach of kind of doing almost secret data science projects, right? And then coming and saying, I think I can reduce the sales force by 50 people and save $10 million, right? And of course the group that she brought that on was not necessarily receptive because they had been part of the process, right? Whereas now we're kind of turning it around and saying, hey, we know you need to reduce costs, maybe you need to reduce the size of your sales force. Let's work together and figure out how that might be possible and how we can do it together rather than kind of surprising somebody with your analysis. Could be exactly the same analysis, right? But you don't want to surprise anybody with it. Yeah, and it's important to add on to things. One of the things we did not have in the program, but I'm interested in the program that we didn't have. And it's good to see that people are already on there. I think it would be more fortunate than any other. I think we also started different ASS systems. You know, kind of data based strategy and more in the tools and the architecture to create framework which enables data science in a much better way. So if you didn't have that, how would you do that? And what would you do? Do you want a database to find some data? I mean, I tell you, I tell you, I'm going to go into a quick win. That's going to be like two years later, right? It's going to be a big organization. So I won't probably jam into it is important in all of these initiatives to show value. I think that definitely helps a lot, right? And, you know, for data science, I think if you looked at some of the data discovery, getting to data discovery was in the beginning a bit difficult because people didn't believe that you could find something by looking at the data you found the goal in itself, right? Just to find, you know, searching for data and more how would you do that? You kind of know myself. And clearly, obviously you don't, right? It's always, there are always surprises out there. At the moment, surely, the motivation to do that today is constant, which before, but again, some people are saying, well, you know, some people are saying, you know, this is better, right? So I would say, if you start it up, these scum girls might not be that thing and just have a putting point seen quickly which leads again, to the side stories that you needed to have a life cycle which had a lot of affection plays, aligned with different organizations and teams. It's not only that. Data science don't have that. Equally also, I think, there are enough things that I mean, I really see the process starting. I don't want to do it. Endowment? Endowment? Okay, there's something you said that I really like, there's the scum books. I use that today, actually, there's some of the in management, I mentioned scum books. So if you don't know what scum books is, it's like stealth, right? You can try to do it under the radar so nobody knows what's going on. And that's a bit strange. So what we did, initially, was we did a combination of both hanging the truth, right? So every company's got a really easy data science problem that's got a high impact and high visibility. And you can put one person on that and get something done in six months. And then, you know, in six months you've got three people working on really important long-term stuff. You know, just sneak it in, right? You have to do something good so you can get that. I mean, just caution on that because the example I made, right, is you don't want to be too secretive and too scum-worky. That's such that you surprise people with what you have and that's what you can push about. Right, right. I definitely agree you can do things low-profile but not necessarily too low-profile that you can surprise people. So maybe I just wanted to add a little anecdote and a story that we encountered. So a couple of my team members recently were called in a circle line which is so I definitely know that there was another issue with the circle line which was in fall or three months when the fine-out was caused and basically it was a nice story that we were able to figure out that there was actually a single train called in these instructions. So you can go around the sales pitch on our data down the entry floor to look at the code and the visualizations. But essentially there were a bunch of experts looking at the problem and a bunch of data scientists came in and basically did a bunch of visualizations and looked at the data and we were able to isolate the problem through a single train. Now what lessons did we get on this? I think the first thing was it was actually pretty intimidating for the data scientists to go in and tell train engineers and say to them experts from three different companies some of which are MNCs that you know their initial hypothesis was wrong. But the challenge was that within the working team they let down the circle line of the group which in the end turned out to be false. And the data scientists were looking at the issue of a pressure strategy. So to tell a long story short the story was this that the faults were happening almost randomly across the island it was actually this road train as it was passing across the track it was almost hitting trains on the other side of the track so initially the hypothesis was that this was not possible from an electrical signal that there were no faults but the data scientists had a kind of pressure strategy and so they really had to challenge the experts. Now how were they able to do that? One key ingredient was the fact that we had a data well it was a big committee with 30 or 30 people which was pretty scary in about 10 seconds the first thing was we had a candidate we had somebody in the ministry who was was found that we should be looking for he wasn't a data scientist but he was an advocate who kind of brought us to the table and said that the powers that we need you guys are going to give us the second thing that I think was really compelling is traffic links which is we need to be easiest at first and visualise the data like don't try it's really nice data look at it and I present that and there is nothing like a good data visualisation to get the point across to these individuals because a lot of them data is very scary to most people I can see when you look at the station and the eyes it means you can see the guys they don't get it so companion visualisations this is the first time before you go into most of this data sometimes you get too stuck in I would say you know so I would want to open up a story tell it what that's about a different topic but what I like to do is actually sit down with somebody who's I try to think like the least data the least head-on person and sit down and explain the problem right and you can't explain to them and you probably not going to do the jobs going to the management so you know it's like streaming down and doing the essence and speak about it as a business problem right and if you go on the cluster you might be showing the seven clusters and here's the broad data that we derived that from boom yeah I mean lost the CDL you know you know get out of here if you're watching you might get interested in that but you know for this question I would say it's kind of like you know how many of you start doing science it's like what's that one can't remember the five stages of that right it's like denial anger resentment negotiation and then acceptance right that's what you go through starting to do the science right you know it's like you know people are you know denial we don't need that anger we're doing that already we can do that you know when it progresses and really you see shouldn't be organizational and hopefully you don't die okay here's a good question this is one of my favorites that I wanted to ask that I was at the opportunity so lots of data analysis done these days involves retrospective data that's already generated sometimes you need to carry out experiments or I could maybe use new data to gain the recovery of science so lots of panel I'm doing needs assessment of data time and quality and I appreciate you very much thank you very much for the question again I mean I understand the question currently from my perspective so there's the data you have what are the data you have is what you need what do you do? well if I turn it around for me it's more a problem of the quality of data right and is it I think that is a way for me to understand that it's comprehensive what I don't always necessarily understand that the quality is appropriate or is it really a high quality because through through the stream of data as we pick it up I can say for internal data I can know what is complicated fine and external data their comprehensiveness is a bit more difficult to understand because you you don't know what you don't know that's been the problem so you need to get back to the source and again I I run the longest performance in the world technically and then and then there's the last bit which is the problem is the quality of data and that could be comprehensiveness as in as in some systems they just they just combine the data which are useful at a certain time or in a certain way if you don't know that then the data again is not not effectively and then it represents I mean if I look at anything in presentation it is the quality and creating the right useful initial data that's that's really significant in that context and so there are definitely kind of methods to use the methods saying hey can we reconcile the data can we buy everybody the data and some data is less less important that context is enough for us and in other data there's more on basically the key section that goes for transaction as you need data immediately you can have is it really the same data as in the end two or five systems so I think there can be one check sometimes against the original system to make sure that it's the right data and it's not necessarily only a technician that also is concerned some of them can only be validated based on this which makes that extremely complicated as I said so I think the other organization probably have their own to use because there's more information but even there I've seen startups I'm not putting anyone down but even I have seen a number of startups whose color difficulty of data quite significantly and then of course all the data how do the data start so in a pharmaceutical company right we've collected data in a very structured way for many years for a clinical child for example so you kind of you can find up front what they want and as scientists people always want to collect just one more piece of data right you know what's with that information and in our company in particular post data and it became like even hard to participate in our trials because we were asking for so much data because we have this scientific curiosity about it and then we started to realize we're collecting piece of data especially if you want it to be quality right as you said you've got to define your rules you've got to clean it up you've got to do all that right that because we can get data more directly from people just creating data as part of their normal life and not thinking that I have to ask them to create specifically for us I'm hoping that the price of data drops and that will have an effective influence on everyone right right but I don't have to ask you how much alcohol you drink each day I'll have a little unplanned that clearly tells me and it's probably more accurate than what you would tell me yeah what can I just just two points to add to that which is completely about data quality and kind of low cost of data I think one challenge that agencies come to us all the time tonight I might like should I just collect more data and sometimes there are costs to this sometimes it comes down and it's hard to say so I mean you example about the reasoning the reasoning system here for the was for payments and transaction it was because you have SPS and SMRT and you need to script that's why the design system turns out that you can use the online data product for lots of platform planning and the research universities you have worked with like that so you can use data processes but I think my son is always don't try to get all the data see if you can make some headway with the users on the existing data that you have show them something and we pretend to say I've done phase one and I really can't do it but you might be surprised because if you can show them something with the existing data that you have say you know what wouldn't it be cool if I could do have more data and do this and either or they will tell you surprising heuristics that are so I would say just to work with what you have you have to be careful not to start project that it goes but sometimes just giving a chance for if you are trying to do it probably and quickly is that at the beginning that is probably very smart that you think about time development you would say hey let's do this later and then you look at them and probably make something out of it however I would say is inevitably which by is that you know and all data can literally be let's say this is a stupid example is what type of banknotes I gave you at ATM and that's 50 is 10 so I know that is very interesting right how would I know that it's not interesting that some of the banknotes which carry the data when you go around and you know junkie this is kind of so why would that not be interesting at some point maybe from security problem so I don't know so I would say is that I do like the idea that you can consolidate even through data visualization which are why or through conflicting data in real time and several some of them I think that's very important thank you just a couple things on that I when it comes to data I think here on the side of collecting more data it's one way on the side of collecting more data now the cost of collecting stored data is actually quite if you do it in a way if you know it's not expensive you got well but you know I mean I was you know analyzed the data the data you have that's where data engineering comes in so when you go on some of these questions so I'm just going to answer some of these because there's only one way to answer is there a shortage of data scientists to make sense of the data okay this is for you okay thank you so I have to find these two questions so what qualities do you look for data scientists which non-capital skills do you find the most lacking data scientists so what quality do you look for data scientists what kind of new questions do you assess those qualities I just look for curiosity I just want people who have a natural curiosity and who are willing to follow that to wherever I mean curiosity and my marriage makes cool I mean when you meet a person right can you tell if they're having a sense of curiosity or not there was a very famous Supreme Court in America a long time ago it was about pornography and this is Chief Justice they said how do you define pornography he says well I can't define pornography but I know when I see it so then I can do some things that are not everything has to be a data-driven question you only have to quantify it you only have to do three decimal points but that question is something that a curious person would ask so we do the dataset I can ask some questions of it and you'll write that some of it is not quantifiable but I think as you watch people go through questions and you can have a different discussion of that when you do them the right way you can do some of that and also you can do that what they're looking for side projects I want to say though that I don't know what the details are but eventually it's an advocate scientist designer and any of you who probably have their own definition but I think only for pretty much pretty much and I think the problem sometimes people expect is has to be passed and everything and I think it's really important to figure out what this person is good at kind of a strong but have that's not so I just get very annoyed when people think of the one it's many different disciplines that have been very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very So there is a need to be a model, to be interested in some technologies, I don't expect data scientists to be diartic or just so, but have some understanding, and be willing to try out any way different to it. And also understand what they have to remain understanding as you expected to be. It's very difficult to be a generic industry leader, data scientists I think is not going to work well. And be willing to understand some optimizations of the top four of the hard things of business roles and what is the data communication accordingly. I think some of these requires also a little bit of system understanding, right? So we'll be a little bit there on the end. Yeah, I've got to add my opinion to everything I can help with. But these two, I think you're right about the, you know, not that you're asking, but the passion, you know, is accommodation. Well, you know, I'm very curious about that. So if you're not doing anything about it, then you're useless. So kind of a passion. You know, people that are passionate about it, they do things that go a little beyond, you know, they make people to go off. God bless them, they go home, they do data science problems. But they've got the strength to ask for passion. Shoot, there's those table stakes that people have to have. You cannot take a plumber, you know, with no high school education. And because he's passionate about data science, data scientists, there's certain table stakes, but you can take something with a minimal skill set, a lot of passion, and they will make something for themselves. You can't stop them. You're trying to stop them. So it's the question I was trying to find. Yes. You're not really my job description, so what do I call it? So something that I get confused with, do you think all of this is one person or why is it a good kind of job? So they're often written in my light on HR. I had this recruiter concept, and I responded, I generally don't respond to recruits as well. What I had to do was just like, this person had a laundry list. It's like, you need like six years of scholarly, you know, hard, you know, Python, you need to know, networking, data engineering, business savvy, you know, all this stuff. And I just go back and say, which is described, there's like a team of eight people. A real team of eight people. If you have one person with all those skills, you could never utilize all those skills. So, you know, I mean, you gotta separate the job description that's written by somebody with the real thing. You know, if somebody gave me and I wrote a job description, and they said, well, you know, I'm not, you know, I don't have all those qualities, but if you use the API, I would listen to them and would predict, you know, when your grandma would be gone. And it works. I thought somebody that was born with a grandma who said, hey, let's give me a name on that sign. Yeah. But, you know, I mean, it's like, okay. Great. Show me if you know what I mean. You can't put out a job description that just says curiosity. Right? You need a little more meat around it. You gotta keep some of the riff around for a while. I wonder what happens if you keep them, right? That'd be fun. Yeah. Yeah. Do you think anyone would apply? I wonder who that is. Should we try this for you? You have to. You should introduce yourself because you're the chief officer. All right. Let's move on. Okay. Well, all of these, you wouldn't want to be a scientist. What non-technical skills do you find the most likely to be a scientist? How successful have you been to train a scientist in these skills? That's one. That's one. The easiest one to do. Why don't I start with that? No, I'm not going to. I'm not going to. Which one? Which one? Which one? Which one? Which one? Which one? Which one? Which one? Which one? Which one? What I see is, according to high-key studies, a lot of people, don't understand these things. I find that very, very difficult. If you don't understand how some of the work and customer studies are working and why and if you find it to happen and what the difference is, that is very difficult to do a decent job. I think you don't need to go to that one, this is not negative, but domain skills, if you really don't have it, are maybe severe skills. So that doesn't be my first thing. And we need the detail. Try out the products, understand what's happening, and then come from a product where I understand the environment is going to be that first thing. I agree. But let me have one more. It's kind of what you came back to, the story of health. Once you've nailed the domain, how can you tell that story? And how can you make it real? And how can you keep it moving forward? So I think one thing that we found is that data is not inherently multidisciplinary. You have people with, like I said, many specialties and many stuff. And so I don't know if you can teach this, but I think the most important knowledge and skill is almost that willingness to learn from people from very different perspectives and backgrounds. So if you're 15-year-old colleagues, are you willing to at least engage in a few designs if you want to understand the background? And are you willing to engage in design? And with something, you visualize it in a simple way, but you may not be technically absolutely combat. Are you willing to work in those types of sorts of things? And to me, that's kind of what you're doing, what you're doing. Because to make it healthy, on this point, you're going to have to work with teachers to know quite understand what they're doing. And have willingness to learn, as well as to respect and appreciate the role for their students. Yeah, and there's a role for everybody, right? Sometimes you have these geniuses that can't interact with all their communities. You put them in a dark room, you turn on them, they do great things, right? You just have to have the right care for them. But they do the physical versus physical job. But yeah, storytelling is really important. And to me, they have something like that, that helps the way you do things inside of them. Before the others, I would probably agree that it's convincing statement. The thing is, it's also about looking at the different data, what data shows you, and then bringing it into the context, and more just to the context of what happens. It's been what we said about visualization before. You said that in the last year. And I think, for me, this is a story that you can continue to show them as pledges and enable it to date as obvious, you know? And nothing's been done. And I know we've been going down so well, but if you can put that into eyes, we're sure it looks, you know, in the download era that they really do the trick as well. And I would say you need to learn how to tell the failure stories as well. Because if you've got to learn from the failures, and so I think we hesitate about telling the failure stories, well, we should tell them, right? So that's a whole other self-field of who can tell the failure story. And I think something that expectations are hard as a whole. A lot of people come from 90 background in data sciences like building a 90 system, and really deliver it, and say, oh, what's the accuracy? Like, what's the F1 in the data model that's really the model? And I can't tell them the right amount of data. So we try to pay it for the time. And say, here's the first phase, and I'll tell you what I can get in these two states. I'll tell you what we usually move to the top of the bottom, but I think a lot of people, again, from 90 background, don't really understand that it is open at the top level. You can't tell it by that. You need to give it some balance, and some reasonable expectations of what you're talking about. But that's, you know, you're not going to be able to answer to that, but I think in the second phase. I think that's why data science spawned in science. Because scientists are used to taking that experimental hypothesis, iterative approach. And that's where I see our challenges moving data science out of science and into less scientific areas. Yeah, I mean, I think we're going to present some results. You know, encouragement. What's the headline, right? What's the headline? If you're going to get some exciting talk, where you see headline, what would it be? There's a story on, you know, why did headline do this, right? So the teachers from District 128, they all went out to a teacher's conference. So they got a warrant for having the most group of students over the last two years. And on the way back, let's go down. What's the headline? No school tomorrow, right? I mean, you can put an angle, let's say, at the most impact on the audience. I have a lot of them. What's that? What was it? I don't know. Is there any questions? My, my, my, I would say is, I would say yes. So for, so for, I think, I mean, sometimes companies who are not using data science or worrying about data or, and if you think about the application of smarter words, smarter applications, that's very difficult. I think you, you definitely are having, having, you know, becoming independent, right? And I can, I can see that also my industry now, the banks are not, which are not investing. I mean, they're very, very, very specific use cases to the rest of the city from the calculations and stuff. Prevative models and so on. If you can't do that, then maybe it's going to, you know, maybe it's going to start, start some new shit. And, and so if you're not investing, it's, it's, now, now then what does it mean is, and can you be as good as, as some of the people know to do, right? Because they, they aren't, well, let's be honest here. I mean, all of the companies, could you honestly say that you are maybe as good as someone in the US who, who spent the most amount of money on the high? That's telling, that is extremely, I don't know, as far my friends are going to be there. You know, there's something in how the system, how the organization is working, wow, I mean, you just, you just, you know, help them. You can always work with it, but so I, I, I, I, I would agree with that then. If you even as an SME, if you're not able to do that, you, you have to change. And this is this party, so to set parties, and it all comes to be expensive to do that. This is the same thing, right? That, that you can put the data into one place. This is one extreme way to do it, right? You can start educating yourself, and, and learn about data science, and different technologies, and, and try them. It doesn't make you a Google, but it definitely makes it better than the next time. For the next business. So I, I, the set parties doesn't have to be expensive to, to become, become good at something. It takes attention. That reminds me about, you know, another global thing. So when you're running away from the rare, you don't have to be the fastest. You just don't want to be the slowest. Like, you know, there's a real round of things. Yeah, I, I don't like your question, because we, it's not just data science where you hear that double narrative, right? It's the concept of a digital company or a digital business model, right? And, you know, the newer companies, they, the Googles, and they, they started out with a digital business model, which includes data, right? Digital gives you lots of cool data. And I think a lot of other companies, you know, we have a lot of legacy to pull with us, right? You know, my company's been around 125 years. We've always had a lot of data, but we're not a digitally native, digital data business model, right? And we have to decide how far we want to go down that path, and at what speed we want to do it, right? That's why I think all the companies need to make that choice. So what is their business model? Is it a digital data business model, or is it more of their current business model? All right, thank you. All right, so we want to talk a little bit about who's involved in this here. So, good question. How do you distinguish what developments of data, data science, on real value, and what's just high? Or is it all just high? Depends on your headline. I mean, the way we try and do it is don't believe the brochure where I've tried to get where you need it. Three years ago, people talked about Spark, something coming out of M-Land and Berkeley, and we were like put together by one of these people I think about when we tried to, I mean, see if it worked. And there are lots of people who come and talk to us about the M-Land business model, and they're like, well, maybe it's just seven products. It was Spark, it was fire, it was a good experience, it was data-satisfying. Yeah, and I think that's just the only way to do that. And sometimes, you know, you're like, oh, you know, what are you gonna do with that? And what we're trying to do is, some projects that we do are really well-contact driven. Some projects I'm like, I'm looking for projects so that the guys can play around with it. And it's only my knowledge of projects, I can't do the set that we want to do that. But sometimes we just put it in a space that we never know what's gonna happen. But trying it, like, there's so much hype and so much people around there that it's definitely so good. I thought of the idea that there are certain things we're looking at first. So I fully subscribe to them, so it's actually the same, the same ones. I'm probably more of a guy who's crazy about the future and try some more, and you should, but I think it balances out what you could do. I think that's a good idea, I think that's a good idea. I would say, instead, I'm looking for, in terms of technology specifically, I'm looking more towards, because what we see is, and if you know, it's on the stage, in terms of innovation, that's a lot of problems. There may be one difference. There's not even so many. There are obviously also a lot of companies who provide products, but just in terms of, in terms of perspective. And I think it having, I'm not sure I would probably put a percentage to it, of what we're trying and how much we're trying. I think it depends a bit on what we see and we pick up technologies, look at that, try it out, as you said, maybe do a bit more projects and then just get it out. And I believe it's a cause of a good technology shock. I'm not saying that it wouldn't happen in the sciences, but I think in terms of knowledge and perspective, you also hire better people, you get better people, people are more excited because they're saying, hey, we need to do stuff there. They have a cool catch on trying things out, why won't we do it? I think there's a few dependencies there which make that kind of world a lot better. Then on the hypotopic, you know what we were talking about earlier, that some people that you'll approach with the data science idea will be like, oh, I can do that, I'm doing it, it's nothing. So in some cases, the hype helps a little bit. I want a beer. The hype helps a little bit to open the doors, right? Because they see something cool in an article and then they'll approach us about that. But then you've got to set the expectations about whether that's possible or not. And certainly, because I run an IT organization and a data science organization, and we always try to align our projects with 70% kind of our focus on our core, 20% our two to three years old, and then 10% our five years old, right? And so we have flexibility to look across the whole spectrum of what's possible with data science. I would say, you know, when you're starting out, I would stick to low risk, I've gotten low risk that now it's like, has somebody else done it, and they've been doing it for a while, it's impossible to go for it. And as you get more sophisticated and understand technology more, your ability to evaluate a technology or a tool quickly and say what, you know, that looks usually not increasing. So you can get about more towards the cutting edge or towards the bleeding edge with less risk. Okay? All right. Let's, you know, I've been able to stretch this out. There's one question out a little bit more because again, we've got somebody that didn't ask the other question. Is natural language processing still in its simplicity? What can we really do an industrial level? Now, one of that, I mean, that also leads into buzzword. To learn. So, what are your thoughts on that? I hear an answer. What? I hear an answer. I'm gonna take me to listen to you all and see you soon. I mean, for me, like, you know, I don't think we can use it quite a bit, right? It's very useful. That's what will establish technology or critical. Still, it doesn't do everything. It's still, it's difficult, but there are a lot of uses for it. When it comes to deep learning, I'm a little bit conservative, right? The process of those are good and now I'm seeing more and more people with that. And then there's more and more real-world results from it. So, yeah, I think we need to start on that point. So, generally, with natural language processing, one of the challenges about the company is how it's provided to the world. They've got lots of texts and lots of records and they've never been in the process. One of my colleagues used to do research at the university and it was an intersection of history and clear science. So, they found it, the historians who wrote the program found it really fascinating to use top models. So, this is finding underlying clusters within the data and the structure. So, we tried some of the top models for those who use an LDA, and talking about better things, whatever you call it. But, essentially, it allows any agency to track their data and it tells you, you know, the 600 light years. Now, how could this be used for that? And it's not the most sophisticated kind of a deep learning model. But, HDB, for example, so the housing development board gets loads of feedback and they have a taxonomy for the CRM systems. But, because this hasn't been updated over time, when you do the business analytics, about 70% of the feedback is categorized by others. Which means you're only looking at data from 30% or something. So, we talked about 60,000 user feedback that we would turn into a model and then how did they work, you know, X number of topics then. 10 times about 10% of the complaints were about a cheap lecture. Because, when you apply for an HDB, probably thousands of times, it's important, it fits the date, right? And most young people want an earlier date, so they need to reapply this. And most older people want a date date because they need to sell their house. So, younger people want an earlier date because they want to go out and try to stay on time, so that older people really realize it. So, when we were able to kind of do this, talking about data, we got an insight and they realized that it was 40% of the cases. So, as a result of that, they're now going to change their IT system so that you can figure it out. Really simple application. You would say they didn't really know this already. At the ground level, they knew it, the operational level knew it, but management never knew it was 30% of the cases because 70% of them were others. So, with this talking about data, they were able to convince them that management is a big problem. So, it's, I don't know if it's the most sophisticated use of natural data processing, but for a lot of government agencies, you know, with eBuy, as instructed, it turned out to be a really easy way to use MLT. So, we've built a simple web app and we've got a proper data, so it's quite a lot of possible data, it helps you process the data, that helps analysts understand it. It's not something where it's just pure machine, but using unsupervised techniques, you can use a machine to help analysts understand the data, because otherwise, the whole guy has a great $6,000 case, it's trying to develop on this one. So, this is a use case that you've seen that's pretty simple, but you can tell it's a lot of data. One thing I think, you know, for natural language processing, sometimes there are really good use cases for that. So, it was out of the blue that a lot of, you know, the manual of tasks, right, that there's something that has to sit down and you can deal with something about processing an issue, that's plain, you get products in and have them properly categorized, and this requires a lot of labor. So, if we can use natural language processing, you can say narrow down the problem space, so if you do, you can, you know, talk about this one, a product classification, right, you get a product, you have to put it in one or 4,000 different categories, you have a desire, what categories should be it, well, you know, natural language processing, you need to be good, you know, in the science world, it's pretty, you know, you know, each of you is able to come up with, in terms of, you know, of course, a lot of people come up with, you know, the top three, the high probability, right? And so, just giving them the top three, and you know, that when you prove the productivity of the people that are doing this by 500%, so you don't have to react to it, you just narrow down the problem space, you know, each one of the points, is it a key point, or is it a solution point, right? There are two problems, two different problems. Maybe to add same, same as in the other thing, there's a two-contact-send-in-use case, you can contact it, you get, you know, and it will justify somehow, maybe you want to link it to a chipboard, and find somehow, maybe you can do it. I think we have a new sketch, probably, it's in the image, we get all the documents, the images of the documents, which we have to process. Then usually not text-line, or a data-minder, you know, in the past, some of these can have, in fact, as a trend, which we're seeing now, is that these OCR vendors, they see me die, because we have measurement algorithm who are much better than OCR, and then criminalizing images in the OCR vendors. So you kind of see changes there, so you agree on all products and all. And this is also very interesting to us, right? And again, the other way of unformulating responses of any stores to staff, maybe a sense of data, which is not planned, I mean, what's going on. So I'd say, yes, a number of use cases in Washington and the New Jersey City. Again, I think we have a team, which is kind of experimenting, that we use it in two areas. And I think, again, as we said before, to find out products, I think it's very important that you have the ability to look at these, look at the project's new frameworks, download them, and you don't pay a lot of them, and see how it can make sense of, that's a huge case to get, which makes sense. And that's what employees should be doing on their all-pollers. That's a lot of just, yeah, I know a lot of those others. Right, and particular in medical terminology, and everything, it's quite vast, but very valuable if we can get some of the cases done. Okay. All right, now, here's the question all of you are getting here for, right? How do you find value in data? Is it more working towards a specific goal, or doing what you do in hoping something awesome will appear on the ball of data? The scouts. I think we're doing it all. Actually, somebody added that at the end of the previous question. I can't resist it. Thank you. Well, let's just begin, and then I have a really boring, predefined set of problems, or I can do something awesome. Do you know where you're going, or do you discover something and follow where it leads you? Well, both, right? I mean, I think you have to have some direction of where you wanna go, but you have to keep your mind open on the journey for anything else that might catch your eye. It comes back to the curiosity, right? If you just stick to the path of exactly what you were gonna do when you don't look broader, I think you could miss that. I saw a lot of one two-cases scenario. I'd say a few cases where you're actually not having a direction. Just actually trying to have the data, understanding what's going on, having another, of course, you need to actually, you have to have the time to sort of get this equally. But the leaf is that we know what's in the data, so we're pretty sure, right? So, I think the balance, I think that's what we're saying all of it. It's kind of a balance question. If you're only there to do, you know, kind of a useful search type of project, of course, nothing there for the model. So, you need to spend the time wisely. But I would not say that only giving direction and only being checking on the coordinates would catch you in a good way. You need to have the time to do it. I look very much at the school and you need a kind of hypothesis of direction, but that you need to understand the broader, the way you use this broader object. So, they may ask you a specific question, but actually, underlying can be something that the broader. So, you have to do a specific piece of analysis where you need to understand what the underlying information is. And then as you explore, you can kind of end the production. One of the things we spent a lot of time on is really what I call passing the question from the business users. The business user would say, for example, they are going to that task authority and they say, oh, I want to calculate the utilization rate of buses. Now, why? So, what's the broader question? The broader question is, oh, I want to give people a better community experience. What does that mean for a community experience? I'm looking at how well you can crowd this. You go on buses to be on time and you go on to be, okay, what are the metrics around crowdings and how well you do? So, how do you measure the metrics specifically? You know, there are various metrics on that and okay, crowdings minus naturalization, okay? Going down to naturalization, are you looking to be, how do you, how do you average? And it's really forcing the user to think through kind of from a high-level objective of improving community experience into our current metrics, finding the right base lines and then passing the problem down. And what I would say is, once you have a high-level question that you are able to bring it down to the metrics, that's how we start the design problem. And because I understand a high-level question as well as a specific metrics, if I find something interesting, I can then kind of move it into a direction that might be useful. So they might not ask me to do that in a minute, but I know it depends on the functionality. If I find something interesting and the functionality data, I can go back to that. So, it's, I really feel like it needs to be guided and I think it is now kind of an understanding of this. I don't think they're exclusive, right? So I look at our process, right? And we always, the goal is going for a goal, right? We're starting out a project with the goal. Sometimes there's rarely a time when we just get down, and I'm just like, hey, guys, play with the data I have, fun, you know, whatever. So we're going for a goal, but as we do that, we run these things like, huh, that's strange. Okay, so usually it's like, you've got a question. You learn about this. There's a lot of questions on a page, you know, like catalog page, then percentage of that will come in two quits onto a product page, and percentage of that will come to purchases. But all of a sudden, you know, there's like no impressions, there's a few page reviews, and lots of purchases. How the hell would a person do that? You know, what's going on here? And put that aside, you know, maybe research a little bit, and then you call the stuff. So in the process of doing the big science, we also get these things going, huh, that's strange. We should look into that, and we put that over to a little pool, and then we start to, you know, system through that. And we made some like, you know, tremendous, created tremendous value out of that, right? You know, we didn't think it was, it wasn't what we were looking at, but once we looked at it, and started thinking about it, it's like, ah, that is really important. We should really act on that. So, you know, it's kind of on the goal, but you know, you can find something interesting, you know, don't just go away, keep it like that. Okay, so, let's see, let's see your great stories, your big stories, your big ones. What ways has data science effected decisions made in the organization, or has it? Well, I think I'm going to start with a couple of things I'm trying to come to. What is this specifically about? Specific enough that it's still general enough that we have the ability. We're really happy about what we were able to do with the kind of civilized experience that I just got. I did some quantification about the kind value of money and how many people were disrupted, and we did a lot of that. It was pretty gratifying to be able to help with this. I don't have to be stuck in trade, so I think we're really glad about that. We've done some other work on policy analysis. So, this is a ministry trying to change how about that policy development. And they came with certain benchmarks. So, they had a view that there was a certain group of people that were currently receiving these kinds of subsidies that were not deserved. So, again, the business users said, oh, that is not a very deserved group, and we should change the policy to a better group. Now, what you may or may not be observing is something you need to ask and define. And what we're going to use in the course of analysis was to actually show them that what we looked at was for one of the benefits. They were also correlated with medical episodes, and they were the kind of people who were kind of low-engined based on some data. And we actually were able to change the decision in response. So, in fact, other than being more strict about these problems, we were able to actually change the policy to make it more generous. So, it was about a 180-degree decision from the decision of the decision-making as the minister and the minister agreed. And we were able to change that by the model of showing them the link. And I kind of heard from kind of more specifics, but this is kind of a cut-off kind of success story. So, we were able to show them that the model that they proposed would cause damage and the other model would actually increase benefits. One of these things is it's not a simple case where there's an ROI, because it's a distribution crash, and there's no kind of... In some ways, it's a hard problem, because it's not kind of a right or wrong, and we cannot receive you from the right-hand side of people even if it's something that... I think what was really complicated was that it has a fundamental different way of working in the world, and by using data and analysis we're able to change that. So, can we maybe complain about that a little bit? Because you mentioned something that's like measuring ROI, right? And the ROI is usually what you put in to get that out. But not everything is always, you know, easily quantified even if my answer is equated. Right? So, in one way, there's ROI, there's fun. Could you answer that? Actually, you don't have to answer that. You would put back all of it. A common picture of the world, so a lot of agencies and companies, they don't have a God's idea of the way it works. So I used to work as a bus regulator and a passport authority. I tried one week after the first 17,000 people, and we were in charge of adding capacity buses. And I had no view of which bus stop, there are 4,000 bus stops, 40,000 bus strips a day. I had no idea which was the most crowded bus. And after about six months of life, I just really had this tableau visualization with dots. I had to slice my time, and so I had a complete view of which were the most crowded buses in the Netherlands. And it's just that view, the common language to discuss, so we could debate about what the causes were, whether it was bus operators training or having certain spikes in the data. But that common view in order to have discussions with stakeholders was incredibly popular. I mean, it also helped my decision-making because every week it was a simple decision to write them and use it as I. And then the member of parliament says, oh, my bus, my district is really crowded. He said, no. Let's look at the ranking. Your colleague is suffering even worse. It's a new way. It's completely data driven because otherwise every good bus is gone. So I think having conversations with stakeholders are a power of visualization in the US. And that brings up another thing that is really popular in science. I've been trying to preach that at Lazada. Data is a great equalizer. Right? It's a two-barge table. It's a famous quote. It's like, if we have data, let's use data. If we have our opinions, let's use mine. You see, no. But if you have data, then your point of view is valid, right? So if you can challenge somebody more senior, you can challenge them. That's a challenge. That might not go down well. I don't fucking do that. But if you can present the case and have data supported, you know, good ideas can follow up that way. That's one of the most exciting things for me. It's good ideas supported by data. I'm more viral. Your recommendations are always convenient. It shouldn't be that easy. I can't think of a good example which I can share, unfortunately. So I'd say that first of all, we have my favorite examples where our eye was directly impacted. Also, in kind of understanding what happens in our scales, there are many examples there. So I think it goes back to very early in the beginning, we talked a bit about the point of data science and using data as part of the analysis. I think in the beginning, I was tired of hearing about it. To be honest. And the one term which we're using is probably which is a differentiator and we talk about data democratization. So there are some limitations. So I would say run and then come on and on. What I would say is that the ability to access to use data across where that is reasonably possible is probably a differentiator and if you allow that to happen, also put things behind the line. Not keep it but other than that, there was a step there to say we make some of these insights and we make some of this data as fast as possible. I think that's a big differentiator. People didn't like that in the beginning and it's a bit like if you're a technical techie, it's like at Facebook, when you start working at Facebook and checking the first code, you're going to get reading out the item and it says there's a trick for a new year, right? And everybody enjoys it actually. And everybody gets better at this code and submit some code section and then this starts beating up the other ones. So this is a bit the same thing that we did in the previous years and we went on to the reading one when we had a new year. But once that happens, it kind of makes sense, right? So I think that's one of the interesting things. Yeah, one more interesting thing. So this is data versus data. So when you look at it was a category of pages, right? In terms of smart phones. You've got a ranking algorithm and it's the most popular one. There's a few other things and it's not just about popularity or revenue. We actually have a few of the key guys that have a version that are beneficial to the customer and us in the long run. It's a very short time I think. But we have something. So you do the algorithm and then subject matter experts and both in the different departments can take in the real range those results. For the fair subject matter expertise to whom to diversify. What do you think happens overall? The more they are weighing out a shop the worse the conversion rate goes. So usually they have some distinctive things where they might oh this is a new product Apple is going to be hot, let's put it on top it's got no sales history and it works. But after about two or three tweets it just dies deep. And in the end almost every group that touched the rankings made it worse. So these people that are experts they don't want to do this. We have been tasked when the algorithm events their ranking. We said okay, you know everywhere this when somebody touches it it fails. So what does that have to do with whether they change three or it's ten in the category so you just want to do the amount of damage they can do. If you limit the amount that they can change they do make the best decisions and then do the rest of them. I'm trying to give you an example I can share as well. And maybe one is just sort of a broad concept of a holiday that can help improve healthcare overall. So one of the big dreams of healthcare is to have more individualized or precision medicine. So that you know that the drug you're going to take is really going to work for you. But in order for that to happen you often need to have sort of a diagnostic test. Maybe you need to check if you have a certain gene expressed or something like that. And one of our products is like that, it's a precision medicine. And but we really had no insights into how often that diagnostic test was being done and who was doing it and you know potential prescribers of the drug. And so that's kind of an example where you can look at external data sets and you can kind of see okay is there even a chance that our product will be used because this hospital already has a practice of doing this diagnostic test. So I think those kind of combination data sets will be really interesting in healthcare in particular moving forward. And I've got a correlation to this whole thing. So my correlation is never trust the person who doesn't believe in AAP tests. Right? So we have a lot of pushback sometimes on AAP testing. And then that will lead to people who use AAP testing where people who weren't confident in their ideas and they felt that they're covering things up. So, you know, AAP testing is great. If people weren't really resistant to it they're probably hiding something. Okay. To see if anybody, you know, wanted to confess something. Is there any test result that you or did you pledge? Okay. So we can kind of hear the questions and open up to a lot of questions from before. But this is one that I want to ask because I want to know. I've got my opinions but, you know, we need data, right? How would you organize data scientists in your organization? For example, having a central pool of data scientists for which each department would draw or distribute them across departments so they have a better contextual knowledge to solve department-specific problems? Well, I think when you're first starting out there are advantages to some form of decentralization, just a community with lots of ideas all. But I think once you reach this inside and you've got all the science for the domain is to kind of start their science within the various teams, kind of operational teams. The danger of starting with small groups of individual data scientists or, you know, small teams in the operational units for which each department would be discussed. So, I think there's always a trade-off. I think central teams also benefit from the variety of problems. So, if I'm a central team servant, all sorts of different number of data scientists will go back to the time. It doesn't mean that we need to go very closely with domain experts within the agency who are already in the business, train them as well, so they can make new clients. But I think actually what we're hoping to see is to also have communities of data scientists within specific data industries. I'm not recommended that they don't have data-oriented problems. Sometimes the worst thing to do is to put lots of data scientists who are not interested in problems. So, I'm more inclined to kind of go with the central first and then to... It's really by department. I think the interesting part there is you know, we don't actually how do you further supply that's an interesting question. I think the other question which has onto it is who should, I would say, have a data report? I mean, what's the... Is there a central function? Is it by department? So, we have multiple of two data plans on some organizations, but in one part of it, we have a data report to the CEO, which is an agency with which I can see that the ability to impose and to do cool things is significant in this way. And I think that's interesting. So, what we're talking about cases which don't have enough problems, I couldn't imagine who that would be. So, I think it's just that they may not have the organization in general, they don't have data at a stage where we really could actually start to think on this. Just thinking. So, we don't have problems. We've got opportunities for approval. But you're not the government. Yeah. So, we kind of have an evolution of how data science and analytics has been in the organization. And I think each domain area has always had analytics people. We can report on what's happening now. Maybe give a little glimpse into the future. And then we started to do more sophisticated data science in R&D. And kind of proved out that that would be helpful for manufacturing and commercial organizations. And so then my organization is created which is a central group at the enterprise level for the whole organization to tap into. And so, we're really looking to partner with those domain experts who really know their data. They know their business. They know how to do their dashboarding and their analytics. But because they've been so focused on that they haven't kept up with the industry and with newer techniques and newer things that can be done. So, I see the purpose of my enterprise Y team to be the people who stay shot. Because sometimes when you're sucked into the domain the day-to-day activities of that domain can really dominate every minute of your life. And so, I'm trying to carve out time for a group to continuously bring in concepts and then push them out to those domain teams. Yeah, I think the point with the centralized we can do another model that's actually being centralized actually has a lot to do. We're centralized and I actually feel kind of isolated. We didn't have to rely on other infrastructure. We had our own construction. That allows to quickly iterate and do things about everything whether as we grew as we evolved our needs and dependencies with other people in our working relationships or other people grew it and we want this isolated little island. So, this is something that's been on my mind quite a bit lately. And we moved in so it was a tricky question centralized or decentralized. And that's the question we got, but I think there's a big model that works in hybrid and that's what I wanted to do where we still and this is what we're doing this week this morning we still have a central data science that serves people on project basis. Data scientists and data engineers belong to the data group although they might be funded by different horizontal business we can say so who's hired how they work what the process is consistent standards and they try to actually what we want the results, let's say, of execution and nothing to that is really good execution in my opinion. Congratulations. We get to be able to talk about data and standards across organizations and if you don't have the same data you don't have the same standards people can talk about the most opposing answers from the same line of data but also if you say scientists can really choose what side they want to be or if they want to be in the side of it and if they don't think it's quite right they really don't really want to improve themselves so who do you think that's the most important is the other thing to discuss my mind was I think in data organization it doesn't always have to be everything that there's a way to have competition and I think we have that bit in the beginning which is going to be better in defining the style of data whatever and I think it's a great conversation in terms of sharing sessions you know and it's certainly successful and it was important always that you have professionals around and you know what they're doing that's really interesting to me so we had well between 90 to 70 and people and they tried doing the solution that wasn't a great solution but they picked the right problem and they followed where they were and we were able to follow so they had kind of a okay solution that worked for a while but we were able to follow on and it turned out to be a great solution so sometimes you said innovation and some pieces around organization you know they might not do it but you know what they understand the problem and they've got a good idea how to solve it and you know they need a way to follow up this kind of policy it's actually something that's great to know many of you have conversations in general they're called elections I don't know I mean there are lots of agencies who set up their own data and in Delta we provide traditionally we provide IT support to like 15 different agencies so obviously we want to come in and support the other and some security and data science so in a way they need to have a choice so there's some to be a better competition there are some that need to have a better science team but again there's a whole debate about some competition but again we don't want to complicate because it's funny yeah so the other thing is you know when you look at the big science there's business problems that you solve, there's technical problems that you solve and sometimes technical problems can actually be about more of a different business solution so it's like God non-detection when we have people that are doing odd things on behaviors we can apply that to a lot of different areas so when people order this is a fraudulent thing when people are responding to marketing programs they're wasting marketing dollars by using marketing programs that we're not going to do new customers in instead of doing it one per customer somebody uses it by a hundred times so things that some success can have I don't know that's just good news and let's open forward to questions so the question is everybody debates what essentially should be part of the IT department or should be part of the state of the law department I don't know anybody reporting me sorry what was the question I think we have both actually we we have got multiple dimensions so there is data on data scientists data analysts which are important in technology organization as well as we have as I said before in the different business units data teams who also do work with data scientists high data scientists one of our best ones actually is part of the IT department and also so I think we have an issue I think that's not a bad thing because people in technology they are always being involved in technology they don't have to but they couldn't make it work otherwise everything can work but then in the end in that process so I think there is an argument you also want data scientists to be embedded in the business because we have a different sense of urgency we have a different sense of focusing on the problems which people talk about when they go for lunch or when they come together so I would say depending on the size of the team you have you know I still see data science as a business one I don't even see it let me pitch the opposite I mentioned that we do have a lot of analytics data science people embedded in the business but our Enterprise Y team is in IT and I'm comfortable with it being there at least for now in that our IT infrastructure and operations need to be heavily influenced on how to support data science activities in the company and I've seen the data science groups in the business kind of suffer with that they want to use R or Python but it's not supported by the IT organization and they just get kind of churn lost of what to do and so I'm very happy to have a data science centralized group in IT so that we can make sure that IT understands their role and enabling this technology and function for future sustainability right but I can see if there is some point in the future in which that is truly stabilized and is just a routine activity then I agree it's much more like the business objectives so completely agree it's a different function I have seen examples where IT teams don't afford they're on the water and so we have that so I think the best thing is is that it's something useful to be reporting to the CEO's office this will be your last two okay thank you so the data is everything this is the technology secondary I know it's about supporting the business it's not about the technology there's many places that data could be important to you know some kinds of works in the organization and frankly it is important that the person that you report to is the champion for you and you know it's the gospel data is everything data is everything data is everything a couple questions we like what it's a contemporary modern art piece it's lines and dots okay next can you repeat that what is the specific problem you're trying to solve what do you have a set of problems and your specific problems are the way which one do you want to tackle first is that a question no so maybe then we can try whenever we look at a particular problem that comes to us and there are a number of key questions that we have been asked about so I think first is the wonderful data science world what is the potential of that so what are you going to change the way you view this process and what is the potential of that so if you do this wonderful analysis who are you going to present it to and how would you change the outcome of that so this is what the potential of that second thing is if you have interesting data now if you don't have interesting data sometimes it's the last data you need to look at and in the third thing which is buying an IT staples so you can buy it from the office people and the people who are here because you can buy it from the people who are passionate about this problem but if you want to buy it from the IT team and the people who post this data in this case so I'm answering it kind of more genuinely around how to decide whether it's a good data project but I'm going to say that the first time we won't have the outcome strategy because you're trying to do the things you're trying to solve a problem but there's also building the organization so that's a learning process as well so outcome of solutions will come after you you know I think it kind of shows us some success stories first and in your point of view what you would ask for when you go to the IT team so it's always been something like what you can do what you can do and then it also creates opportunities so my experience is those that I hired they don't have a data or a service talks to the same voters maybe we should take this off and we're not going to get this unsafe but I'm not even sure if I understood your question I would say yeah this is something you have to guess around what you might have asked so I think if you start as in day one I'd say if you have two people that you know you have a technology person that someone who has sensitive science even then I think I think it's a small team actually day one if that's if it's not about the low low yeah so my father my father was a type of scientist he had machines they set the timer and a little bit of mac came out anybody could be a graphic designer 1984 anybody could be a graphic designer do we still need graphic designers yes because I mean so anybody can take a stock tool and do stock things but if you look at this data and analysis as a competitive advantage it's always going to be the people that did the best minds and the most innovative solutions and so a lot of things about data science might be more and more automated or might just become more and more basic but you know we'll still need that function we'll still need you know I don't know maybe there's two or three iterations of people that I I still think that we'll really there will be no experts in that that will always be we have some I wouldn't call it a scientist I would just want to solve things on the mainframe and I send data I'm not sure this could be very future of a group set of skills in that way and so I think to John's point I think what's very important is that you're all involved in our work because these jobs do the work and since we're doing today maybe the scientist you're saying what has something done but it's true for many jobs so I'm just gonna see I'm hearing that so that's it thank you ladies and gentlemen any final party close of wisdom would like to share that today is everything that's it stay curious stay curious, stay passionate thank you very much thank you thank you