 Hey, Jeff Rick here with theCUBE. We're in downtown San Francisco at the Hilton at the Financial Center and we're at the Chief Data Scientist USA Conference, a relatively small conference, but a lot of heavy lifting Chief Data Scientist talking about not only the data science itself, but really kind of the role of the person, which turns out to be, you need a lot more soft skills than necessarily all data science skills. We're excited to be joined by Hailei Owusu, the Chief Data Scientist at Mashable. Welcome. Absolutely. So the conference has been going on for a couple of days. Any surprises that have come out that you've seen? Yeah, so I mean, one very specific surprise was one of the conference speakers mentioned and they have a very large team of data scientists and their philosophy, which I found really intriguing, was essentially an expectation that their data scientists were mature to the point where they would control entire project pipelines from data collection to the establishment of the API associated with the project. And so that kind of full stack data science expectation from the totality of their team is actually, was a bit of a revelation for me personally. Yeah. The panel that you just led was really about how the Chief Data Scientist communicate to the business. Yeah. How they engage with them, how they communicate what they want to do and really what struck me is it's very much a sales role as any leadership role as to selling what you're doing, the value of what you're doing and then specifically building trust around unwrapping the black box a little bit. So it just doesn't go in and come out and here's the answer to the question, but you really got to sell it. You really do have to sell it. I mean, it's one speaker queried the audience and asked whether we as audience members thought that the Chief Data Scientist role was primarily the role of a scientist engaging the business as opposed to a business executive coordinating the explanation of science and extracting value from it. And one of the things that's certainly crept up on me in the role is really the vital importance of the sort of business soft skills component. Right. It's not something that is natural and native to most scientists, but it really is where the value from machine learning algorithms really gets extracted because there's this communication gap that has to be bridged, which happens at the level of communication not at the level of technology. Right, right. And then things like, simple things like data quality, you know, do you have the right data to answer questions? Is the question formulated in the right way that you can actually go solve it? So there's a lot of, it's almost like a big system integration project expectation management in doing that engagement, building the trust and yet still trying to deliver some value. This is particularly the key because I think the most interesting data science work happens at the interface between, you know, practitioners and people who are not especially quantitative, who are expecting and rightfully so expecting to extract real concrete revenue based value, but are completely in the dark about the details. Right. Again, being able to communicate that is an extremely high value. And the other thing that comes up all the time we do a lot of shows is, you know, there just aren't enough data scientists, it's just, just period. And we go back to kind of the buggy whip, you know, there weren't enough buggy whip manufacturers for cars or chauffeurs, right? When they first came out of the car. So we really need to change the discussion. The chief data scientist has to help the business users and kind of drive down the tools, the engagement, the interaction to people beyond just the chief data scientist to really get the increased buying. I thought it was pretty interesting. One of the gentlemen talked about making this whole process much more interactive to get a feedback loop, a trust with the business units. So it's really not this kind of one way, you know, here's what I'm telling you. It's really more of this kind of engagement. Yeah. And in fact, if you try, if you try to prosecute the results of a data science group that way as a kind of a dictate, you will get immediate pushback and people will disengage. And I mean, for a company to invest in data science is a high upfront commitment. And so if one goes that approach, you'll find that key stakeholders will divest from a data science team, I think. So yeah, it is far less a dictation and much more of, it's kind of an ambassadorship actually. I mean, you are sitting on techniques and approaches that are probably relatively new to the industry in which you're housed. And you have to sell them on investing both time and money and effort in unearthing this functionality and embedding it in the company. No, you have a day job, you don't just run conferences all the time. I don't, no. So you're in charge of the velocity technology in Mashable, which is the viral life cycle of digital media, really interesting. Everybody wants their content to go digital, right? Or go viral, excuse me. They do, yeah. So just curious on that project, what are some of the things that you've discovered? What are some of the surprises that most people wouldn't think about? How do you continue to tweak your models as social media continues to evolve and adapt? And you know, it's Twitter, it's Facebook, it's Snapchat, God knows what the kids are gonna be on next week. Yeah, no, I mean, I'll just say that we've invested a lot of time and effort in really trying to understand and predict how much engagement an article or a video is likely to get after it's been published. We consider all sorts of signals and we've gotten our accuracy really quite high based on really early behavior of a piece of content. I think one of the things that is underappreciated is how hard it is to predict how well a piece of content is going to do before you publish it. There is a huge gap in explanation between how well something is going to do after it's been published and you actually get a few indication of how well it's doing initially to having no information whatsoever, except information about what the content is, whether it has an image, whether it's talking about this topic or that topic. Explaining the latter scenario, very difficult and we've made some strides along those lines but it turns out to be probably a fairly fundamentally difficult problem. And are there just some triggers that you see as kind of a typical behavior for things that really fly that kind of hit that magic around? I just go back to the woman with the Star Wars mask in the car, right? And she tells the story, you know, she puts on her wiki mask and laughs at herself in the car. I have no idea where you're talking about. You have to see this thing. But it went crazy viral literally overnight. I mean, she went to bed and she woke up the next day and she was an internet star. You know, it's not a published piece of news but still just, you know, she hit that magic. And everyone wants it. And the other, of course, factor is everybody's overnight sensation but you didn't know they've been working for years and years and years and years. But knowing what you know, do people, do you try to bake that in? Is it baked in with tags? Do you basically let the author write it as they would in their voice and then come back and make adjustments? I mean, how do you kind of bring your skill set, your data, your knowledge of what's gone before and help make sure that that great piece of content that's being published tomorrow gets the proper, if that's the right word, amount of uptake? Yeah, just to be clear, the work that we do with Velocity, again, this idea of dictatorship versus ambassadorship, we really avoid attempting to impose a kind of data dictatorship on the content creation process. So for us at Mashable, we take all our accumulated information in the Velocity Suite and we embedded in our CMS and essentially allow writers to see a running history of viral hits, not only our own but sort of across the publishing landscape. And what we found is that writers are able to distill from sort of a collection of greatest hits filtered by topic, filtered by time window, filtered by language keywords. They're able to incorporate that collected history into their writing. And we found that in general, that tends to yield better outcomes, both in terms of overall engagements, overall viewership, but also I think we find in terms of just like the depth and quality of the content because it also allows you to see sort of where similar content has fallen short. So we don't do, we don't make sort of dictatorial recommendations as to what folks will write about. That's not our way, rather we use the technology to be able to distill out what has worked in sort of historical compendium, present it to writers and then they can use that, use their judgment as to how to use that for their writing. Right, right. And so we come back a year from now as you look forward as this landscape continues to evolve, what are you tracking? What are you keeping an eye on? What are you excited about? That's kind of changing in this world. I mean, without question it's a popular buzzword but it's actually sort of revolutionizing how we're thinking about content and that's namely improvements in the state of the deep learning art. So the use of long short-term neural networks, a convolutional neural networks, they've allowed us to do feature extraction on images and text in a way that we hadn't been able to before and there has been a significant improvement in our ability to do predictions along these lines. So the pace there is very fast. I have no, basically it's impossible to say what that's going to look like even a year from now but there's no question. It'll be an impressive advance forward for us in the future. Very exciting. Yeah. Well thanks for taking a few minutes. No, thank you. Pleasure. Absolutely. I'm Haile Usu, the Chief Data Scientist from Mashable. I'm Jeff Frick. You're watching theCUBE.