 Live from Stanford University, it's theCUBE, covering the Women in Data Science Conference 2017. Welcome back to theCUBE. We are live at Stanford University at the second annual Women in Data Science Conference. This great, fantastic one-day technical conference, and we are so excited to be joined by Yale Garten, who was one of the career panelists. Yale, you are the director of Data Science at LinkedIn. Welcome to theCUBE. Thank you, thanks for having me. So exciting to have you here. Everybody knows LinkedIn. My parents even have probably multiple LinkedIn accounts, but they do. You serve, what, 400 and plus million accounts. I'd love to understand what is the role, what's the data scientist role in the business overall? Yeah, so I guess when people ask me about data science, what I love to kind of start with is there are a couple of different types of data science. And so I would basically say that there are two main categories by which we use data science at LinkedIn. If you think about it, there's really data science where the product of your work is for a human to consume. So using data to help inform business or product strategy, to make better products, make more informed decisions about how you're investing your resources. So that's one side, which is often called decision sciences or advanced analytics. Another type of data science is where the consumer of the output is really a machine, right? So rather than human and machine, so basically these are things like machine learning models and recommendation systems. So we have really both of those. Second category is what we call data products. And so we use those in virtually everything we do. So on the data products, much of LinkedIn is a data product, it's really based on data, right? Our profiles, our connection graphs, the way that people are engaging with LinkedIn helps us improve the product for our members and clients. And then we use that data internally to really make better decisions to understand how can we better serve the world's professionals and make them more productive and successful. Right, fantastic. So tell us a little bit about your team. Sounds like it's sort of broken into those two domains. So you must have quite a large team or a lean team. So yeah, the way we have our team is that we work really closely within all of our product verticals. And we embed closely with the business to really understand kind of what are the needs. And then we work very cross-functionally. So we will typically have in any group sort of a product manager, an engineer, a designer, data scientist. Often it's from both kinds of data scientists. So sort of more on the analytics side, more on the machine learning side, right? Marketing, business operations, so really very cross-functional teams working together using this data. Very smart, sounds very integrated from the beginning really kind of by design. So that collaboration is really sort of natural within LinkedIn. That's fantastic, very progressive. Certainly something that everybody benefits from. Because whether you're on the advanced analytics side or on the machine learning side, you're getting exposure to the business side, vice versa, which that's really a great environment for success. Yeah, and part of I think what I love about LinkedIn is actually our data culture and how kind of data is infused in the culture of how we do things. Right, which is really, really, not always the case. It's not, and it's cultural shifts, we were talking about that with a number of guests today and especially depending on the size of the organization that's tough. Yes. So to have that built in and that integration as part of this is how we do businesses, really you can imagine all the potential and possibilities there. So we'd love to understand how is LinkedIn using data to recommend ways to evolve products and services to best serve all of its members. Yeah, so maybe two different examples of how we do this. One is what we do is every launch that we have, so every feature that we generate, we really do it in an online experimentation setting. So we have a certain feature that we're about to roll out to our members. We want to make sure that it's a better experience for our members and better as measured by kind of the metrics that we've defined in terms of measures of success. Which is really aligned to what value we believe we're delivering our members and customers. And so when we roll out features, we'll roll it out to a certain percentage of our users, test the downstream impacts of that and then decide based on that whether we actually roll that feature out to 100% of members. And so that's one of the things that my team is heavily involved in is really helping to use that data to make sure that we are structuring things in a way that's statistically sound so that we can measure the impacts correctly of rolling out certain features. So that's kind of one category of work. And the other category is really to do sort of opportunity identification and kind of deep dive insights into understanding into a certain product area where there are opportunities to improve the product. So one, let me give you a high level example. One of the ways we might use data is to say, okay, are certain members in certain countries accessing via iOS or Android? And if so, should we be developing more in differentiating between iOS and Android apps? One simple example, right? Where we'll actually just decide our R&D investments based on the data that we're seeing in terms of how people are using our products and do we think that that's important enough of an investment to make to improve the products and invest in that area? Wow, very, very smart. What are some of the basic ways that data scientists can deliver more value for their stakeholders, whether they're internal stakeholders across different functions within the organization or the members, the external stakeholders? I think one of the most important things is to really embed closely into these kind of functional or domain areas and understand qualitatively and quantitatively what's important, right? So understanding what the business context is and what problem you're trying to solve. And I think one of the most important ways that data scientists play a role is actually helping to ensure are we even answering the right questions? So as an example, a product manager might ask a data scientist to pull certain data or to do a certain analysis. And part of the conversation and the culture has to be, what are you trying to get at? What are you trying to understand? And really thinking through, is that even the right question to be asking? Or could we ask it in a different way? Because that's going to inform what analysis you do, really how you're delivering the results of this analysis to make better decisions. So I think that's a big part of it is having this iterative process of doing data science. Really, it sounds like such an innovative culture and you're right, looking at the data to determine, is this the right next step? Is it not? How do we maybe adapt and change based on really what this data is telling us? If we could have looked at collaboration for a second, you talked about the integrated teams, but I'm wondering, how do you scale collaboration within LinkedIn across so many businesses and engineering stakeholders? So the way I kind of like to think about it is, there's really, you have to invest in, in culture process and tools. So let me start from the bottom up. So on the tools or technology, one of the ways to do it is actually to create self-serve tools to really democratize the data. So first of all, investing in foundations of really good data quality, whether you're creating that data yourself or you're collecting that from externally, from different organizations. Once you have really good data quality, making sure that you have foundations that enable self-serve data basically. So for example, some of the things that data scientists are used today in various companies really doesn't need a data scientist if you've invested in ways where business partners, let's say, can query that data themselves. And they don't need a data scientist to be doing this role. So that's an important investment on the technology side. In addition, making data scientists really productive by using and investing in tools that will enable them to access the data is really important. So once you have that sort of technology, it enables your data scientists to be productive. The process is really important. So just as an example, we have a sort of playbook in terms of how do we launch features? And part of that is kind of bringing data insights in terms of which features we should be building. And then once you've determined using the data and those insights, it's okay, how are we going to launch this in terms of experimental design and setting? And then what are the success metrics? How are we going to know that this is actually a good feature? And then once we've launched the experiment, analyzing that, where all of the stakeholders are part of this, right? The product manager, the executive, the engineer, the data scientist, and then kind of iterating on the results and deciding what the decision is. So having actually a process that the whole team or the company abides by really helps in having this collaboration where it's clear what everyone is doing and what's the process by which we use data to develop and to innovate. And then finally culture, I think that's such an important part and that really needs to be sort of bottoms up, top down, everywhere. It really needs to be a community and a culture where data is discussed and where data is expected and where decision making really is grounded on data. I fundamentally believe that any product being developed or any decision being made really should be data informed if not data driven. Right, absolutely. One of the things that I'm hearing in what you're doing is enabling some of the business users to be self-sufficient. So you're taking that feedback and that input from the business side to be able to determine what tools they need to have and how you need to enable them so that you've got your resources aligned on certain products. Yeah, just as an example, one of the things that we do, for example, is we realized over time that this isn't actually productive and how do we make ourselves scale? So we started doing data boot camps, for example. Where we'll actually train new people coming into the company on data and our self-serve tools and on how to run experiments. And so a variety of different aspects and even how to work with data scientists productively. So we actually train that. So this data boot camp really helps us to instill a data culture and it really empowers the team. So this is anybody coming in, whether they're coming in for a marketing role or a sales ops role, they get the state of boot camp. Yeah, and it's open to anyone and it typically is going to be a certain subset of those people but it really is open to anyone and we're talking about more ways of how do we scale that and maybe how we put that on LinkedIn Learning and make that more broadly accessible. So you have quite a big team. How do you keep all of the data scientists that you've got happy? What are the challenges that they face? How do you evaluate what those challenges move forward so that they have an opportunity to make an impact at LinkedIn? So part of the things are actually the things that I mentioned. So a culture of data, it's really important and when we see that this is not happening, actually addressing that. So data scientists are going to thrive in a community and a culture where data is valued and where data scientists are valued. So that's actually a really important aspect and luckily people come to us because they know that we do value data but I think that that's very important for any company and so I advise startups as well and this is one of the things that I tell people that are founding companies is you have to have a culture which values data to attract data scientists because otherwise they have other options. The other thing is having these foundations that enable them to be productive. So these tools and these systems that enable them to really do high value work and invest in the right areas. So you start graduating from doing things that are more maybe repetitive or low level and figure out how do you scale that so that you can have data scientists really efficiently using their time for things that only they can do. Right, I love that this culture is sort of grooming them. One of the things that, a couple of things I read recently, one was that, I think it was Forbes that said 2017 the best job to apply for is data scientists but from a trends perspective it's looking at by 2018 this going to be a demand so high there's not going to be enough talent. How are, what's your perspective on LinkedIn? Are you, sounds like from a foundational perspective it is a data driven company that really values data. Is that something that you see as a potential issue or you really have built a culture of such not just collaboration and innovation but education that LinkedIn is in a very good position. Yeah, well so one thing that I didn't mention in terms of the happiness factor is that it is actually a place where data scientists look for a place where they can also grow and learn and be with other like minded data scientists. So I think that's something that we strongly support. Again, for companies that people that may be viewing this and are not in such environments there are a lot of ways to do this. So keeping data scientists happy also can be facilitating meetups with data scientists from your local region and so those are ways that people share information and share techniques and share challenges even because this is a growing and involving field. And so that's having that community and one of the things that's amazing about this conference is that it's creating this community of data scientists that are all sharing successes and failures as data science is evolving. The other thing is that data science draws from so many different backgrounds. It's a broad field and there's so many different kinds of data science and even that is getting both more specialized and once more broad. So I think that part of it is also looking at different backgrounds, different educational backgrounds and figuring out how can you expand the pool of people that you're looking at that are data scientists and how do you augment what skills they may not have yet on the job or through training or through online education. So we're looking at all these ways. That's fantastic. We've heard a lot of that today. The fact that the core data science skills are still absolutely vital, but there's some other sort of softwares because you talked about sharing. Communication has come up a number of times today. It's really a key. Not only to be able to communicate, to understand and interpret the data from a creative perspective and communicate what the data say, but to your point to grow and learn and keep the data scientists happy that social skill element is quite important. So that was an interesting learning that I heard today and I'm sure you've heard many interesting things today that have inspired you as well. Yeah and that's something that creating this culture is something that even data science leaders around the world were discussing this and talking about what are the challenges and how do we evolve this field and how do we help define and help kind of groom the next generation of data scientists to be in a more stable and maybe better place than where we were and help to continue to evolve it. And so it is, yeah. Evolution is a great word. I think that that's another theme that we've heard today. And as much as I'm sure you've inspired and educated these women that are here, not just in person today, but all of the 70 cities in 25 countries that's been live streamed. Yes, it's growing, it's amazing. And I'm sure that they've learned a ton from you, but it's probably just in a little bit that we've had a time to chat. I'm sure that you're probably gleaning a lot from them as well. And we're scratching the surface. Yes, absolutely. And so there are many more years to come. Exactly, yeah. Well, thank you so much for joining us on theCUBE. It's a pleasure talking to you. We wish you continued success at LinkedIn. Thank you. And we want to thank you for watching theCUBE. We've had a great day at the second annual Women in Data Science Conference at Stanford University. Join the conversation hashtag WIDS 2017. Thanks so much for watching. We'll see you next time.