 Hello and welcome. My name is Shannon Kemp and I'm the Executive Editor of Data Diversity. We'd like to thank you for joining the current installment of the Monthly Data Diversity Smart Data Webinar Series with Adrienne Bowles. Today, Adrienne will be joined by guest speaker Bob Hayes to discuss data science and business analysis. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we will be collecting them via the Q&A in the bottom right-hand corner of your screen. Or if you like to tweet, we encourage you to share highlights or questions via Twitter using hashtag smart data. We'd like to chat with us and with each other. We certainly encourage you to do so. Just click the chat icon in the top right-hand corner for that feature. As always, we will send a follow-up email within two business days containing links to the slides, the recording of the session, and additional information requested throughout the webinar. Now let me introduce to you our Adrienne. Adrienne is the industry analyst and recovering academic, providing research and advisory services for buyers, sellers, and investors in emerging technology markets. His coverage areas include cognitive computing, big data, and... Shannon, we lost you there. Shannon? I can hear you, Bob. And I can hear you. I am from Northwest University. And with that, I will get the floor to Adrienne to get today's webinar started and to introduce our speaker for today. Hello and welcome. Thank you, Shannon. Just chatting with Bob. We lost you somewhere in there, but I'm glad you're back. So I'll go ahead and give his bio and we'll get started. And welcome, everybody. It's a pleasure to be here, as always. This is the third one of the year we did a dozen last year. So those of you that have been with us for any of the events, the webinars, know that I guard the microphone pretty carefully. So this is a special day that we have a guest speaker. I met Bob Hayes last year when IBM put us both on a panel at one of their events. And I was really impressed by the things he talked about about his own research on data science and scientists. So this year when I was putting together the agenda for the webinars, I thought it would be great to have him on as my guest for the topic. So let me give you the short version of his bio. The full bio is on the DataVersity website now. So Bob Hayes is the Chief Research Officer at Ivory. He's responsible for directing best practices research and communicating the value of their platform. His research is focused on the intersection of analytics, the practice of data science, which got me very interested, and customer experience and success. You may know him from his talks, blogs, and books. And all of those are detailed on his bio at DataVersity and should be in the package that we get after the webinar. Bob received his PhD in industrial and organizational psychology from the Holy Green University, specializing in survey research and quantitative methods. That was also something that impressed me because I can't tell you how much research I read these days that comes from people that obviously don't know what they're doing. So somebody that knows about quantitative methods is always of interest, particularly when we're talking about data science. Bob has over 20 years of consulting and research experience in enterprise and mid-sized companies, including Oracle, Agilent, Sappos, and a few others. And I'm really excited to share the platform today. So with that, I'm going to turn it over to you, Bob. Adrian, thanks for that nice introduction. And thanks for DataVersity for allowing me to be here today. I really appreciate it. For the talk today, first of all, on the front page, you see my email address. So if you have any questions after this, please feel free to email me. And also feel free to follow me on Twitter or tweet anything you want during this talk. So I was interested in data science for a long time. And I started my work after grad school in the area of customer experience management and customer success. So I read a few books on that topic, focusing more on how companies can better utilize analytics and analytics-related topics to help improve how they manage their customer relationship. I blog regularly on the topic. I do research in the area. There's some information about me. I'm the no one blogger on the site called Customer Think, where my blog is syndicated. I also focus on customer analytics. And I also blog and do research in the area of big data and data science, as Adrian said. So with that, let's dive right into data science. So I was interested in data my entire adult life after my first stats course in college. So when this world of big data was coming into focus, and all these new terms were coming out, big data, machine learning, data science. So I wanted to kind of put some rigor behind what those terms meant. So I focused right on this study. What does data science mean and what do data scientists do? What are their skill sets? So this is how I define data scientists and data science. So it's a way of extracting insights from data using the powers of computer science and stats applied to a specific field of study. So it necessarily involves the collection and analysis and interpretation of data to extract empirically based insights that augment human decisions and algorithms. So you need to be a data scientist that focuses on helping people make better decisions, or maybe data scientists that does activities to help create machine learning algorithms to help automate work processes. So given that as a broad definition of data science, I wanted to, you know, ask data scientists, since I'm an Iowa psychologist, I do survey research in quantitative methods. I thought I want to apply data science techniques to the study of data scientists to see if we can get some insight as to who they are, what they do, and are there things we can tell companies to ensure that they're more successful in their data science projects. So hopefully throughout this talk, you'll get some insight as to how you may improve your data science skills. How you may implement data science more effectively in your company and so forth. So the study, I did a study about a little over a year ago when I was with a company called Analytics Week, and we had a huge newsletter. I wrote a blog post on the topic, and I tweeted and shared through social media invitation to take the survey. And we got about 600, over 600 respondents for this particular presentation. As to date we have, I think, over 1200 respondents who are still looking at the data and trying to make sense of it. The survey itself, I'll go into this actual topic via questions in the subsequent slide or two. But the survey basically asks respondents to self-assess on 25 skills related to data science. Those include business, technology, programming, math and modeling, and statistics. I also asked some other questions around demographic issues like their gender, their highest level of education attained, their job role, things like that. And I asked another question, what was your overall satisfaction with the outcome of their analytics projects on which they work? That's pretty much the variables we're looking at today. So here's a list of the 25 skills that we think make of data science. Now, maybe this isn't exhaustive list, but it's pretty comprehensive, I think. And I got these, this list was kind of generated by these prior researchers at the bottom of this slide where you see the references, the article called Analyzing the Analyzers by Harlan Harris, Sean Patrick Murphy, and Mark Baseman. So the data science skills fall into five broad areas, business, technology, math and modeling, programming, and stats. Now, the upper level business, I focus on business as a subject area expertise because I focus on asking data scientists who worked in the world of work. Now, if you want to study data science to work in other areas, you may want to ask questions about like oncology. If you have a data scientist trying to apply data science methods and big data methods to the field of predicting cancer, then you want subject matter expertise around knowledge of cancer. Okay, but here we're just going to focus on business. So for each of these 25 questions here, I had the respondents rate each of them on a scale from zero to 100, where zero means they have no knowledge of the topic and 100 where they're an expert on the topic. Okay, and the thing is, I got this rating scale format from the National Institute of Health. I thought it was a really good way to hopefully make these ratings more objective in a way, because if you look at number 60, rating of 60, it tells you that if you give yourself a 60, you can do that scale on your job without any help. Okay, and the higher the score you get, so 80 and 100, that means you can give more help to people around you. And below 60, 40 or below, that means you need help to do your job. So hopefully with these kind of criteria, these ratings are meaningful. And based on the results I found, I think these results are reliable and valid. And I'd like to do feature research to see if these ratings of proficiency and various skills are actually correlated with real objective assessment of these skills. Maybe we can talk about it in the Q&A section. All right, so next slide. This is the samples made up of primarily data scientists from the B2B space. About 80% of them were from B2B, 50 for only B2B, and 30 for companies that were both B2B and B2C. And most of the respondents also came from North America at about 64%, and most of the respondents were from the IT industry, education and science, consulting and healthcare and medical industry. Just to give you a sense of who I was surveying. So first thing I want to do is I want to see what is the proficiency across all those 25 data science skills. It's like, code an expert is green. I'll be down to know the knowledge in that is red. And I rank these from the most proficient to the least proficient. And we see the top 10 data science skills on the right there. What's interesting is that, I don't know if it's interesting or not, but just the fact that communication was the number one skill. So if you ask any data scientists, they believe that they are the most proficient on average in communication compared to any other of the data science skills. The next skill is managing structured data, data mining and VIST tools, science and the scientific method and so forth. And I'll get further on the presentation. I'll compare different kinds of data science to see if certain scientists have different skill sets. What's interesting to do on this slide is that you look down below the 24th one that is big and distributed data. So even though we are in this big data world, very few quote data scientists have expertise in that area, which I found surprising. And actually there was a study that was done by Katie Nugget and looked at the size of databases data scientists typically use. And what they found was that data scientists typically look at data that can fit on your laptop. So it's not really big, big data as you would think. Okay, in the survey I asked the response to identify their job role. They have one of four options. Actually they could choose more than one, but the four options were your researcher, your business management, your creative or developer. And we see that in this sample, most of the respondents identified as a researcher. Next is business management, creative and then developer. So I wanted to see how these different job roles differed in skill proficiency. And you see here on this chart here, it's a spider chart. Red in the middle means you have no proficiency or low proficiency. Yellow is kind of warning and green is when you're really, at least you can do your job independently at work. So here if you look at the business management, the orange one, they tend to be strong, which is no surprise in business skills. See the top right part of this spider chart. They're above the 60 point criteria. Look at developers, they tend to be strong in technology and programming skills. And then researchers, not surprisingly, tend to be very strong in math and stats skills. There's some overlap with different kind of roles, but typically that's what you see. And the creatives that people have identified as creatives didn't tend to be strong in any one field. They're kind of like average in all skill areas in the middle. So I'm not sure if that's good or bad, but as we'll see later in the presentation is that the more proficient you are in these skills, the happier you are with your outcomes on your analytics projects at work. So if you're a creative, you may want to maybe dive deep in a given topic area and become an expert in that, whether it be stats, technology, or maybe even business management. Well, next slide. So let's go back. So these questions here, you know, the 25 lists were generated by myself and these prior researchers, and we bucketed them in what we thought was a pretty logical buckets of five buckets of business, technology, programming, math, and stats. Now we did, I did a factor analysis of the proficiency ratings to understand how these items kind of clustered together. And this is a popular machine learning technique that's similar to principal component analysis. So the fact analysis, what it tells you, it tells you which, how many, how many underlying factors can describe the correlations among these 25 skill sets. And also, which of the skills is correlated to which of the factors. So in this fact analysis, I found a very clear three factors solution, right? And if you look at the loadings right here in the factor pattern matrix to the right, we see that the top five skill sets were for business. They loaded on factor three and I labeled that business and highlighted those, those weightings with the yellow. And see down below, we got technology and typically they fell into the correct bucket that we thought, but there were two of them that, that really should be measuring something else. So for example, we thought machine learning fell into the technology bucket, when in actuality it loaded more highly with other items that related to math and statistics. And likewise, the area of, of big and distributed data we thought was more about technology. So it is, I'm sorry. Oh yeah, it's just machine learning, I guess. And also in data management, we thought was going to be down below in stats. We thought it would be a little highly on the stats skill, but actually it loads more highly on the technology and programming factor. So what this basically tells you is that this, even though we have 25 independent skill sets, they can actually be grouped and do into three separate clusters, meaningful clusters. And if you look at these, these factor loadings right here, those three factor loadings, we can actually plot those in a factor space on an X, Y and C axis. And I'll show you that on the next slide, you get a more, a better visual sense of what I mean by, by having these three factors. So the blue ones are, excuse me, the yellow ones are business, they load in a similar area. The items that load on tech and programming are kind of clustered together in this area, lower right of the picture. And the math and stats also load together in one kind of area. And the implications for this is that if you're strong in any one of the math skills and you want to learn something else about big data or data science, I think your best bet would be to kind of focus on skill sets that are related to math and stats because you're already good at that. And people who are good at one tend to be good at other areas in math and stats. The same thing for technology, if you're good at maybe back in programming, then if you want to learn more about data science, you may want to focus on more programming types of skill sets that you want to develop in yourself. So here are the three kind of broad domains. This kind of supports what the data science and big data punters talk about. We talk about the three data science skills. You got the subject matter expertise or domain expertise up top, followed by technology and programming, which includes these skills. And then finally, you got math and stats to actually analyze the data. So instead of thinking about that Venn diagram, those three that kind of overlap with data science in the middle, I like to think of data science more like the chart here on the upper right where it's a variety of skills you can have. You may possess more than one or a few of them. But think of skills as being independent. You can learn a lot about stats and not know anything about programming and technology, which is fine, and vice versa. And we take those three factors and I do the same kind of plot. We see here again that of course business people on average have higher proficiency ratings for business type skills. The developers have higher proficiency in technology skills, the blue. And the lower left see that the researchers are much more proficient in math and stats skills. And again, the creatives are kind of mediocre in all the three skills. They're not strong in any one particular skill. So let's look at each of those skills that are in each of those three general buckets of data science skills. And we see here the chart here looks at, again, the proficiency rating on the left. I drew this in the line 60 with the dash. So if you're above that, you give help. If you're below that, you need help to do your job. And we see here that these are business skills. And we see here that the business people tend to be above the level at which they can actually do their job. So that's good. Look at technology and math on the next page. On the top one is technology. We see here then blues are the developers. So the developers tend to be more proficient in the developer skills compared to the other kinds of data scientists. And look down below again, look at math and stats skills. We see here that again, researchers tend to be more proficient and above the 60 cutoff point across most of these skills. At least higher than the other data scientists, other types of data scientists. So the implication for this is that not everybody is good at everything. So pick your strengths and focus on that and be an expert in that area. And this is kind of a summary chart that looks at the various data skills, how competent you are for each of the four data science roles. And the bigger the bullet means that the more people are proficient in that area. So we see here for business managers, and even the other three developer creative researchers, the top skill again is communication. And what's interesting to note is that for the business manager, their top skill is communication we have. That's hard to read. Okay, is project management and business development. And for example, researchers, they're very competent in communication, math, not surprisingly, data mining of these tools, and science and scientific method. So you can give me a nice summary to look at where data scientists' strengths are. Alright, let's go to the next slide. So, okay, here we go. So I also asked a question on how satisfied each of the participants were with the job that they're doing at work. And we see here that there are significant differences between researchers and developers and business management folks. So researchers tend to be more satisfied with the outcome of their projects around analytics compared to both business management and developers. And what I'm kind of surprised at how low the satisfaction is for the developers. And we can talk about that in the Q&A. But one hunch is why they're so low, is that maybe that they're not included in maybe a broader look at a data science project, and they're kind of stuck just doing coding. So I want to study this in this year to understand why or how companies can improve the satisfaction of developers in the work that they do. And I think one thing to look at is team cohesiveness. It's that if you build a nice strong data science team who collaborate well, who speak weekly or daily about projects and how the project's structured, I'm hoping that those kinds of developers working in those settings will be more satisfied with the jobs than developers who are stuck alone just doing coding. But again, that's an empirical question that I'd like to answer. All right. So I looked at how people talk about how hard it is to find a unicorn. Well, here's data to show you how difficult it is to find somebody who's an expert and everything. So look at each colored bar. So focus on the intermediate, the dark blue bar here. So this means that I wanted to see how many people at least had an intermediate level of proficiency in any of the skills, right, the five broad skills. 22% indicated they had no skill sets above intermediate, which is, I find, kind of surprising. We see that there was 10% of people who were proficient in five or more skills at the intermediate level. And if you look at it, if you go further and further advanced to expert, we have 96% who weren't an expert in anything. If you go to the right, we had 3% were an expert in one skill, and 1% proficient was an expert at two skills. So the whole notion of finding somebody who knows everything about data science is impossibility. So that's why I always talk about data science being a team sport is that you have to work with people who have skills that you don't have, and you can complement the other skills and drive your data science projects forward. And I'll show you how to do that coming up using the scientific method or why you can do that or why it's useful to think in those terms. So I wanted to see the impact of your teammate's skills on your work performance. So I asked, basically, if they work with a team or alone, and if they work with a team, I wanted them to indicate if they have an expert on their team in a given area. So this one, I want to focus on the impact of the business expert. If you look at the left side, the three left bars, this is a satisfaction for each of the job roles when there is no business expert on a team as your teammate. If you look at the right hand side, this is with a business expert on a team. And we find that data scientists who have business experts on a team are more satisfied with their work outcome than when they don't have a business person on a team. And this is held for researchers as well as developers. So developers and researchers can be happier about the work that they do when they're paired up with somebody who knows about the business. And that makes sense because the business person knows what kind of questions to ask and what kinds of projects need to be done. And so the developer and the researcher can spend time focusing on addressing those particular business questions. All right, let's look at this next slide. So this one is for the impact of technology and programming expert. And for this, I didn't find any impact on your teammate being an expert in technology or programming. So it didn't matter if your teammates were experts or not on programming. It didn't impact your level of satisfaction with your work outcome. And finally, we'll look at the impact of math and modeling statistics on team performance or in the team function. So on the left hand side, the left chart, we see that the presence of a math expert on your team will increase the satisfaction of your business manager, data scientist, as well as the researcher. And on the right, we see that the introduction of an expert who on stats in your team will have a significant impact on the satisfaction of your business management data scientist. Without a stats person on the team, satisfaction is about midpoint for 0 to 10 scale. Whereas if you have an expert on stats on your team, the business manager is significantly happier at about a 7.0 rating on a 0 to 10 scale. That's a pretty big jump. So what this tells me is that you need to think about data science as a team sport and get people on your team who are experts on things that you're not an expert on. And that will necessarily enhance the outcome of your work. So how do teams get together and how do you encourage them to work together from the business manager to the developer, technologist, and the researcher? How do you get them to work together? I like to approach problems using the scientific method. It's a simple five step process. And it starts with formulating the question. You start with the problem statement. If you're a customer success manager company, you want to decrease churn, your problems, it would be, you know, how do I decrease churn? Okay. And then the step two in the scientific method is you have to generate some hypotheses or some hunches. Basically, things you want to test to see if your ideas come true or not. Step three is you gather the data. You could either do that with experiment or just simple observation and just collect data and see what patterns emerge. Also, I encourage clients to integrate their data files because the sum of your data is greater than the sum of your data. So if you want to get, you know, better insights, the more data you have, the better, especially if you have the capability of machine learning. So the final step on analyze data and test hypotheses, you have a lot of data. The data scientist doesn't have the time to sift through all that, all the metrics. So if you have machine learning that will easily surface some insights that you may have missed as a lone data scientist. And the final step is take action or communicate the results. Either you tell your executives kind of things that you found, or maybe you've developed a machine learning algorithm that you can put into a recommendation engine that will automate that process. And with each of these steps, step four, you can go back to generating new hypotheses or from five new hypotheses based on what you find and so forth. So it's a continuous cycle of testing ideas, implementing them and relearning and going in a virtual cycle. So how does that work? So if I have data science on my team, how do I get them to work on a project? So I developed this kind of heuristic to show that of these five scientific method steps, one through five on the left-hand side, you cross those with the three broad data science skillsets, then you can start seeing how the different data scientists can impact different steps of the scientific method. So in the first two one, you've got to formulate the question and generate hypotheses. That's typically driven by the business manager of the team who knows the business, who knows the kinds of questions to ask. And this, by the way, like I said, the business is just to focus just on because I've used data science as a business. If you're studying cancer treatment, you want to have somebody who's an expert in oncology who knows about cancer proteins and things like that, who knows what kind of questions to ask. So the business person can touch primarily steps one, two, and five of the scientific method. The technology and programmer data scientist has a big impact on gathering and generating the data. I know that when I work on big data projects, I always rely on technologists because I don't have that skill set. So without them, I'd be just managing small data sets my entire life. And the firing stats and math, we see here that they primarily impact the final three steps, gather, generate data, analyze data, and communicate the results just based on their skill set. So this is kind of like a nice tourist that you can follow of how your team can get together. Maybe it'd be a nice way to kind of introduce your data science team to this kind of notion that it does take a team to have a successful data science project. And these are the reasons why. So I wanted to see of those 25 data science skills, which ones were really primarily driving the satisfaction with the work outcome. So with the correlation between each of those 25 skill sets, the ratings, and correlated each of those with the measure of satisfaction. And I found here, I just overlaid the top five and the bottom five data science skills. And we see here that for each of the data science roles, one of the top skills that drove satisfaction with the work outcome was the data mining and biz tools. So and that was true for business manager, it was a number two driver of their success on their project. For developer, it was a number four, creatives, it was number two, and researchers, it was number two. Which I found, I mean, this makes sense to be a researcher, you need data mining and biz tools as part of your job. But I was surprised to see both for developer and business manager that just having that skill set made you more successful on your project. And down below, here's the bottom four drivers of success on projects. And what you hear that budgeting was across the board, one of the lowest, had lowest impact on project success. And again, this is just a survey that we developed. So maybe the next round to do a survey, I'm going to leave that question out of there because it didn't add much to the predictive power of a project success. All right. So again, this is a similar kind of graph I showed you earlier. But this instead of like the competency, this is the impact that each skill set has on project outcome. So essentially looking at the correlation between proficiency ratings of each skill and how satisfied they are with their work that they do. And here's the key, the code down here. It's greater than, of course it's greater than .40, you have a big bullet and it decreases down to a small bullet when the correlation is below .20. And again, interestingly, for business managers here, the top role on this chart, big drivers of project success, again, were data mining and biz tools, followed by stats and statistical modeling. And you see here down below some of those business skills are not really predictive of their project success. So I encourage any kind of business manager out there who thinks, you know, they don't need to be skillful in stats or data mining and biz tools. I would encourage you to learn about that stuff and try to adopt some of those tools in your current job because if you do, you'll be more successful in a job than if you don't. So next I looked at education. So again, I had a question about their education status so they could either pick the tech program to your degree, four-year degree, master's degree and PhD, right? So if you look at, if you look at here on the left-hand side, this is for overall the entire data set, but if you have a PhD, you tend to be more proficient in science and math that makes sense or some of your stats and math. Although it didn't seem to have a big impact on other areas like programming or even business. If you got a PhD, you actually have lower proficiency in business skills, which I thought was interesting. The function of the fact maybe the PhDs were primarily focused on maybe academic research jobs. That's something I just don't know. The next I looked at the three different kinds of data scientists that we had Apple data on. There were business management, developers and researchers, and we see here that in developers the difference between a BS and a master's degree is negligent. There's no difference. So you're not more proficient in programming if you have a master's degree than if you have a four-year degree on average. If you look at business management, you hear that there's a difference across all these five skill sets if you have a master's degree compared to if you have a four-year degree. The ones that were statistically significant actually steps, math, and programming. So you just have more proficiency in those areas. And down below, we see researchers see that PhDs have the most proficiency in both stats and math and modeling compared to master's degree students and four-year degree students. So again, education does seem to have an impact on your proficiency, but only for certain kinds of data scientists. Again, for developers, it appears that getting a master's degree doesn't make you more proficient in data science skills that are related to big data projects. And finally, look at gender diversity with data scientists. And I'm not surprised with these results. I looked at other technology companies and you get pretty much the same results. So typically, females make up a very small percent of the data scientists. On average, about 25 percent across overall. You can hear that they appeared mostly as researchers. Of researchers, 32 percent were female, 68 percent were male. So it's predominantly a male-dominated profession, not surprisingly. And I wanted to look at other occupations in the field of science to see what their makeup was in terms of gender diversity. And again, for these three on the right, you get chemists, computer and mathematical occupations, and environmental scientists. You can hear that, again, they're primarily made up of men. Whereas these over here on the left-hand side, biology and medical students or scientists, they tend to be roughly half women and half men. So if you're a recruiter looking for a researcher in data science and looking to build up the number of women in your company, you may want to focus on maybe recruiting people who are in biology students or maybe medical students because there's a lot of women in those industries. And I looked at just comparing, so if you were a woman, you tended to be primarily a researcher. You self-identify as a researcher, about 72 percent indicated they were in a research role. Whereas men kind of cut across the board, the highest is still a researcher, but there were a lot more developers compared to the females in the sample as well. So females tend to be more researchers. Men tend to be, even though they do research, they tend to be more on the developer side and the business management side, not surprisingly. So also one last thing, looking at educational attainment for men and women. The left-hand side is a woman with the right is male. It was true that the roughly the same. Most of the sample had at least a master's degree. So we have over like 66, 65 percent of the respondents had a master's degree or a PhD. And almost all of them had at least a four-year degree. So it's a very highly educated group and there's a difference between men and women who are practicing data scientists right now. They can have the same degree and the same background. So I want to look at proficiency across, and I'm only presenting these business managers and researchers on this slide because that's where we had ample data on that. For the other job roles, we didn't have enough data to make any strong comparisons. So I want to see if proficiency and skills varied across gender. And we see here that roughly men and women have the same proficiency in various data science skills if you're a business manager. Men may have a slight advantage in business knowledge, but the other one is just roughly the same. For the researchers, again, men are slightly higher but not much higher, especially for math and stats. Women are on par with men in their proficiency of knowledge in that skill set. So women are competent, so bring them on board. So ending with a few advice for some data science stuff there. So when you talk about data scientists, be specific. So as I showed here, there are different kinds of data scientists, and each data scientist has their own skill set. So if you're a developer, you tend to be proficient in things like programming and technology. If you're a researcher, you tend to be proficient in things like stats and math. And if you're a business manager, data scientists, you tend to be knowledgeable in business acumen. So it's funny, I have a twin brother who's a computer scientist who works for a local company here in Seattle, and he's also a data scientist. They call him a data scientist, and people call me a data scientist. We have no overlapping skills. So again, I encourage you to kind of be clear when you talk about, you know, I'll be a data scientist, and clarify what kind or where your strengths lie at least. And because of that, I think you need to work with other data professionals who have complementary skills to your own. Not everybody, in fact, nobody is an expert in everything. So find people who are experts in their certain skill sets, and work with them to drive your project forward. And also, no matter what kind of data scientists you are, I encourage each one of you to learn some sort of data mining and visualization tool, because we showed that that was a big driver of whether or not the data scientists were happy with their work outcome. Okay, and you can use such programs like R. If you're a data scientist at a purée, use R in Python, I use SPSS. But there are a whole host of data mining tools out there that I encourage you to at least explore. And finally, being an advocate for women in the field of data science that we showed here, women are just as competent. There was no difference in job satisfaction between men and women. So I encourage you to, if you're running a conference, invite women to speak. If you're hiring data scientists, I encourage you to look at maybe certain industries that may have more women research data scientists than others. But I encourage young women from grade school, middle school, high school, college to get into math, programming, technology, things like that. And that's it. Great. Thanks, Bob. I had my microphone on mute. No problem. Thank you. Okay. I just love it when somebody actually has numbers to back up what they're talking about. Some interesting stuff there. Thanks. Okay, you just switched in and lost the questions. We're going to open it up to questions in one second. I want to start by asking you, was anything in particular a surprise to you as you went through it? A lot of this seems like it backed up some of your expectations. Well, I was really surprised at how clear the results were. I'm always kind of afraid to do survey research primarily because it's all self-report data. That's why you got to be careful when you phrase the questions, especially for the 25-scale proficiencies. I needed to build the scale that I would hope that people aren't inflating their ratings because people tend to be better or say that they're better than they are. With results here, I found that most people didn't reach that level of 60, that intermediate level of proficiency. That leads me to believe that these ratings reflect their real skill sets. That was kind of encouraging. I was also surprised to see how clear those three factors were. I'm always skeptical when people talk without any data to back it up. That's one of the reasons why I did the study, to see are there really three underlying skills to data science? Through the factor analysis of those 25 skills, we see clearly that those three skill sets emerge, quantitative skill sets versus the technology programming skill sets versus the subject matter expertise skill set. I was surprised at how clear that factor solution was. It's nice when you have hypotheses that you test and they come out in support of your hypothesis. I'm always surprised when that happens because it's guesswork, it's science. You can be right, you can be wrong. I was shocked to see the impact of the team on your performance. If you're an expert, if you're a business manager, you're more happy with your work if you're paired up with somebody who's good with stats. That makes sense to me because when you look at data projects, they necessarily involve the analysis of data. If you're a business expert who doesn't have expertise in that area of analyzing data, you need somebody with good quant skills. Otherwise, your project's going to fail. I think with your slide 24 that talked about that, that was the thing that I had never really considered. I think it's probably one of the great lessons from what you've been doing here in terms of, right, what are the... Is that what you were talking about? There was a slide that talked about the impact for each category on the other category. I think when we talk about analytics and putting together teams for data science, I just think that that's something that is so often overlooked where it's a last minute thing. For me, the most surprising thing, I say this almost on your cheek, but was that for the developers rating themselves highest on communication skills, because that hasn't really been my experience, but it's an interesting thing. So I want to open it up to some of the audience, the participants' questions. You normally do this for me. I'm looking to see questions. You didn't leave me alone here. I'm here. What do we have here? With my sound issues today, there's questions that are coming in. The most recent question is, would you recommend having at least one level of expertise in each of the three factor areas versus having high proficiency in all factor areas within your silo? I would recommend finding at least somebody who has advanced knowledge in a given topic area and work with them. I remember the creatives, the creatives data scientists, they identified as hackers or more creative people. They weren't strong in anyone's field, anyone's skill set. We found that the presence of them, nothing affects their behavior or their performance or their attitude about the work they do, because I think in order to be successful, you have to be at least advanced knowledge of a given topic area. I think that's a general statement for any kind of skill set. If you're not good at something, if you're asked to do something in data analysis, for example, and you don't know stats, I think you're going to suffer and you're not going to be happy with the work that you do. So I think you need to be working with people who are the best that you can find in that area, at least have an advanced knowledge. And how do you determine that? I mean, you look at resumes, past work, job interview questions, but still focus on getting the best. And the point I'm trying to make in that one is that if you do find somebody who's good in, say, stats, they probably are not going to be good in programming or technology. They might be, but the chances are they're not. So data science is necessarily a team sport because people are going to have their one or two strengths, and that's about it. So if you want to have a complete data science team, get experts in all those three areas, technology, stats, and subject matter expertise. Yeah, I think one of the other things that I really appreciate you bringing out there was using the scientific method as your approach to the questions that you're asking. What are you looking for in the data set? And as you pointed out, that starts with somebody who understands the business because they know what you're going to need. And perhaps depending on the type of analysis you're doing, using data science in general business analysis, or you're doing it for operational or long-term forecasting, if it's a sort of a bigger project, that the need for the different skills will come in phases. So you have to start out by understanding that. Right. One of the problem is, and as your slide pointed out, the folks with the technology expertise can come a little later, but right after the business, you need the stats ability. Frankly, business people may ask questions that can't be answered with the data that you have. Exactly. Exactly. I do like the scientific method. And other methods I've seen on the web, like other pundits and consultants, even though they make up their own terms to describe their particular method, it still follows roughly the scientific method. You've got to start out with a problem statement. Like why are we even doing this project? If you don't even have that, you'll just end up analyzing data, which can be fun, because I've done that many times in my life. Like late at night, you've got a data set, you just start looking at stuff. But typically, you want to have a purpose of why you're doing this analysis. And that'll guide where you're going. And I'm not saying you have to do this every time, but I think it's a good approach. It makes you critically think about not only the problem you're asking, but how do you actually operationalize it and define your metrics that are used in your analytics? Great. So additional questions coming in. How do you best promote a data science strategy in a non-data-centric enterprise? Oh, that's a simple question. I'm doing some research project right now looking at the difference between analytical leaders and analytical laggards. An analytical leader are companies that use analytics to get a competitive advantage versus the laggards who don't have that ability to do that. And I find that there are differences between companies or leaders versus laggards and a host of things around how they build up their customer program, whether it be customer experience, customer success, what have you. I find that analytical leaders tend to have a lot more executive support. So initially, you've got to have people up at top who believe in this, who actually use data to make decisions and can talk about it intelligently. One of the things I stress to my clients is that I encourage the top brass of companies to at least take an introductory course on statistics or math. Just to be comfortable with numbers and what they can tell you and what they mean and can't tell you. So topics they can support. And also sharing the results company-wide is another big differentiator between analytical leaders and laggards. So make sure you share your results not only inside your company, but also outside to conferences. So people know who you are. Maybe you'll get feedback at a conference on your project. Maybe it'll make you better. Also, I find that analytical leaders tended to use more of the latest technology like machine learning capabilities. They have more access or better access to data scientists on their team than analytical laggards. And also, they tend to use customer data platforms. So that kind of automates insights. So there are many things that differentiate leaders and laggards. But if you want to start with something, I think start with leadership and get them on board and get them talking about the value of data in analytics. Thanks. All right. So what are some of the best practices, Bob, in tying between data science and analytics, business analysis, and what are you seeing practically that the benefits that it's bringing to businesses? Well, I don't have any research on that particular question. I can just speak through experience. I think you need to be clear with the things you're measuring. That's why I like the scientific method, because it makes you be critical about the questions you're asking and how you define them. Because when you pose questions, you want to make sure that these questions are testable. You can't just look at a static average, something and say it draws some deep, meaningful conclusions about that. So be clear about what you're trying to predict and develop measures that are good measures of that construct. One of the biggest problems I see in the area of customer experience and customer success management is that people define these or just throw around terms like, for example, customer engagement. And it means two different things for different people. And if you're measuring engagement from a survey, then I want you to look at the questions and tell me how is that metric different than other metrics you may have used or may have called something else? For example, I often hear the words customer loyalty, customer sat, and customer engagement used in the same article. When you look at the questions, they're roughly the same questions. So you're just measuring just an overall attitude. So be clear with what you're measuring, define it, and be clear with what kind of outcomes you're looking for. And I think that's a big problem. If you can solve that problem, be clear with your measurement. I think that's a good first step in a great project. Great. Can I just jump in there, Jen, because you were asking about sort of business analysis. And I see as I'm trying to go through the thread here that someone had asked about business analysis that was in the original title and we're talking about how it maps to Bob's research. I think one of the things and one of the reasons I wanted to cover this topic in this forum is that we talk about big data all the time. We talk about analytics. We talk about data scientists. To me, when we look at sort of making business decisions, there are kind of two major categories. You can have a data-driven decision, but you can also have a data-driven strategy. And sometimes what happens is when we're doing business analysis, trying to figure out the strategy, and I've talked about this in a couple of the other webinars going back to some of the stuff I did teaching in business school, you know, trying to figure out what you want to be as a business, those are the kind of questions that require different analysis rather than something where you're doing customer satisfaction. You release something. You do an A-B testing, whatever that may be. And you want to say, well, okay, what's the result? And I think both of those are amenable to the type of use of the scientific method that Bob outlined here. And both of those are things that you can improve. I want to say both. I'm talking about sort of tactical decision making versus overall strategy. If you take the time up front to think of, if you want to be data-driven and really understand, make your decisions based on data, this approach of building a team, making a team support is so important. I used to do some management workshops with a fellow that said that he'd interviewed managers at companies all around the world, and the one that stuck in his mind was he said that decisions in their company was always made by gut feel, and basically whoever had the biggest gut was the one that had the biggest impact. And, you know, I'll just toss in one more data point, and then you can do with this as you like. I also worked with someone else who's a well-known name in the industry. And they were doing a TV commercial and showed up at the studio, and they had to wear a suit in a while, because they didn't need to, and realized that the suit jacket didn't fit. And so, sitting them down, they ripped the jacket right up the back because you're only doing sharp on the front. And that's when I came up with my own law of success, which is the Rutherford's Law of Success. If you ever look at Brooks Brothers, they generally have three lines of suits, and the more expensive the suit, the smaller the difference between the chest size and the waist size. And that goes with the gut feel. This is, you know, my own data that, as you reach certain level, and that's backed up by empirical evidence. And the only reason I say that is because I think that that is something that we can look at as some things are backed up by data, some things that are backed up by intuition. And if you really want to be able to justify for your business analysis, the way to do it is to follow the guidelines or the recommendations. It wasn't put that way about presentations, but this whole team approach and when you need which skill, I think it's something that people really should be looking at. And I haven't seen them look at it anywhere else. Come on. Shannon, do we have any more? No, that was it for questions. But we are just, you know, coming up right at the top of the hour. You know, and to answer the most common question that we have had throughout the presentation is just a reminder that I will send a follow-up email by end of day Monday with links to the slides, links to the recording of this session, as well as the information to contact Bob. Bob, thank you so much for joining us this month. It's been a pleasure having you as guest speaker, and Adrian, thanks as always. Thanks. Thank you. I really appreciate the offer. Thank you very much, guys. Well, thank you. Well, thank you. Thanks. Take care, folks.