 Live from the Mandalay Convention Center in Las Vegas, Nevada, it's The Cube at IBM Insight 2014. Here are your hosts, John Furrier and Dave Vellante. Okay, welcome back everyone. We are here live in Las Vegas for IBM Insight. It's The Cube, our flagship program where we go out to the events and extract the silver of the noise. I'm John Furrier with my co-host Dave Vellante with wikibond.org, our next guest, Amit Bulat, program director, master's in data science from Istanbul City University. Welcome to The Cube. Thank you, John. We had a chat an opening night here inside the social lounge, it was a great chat and I got to tell you, I was fascinated. I could have not talked another hour, so we got the live streaming here. So let's try to recreate the magic. Let's do it. First, data science, obviously, you know, there's a paper you wrote two years ago. Yes. That's now fashionable again, so we talked about that. But in general, the theme of big data is essentially good coding, good science, and real time. So let's talk about that analytics. What's your take of the current state of the union for data science from a practitioner standpoint, from a student perspective, research, and application in business? Where are we? Still early, what's the cool things that are going on? Okay, so John, I'm here for the BDA EdCon, which is Big Data Analysis and Education Conference. So it's the second time we've been doing it. And basically, the academicians come together and discuss, are we ready in data science? I mean, how are the schools ready in data science? And I came here and I gave a talk on our graduate program, and I talked about the kinds of classes that we have. So it's very early. I mean, we just got our first batch of students in fall 2014, so and it's quite new. And we've been discussing, should we add in more classes or is this class right? Is it going to answer the industry's needs? So we're still working and I think we're getting there, so we're progressing. So I think we're still in the midst of discussing, is there a data scientist person specifically or data science is just a concept? So one of the things that we see, and this is a good conversation to drill down on is, data science, I'll see hot. You're seeing people talk about, oh, I want some more data scientists. There's no yellow pages. There's no directory. Give me the data science. It's usually they're growing out on their own, if you will, through either computer science or just some weird math skill, physics, and anthropology, I know guys are really killer. It's just a different breed right now. So there's just not enough data science going around. So I've got to ask you one, what are the things that are going on to automate data science? And two, what are the things that you see in software that'll make data science easier for people to apply data science skills, not so much be coding Python or doing algorithmic machine learning code? Okay, so I think, as I said, I don't believe in the title data scientist, but I believe in data science as a discipline. And the reason is, for example, I see computer scientists playing a pivotal role in data science projects. But I mean, what about the other people? For example, computer scientists have to work with domain experts, sociologists, biologists, psychologists. So we have to bring them on board. If you call this computer science, no one is going to be on board. We can embrace everybody around data science because data can sort of embrace everybody. And then we can just sit together with a statistician, with a biologist, with a domain expert, and a computer scientist is going to be there with specific skillset to drive it forward. So that's what I call a data science. So data scientists, I don't think should exist, but data science should exist. It is the table that brings all those practitioners together and they work on the project. Well, a lot of people are saying they're data scientists and it's actually weird because they might not even be. So you're making a distinction. There's data science. Not so much a data scientist. People call themselves data scientists so they can make more money. Look, that's what it is. I mean, if you're a statistician and you know how to quote an R, well, you're a cool data scientist. And you just got to raise. And then there's a false expectation because people think there's magic juju in there, right? Dave's like, it's a study case. That's why I'm doing it. Well, but now in fairness, something like Hillary Mason would say, well, a data scientist has, take your data science disciplines. A data scientist would be somebody who has those skills, which is pretty rare. It would be, I mean, come on, I did my PhD and it is all about specializing. It's all about picking a topic, deep diving in it. So that's how you can be proficient in something. I mean, do we really expect one person to have data visualization, data analytics, scalable systems, business intelligence? No, come on. Well, but wait now. So it might take a while to get those skills, but you could study and in practice. IBM can help. So we'll come back to that. But what an individual could in theory anyway, over some period of time, say a decade, study those different disciplines and maybe even apply them and come out with. They might change. So we've got to be agile. So your argument would be they have to be specialized to keep up with the changes. Exactly. So they have to be agile, they have to embrace the change. It's going to change. So let's not expect that business analytics skills is the only skill, the top skill there. No, no, it might change. So how do you embrace it? How do you change? How do you adapt? Those are the kinds of things or social skills that are really important, I think. So we had a question in the crowd chat, but it's kind of related to more of the business part. I want to apply it to our conversation more technical. Question is from Mark Bullander. What are the best traits of your business partners that have helped you achieve your business outcomes? I'll translate late into the more, this conversation is, it's a team sport, data science, collaborative computing is very much what's going on in this world. So what traits do you see in partners or peers, if you will, to help achieve a result? I want them to be doers. Basically, I want them to be, you know, lean to the table and I should do it too as well. Leaning in the sterile sandwich. Leaning is the key, I think. So that is because I'm going to be working with a statistician. So she or he should be willing to pitch in anytime that is needed. So I am the computer scientist. I need to massage the data. I should do it. So it should be a collaborative effort. So as I said, everybody should lean in. It's a project, it should be agile and we need to just, you know, step by step work towards the solution. That's... What's the state of data science and big data in Istanbul and your part of the world? I think it is, I would say I see masters in data, for example, the first masters in data science program that I've saw was at UC Berkeley. And this was, I think, two, three years ago. And I may be mistaken, but that's when I saw it. So in Istanbul, it's the same. So I think we are the first program in Istanbul or in Turkey, probably in the region, in EMEA region, who has a master's program in data science. So that's how really we are. And we just opened up and we got our first students in fall 2014, so this semester. And you do that based on demand from society? Demand or, I mean, I am always, you know, I am always, you know, tapping into industry. And I was here last year and I was actually, I lived in the States for 10 years. So I'm always, you know, up to date in what's happening in this industry, what are the demands, what is needed. So I'm just trying to be visionary and trying to, you know, move on earlier than my, did you go to school in the States? Yes, I did. And then we kicked you out? No. No, so we usually do it. I got my PhD at UCSB, you know, I think love listed in these states. My son goes there. Yeah, it is, it's a great place. It's a great place to live. He lives on IV. A lot of distractions, a lot of distractions. That's why it takes a while to finish your PhD. You don't want to rush. Which is good. I mean, you can just, you know, take as much as time. So what's going on in Istanbul? So like, you're in a part of the country that's really exploding. You're seeing a lot of the Eastern Bloc countries from a computer science standpoint, you know, do a lot of things. You also get a bad rap on the hacking side. So you also have, you know, close to the territory and there's a geopolitical scene going on, you know, south of Turkey. So what's the environment like? You got good guys, you got bad guys. Are the coders up to speed? I mean, they get a bad rap sometimes by some people in America saying, oh, those guys are hacks. What's the state of the computer science? I'm hearing mixed messages there. I think, I saw it was a tweet, I think. And it was saying there are a few languages who's going to make people better at mathematics and Turkish is one of them. So you can see the computer scientists in Turkey as good in math, good in coding skills. Do they have division? Not yet. So I just want to be honest. So we don't have division, but we do have the mathematical analysis skills. So that is the, in a way, state. That's the DNA. That's built in. That's the DNA. So and going back to your statement about, I think I am very lucky to be in Istanbul because there are problems and they're complex. Because problems usually have many, many dimensions. It's not just a mathematical problem. It's not a computer science problem. It is a problem that has socio-economic dimension. It has psychological dimension, historical dimension. So the problems that we have, it's like a melting pot in Istanbul, big problems. And I think it's a good place to be in if you're going to do data science. And you have challenging problems there. I mean, there's a lot of activity there. Certainly we see it there. We've seen a lot of innovations coming out of that area. I got to ask you what the beach is in Turkey because everyone loves to talk about the beaches. They do from Europe. You know, it's a favorite destination spot. And there's a... There's a city, Istanbul itself. If you like history, it is a must. It's a very old city. It is a bucket list city. Yeah, and people love going there. Just in terms of the beaches and the oceans, awesome. Let me get that out of the way. People want to know. I mean, you know, things going on. Internet of Things project going on, Dave? I've not been, but I have a good friend who's been and says it's a must-see. It's a must-see. I mean, how long is enough? Well, it's never enough, but spent at least a week to sightsee, to see all the historical places. And I think historical peninsula is the place you want to go. That's where the Ottoman Empire was based in. So I mean, I got to get your perspective on the computer science back to that thread and bring it back to the IBM event here. We were talking at the opening night about the programs. What is the modern computer science program structure out there? There's a variety of different tracks. We don't need to go into great detail, but there seems to be a mindset right now for this new generation. I'm an old school. We did Assembler, we coded a lot of stuff. Now, computer science is so much more exciting. I mean, there seems to be get the math angle, there's so much going on in computer science. It really is exciting time. So the new kids coming out of college and the universities, they have an awesome opportunity in front of them. Startups, there's transformations going on. What should be the curriculum? What is being taught? What are the highlights? What is the good and what is not so good? Okay. I think this is also the topic of the presentation that I had in the 2013 IOD. And this is also the kind of decisions that we made it in our school. So I was lucky to be at Istanbul State University, which is a startup university. And I was very hands on in all the decisions that are made, which is a good opportunity for me. And we decided to use Python, basically. And this was a key decision because I think my key design criteria when I was designing the curriculum was this. Whatever I do, I shouldn't lose the students' excitement or the curiosity. I mean, this was the key design criteria. Make sure that you're not losing the student. He is or she is basically progressing with some programming language, which is going to make him easy to kind of interact with the platform, with the data. So I think their Python is the key. So therefore we decided we're going to teach everything in Python. And this was very radical because all the old schools in Turkey and also in the U.S., they either teach, for example, in C or Java, which are sort of mature languages. But I think also coming from the education from this, let's say, from this channel, I know the difficulties because you get bored. It is easy to kind of get bored or to get sort of, you know, with the challenges of the platform itself. So we have to... Well, they teach it because it's popular. It is popular, but... It's everywhere, but Python works for Google. Python works for Google, so I get a lot of feedback and also back pressure saying, everybody is using Java, why are we using Python? Wait, wait we'll see because I want people to stay active, agile. I want them to kind of get into it. If they want to learn Java, they will learn Java. They will have that self-esteem already built with a bit of Python, which is as easy as just open your terminal, type Python, and there you have Python. So this is, I think, what I see or what I should see, would like to see also in other programs, basically making sure that students are agile and they don't lose their creativity and curiosity. And I think we should help, as the academicians, what is the right environment for that? And I think, as I said, Python is the key for that. Well, it's good to put them out of their comfort zone every now and then as well. They keep innovating. They have to be kept out of the comfort zone anyway, so that is a must. So I give you that. Yeah, okay, so... But independent of the programming language, what other requirements are there? Because you can learn a new language. I mean, relatively... I think we should teach them how to be social programmers. That's the first thing. What does that mean? Social programmers. So there is a concept called paid programming in software engineering or agile software methodologies, which is called paid programming. Usually, when you think about a programmer, you see someone who's lonely, who's kind of coding by himself, he has a task to finish. Get the hood on. Get the hood on. You have coffee and all that. But no, I think you have to make sure that we are basically raising social programmers. And one thing, how do you be a social programmer? Well, there is a concept called paid programming. Paid programming. And what it is, is one of you is programming, basically, we are in pairs. One of you is programming and the other one is sitting right next to you and reviewing the code, checking and also asking questions, verifying. Very collaborative. And very collaborative. Because you know, paid programming. Paid programming. Is it called paid programming? No, paid programming. Paid. Paid. Oh, pair, pair. Got it. I thought I meant payer. Okay. It's like, yeah. Okay. They should get paid. Well, social programming. Sorry, okay. No, maybe it's free programming because it's social. So the buddy system, team. It is a, yeah, should be a team. Exactly, buddy system would be, you know, I think it would be better, but anyway. Oh, no, it's good. Okay. So that is, as I said, I think we need to have social programmers. At the end of the day, as I said, going back to data science, they have to sit around the table with statisticians, with practitioners and do it up right there. So, well, you should kind of give them already, you know, right in school. So we should make them social programmers right from the back. All right, so what, you guys started this program when? I've been working on, my team, you know, my department is working on it, I think one and a half year now, but we just started last September. So it took some time to get the curriculum. It took some time. And then, just last month? Last month, yes. Oh, okay. And how many students? We have 15 so far. And we are partnering with IBM, actually, also in this data science program. On the curriculum? So they're going to pitch in with, maybe, as guest speakers or they're going to, through the academic initiative program, we're going to have access to the products, like IBM's, you know, products. And we can teach them. Well, I want to ask you, what's next? So you write papers that two years becomes mainstream, you're ahead of the curve. Okay. What's next? I mean, what are you working on now that you think is going to be kind of coming into the mainstream that's going to be important? Okay, so I did my PhD on Data Stream Management Systems and I see now IOT gaining traction, which is Internet of Things. And everything that is being discussed in the scope of IOT is, I know it now by heart, because I did my PhD on... That's what they call stream computing. Stream computing. So, or Data Stream Management. So just computing, yes. And also how do you manage the whole platform? So, and I, so that's what I'm seeing. I mean, that's... So what do you think about the crowd-shed idea we were talking about, the one where, because that's a stream. I mean, streaming is how people are talking now, online. So I want to get your take on how streaming can not just be for things, machines, people. People are streams. I mean, we're streaming ourselves, right? So what do you see that going? Because it's very unstructured. Twitter's earnings were out today talking about cohort analysis, essentially relationship mapping. That's all math, right? If I'm not mistaken, what you have shown me in the crowd-shed, you were organizing the streams around hashtags. So basically you're helping the community to be organized. So in some way that's how, that's your biggest, I think, help. So I see crowd-shed actually in a growing, that's under that. And what about this idea of cognitive computing, essentially reasoning? I mean, because now you have context, which is like the in-stream stuff, relationship across, and so database work, and then the streaming stuff gives you a real-time component. That's really compelling. I get that, right? So now let's go on top of it. Actions, how do I create more insights? The human component, a little bit of a cognitive, that's what they're calling it, but to me it's more of, okay, I got all this data, I got all the ingestion going on, I'm doing some analysis, some algorithmic stuff going on, automating, orchestrating. Now what do I do with it? Security prevention, it could be people relationships like crowd-shed, other things. What do you see that going? I mean, I think it's still early for the cognitive computing, and so your statement about Twitter and Facebook. So yes, we are sensors, yes we are basically spitting out information, but just think of it this way, we're spitting out, if they're not too lousy in our usage on Twitter and Facebook, they are spitting out quality information. Basically, you look at it, for example, a sensor reading, you look at something else, and you basically say, you combine many information altogether, and you say, you spit out something in your Twitter message, and now it is digital, it is textual, and we can't mine it, the machines can't mine it. So basically that is how us people got in the whole game. So cognitive computing, I think it is too early on, I see people trying to use, going back to noodle networks, and trying to revive it again. So I think it's still in a bit of space. It's still a bit early, but you can connect the dots, so you can say, okay, if we do the in-stream processing, if we do have some algorithmic coolness around machine learning, you can almost project, okay the next evolution is, some sort of personalized experience, and or value, you can get there. Or can you? Come on. One more statement about this is, I think we need to be careful though, John, because there's a lot of data, yes, but does that data have also in it our big biases as well? Yes, it does. So in a way, you need to be careful. It is us who spit out the data, but it is the system who drove us to kind of gave it observations and outcomes, but we need to be careful. So it could be our biases that kind of gave rise to those kind of observations. So even if you understand, ah, ah, ah, you know, I discovered something, probably you have discovered a big bias there. And you can also game it as well. This gamification, if you know the biases are built in, Exactly, so that's a great point. So exciting, it really is an exciting field. It really is. You're doing some great work, really appreciate you coming on the queue. It really is, again, computer science is to me, so exciting right now, and so much opportunity. You really can bump into something that's not going to be a big opportunity with the computer science, certainly around analytics and cloud. Mr. Cube, we'll be right back after the short break live in Las Vegas. Pride BM Insight, we'll be right back.