 Live from Los Angeles, it's theCUBE, covering Open Source Summit North America 2017, brought to you by the Linux Foundation and Red Hat. Hello everyone, welcome back to our live coverage, theCUBE's live coverage of the Open Source Summit in North America as part of the Linux Foundation. I'm John Furrier, your host with Stu Miniman and also we'll give him on co-host. Our next guest is Tammy Bakshi, who's an IBM Honorary Cloud Advisor, algorithmist, former CUBE alumni, great to see you. Thank you very much, glad to be here. You get taller every year, and it's about three years ago, two years ago? I believe, yeah, two years ago, Interconnect 2016. IBM show, doing a lot of great stuff. You're now an IBM VIP, you're doing a lot of work with them. IBM champion, congratulations. Thank you, absolutely great. What's new? You're pushing the code today? Definitely, now today, getting ready for my BLF that I've got tonight, it's been absolutely great, and I've been working on a lot of new projects that I'm going to be talking about today and tomorrow at my keynote. I've been working on Astanmay, of course, Interconnect 2016, the very first time I presented Astanmay, since then a lot has changed. I've incorporated real deep learning algorithms custom with TensorFlow into Astanmay. Astanmay now thinks about what it's actually looking at, using Watson as well. It's really interesting, and of course, new projects that I'm working on, including Deep Spade, which actually basically helps online communities to detect and of course, report and flag spam from different websites, like for example, Stack Overflow, which I'm working on right now. So you did some deep learning stuff? Yeah. You did some Watson team, everything else. Exactly, yes. What's the coolest thing you've worked on since we last talked? Well, it would have to be a tie between, say, Astanmay, Deep Spade, and Advancements with a cognitive story. As you know from last time, I've been working on lots of interesting projects, like with Astanmay, some great new updates that you'll hear about today. Deep Spade itself, though, I'd like to get a little bit more into that. There's actually, I mean, of course, everyone listening right now is using Stack Overflow or Stack Exchange at one point in their lives. And so, they've probably noticed that a little bit here and there, you'd see a spam message on Stack Overflow in a comment or post. And of course, there are methods to try and prevent spam on Stack Overflow, but they aren't very effective. And that's why a group of programmers known as CharcoalSE actually went ahead and started creating basically the suite to try and prevent spam on Stack Exchange. And they call it Smoke Detector, and it helps them to find and remove spam on Stack Exchange. Just like this is so good until it goes off and the battery needs to be replaced and you got to get on a chair. But this whole smoke detector, this is a real way to help create a good, healthy community. Yes, exactly. So they try and basically find spam, report to moderators, and if enough alarms are set off, they try and report it or flag it automatically via other people's accounts. And so, basically, what I'm trying to do is, I mean, a few weeks ago, when I found out about what they're doing, I found out that they use regular expressions to try and find spam. And so, they have years of people gathering experience, they're experts in this field, and they keep adding more regular expressions to try and find spam. And since I'm really, really passionate about deep learning, I thought, why not try and help them out, try and augment this sort of smoke detector with deep learning? And so, they graciously donated their dataset to me, which has a good amount of training rows for me to actually train a deep learning system to classify a post between spam or non-spam. And you'll be hearing a lot more about the model architecture, the CNN plus GRU model that I've got running in Keras tonight during my U.S. Now, machine learning could be a real benefit to spam detection, because the patterns, scammers tend to have their own patterns as do bots. Yes, exactly, exactly. And eventually, you realize that, hey, maybe we're not using the same words in every post, but there's a specific pattern of words, or this specific type of word that always appears in the spam message, and machine learning would help us combat against that. And of course, in this case, maybe we don't actually have a word or a specific website or a specific phone number that would trigger a regular expression alarm, but in the context that this website appears, machine learning can tell us that, hey, yeah, this is probably a spam post. So there are lots of really interesting places where machine learning can tie in with this and help out with the accuracy. In fact, I've been able to reach around 98% accuracy and around 15,000 testing rows, so I'm very glad that there's also far, and of course, I'm continuing to do all this parameter tuning and everything. All right, so how old are you this year? I could be at the numbers trade. Are you 13, 14? Well, originally, Interconnect 2016, I was 12, but now I'm 13 years old and I'm going to be 14 in October, October 16. Okay, so you're knocking on 14. Not just yet, Mary, I will be 14 in June, all right. So Tanmay, you're 14, your time's done at this point, but one of your missions today to be serious is helping to inspire the next generation, especially here at the Open Source Summit. Why don't you talk a little bit, give us a preview of what we're going to see in your keynote. Sure, definitely. And now, as you mentioned, in fact, I actually have a goal, which is really to reach out to and help 100,000 inspiring coders along their journey of learning to code and, of course, and applying that code in lots of different fields. In fact, I'm actually already around 4,500 people there, which I'm very, very excited about. But today, during my BOF, as I mentioned, I'm going to be talking a lot about the in-depth of the deep spade and Tanmay projects I've been working on. But tomorrow, during my keynote, you'll be hearing a lot about generally all the projects that I've been working on and how they're impacting lots of different fields, like healthcare, utility, security, via artificial intelligence and machine learning. Yeah, so when you first talked to us about Ask Tanmay, it's been, what, almost 18 months, I think, there? I assume. What's changed? What's accelerating? I hear you throw out things like TensorFlow, not something we were talking about two years ago. What have been some of the key learnings you've had as you've really done this? Yeah, sure. In fact, this is actually something that I'm covering tonight. And that is, that Ask Tanmay, you could say that it's DNA. I mean, what, well, from Ask MSR that was made in 2002, and I took that, revived it, and basically made it into Ask Tanmay. In its DNA, there were specific elements. Like, for example, it really relies on data redundancy. If there's no data redundancy, then Ask Tanmay doesn't do well. If you were to ask it, where was, where's the open-source summit North America going to be held, I wouldn't answer correctly because it's not redundant enough on the internet. It's mentioned once or twice, but not more than that. And so I learned that it's currently very, I guess you could say naive in terms of how it actually understands the data that it's collecting. However, over the past, I'd say around six to seven months, I've been able to implement BIDAP or Bidirectional Attention Flow. That's what was created by Alan AI. And it's a completely open source. And it used something that's called a squad data set or standard question and answer data set in order to actually take paragraphs and questions and try to return answers as snippets from the paragraphs. And so, again, integrated to Ask Tanmay, this allows me to really reduce the data redundancy requirement. I'm able to merge very similar answers to have better answers at the top of the list. And, of course, I'm able to have it more smart. It's not as naive. It actually understands the content that it's gathering from search engines. For example, Google and Bing, which I've also added support, search support for. So, again, a lot has changed using deep learning, but still, sort of the key points of Ask Tanmay requires very little computational power. Case, very, very cross-platform runs on any operating system, including iOS, Android, et cetera. And, of course, from their open source completely on GitHub. So, how has your life changed since you've been really in this spotlight and well-deserved? I think it's been great to have you on theCUBE multiple times. Thanks for coming on. Thank you. Definitely, of course. Dave along there was just calling. He wants to ask you a few questions himself. Dave, if you're watching, we'll get you on. Just call right now. Thank you. What's going on? What are you going to do? Are you happy right now? Are you cool with everything? Or is there a point where you say, hey, I want to play a little bit with different tools? You want more freedom? What's going on? Well, you see, right now I'm very, very excited. I'm very happy with what I'm doing. Because, of course, I mean, my life generally has changed quite a bit since last Interconnect, you could say, from Interconnect 2016 to 2017 to now. Of course, since then, I've been able to go into lots of different fields. Not only am I working with general deep learning and IBM Watson, I'm working with lots of different tools. I'm working especially in terms of, for example, Linux, what I've been doing with open source and everything. I've been able to create, for example, Jastan may now integrate Keras and TensorFlow. Deep Spade is actually built entirely off of TensorFlow and Keras. Now I've also been able to venture into lots of different APIs as well, not just with IBM Watson, also things like we've got the Dandelion API. Jastan may also relies off of Dandelion providing tech similarity services for semantic and syntactic tech similarity, which, again, we'll be talking about tonight as well. So yeah, lots has changed. And of course, with all this sort of new stuff or new media through which I'm able to share my knowledge, like for example, all these Cube interviews I've been doing and of course, all these keynotes, I'm able to really spread my message about AI, why I believe it's not only our future, but also our present. Like for example, I also mentioned this last time, if you were to just open up your phone right now, you already see that half of your phone is powered by AI. It's detecting that, hey, you're at your home right now, you just drove back from work and it's this time on this day. So you probably want to open up this application. It predicts that and provides you with that. Apart from that, things like Siri, Google Now, these are all powered by AI. They're already integral part of our lives and of course, what they're going to be doing in our lives to come is just absolutely great with like healthcare, providing artificial communication ability to people who can't communicate naturally. I think it's going to be really, really interesting. Tami, always great to have you on the Cube. Congratulations, ask Tami good projects. Let's stay in touch as we start to do this more collaboration. Love to keep promoting your work. Great job. And you're an inspiration to many. Thank you very much. Glad to be here. Live coverage from the open source summits to Cube in Los Angeles. I'm John Furrier, Stu Miniman. We'll be back with more live coverage after this short break.