 Hello and welcome my name is Shannon Kemp and I'm the Chief Digital Manager of Data Diversity. We'd like to thank you for joining the current installment of the Monthly Data Diversity Smart Data Webinar Series with Adrienne Poles. Today Adrienne will discuss machine learning update and overview of technology maturity and product vendors. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we'll be collecting them by the Q&A in the bottom right-hand corner of your screen. Or if you'd like to tweet, we encourage you to share highlights or questions via Twitter using hashtag Smart Data. If you'd like to chat with us and with each other, we certainly encourage you to do so. Just click the chat icon in the top right-hand corner for that feature. And as always, we will send a follow-up email within two business days containing links to the slides, the recording of this session, and additional information requested throughout the webinar. Now, let me introduce to you our speaker for today, Adrienne Poles. Adrienne is an industry analyst and recovering academic providing research and advisory services for buyers, sellers, and investors in emerging technology markets. His coverage area includes cognitive computing, big data analytics, the Internet of Things, and cloud computing. Adrienne co-authored cognitive computing and big data analytics, published by Wiley in 2015, and is currently writing a book on the business and societal impacts of these emerging technologies. Adrienne earned his BA in psychology and MS in computer science from SUNY Binghamton and his PhD in computer science from Northwestern University. And with that, I will give it to Adrienne to get us started. Hello and welcome. Thank you, Shannon. As usual, it's an adventure just being here. All right, well, thanks, everybody, for joining us. If you missed Shannon's earlier message, I'm literally semi-in the dark here. We've been without power for, oh, about 14 hours. So it's just been fun trying to get online. Happy to be here. Anyway, happy to be with you today to talk about machine learning. And the plan today is to go over, kind of put machine learning in context and talk about the maturation of some of the technologies involved, and then do a brief overview of the market for commercial products and services in machine learning. So with that in mind, I didn't originally plan to do any definitions of terms. But yesterday, I was reading a report from a research firm that will go unnamed. And they so butchered the distinction between artificial intelligence and machine learning and data science. But I thought I'd better put things, put a framework around it so that we're all talking about the same thing. So with that in mind, the way I look at the world is that artificial intelligence is the broadest of the categories here. And AI, historically, going back to the 50s, has been a discipline that brings in ideas and techniques and some algorithms from a variety of disciplines. But in general, it refers to an approach to problem solving, things that we once associated with the human brain. So when we talk about artificial intelligence, we're really talking about machine-oriented systems or solutions that perform human cognitive-like functions. So I'll get into that in just a minute. But within that, machine learning refers to systems that improve their performance on some artificial intelligence tasks based on experience with data. So machine learning is the part of artificial intelligence that deals with self-learning, if you will, or self-improvement rather than additional input from the system developer. So if a system is running and gets some experience and improves its performance along some dimension, then we would say that the system is learning. If the system is running and we get some experience with it and say, oh, we need to change a rule and the programmer or developer changes that or a user, then that's not machine learning. That's the system is learning. But then you would think of the human in the loop. And within machine learning, deep learning, we'll get into quite a bit today, is a biologically inspired approach that uses multiple layers of abstraction, modeling with something that looks like a neural net. Data science is outside that. Data science is really looking at analytics and the relationship between them is that you can use AI techniques for data science. You can certainly use machine learning and use deep learning. But you don't need data science to do artificial intelligence. And for many things with data science, you don't need artificial intelligence. So I just want to set the stage here. So what we want to focus on today is the big green box. Like I said that, looking kind of at the evolution, if you will, the classic problems in AI, again, going back to the early days, involve perception, understanding, learning, and a lot in problem solving with communication between human and machine done in natural language. So classic AI is looking at all those problems. When I think of modern AI, it's the same problems, but we've also applied these deep learning techniques and big data. So someone was asking me recently, what's new? And it's not that the types of problems that we're dealing with are different. It's the approaches that we're able to take because we have available to us large data sets and the infrastructure to process that data. And that enables deep learning and that enables to solve the same problem in a different way. Kind of wrapping up the context here in the next couple of slides. Excuse me. When we think of AI as sort of the overall picture within AI, the area that we generally refer to as cognitive computing involves understanding, reasoning, and learning. And the model, forgive me if you've been with me on some of the other webinars. I really focus on the idea of a model. The model includes your algorithms, but also your assumptions. So it may be implicit, it may be explicit, but any assumptions you make about the way the data is structured or the forms of things, that all goes into the model. The talk today is on machine learning. So you might think it's just that learn part in the center. But in fact, it's really just about impossible to separate learning from understanding. And to get from data to some representation, we need to be able to apply some reasoning. So last one in this section here. What we're looking at is for machine learning. We need to be able to have a representation of the data. This is often what we think of as our knowledge representation. So it's not just ones and zeros. It's identifying context and meaning. I've expanded that here into intent and emotions. So it's kind of a multi-dimensional representation of the domain that we're working on. And that would be a different, the same data coming in from the outside, the same string of words or utterances. Maybe interpreted differently the way we understand it based on the context, so based on who's saying something. And we're certainly expanding that these days with perceptive input. So even the same words and the same person may be interpreted differently with additional information derived about their emotions or about the context. Now, the reason this is important when we start to think about machine learning is if we're going to take a system and we're going to measure its performance based on how accurate it is, perhaps in identifying something within a larger data set or identifying categorization or classification of data or finding patterns and you want to improve the performance, then you have to have a way of objectively measuring that performance. And to do that, we have to have a good understanding of the representation of the data. And hopefully in a minute, I want to get through this piece and get into the way we're doing things with deep learning. We'll see that the representation is really very important. So the mnemonic I use for cognitive computing is URL because it's easy to remember, understand, reason, and learn. But more and more in talking to people about the actual learning part, the R can stand for representation. So you have to understand something and you do that by representing it. And then as you change the data that's being represented, that, in fact, is learning. So that's where it fits. Now let's look at a couple of the major design choices here. And this is important not just historically. You may look at it and go, well, we all do it on what now. But the fact is that there are multiple ways to approach the idea of machine learning. In the example here, the page is from the Azure Machine Learning Studio. But there are any number of different approaches or different platforms that we will talk about in a few minutes. The difference that I'm trying to get people to think about here is that we can either represent the data and the patterns explicitly using symbolic logic or other types of representation so that we can plug this into mechanical theorem proving. You're dealing with formal logic. Or as a starting point, happens there more, might think of them as raw data, whether it's structured or semi-structured or unstructured. I will say that all data is structured. It's just a question of how much effort you have to put in to find the structure. But the idea is if we treat words and natural language coming in as binary representations of the words, so the system as the words come in, don't assign any meaning to them. They're just assigned an American value. So if you were to parse everything I'm saying right now and just put that into a computer memory, each individual word would have no value by itself until you analyze it in the context of the entire statement. And that's when you start to get into statistical models instead of the symbolic representation. Excuse me. What we'll see is that most things are actually hybrids. So even if you're using a statistical model, at some point you're assigning meaning to it. But early on, we need to kind of look at that and this fits in with the idea of the model. So if we're going to do learning, we have to have a model. If we're going to do the model, we have to understand what assumptions we're making. At what point are we assuming that we understand the meaning of the input? It could be something where let's say we're building a system that uses a natural language front end, uses something like a modern chat bot and we can constrain the input so that even if we allow somebody to give free form English as input, we're only looking for certain keywords or certain phrases. Or we could say, we're going to do a statistical model, let them say anything they want and then analyze it in context over a very large data set and create our own representation. The simple model here is just to show that even if we start with a logical model in mind, so we've assigned some value to or some meaning to each word, when it gets stored in computer memory, and it doesn't matter for the purpose of this diagram, whether we're dealing with sequential memory, whether we're dealing with associative memory, whether we're dealing with bits or qubits, there's going to be a logical representation in memory for each of the pieces of data that we're concerned with. And the idea of the diagram is to show that if we have these words coming in, if the purpose of the system is to do autocorrect or an autocomplete on your phone, for example, you may want to look at the words differently and position them differently in this arbitrary end space than you would if you were mapping for meaning. So if you're mapping to autocorrect and you don't want to have to look at a lot of context, part of your model, part of your assumption as you're learning here might be that the words that a person is more likely to have to type the right number of letters than the right letter. And then you start to look for things like, well, the most common error is that the left thumb is going to be off by one letter to the left. The right thumb may be off by one letter to the right. And so the way we represent these things is going to be very different if we're trying to have that simple autocomplete autocorrect than if we're trying to do, even if it's an autocorrect versus meaning, then we have to understand what each word means, what the intent is, and we'll store things differently. So here we're storing things on the right by concept or intent, and on the left, it's just mechanically. So that's important because as we're trying to learn, we need to have that fairly well-recognized in the model. Now, excuse me, normally I have a mute or cough button, but since we're doing everything on the iPad today, I don't, the change over time has been from focusing very heavily on the knowledge, if you will, of the system designer. So there's a lot of emphasis on the understanding and that goes into the rules and the algorithms and the data is just, I don't wanna say along for the ride, but the heavy lifting, if you will, was done up front whereas today, moving along from left to right, with the advent of big data architectures and solutions, everything from Hadoop to Spark and the hardware that goes with it all, we're starting to see systems that are lighter weight in terms of how there's fewer rules and the systems themselves, the learning systems are learning based on discovery of patterns in the data. So machine learning over the course of the last couple of decades has moved very much from the left to the right. So machine learning in knowledge-based systems or expert systems of the late 80s, the early 90s since they're still widely in use since having gone away, but in that case, you would try and capture the knowledge of an expert either explicitly or by watching them, by self-reporting that type of thing. And so in the structure of the system, when the system first launched, there would be a lot of information that is codified in the rules and the algorithms. And so machine learning in a system like that, one let's say to the left of the center dot there, machine learning would consist of either, well, learning would consist of either machine learning or humans changing the rules. Machine learning in that case, this is certainly not what we would think of as deep learning, but machine learning would be the case where the system could identify rules and change its own rules based on the difference between what it expected to see in terms of results or feedback perhaps, and what it actually saw. So you can update rules based on experience as we move further to the right. Now we're getting a lot of investment in deep learning. It's certainly become kind of the de facto approach to machine learning. Excuse me. And as a result, systems have smaller and smaller upfront emphasis on identifying the rules and identifying the specific new algorithms for the application. And what the focus is on, is on approaches to things like feature, identification and extraction, which would be, I don't wanna say it's easier, but there's certainly less human intensive. The emphasis is on getting the sufficient data and training data and labeling data. So that I put at the bottom in the purple. So that's kind of the shift as we go from, you'll notice this isn't labeled by date, but anything to the left of the first vertical line. We're talking the 50s, 60s, 70s. And as we move over the second vertical line, basically represents where we were from the end of the last century, if you will, to now. And I would say right now, we're at the point where most of the investment, most of the applications are being built, sort of to the right of that line. Everything is focused on machine learning based on larger data sets. No backtrack, much of it. There's still cases where we don't wanna be using deep learning and where we have more information that can be codified early in the design phase. I use this diagram a lot to show the difference or the distinction between discovering or identifying or classifying and understanding. Because when we talk about learning, as I said, there are three things that have to go together. You have to understand, you have to reason using representation and then you can learn. But understanding, just as we say artificial or artificial intelligence, sometimes augmented intelligence, but artificial means to me that it's simulating or emulating the process. We can treat the process of understanding as a black box. For those of you that have studied psychology, I'm talking about something that's more like a behavioral approach. We don't need to understand the inside as long as we can map a set of inputs or a set of outputs we're finding in relationships. And in this diagram, the system that has been developed by Lubei Labs can identify relationships based on statistical modeling. So it goes back to that early decision without being concerned even by what natural language is being used. So it's a purely mathematical modeling, if you will, in N-space to find things that are related to each other based on frequency and based on usage. In this case, it happens to be something that was published in Al Jazeera and the system went through and looked for relationships and patterns. And although I don't speak the language, I recognize the numbers. It happened to be something that was focused on aircraft. So you can see a lot of Boeing designations there, Airbus designations. But the reason that's important is to understand that when we talk about learning and machine learning specifically, the machine, the definition, the working definition for learning is that it's improving its performance. It doesn't have to understand things or represent things in the same way a human does. So machine learning is a measure of performance improvement along a dimension that has to be specified. And a lot of it today is based on recognizing these concepts, recognizing, representing and reasoning to find concepts that work together using a mathematical model. And again, it's absolutely irrelevant to most of these systems, whether the relationships are stored in a way that humans would store it. Reasoned about in the way that humans would do it. It has to be something that is, however, mathematically complete, consistent, and the rules would have to be unambiguous, even if the human doesn't know them. So this is just one example where we do so much with natural language. This points out that we don't really need to understand or even consider the language for certain types of learning applications. Okay, so let's take a look at the overall landscape. Now we're gonna dive in and, excuse me, look at what the considerations are and how mature each of these is, and then look at some of the vendors. So when I was laying out the framework for this talk, what I wanted to do was have people think about what is practical to use today. And I'll just state flat out that within this diagram which represents sort of the whole landscape, if you will, of machine learning, everything here is above what I consider to be the threshold of usability, but some are more equal than others in the four worlds terms. So we wanna break this world up into four quadrants, supervised and unsupervised learning and general and deep learning. And the idea is that supervised machine learning is learning by example where we use training data and versus unsupervised, where we're trying to discover patterns based on experience with data, but without sort of prejudicing, if you will, the algorithms to say, this is what you're looking for. It's like, look for things and then tell me what you're finding. It's a little more detailed than that, but for the purposes today, those are kind of the two big differences with supervised. We have to have labeled training data and so there's more guidance up front, if you will, it's supervision. And I think it's usually instructive when we think of learning to think about how we deal with teaching or learning is one side of the equation, teaching is the other. And so if you're dealing with trying to teach children how to do things, when you think about supervised learning versus unsupervised learning, supervised would be giving examples and then having them try to generalize from the examples. Unsupervised is having them go out and play and see what they find and then try and assemble theories from that. Within supervised, the idea of reinforcement learning, it's very much like reinforcement in psychology, which is that the system will have to develop strategies based on feedback that we give for its performance. And if we're dealing with reinforcement learning for a human or another animal, if we're dealing even with rats and a maze, reinforcement is generally something that is a reward or the absence of reward is negative reinforcement. Negative reinforcement isn't punishment, it's taking away a reward. And so systems learn to improve their performance and develop strategies and develop changes to the algorithms, again, based on experience. All of this is based on experience. And so the system can change its behavior based on the frequency with which it receives a positive reward. And so that would be something that would be useful, perhaps for an application where there were just too many rules to be able to specify all of the possible scenarios, things like specifying behavior for an autonomous helicopter. What not to do, that's the reinforcement. And sometimes he's the example for training manuals for fighter pilots. This is where you get into prioritization, right? There are only half a dozen priority rules. There are a lot of little rules, but basically it's, you know, don't hit anything in the air or anything on the ground and then you start to work down on that. So each of those has a value associated with it. You could also use this for a system that's playing a game. If you're playing chess, for example, chess is a two-person perfect information zero-sum game. Both players or both systems have perfect information about what the other person is doing and what they could do. And each move has a resulting value associated with it. So that value could reinforce or extinguish the behavior. If you do something today, if you do something in a particular context and it results in a bad outcome, one of your pieces is taken, then that would be providing the negative reinforcement. So that's the supervised versus unsupervised. Both of these, again, in terms of machine learning, the important part of it is that the change that we make to the system is based on the system evaluating data over time and changing its behavior. And by behavior, that would also include just changing the knowledge that it has. It doesn't have to actually do something to change behavior. The distinction between general, all of that could be done just with a set of business rules, but when we get into deep learning, that's more specifically based on models that are analogous to human neurosynaptic events. So it's biologically inspired. We have layers of processing units that are very simple. Each unit within the layer has a limited range of behaviors, if you will, or responses to stimuli, but we have enough of them that the layers can collaborate and solve these more complex problems. And for that, I'm just gonna do a couple of quick slides. And then we'll start to look at what's changing and then look at the market. So in an artificial neural net, and this is a pretty simplified view of the world, this is one layer. Each of the dots represents a neural processor. And you can think of that as something that is aware of a particular type of stimulus. If it was a natural neuron, the stimulus could be anything that we perceive naturally. So it could be light, heat, humidity, whatever we have perceptrons, if you will, to acknowledge. In an artificial neural network, if we just had the one layer, then the input is a representation of the state that we're in. And that representation is digital. I think of it as a vector or a scalar that provides an input to each of those or some set or subset of these neural processors. But what makes it interesting is the way they connect with each other. So in this case, we're not talking deep learning, we're just talking about the simplest. We have, I think of it as a matrix of these and perhaps we can wire them in different ways. But let's just assume for the moment that every little processor is wired to three or four nearest neighbors. And when it gets a signal from the outside, maybe it tells all of its neighbors, maybe it just tells one. That part is kind of the hard wiring, if you will, of the neurons. But where it gets deep is where we have that input. Those are the observable variables. And if we're thinking about this in a particular application, let's say we're doing image processing now, then that's where you're looking at the pixel level, right? And if we just had one layer, you can't do that much with it. But if we start to say, okay, at the first layer, we're dealing with something that's low abstraction, we're looking at pixels there. They may have one of 10 possible color values. They may have different shades, hues, but each one maps to it and say, okay, what am I seeing there? So that's not very abstract. What you're trying to do is from that identify features and pull it out to say, what's the relationship? Now we're, you know, in this example, I'm looking for edges to begin with. So if I'm looking at a digital representation, I'm looking at a JPEG file, for example, that represents a picture of my family, then the first layer may just be looking for edges. It'll be looking for changes in pixel color. It may be looking for borders, boundaries. It may be looking for a change in contrast. Once we've done that, we've sort of outlawed that brush shape. We're gonna go in using rules of geometry for going from one layer to the next. Say, okay, well, I see that, you know, there's a shape here, and now I'm trying to figure out what shape it is. And if it's very structured, if you will, if it's a picture of a crisp, in-focus picture of the building, then maybe it's easy to find the lines, then we can go from that to identifying what the lines represent. When it gets a little fuzzier, then it's more difficult. We may have to have more layers, but here, just with this example where I've got five layers, it's, that's the depth of the model, and the output is something that is much more abstract. I'm looking at a picture of a cat of a person. And again, this is very simplified. Obviously, for those of you that are familiar with convolutional neural networks, there's a lot of back propagation. There's data that's going back up between the layers. We do a lot of checking and validating, but if you think about it as a general model, we're going from something that is very concrete pixel layer to something that's much more abstract. It's a representation of an object. Now, one of the problems is when we get into systems that have many, many layers, and they've seen systems with over a hundred hidden layers, I think one of the ones that Microsoft has been talking about recently in terms of translations, about 150 layers, it gets very difficult to go back and say, well, how did you, what evidence did you use for coming to the conclusion that the abstract item is X? Can we trace back? And that is difficult. And in some cases, that doesn't matter. But in some cases, let's say we're doing medical diagnostics, you may really want to know, what did you see in the X-ray? And the X-ray is the very concrete digital representation of some slice of a human being that led you to a particular diagnosis. And we do have kind of a gray area, no pun intended, or dealing with X-rays, but there's a gray area now in terms of being able to go back. It's not as easy as it was when we were dealing with a set of rules and you could see which rule fired, but when you're dealing with a system that's doing self-learning, that's more difficult. So as we start to look at which type of machine are you gonna use, one of the design considerations or decisions you need to consider is what level of explainability do we need? And this is an area that's undergoing a lot of research at the moment. So I would say we're getting more mature in terms of our ability to do that, but anytime that you add a requirement for an audit trail, if you will, or a potential audit trail, you're adding processing overhead. So, I know that just showing a picture of my kids doesn't make them tax deductible, but it makes me laugh here. In this picture, if you were to use machine learning to try and identify the objects, it would probably find three faces. And the way it's finding faces is you're looking for the edges, you're looking for things like eyes and the bow of the nose and saying, okay, will those kind of cluster together? You could do image recognition and if you had it properly trained, you'd be able to detect that one of those is wearing a Yankee hat and two are wearing Red Sox hat. So that may be the difference. In a picture like this, you could have different genders, it could be all sorts of things that are different, but it would depend on the context which one you wanna put a value on. So, context that is Israel and King. What isn't shown here is that if I were to take this picture and modify it, let's say, turn one of the heads upside down, the typical deep learning system wouldn't recognize that that was any different if I turn into a Picasso than a picture where everything was in its place because it's typically looking for features but not geospatial relationships between those features. And that's important because a lot of what we're looking at today as we get into more complex problems that we're trying to solve with machine learning and deep learning, the limitations of going from abstract to, I'm sorry, going from concrete to abstract, is that we're not giving the systems enough information about these spatial relationships. And so there's been quite a bit of work done recently that it's kind of changing the way I think it's gonna have a big impact on the way we do deep learning. And for that, just take one more quick look here. What would a deep learning system today, multiple layers, learn from this picture? This happens to be, excuse me, in downtown Atlanta in a rainstorm. Shannon, I don't know if I've used this one before but this was at EDW last year when I got stuck in the hotel for two days because nobody could get out of Atlanta. But you can find edges. People are very good at finding edges. You can recognize the park if you've been there before. But the raindrops on the glass are actually more challenging than anything else. And so the reason I put the slide in is to say, okay, as we start to look at it, if you were to take this and take the same picture from a different angle, it's going to look like a different place even though it's just a different view. As long as it's, sorry, not subverting nature, deep learning systems could figure out that it was the same but it would be a lot of effort whereas for a human, when we look at this, it makes no difference if I move 10 feet to the left, 20 feet to the right, up or down. That's the same problem. And so even though we don't have to solve machine learning problems the way humans do, we should be able to learn from the ways humans actually interpret these images and that's where things are starting to go. So kind of a roundabout way and this is one of those slides where I don't expect anybody to read this until you get a copy of the slides. I just wanted to show that there is an emerging alternative to conventional, if you will, deep learning with neural nets and convolutional neural networks and that's the idea of capsules and a lot of the work is coming from Jeffrey Hinton's group out of Canada. Hinton of course is certainly one of the premier thinkers if you will in the whole area of deep learning but one of the problems that has been addressed recently that his research team has worked on is the idea that neural networks have in past each of the individual processing units is outputting a single value and in his new work with capsules it's still a neural network idea but they're clustered and they're arranged differently and they're interpreted differently and they're connected differently, if you will so that the processing units can output a complete vector and I don't know if this is going to help but if you're familiar at all with quantum computing and the idea of two bits rather than bits, it's similar in that you're getting more value out of the individual processor. So instead of aiming for the viewpoint variance here it's the last part that I think is important and this is the bold part is mine, it's not theirs that we're looking for replicated feature defective detectors using these capsules or encapsulation if you will into a vector of informative outputs and the idea and the reason I bring it up today is that while we've reached the point now that even though there's some gray areas in terms of interpretation the performance of what we think of today as conventional deep learning I know it's so relatively new in practice we understand going from the concrete to the abstract we understand building in that feedback loop to be able to change the values of the connections between these neurons there is this frontier right now of capsule oriented networks that I think is going to have a huge impact on the field in the next five years and so I wanted to include it here so that I could talk about how mature each of the technologies are. And so with that in mind, if we go from ad hoc where everything is hard coded in terms of machine learning through the rules phase through neural networks which would include deep learning into capsules my thought today and I've used the variants on this maturity model in the past the idea is that with investment with research, with feedback, if you will technologies typically follow an S curve I'm trying to simplify it here and at some point they reach the point where you're getting reasonable utility out of them that's the horizontal bar there so I'm saying that all three of those have reached that point right now we know enough about machine learning we know enough about the alternatives to build these ad hoc systems to build rule-based systems and we're still building those we're still betting those in applications the neural networks, the width here is where I see the investment we're not gonna do much more relatively speaking with rules or ad hoc systems we're still gonna do more but capsule networks is where I think we're gonna see the most investment for the next several years although right now they're way down here in fact in terms of the height it would be lower but I couldn't make it lower using the graphic tool or the arrow would have disappeared so the idea is that we're mature we're still gonna see quite a bit of investment with conventional neural networks but for the next five years I think the big effort here is going to be on looking at this other dimension with capsules okay that's another paper from that group they went backwards now let's get into the market and we'll do the next 10 minutes just looking at the market so if you're convinced and say I'm going to build the system I want to use machine learning with or without deep learning right now these are the big four Amazon, Google, IBM and Microsoft and the reason I put them together is it's basically all four have cloud native machine learning platforms they're all very scalable you can start with any one of them for a relatively low price there's a lot of investment there and they give you the flexibility so I tried to break it up into three simple dimensions ease of use, the breadth of services that are offered on each of these platforms and then the depth of services from the company so if we look at it using this on the ease of use scale I think that Amazon with AWS and Microsoft with the Azure Machine Learning Studio are probably the easiest to use and to get started in terms of the breadth of services IBM and Microsoft probably have the broadest offerings at the moment Amazon is kind of picking up the pace and then in terms of depth I think frankly there's a big gap between what IBM has done with Watson and the others but Microsoft with the Azure Machine Learning Studio is really picking up the pace I'm just reiterating something about Microsoft's AI strategy and their AI is basically one of the three pillars of Microsoft's overall strategy going along with sorry, mixed reality and quantum computing their machine learning or AI overall is going to be in everything they do so I expect Microsoft to be on par very shortly and lead Google and Amazon in those areas like many things in this space the reason there's no tick marks on there is that a lot of this is based interaction with the companies and with people that are using these products and so it's hard to actually quantify but I will say that one of the things to look for in a platform provider is the types of data that they can provide not just the types of services I mean any one of these platforms you can get algorithms for supervised unsupervised learning and deep learning you can assemble things via APIs what's interesting is that some of the companies are making huge bets on data like IBM with the weather company data that they acquired Microsoft by acquiring LinkedIn has a lot of data Google obviously has a lot of data from searches and so the question is which of those are available to you in Google's case most of that is available to them and Google and Amazon have done more to leverage AI internally than they have to expose it to clients IBM and Microsoft I think are have been more focused on creating tools for others to do it I'm going to go just through a couple of firms that are out there in the market that I think are noteworthy at this point and these are intentionally organized in alphabetical order because I don't want to try and rank them there's too much subjective it's too subjective a nature but I have two slides with companies that I think you should take a look at if you're interested in building the system so Altrux, BigML, Cognitive Scale and H2O here what I'm going to point out here is just some thoughts on each one if people have questions afterwards I'm happy to discuss this one-on-one in terms of what fits for a particular application but Altrux is interesting because they have a focus on business users or what's often known today as citizen data scientists people whose primary job responsibility isn't the nitty-gritty of data science BigML has a combination of a private deployment model and a subscription model so if you work with them to develop your own models they create a virtual private cloud and then it runs on Amazon or Azure or Google and that's something that a lot of firms are doing now it's building the application to run on multiple platforms Cognitive Scale I thought was interesting because there's an original investment from the IBM Watson Venture Fund as they were ramping up developing their augmented intelligence platform but recently they took an investment from Microsoft and it's the first firm that I've seen that's had those two investments in a serious way so that's why I'm following them pretty closely H2O, the H2O Compute Engine is a pure open source platform and to me they're in spirit anyway they're the red hat of this market today Next, Loop AI I mentioned their Q platform that wants to have a natural language independent reasoning engine and I will say that in my role as analyst for Aragon last year I put Loop AI in for an innovation award for machine learning platforms so I think quite highly of them in what they're doing SkyMind Leveraging Spark, again from the open source community and I had to put this up there because it's their word production eyes trying to create an enterprise friendly platform for leveraging these open source products SkyTree the other side of the coin rather than focusing on the citizen data scientist is focusing on machine learning as a service for the high-end folks in data science and lastly Spark Cognition is one of the leaders in developing custom solutions with deep learning wrap it up here and we'll open for questions in just a second by saying that there are two companies there are two approaches out there that are quite different from all the others and Intel Saffron Intel acquired a company called Saffron their machine learning their reasoning representation is all based on an associated memory model so the way things are stored is based on the associated memory which in simplest terms the storage location is assigned based on an algorithm for how closely related to items are and they're the almost as I've seen recently they're trying to commercialize this and I think that the investment from Intel is going to pay some dividends and so I look forward to seeing what they do in the next couple of years and lastly I'm wrapping it up with Numenta Numenta is largely an IP portfolio company at this point I would say because they're not trying to productize things themselves but Numenta has done some of the most interesting fundamental research into the human neocortex and attempting to organize a computing system their model called the hierarchical temporal memory model in the HTM based on their understanding of the human cortex and they've just done some amazing work at the fundamental level that I know is being incorporated into some products and is certainly being studied by a lot of folks now has a different way of organizing data to optimize reasoning and as I said right at the beginning you don't have to do things the way humans do but the move from conventional deep learning to capsized deep learning is trying to benefit from kind of tilting your head and looking at things differently the work that they're doing at Numenta I think is going to pay off in a big way so that's the second of the the learning approaches that I would I hesitate to say bet on because I occasionally said had people say oh I took your stock tip well this is not a stock tip this is a technology tip I think that the capsule approach and the Numenta HTM approach are the two kind of leading lights conceptually to take us to the next level in machine learning and having said that I'm going to open it up for questions just note as Shannon said at the beginning one of my passions right now is working on this new book called The Age of Reasoning where I'm looking at the intersection if you will of cognitive computing the understanding reasoning and learning with the IOT and cloud computing to help people build distributed intelligence systems so if you're interested in that check out our new site and I would love to talk to you about what we're doing Adrienne thank you so much and thanks for all the work to get logged in while you're out of power and section minorities there so just a reminder to all the attendees I will be sending a follow-up email for this webinar by Enterday Monday to all registrants with links to the slides and links to the recording so diving right into questions here in the Q&A section Adrienne how do you evaluate how well unsupervised machine learning is doing and can you do that? Can I evaluate how well it's doing? Yeah, well I would say one really good example of an application that's well suited to unsupervised learning is things like network monitoring and looking for intrusion because at that point you have perhaps a lot of sensors and you're not telling it to look for something specific you're asking the system to report on what it's seeing and a lot of times what you get in an unsupervised learning situation is just a flag basically okay I've seen something that I haven't seen before you didn't tell me to look for this you told me just to look if that makes sense and so unsupervised learning is being used quite successfully in that type of system it's also used in in fraud detection when you're looking for anomalies typically if something like fraud you're going to use a combination of supervised and unsupervised because you're going to tell it to look for specific patterns but you're also going to tell it to look for novel patterns actually brings us to the top of the hour and that was actually big sit of the question so far Adrienne thank you so much for working so hard to make this webinar happen today despite being without power we really appreciate it and thanks to our attendees my pleasure yeah yeah and we hope to see you all on the flip side as Adrienne is showing now we've got Knowledge as a Service in April so we hope you can oh we've got we've got a couple questions coming in but I'll get those over to you and maybe you can answer those in the follow-up emails to that I go out on Monday as we are right at the top of the hour thank you thanks all and thanks everybody again Adrienne thank you so much and we will see you on the flip side sounds good take care everyone thank you thank you