 Hello and welcome. My name is Shannon Kemp and I'm the Executive Editor of Data Diversity. We'd like to thank you for joining the current installment of the Monthly Data Diversity Smart Data Webinar Series with host Adrienne Bowles. Today Adrienne will discuss emerging hardware choices for modern AI data management. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we'll be collecting them via the Q&A in the bottom right-hand corner of your screen. Or if you'd like to tweet, we encourage you to share our highlights or questions via Twitter using hashtag smartdata. If you'd like to chat with us and with each other, we certainly encourage you to do so. Just click the chat icon on the top right for that feature. And as always, we will send a follow-up email within two business days containing links to the slides, the recording of the session, and additional information requested throughout the webinar. Now let me introduce to our series speaker for today, Adrienne Bowles. Adrienne is an industry analyst and recovering academic, providing research and advisory services for buyers, sellers, and investors in emerging technology markets. This coverage area includes cognitive computing, big data analytics, the Internet of Things, and cloud computing. Adrienne co-authored cognitive computing and big data analytics YLA 2015, and is currently writing a book on the business and societal impact of these emerging technologies. Adrienne earned his BA in psychology and MS in computer science from SU, New York, Binghamton, and his PhD in computer science from Northwestern University. And with that, I will give the floor to Adrienne to get today's webinar started. Hello and welcome. Great. Thank you, Shannon. It's always fun to be here every second Tuesday of the month. So today's topic, some people might look at it and think it doesn't fit with the rest of the series because most of the time we talk about software-oriented topics where people think of it as software-oriented topics. And we've been looking over the course of this year at everything from machine learning to natural language processing, really at a fairly abstract level. But what I wanted to do as we get near the end of the year is just do one session looking at some new trends in hardware that really support some of the solutions that we've been talking about all year. And so I say emerging hardware choices for modern AI. What we'll talk about today are, first of all, the rationale behind what's the importance of hardware and why you should care, even if you end up putting everything in the cloud and think that you're sort of outsourcing it and it's no longer your responsibility. There are some important choices to be made when you're picking your providers in terms of what they're doing because your choice of hardware is still going to have an impact on system performance. So today, we find my... Let me go right in and look at the agenda. Just a comment about hardware in general and why it is what I'm calling the final frontier. And then we're going to look at the specific challenges that modern AI and it'll be a crisper definition of modern AI in a minute, why modern AI has specific challenges that aren't really well met with conventional hardware architectures. Look at the importance of doing things in parallel and some of the limits to parallel execution of workloads. And then look at three different approaches that are in various stages of maturity and commercial readiness. Euromorphic GPUs and advanced memory solutions and quantum computing. We're not going to spend a lot of time on quantum computing but I did want to include it as I was saying to Shannon before we started. You have to have at least a good cocktail party knowledge of quantum to survive in the world of modern AI. And we'll wrap it up with an overview of what's out there in the market. It makes some recommendations in terms of adoption strategies depending on the types of systems that you want to build. So the first slide today, the first real slide, I left the old template and the old copyright on. This is something that I've used often on for many years. This particular slide I think I used in an MBA class I was teaching at Boston College. The issue here is that over time if we look at sort of the stack from hardware generically at the bottom through operating systems through application software and delivery, over time as functions mature, they tend to move from the top of the stack down. So you might see things moving from a specific application into the operating system into a system where you find components instead of buying a complete application. And at some point it moves all the way down into the hardware. And I like to talk about this as kind of the value migration that as something becomes codified, perhaps it becomes a commodity function, things get standardized and they tend to move away from the individual applications. And just a simple example, back in the early days of PCs, we used to have a lot of word processing options. Today you don't have quite as many. But the function of doing grammar checks or spell checks or looking for style, that was a separate application. You go out and if you were writing a magazine column, for example, you would use your word processor, but you'd have another application that would look at it and give you some feedback. Those things gradually moved down. We'd have components. If you look at OS 10, for example, I happen to be using a Mac here. There are functions in the operating system that are now available to all of the applications, like speech to text. So if you want to dictate into an application, it's not what we would think of as a very smart function. It doesn't do much interpretation. It's the same interface that's available in every application because it's in one place in the operating system. Some things that we may have had separately for security, for example. Maybe you'd have an application that would be doing that at the front end. Now a lot of those things migrate down. And there are a couple of reasons for that. One is hardware is always going to be really kind of the last frontier for optimizing. You can't run a system without hardware. And it's easier to prototype and to get things organized and test the functionality in software because changes are very simple. Relatively simple, I should say, instead of going out and burning a new chip. But as things become more stable and the volume of use goes up, a lot of times it will make sense economically to optimize it to, as I say here, commoditize and standardize. So you move things down into hardware. The other thing is, as we look at emerging areas where we're building systems like AI, you'll find that there are just some problems that you can't optimize sufficiently just in software. So if the volume of data reaches a point where you can solve the problem but you can't solve it in time to be useful, let's say you're doing weather forecasting, for example. The forecast is no good if it comes the day after the weather. You could have just looked out the window. So you need to have sort of this balance between software and hardware optimization. And with that, let's look at how this applies specifically to the area that we're calling modern AI. And I keep hammering home this word modern because one of the issues that we run into, if you look at AI over the course of the last few decades, artificial intelligence, and now we're looking at it as a set of functions that may be augmented intelligence, not just artificial intelligence, it's something that you do as an assistant or an assisted intelligence, has for decades looked at simulating either the function or the behavior of natural systems. So AI would include things like cognition. It will look specifically at what we include in cognitive computing and how that fits with hardware in a minute. But it also includes things like vision processing and understanding, pattern acting, that type of thing. But for a very long time, and really up until quite recently in the greater scheme of things, conventional AI did not include machine learning. Machine learning, as we know it today, very focused on neural networks and on models that are what we call brain inspired. And that has changed the way we look at AI. If you read 50 articles in the popular press this week, probably all of them will have some reference to the impact of machine learning. So when we look at AI functions supported by machine learning, the third component that makes what I call modern AI is the availability of big data, not just the availability of the data set, but the availability of data management solutions that allow you to get access to what you need, improve it, if you will, through machine learning, and then do the kind of processing that we think of as more traditional AI. So those are the three things, when I say modern AI, that distinguish it from any other view of the world. What we want to look at is, given that AI includes so many things and modern AI we're going to focus on the idea of cognition, what is the role for hardware and where do we need this optimization within cognitive, within input and output? And as machines talk to people, but also talk to each other to get into the IoT world. And I'm not going to go through these point by point, there's no single slide for some of them, but since I know everybody gets a copy of the slides next week, this will just help refresh your memory in terms of how we're organizing these thoughts. Put this all together, and this is, I promise, the most, sorry, just lost my trackpad there, the, there we go, the busiest diagram that we have today. So when we look at cognitive computing as a distinct part of modern AI, the definitions are all over the place and some of them seem to be pretty vendor-specific. I include four things under cognitive computing. It's learning, understanding, reasoning, and planning. And so this center here in the red circle is what we think of as a cognitive computing system. Probably the group that's doing the most to promote it as a commercial concept is IBM, and they use three out of the four. They use learn, understand, and reason, and within reason you've got things like inductive, deductive, and abductive reasoning. But the reason I put it all together in this diagram is that's kind of the core, and within that you have a model. Each of the other four are verbs here to plan, to learn, et cetera. This is the noun of a model, which is the data that you have internally and the assumptions that you have about the world around you. So if we think of the center as being the cognitive system, we'll see that there are a number of places that that can be optimized using a new hardware. But it's important to look at the idea that a system like this really doesn't have much value if it doesn't communicate with the outside world. So everything on the left side of the diagram refers to data that's coming into this cognitive system, either from people or from machines. Of course, if you're going to be technical, it's always coming from a machine, but it may be coming from a human first. So if we're looking at a system and we want to be able to accept natural language input, for example, we want to be able to monitor emotions using sensors or gestures. And we've talked about that in a couple of the other webinars. We need to be able to get that data in. So I've got this layer on the outside, which is where we do the data management. And that's the key because we're dealing in modern AI with a lot more data than we were just a few years ago. So the data constraints and how we get this data in and how we organize it are much more difficult to deal with. And the expectations are much higher than we were just a few years ago. At the bottom left, we're looking at machines and machine communications. So we're either coming in through sensors, whatever you think of as IoT, but through other systems. So you're going out and you're reading stuff in batch. On the right side, and I've done a version of this in the past, at the top, this may be where it's the most interesting, is where the system is producing output that could be in natural language. We're talking narrative generation or natural language generation, NLG as part of natural language processing. We're talking about visualization data that's being presented for human consumption versus what we have at the bottom, which is we've got a machine. It's intelligent in the cognitive sense and it's producing information that's going to be used by another system. So if you look at this in context and say, okay, where might we want to leverage emerging hardware architectures to improve the performance? What are the bottlenecks that we're dealing with here? So keep this diagram in mind for just a second and I'm going to compare it to how a person gets input. So we're just going to look at the top half of the diagram for a minute and think, how do we get input as people before we process it and then how do we get it out and see what the human architecture looks like and then compare it? What's the volume of data that we need to process and how do we actually look at it? So I know that in the description that I wrote for this month ago, I said you don't need to know anything about computer architecture. And I hope that's true. I'm not making too many assumptions, but I'm hoping that you have a vague idea of how you get information naturally. So we've got the five senses and we have neuro synaptic receptors, if you will, for each of those. And in some modern biology books or neuropsychology books, you'll see six and seven senses talking about how balance and some other factors are processed. If you've ever had vertigo, you know there's another issue here that isn't actually covered. But really all of those, the sensing part comes in through one or more of these five senses. And just to give you an order of scale, when you're hearing natural hearing with one or two years, audio reception each year has about 12,000 of these outer hair cells in the basal membrane. They vibrate and it gets down to the 3,500 inner hair cells. So that's a fair amount of data that's coming in that we don't even think about as we process it. We tend to think more about issues for vision where we have photoreceptors. And here I always laugh because it's one of the few things I remember from my perception class in college. We've got rods and cones that are being triggered by light, by the photons. So you've got 120 million rod cells roughly per eye, 6 million cone cells. And cone cells require more light to trigger. You've got another 60,000 photosensitive ganglion cells. It's really almost miraculous if you look at the eye and the ophthalmosception system that we can process this as quickly as we can. And if you think about what you have to do with a computer to be able to do that. If you've got a camera and you're then taking that input to be able to process that in time, even roughly approximating what you can do as a human. That's a lot of processing. We won't get into the actual details of how many cells are dealing, how many receptors for taste and smell and touch. But basically the idea here was to say, okay, which of these things can we do fairly easily with computers and which require some extra hardware, if you will, or extra power? And if you've ever done audio recording today with digital recorders, you know that the hearing part, capturing that in a way that has sufficient fidelity. That you could use the recording. Capturing it digitally is actually pretty simple. For, you know, 100 bucks, you can buy a digital recorder that will capture at a similar level of accuracy because it samples the signal coming in and has sufficient bandwidth and everything. So the fact that we've got these 12,000 outer cells and 3,500 inner cells, that's dead easy on very simple hardware. Dealing with an accurate processing of visual information is much more difficult. But if put it all in perspective, go to the next one, human cognition where all this comes in and we interpret it, and overlay this on the cognitive computing model to learn, understand, reason and plan, we have roughly 100 billion neurons in a functioning human brain and somewhere between 100 to 500 trillion synapses. So the synapses are the connections between the neurons, if you will. And it's to try and process the same type of information that we process very naturally and very easily as a human being. If you look at it this way, there's a lot of data there that needs to be processed and a lot of it we do in parallel naturally because we don't break things down into those simple steps. When you think about how the brain hardware works, if you will, without worrying about too much in the neurobiology, you don't really have to stop and think, I hear a noise and it's probably my garage, and the next time I'm going to hear is the dog barking, you have a lot of things in your model that help you. But you can navigate and deal with novel situations because you can process so much information so quickly. One more and then we'll get into the actual architecture and that's going back to the idea of this machine learning. And again, if you're reading anything in the popular press today, you would think that all of AI is deep learning. But the issue here that I want to hammer home and then we'll get into the three types of architecture that we can use to deal with this is that when we're dealing with deep learning, we're dealing with a system that models the world as a set of neuro-synaptic connections, if you will, which are very simple. If you think about that, that we've got how many billions of, let's say a hundred billion neurons and then we've got trillions of synapses. Each of those is an electrochemical connection that by itself is very simple. It's not something like a computer CPU, but it's the connections between them and the way they work together that will allow you to solve these bigger problems. So a modern neural network can work in any type of machine learning environment. But the key is that you have to somehow model a system that can take in all this data, these hundreds of trillions, if you will, synapses and connections and process them, learn from them, and then produce some output. And that is where things get tricky. So I promise that you would already have to know about architecture, so I'm going to give you one slide here that is sort of the standard freshman computer science. This is the von Neumann architecture that everybody talks about. You've got a CPU, your central processing unit, which includes hardware for control and hardware for the ALU, the arithmetic logical unit. This is where the data comes in. It's processed. Very simple instructions. At this level, you're dealing with a few to a few hundred different instructions. That's where we're dealing with the ones and zeros that are in memory and the memory can hold instructions and data in a von Neumann architecture. So put it all together. You've got your input coming from some device. Perhaps it's your keyboard. You've got your output, maybe going to a printer, and you've got data in memory. It all communicates with each other through the operating system. Great. But the issue that we run into is there are all these places where you can get bogged down. You can get bound by computation or bound by too much communication. So the actual CPU has a theoretical limit, the number of cycles, the number of instructions it can process in a given amount of time. If you're buying a new laptop, you'll see what the clock speed is. That's what this is all about. The speed of the memory itself. Again, you can look at your PC or your Mac or your telephone specifications and see what the speed is of the memory. But then you've got the speed going between the memory and the CPU. So with all of that, if this is the architecture that you're using, the only way to make it faster when you're dealing with a problem like computer vision is you make the memory component faster. You make the control in the ALU faster and you make the communication faster. But that's where we run into limits. Everybody talks about in terms of Moore's Law. There are limits to, when Moore's Law very simply is looking at the number of transistors in a chip, the density because the more you have and the closer they are, the speed of electrons going through a circuit is constant. So the closer they are, the faster things are going to be, the faster the clock. But there comes a point where you just can't speed this up by itself anymore. It's either cost or physics prohibited. So what we've done in the architectural world over the last several decades is to get in and change the hardware around so that we have multiple cores. And a core is sort of a processing unit, if you will. So within one computer, within one chip set, we may have multiple cores. And that's great. That allows you, if the problem itself can be partitioned into separate pieces that can be run in parallel, the operating system or the application is going to take that data, split it up and assign it to the right processor. So think about it this way. If you had to lay floor tile on a 10 by 10 room and your floor tiles were one foot square, you've got 100 tiles put down, you could pretty well figure out how long it was going to take you. And now if you had to do three of those rooms, all right, it would take you three times as long. Or you could have three people working in parallel. You'd have a little bit of overhead there. You're handing out the things. If you just had to do the one room, having three people would be too much overhead that you get in each other's way. If you had to do 100 rooms or if you had to do a room that's a thousand by a thousand, you would want to partition it. And this is kind of the way you would do it. You would say, okay, I'm going to break it off. And perhaps instead of having all the same color floor tiles, now you want to make it a checkerboard. Again, this is more overhead. This is where we start to get the operating system and the control system has to be able to handle that. But there are some limits. So what we're going to look at is at what point do we need to go just many, many more cores? And at what point do we need to change the fundamental architecture? So last month we talked about how IBM Watson handled TQA. And I pointed out then that they used 90 servers with 32 cores per server. So the system, which was pretty impressive, was 2,880 cores. And think about that, the problem that they were solving could be partitioned into up to 2,800 pieces that could be run in parallel. So what are the limits that we run into? And we'll see that this is why we're changing the fundamental architecture. Amdahl's law, named after Jean Amdahl, who is an ex-IBM or encountered his own company that made Amdahl computers, is credited with this insight, which is that the theoretical performance improvement that you can get from improving anyone resource for a fixed workload, and the workload is a set of steps and the data that goes with it. The improvement you can get is limited by the part of the workload that can't benefit from it. It's a convoluted way of looking at the world, but basically to net it out, it means that if you have a job like putting those floor tiles down, there's part of it that can be done in parallel and part of it that can't, the organizational part. So the limiting factor is the organizational part, but you can speed up the others theoretically to approach zero. Now, on the left we have my MacBook, 2.6 gigahertz. It's got a quad core Intel i7. So the MacBook has four cores, which is pretty much, it's run of the mill these days. And when I'm running a video processing application, I can see that all those cores are being used and I'm really straining the machine. If you look at the specs here, this is Milky Way 2. This happens to have been one that I've been tracking last year for a few years. It was the fastest supercomputer in the world. You've got three million cores. All right, it's a couple of hundred thousand processors. These are all pretty much off the shelf Intel Xeon processors. And if you look at the specs, each chipset has 12 cores and each clock runs the 12 cores. It's running at 2.2 gigahertz. My MacBook Pro is actually faster if you can't parallelize the application that you're doing with than the fastest supercomputer in the world, because I'm running at 2.6 gigahertz. They're only running at 2.2. But they can take a big problem and spread it out over three million cores. Remember that the Watson system was just under 3,000 cores. I don't have the specs up on here today. The fastest as of June of this year I think is running about 10 million cores. So the issue is how do we take advantage of this? Because if you're running three million cores divided by 12, you've got a couple hundred thousand computers that need to communicate with each other. For most applications, that's not going to work. It's going to be too much communication. So we need to leverage the power of parallelism or find something else that will allow us to break up the problems and solve them simultaneously or solve them in a different way. And that's where we get to the first of the three architectures. So let's think of neuromorphic architectures or so-called brain-inspired architectures. So here is the devices or the components. We could have a neuromorphic computer or a device. We could have a chip set that goes in another device that are neuromorphic. And the real key here is that these are modeled after the biological systems like the brain with neurons and synapses. They can be implemented in analog circuitry or digital circuitry. We'll stick with digital for the moments of everything that's consistent. But the real key here is that each processor that has to operate in parallel is very simple. So the data that you would have to record for a processor within a neuromorphic architecture is similar to the data that you're recording or that the brain handles at one of those trillions of synapses. So communication between communication and cooperation as these systems learn. If you think about things that you may have read about, like deep learning algorithms for looking at images that go through multiple layers. And the first one may just look for light and dark. And then we take that information and you pass it up to the next layer. And we're looking for edges. And then we're looking for features. But each at each place where there's a processor, it's only very simple things that are being done rather than looking at having to have hundreds or hundreds of thousands of full computers like an Intel Xeon or Motorola, whoever's making the chip. So these have been in the research phase for quite some time, the European Commission facets system spinnaker. And personally, I'm most familiar with DARPA, the system of neuromorphic adaptive plastic scalable electronics, because I've actually talked to some of the people that are developing these and we've seen examples. And in fact, on the next slide on the left is an IBM Synapse board and that is my aging iPad that I happen to have with me when I was looking at this board so you can just see for scale. This is the Synapse board as it was, I think, two or three years ago. But I have on the rest of the slide shows what they're doing today. So they've already demonstrated chipset with 16 million neurons and 4 billion synapses. And for scale, this is the one there with the 16-gip board. I think it's approximately the same size as the one I have on the iPad. And then if you look, I don't know if you can see it on your screen, but on the upper right, this is the new true north chip, which is 64 by 64 cores. So each of those, it's blown up in the far right, is doing that one thing. It's looking at its nearest neighbor. It's seeing if a neuron is firing. And these systems are now at the point where they're being manufactured primarily for research. But you can just take the chip, tile it together with others. And so if you think about the example of putting tiles on a floor, here you're putting tiles together to create a larger and larger brain out of smaller simple ones. And what's important here is that the, because they're only drawing power as they're firing, if you will, like a neuron, it's almost like when you sleep, you're using less power in your brain. Their goal right now is to do a 4k of chips within a single rack that represents four billion neurons to the math to figure out how many synapses that is, but all running at four, around four kilowatts of power. So it's much less than what you would have for comparable power today. So this is, you know, a quick overview of the neuromorphic architectures, but what I think is really exciting is that at this scale, we're talking about something that isn't commercial. You can't just go out and buy it. But in the next slide, I'm sorry, I know this looks like a Qualcomm ad. Qualcomm is not a client of mine. We've spoken to them a number of times about what they're doing here. But is the date on this? So the Qualcomm set, the zero FNPU or Neural Processing Unit, has been out for about two years commercially. And then this couple of months ago, they announced that you can now go out and get an SDK to do embedded machine intelligence. And this, for those of you that aren't familiar with Qualcomm or, you know, where they're going with this, this is something that would fit in a telephone handset. So the impressive thing here is that it's using a neuro or brain-inspired model with these neurosynaptic connections in the hardware, instead of having a von Neumann architecture where everything goes through that channel and your limiting factor is getting data in and out between memory and the processor. Here you've got all of these working in parallel. Each one is very simple. But with the SDK so that you can actually write your code for it, it's now giving you the ability to model behavior and to start to get information as people use their handsets. So you can write applications based on this. It's commercially out there today. So Neuromorphic is no longer the future. It's the present. It's going to scale up with the other types of things that are going on. But right now, this is probably the most advanced architecture that you can see that you can literally put in your pocket. Look quickly at the next category, which is GPUs or graphical processing units and advanced memory architectures. I'm going to give a couple of examples here. Again, going back to the laptop for many years now, most mid-to-high-end laptops and desktop machines have, in addition to the CPU, which may be multicore, had GPUs or graphical processing units originally brought in for things like gaming or for improving the visual display. But more recently in the last few years, as people were building more and more AI or cognitive systems, we realized that by using a GPU, instead of having the fundamental unit just be a CPU, you have the CPU with a GPU. GPUs generally have many, many more cores than you would find on your CPU, on your core processor. So here's an NVIDIA, as an example. And NVIDIA, as we'll see at the end, is pretty much the market leader in this space, largely coming from their sale of GPUs to manufacturers that are building systems for people that want to do video editing or gaming, something that really requires the visual display similar to the idea of rods and cones. It's kind of a parallel system here. But this one, which fits in a simple rack that would fit in your desktop computer, has 3,000 cores. So when GPU has more cores than you would only find in the Watson system. So that's a little misleading because the Watson system had 2,800 CPU cores, if you want to think of it in those terms. It also had GPUs going along with it, but it's the idea of scaling up. If you can scale up and partition the data that you're working with, and this is where we get into the limiting factor, that the operating system, the application needs to be able to leverage these GPUs. They don't automatically just say the GPU doesn't require the data. And then for a much lower cost than you would have, and a much lower energy consumption than you would have by getting more CPU cores, by dedicating the GPUs to vision handling or using it as a substitute for a neural processor, it's a way of improving performance drastically above what we would have just a couple of years ago. I'm going to give a couple of quick examples here. This is one of the leaders in the market. So just to look at the way they can be scaled, the way they can be put together, they can be packaged. In this case, in the system stack there, you'll see that at the bottom you've got a GPU optimized version of Linux. And one thing I could have mentioned when I showed the specs for that, the Chinese supercomputer that was tops in the world for a couple of years, virtually all of the top 500 computer installations, supercomputer installations in the world, just about everybody is running an optimized version of Linux as the underlying operating system. So here's one where you can do your own deep neural network using GPUs that are packaged on top of Linux. A couple of months ago I talked about tensor processing and mentioned that Google had a language and had open sourced their tensor processing software, which is pretty cool from a software perspective, but now, just recently, in May, they are talking about and making available hardware custom chips that are specifically optimized to process the tensorflow language, the tensorflow instructions. And so if you are familiar with the recent competition where Google beat a master for the game of Go with AlphaGo, this is what was underneath it. It was optimized hardware just built to run tensorflow with massively parallel software. So this is not something where you're going to go to your radio shack if you still have one in town and buy it, but we can see that these things are coming from the lab and starting to become available. So next, it's kind of interesting to me that Facebook has been building their own and is now open sourcing their design for GPU hardware. And when you get the slide, you can take a look at some of these references, going to put links for each of these projects. So in this case, they've built their own GPU hardware for their software that reads the stories and does the questions and does the recommendations, et cetera, that you're all familiar with if you actually use Facebook. At the back end, because of the volume of data and because of the latency requirements, they built hardware to do this. And now they're open sourcing the actual design of that to the open compute project. So other manufacturers will be able to build systems based on this design. The last couple here in terms of the GPU and advanced memory, Micron is another one of the top companies in the GPU space, but they're also building what they call here the automator processor, which is kind of bridging the gap, if you will, between a substitute for standard CPU with a GPU. And putting the data so that the operations are happening in memory. Sorry. So it says it's not a memory device, it's memory-based. So what happens is if you look at RAM memory, if you look at the way memory is organized, it's very fast. It's silicon-based here, but things are happening in parallel. Things are being moved in parallel. The bits are being moved in parallel, if you will. And so what they've done is created an architecture that's more memory-based that's pushing down some of the processing into the memory rather than taking the data from the memory into the processor. Technically, it's faster memory in the processor that's actually doing it in the registers, but basically it's moving the processing to where the data is to speed it up. So this is still a case of conventional design conceptually in that you're still dealing with ones and zeros, and it's being taken that way. But it gives you some huge performance advantages over the traditional von Neumann where everything has to come into the registers and the ALU before it gets acted on. So it's by distributing the processing just as we saw in the Neuromorphic where it's distributed to these very simple machines. Here it would be distributed to fewer but more complex, if you will, automator processors. Last one in this section that I wanted to point out, this is still in the early stages, a deep learning processor. This is the same idea of pushing things into memory. So you can think of it as the hardware's integration, if you will, of memory and processing. All right. I'm going to wrap up this section by looking at the issue of quantum computing. And the thing that I want people to understand when you think, well, and I will tell you, but using quantum computing is something. Quantum computing is much further out than the other two that I've just talked about, and I'll put it all together in a summary slide. So the fundamental thing with quantum computers is instead of dealing with bits, where a bit is a binary digit, it's a zero or a one, it has two states. That's why we're dealing base two, we're dealing binary, everything is a power of two. It uses properties of quantum physics, quantum mechanics, where each bit is, or the Q bit, can be in a state of on, off, or both. It's called superposition. And while you're getting your mind around that, if you haven't looked at it before, I'll tell you that the big issue here is that it requires material physics properties such that the actual system has to be kept at a very cold operating temperature. So this slide will summarize through the three levels of sophistication for quantum computers. So if you hear somebody saying, oh yeah, you know, we've built a quantum computer and it's got six qubits, which I think was the measure of volume for the arc. So what they're talking about here is a measure of computing power in the very simplest way of thinking about it. If you think every time you add a bit in a standard computer, you double the amount of information that you can store, right? Because if it becomes another meaningful part of your representation, it's times two, whatever you had before. So here, if you're just dealing with the simplest of you in the world where you're dealing with three states, then every time you add one, you're raising it another power of three. There are a lot more interesting things to go along with it if you're working on algorithms that are dealing with quantum mechanics. But it has some general properties that are making it of interest to people that are working in, sorry, cognitive computing because of the speed and because of the volume they can handle. The reason I included this one, again, you can just go out and buy it, you can go out and use the IBM quantum simulator where you can just go to the address on here and get an account and you can start to learn about quantum computing, if you will, which requires a completely different model for instructions because they've made it available for free for research and it's cloud-based. But that brings up another point. For any meaningful application, you're going to still need a conventional computer to pre-process the information for the quantum computer. So there's a lot of training involved and there's a lot of infrastructure involved because of the cold. I'll just throw this up, not that I'm expecting anybody to go out and read the paper, but this is a fairly recent anchor, again, July and revised here. But what's interesting about this one to me, they're looking at what's called quantum supremacy. The article here freely admit I never made it through it because we're dealing with 42 qubits in chaos theory. But what's most interesting to me is that this is coming from our old friends at Google and Google has talked about it and Google is working very diligently to extend the state of the art in quantum computers. They started out working with a Canadian company called D-Wave and now they're building their own. And I think what we're seeing here is that because of the almost intractable problems they're dealing with, the volume of the data and the speed of the data and the complexity of the data that Google deals with, they've had to develop research that has been in the hardware business, if you will. And it's the same thing with Facebook. It's all based on this push from big data. So I'll show one last thing just to sort of as a brain teaser. NYT is currently working on a thing they call the probabilistic computing project and that gets into hardware and software. I don't believe that they've actually demoed or built any of the hardware yet. But one of the things that we look at in cognitive systems is that a lot of times we're dealing with probabilistic problem solving, something where there isn't probabilistic as contrasted with deterministic. So one set of facts may lead you to multiple next steps, if you will, if you're thinking about this in finite automata probably not how much you represent it. But if you're at one place and you get another input, that could send you off to multiple next steps. You don't know from the beginning where you're going. You don't know perhaps what the sequence steps was to get you there. They're working at MIT on this probabilistic representation language and building hardware to go with it. So with that, I'm going to kind of pull it all together and say, where are we today? The bottom line is that as we start to build these cognitive systems that are going to, again, learn, reason, understand, and plan, we're dealing with volumes of data that are pretty much unprecedented for consumer applications and even for many enterprise applications that require us to process that data in parallel, either by partitioning it using GPUs or coming up with a system like the neuromorphics that represent everything in more processors, if you will, more cores that are very tiny to have more communication overhead, or we're going to have to do something completely different and go into a multi-state bit like a Q-bit. So where we are today, the GPU approach has been, it's a natural evolution, if you will, from what we've done in architectures from gaming, from video. It's proven, it's proven approach for parallelism and it's relatively, it's easy, but it's built to be interoperable with conventional systems. So most of the people that are listening today probably have a GPU on your desk in your device. If you're going to scale up and build a cognitive system, one of the things to look at is how much of the workload can be allocated to GPUs. To these memory accelerators so that you don't need to scale up with multiple processors. Neuromorphic, I rate this as very promising that it's ready now at the handset level. It's been out there for a year or two. The key things to consider here are that the model for programming is going to be a little different, but it's based on the behavioral process model which should be more natural to people as we get into quantum. And these things have very low power requirements. The downside is that there's going to be a new software model and skills that you didn't need to have before to build applications. Quantum, it's very promising. Google is looking at having something out in a year or two. But if you're building systems now and you're looking at it for your company and you're building it for a couple of the industries that are well represented in these webinars are financial services, insurance, pharma, healthcare. If you want to build something now, you're probably not looking at quantum. It's something that I think people should be watching but not waiting for because you can already do much of what you want to do with either the GPUs or if you're dealing at the mobile level with Neuromorphic. So we're going to close it out by looking at kind of the state of the market. I've mentioned some of these companies. We'll go from right to left in terms of stuff to just keeping an eye on. For quantum, IBM, D-Wave, and Google are doing some of the most interesting stuff. They've got the most money invested in this. Basically, with reasonable confidence, I can say that the next generation is going to come from one of these companies or a spin-off with Neuromorphic. IBM in the U.S. anyway is certainly the leader at the high end with the Synapse work. Qualcomm is, by everything that I've seen, a leader at the mobile end. And the companies that I referenced on an earlier slide are the research projects that I've referenced on an earlier slide, a representative of the work that's going on in Europe on brain-inspired hardware. So now on the column on the left, this is where the action is right now. You've got NVIDIA is certainly the leader in GPUs. Intel has done a couple of acquisitions. As I mentioned here, particularly they just announced one for Movidius, which is basically a co-processor, all of these are co-processors that's optimized for vision processing and Google with their tensor processing. So commercial off the shelf, you're looking at 90% or more would be NVIDIA Intel or AMD. Then I just listed some of the companies that are on the horizon. They're doing interesting things in the GPU space and with a little work on Neuromorphic. And with that, I'm going to go just about blowing my time budget. We'll open it up to questions, but I wanted to give you a peek at the schedule for the next 13 months. Barring something unforeseen or we have to change it because there's some radical new invention that we need to cover. So Shannon, can I hand it back to you and see if anybody has any questions today? Awesome. Yeah, everyone's pretty quiet today, but I love seeing these titles. I'm really excited about next year and for next month as well. Just a reminder to everyone, if you want to submit a question, submit it in the Q&A in the bottom right-hand corner. There we've got just a couple minutes left if you've got a question. And to answer some of the most popular questions, I will be sending a follow-up email by end of day Monday with links to the slides and links to the recording of this session. And yeah, everyone's pretty quiet today. I think everyone's still recovering from the week's events. Yeah, we're going to leave it at that. Yeah. My cognitive system didn't call it, so I'll just leave it at that. Yes. Yeah, I'm so excited to hear about next year's sessions. We're going to be working to build everything out pretty quickly here. So we've got a quick question here in for you. Which computing architecture is appropriate for high-velocity data like autonomous vehicle data that sends multiple gigabytes of files every 10 seconds? Which architecture for autonomous vehicles? That was the question. Mm-hmm. Mm-hmm. Well, right now you could rule out anything with quantum unless you're only driving in the Arctic. I would say that for the vehicle itself, today you're talking about a parallel architecture with GPUs. And if someone was interested in anyone to just shoot me an email, there are a couple that are really being optimized for autonomous vehicles, but it's primarily a GPU-oriented approach. Sure. Sweet. I love it. And we're right at the top of the hour. It's perfect. And again, just a reminder, I'll send the materials out to everyone in the email. And we'll get to all, yes. And so I will also include Adrienne's information there as well as in the follow-up email. So you have all that in addition to what's printed in the slides. Adrienne, thank you so much for another fantastic presentation and another great month. Thank you. And thanks to our attendees for being engaged in everything we do. We hope to see you next month. Have a great day, everyone. Thanks.