 From around the globe, it's theCUBE with digital coverage of AWS re-invent 2020, sponsored by Intel, AWS, and our community partners. Everyone, welcome back to theCUBE's virtual coverage of AWS re-invent 2020 virtual. We are theCUBE virtual, I'm John Furrier, your host with my co-host Dave Vellante for keynote analysis from SWAMIs, machine learning, all things data, huge set of announcements. The first ever machine learning keynote at a re-invent. Dave, great to see you. Thank you for joining me from Boston, I'm here in Palo Alto. We're doing theCUBE remote, theCUBE virtual, great to see you. Yeah, good to be here, John, as always, wall to wall, love it. So, John, how about I give you my key highlights from the keynote today? I had four kind of curated takeaways, if you will. So, the first is that AWS is really trying to simplify machine learning and use machine intelligence into all applications. And if you think about it, it's good news for organizations because they don't have to become machine learning experts, they don't have to invent machine learning, they can buy it from Amazon. I think the second is they're trying to simplify the data pipeline. The data pipeline today is characterized by a series of hyper-specialized individuals, data engineers, data scientists, quality engineers, analysts, developers. These are folks that are largely living their own swim lane. And while they collaborate, there's still a fairly linear and complicated data pipeline that a business person or a data product builder has to go through. Amazon making some moves to try to simplify that. Okay, I'm good. You're expanding data access to the line of business. I think that's a key point is they're increasingly, as people build data products and data services that can monetize for their business, either cut cost or generate revenue, they can expand that into line of business where there's domain context. And I think the last thing is the theme that we talked about the other day, John, of extending Amazon AWS to the edge that we saw that as well in a number of machine learning tools that Swami talked about. Yeah, it was great. By the way, we're live here in Palo Alto in Boston, covering the analysis. We've got tons of content on theCUBE. Check out theCUBE.net and also check out, at the reinventers theCUBE section, there's some links to some on-demand videos with all the content we've had. Dave, I got to say, one of the things that's apparent to me, and this came out of my one-on-one with Andy Jassy, and Andy Jassy talked about it. His keynote is he kind of teased out this idea of training versus more value-ad machine learning. And you saw that today in today's announcement. To me, the big revelation was that the training aspect of machine learning is what can be automated away. And it's under a lot of controversy around it. Recently, a Google paper came out and the person was essentially kind of let go for this. But the idea of doing these training algorithms, some are saying it causes more harm to the environment than it does good because of all the compute power it takes. So you're starting to see the positioning of training which can be automated away and served up with high-powered chips. And they consider that undifferentiated heavy lifting in my opinion. They didn't say that, but that's clearly what I see coming out of this announcement. The other thing that I saw, Dave, that's notable is you saw them clearly taking a three-lane approach to this machine learning. The advanced builders, the advanced coders, the developers, and then database and data analysts, three swim lanes of personas, of target audience, clearly that is in line with SageMaker and the embedded stuff. So two big revelations, more horsepower required to process training and modeling, okay? And two, the expansion of the personas that are going to be using machine learning. So clearly this is, to me, a big trend wave that we're seeing that validates some of the startups and I'll see their SageMaker and some of their products. Well, as I was saying at the top, I think Amazon's really working hard on simplifying the whole process. And you mentioned training. And a lot of times people are starting from scratch when they have to train models and retrain models. And so what they're doing is they're trying to create reusable components and allow people to, as you pointed out, to automate and streamline some of that heavy lifting. And as well, they talked a lot about doing AI inferencing at the edge and you're seeing, they, Swami talked about several foundational premises and the first being a foundation of frameworks. And you think about that at the lowest level of their ML stack, they've got GPUs, different processors, inferential, all these alternative processors, not just the X86. And so these are very expensive resources. And Swami talked a lot about, and his colleagues talked a lot about, well, a lot of times the alternative processor is sitting there, waiting, waiting, waiting. And so they're really trying to drive efficiency and speed. They talked a lot about compressing the time it takes to run these models from sometimes weeks down to days, sometimes days down to hours and minutes. Yeah, let's unpack these four areas. Let's stay on the firm foundation because that's their core competency, infrastructure as a service. Clearly they're laying that down with the processors. But what's interesting is a tensor flow, 92% of tensor flows on Amazon. The other thing is that PyTorch surprisingly is back up there with massive adoption. And the numbers on PyTorch literally is on fire. I was going to make a joke on Twitter. The adoption of PyTorch is telling because that means that tensor flows is originally part of Google is getting a little bit diluted with other frameworks. And then you've got MXNet, some other things out there. So the fact that you got PyTorch, 91% and then tensor flow 92% on AWS is a huge validation. That means that the majority of most machine learning development and deep learning is happening on AWS. That cloud based by the way, just to clarify, that's the 90% of cloud based tensor flow runs on AWS. 91% of cloud based PyTorch runs on AWS. They're amazing, massive numbers, dominant. And I think that the processors show that it's not trivial to do the machine learning, but that's where the inferential chip came in. That's kind of where they want to go. Lay down that foundation. They had Taneum, they had Traneum, they had Inferential was the chip and then distributed training on SageMaker. So you got the chip and then you got SageMaker is the middleware, Dave. It's almost like a, it's a machine learning stack. That's what they're putting out there. And Habana Gaudi, which is also for training, which is an Intel based chip. So that was kind of interesting. So a lot of new chips and specialized chips. We've been talking about this for a while, particularly as you get to the edge and do AI inferencing, you need a different approach than we're used to with the general purpose microprocessor. So I'll get your take on tenant number two. So tenant number one, clearly infrastructure. A lot of announces will go through those, review them at the end. But tenant number two that Swami put out there was creating the shortest path to success for builders or machine learning builders. And I think here he lays out the complexity, Dave, but it's mostly around methodology and the value activities required to execute. And again, this points to the complexity problem that they have. What's your take on this? Well, you think about, again, I'm talking about the pipeline. You collect data, you just data, you prepare that data, you analyze that data, you make sure that it's high quality. And then you start the training and then you're iterating. And so they're really trying to automate as much as possible and simplify as much as possible. What I really liked about that segment of foundation number two, if you will, is the example, the customer example, the speaker from the NFL, talked about the AWS stats that we see in the commercials, next gen stats. And she talked about the ways in which they've, well, we all know they've re-architected helmets. And it's really very much a database that was interesting to see. They had the spectrum of the helmets that were, the safest, most safe to the least safe and how they've migrated everybody in the NFL to those. She cited a 24%, it was interesting how she would have a 24% reduction in reported concussions. You got to give the benefit of the doubt and assume some of that's through the data. But some of that could be like, Julian Edelman popping up off the ground when we had the concussions. He doesn't want to come out of the game with a new protocol, but no doubt they're collecting more data on this stuff. And it's not just head injuries. You should talk about ankle injuries, knee injuries. So all this comes from training models and reducing the time it takes to actually go from raw data to insights. Yeah, I mean, I think the NFL's a great example. You and I both know how hard it is to get the NFL to come on and do an interview. They're very coy. They don't really put their name on anything much because of the value of the NFL. That's a meaningful partnership. You had the person on stage virtually really going into some real detail around the depth of the partnership. So to me, it's real. First of all, I love stat cats. I love anything they do with, what they do with the stats is phenomenal. But this points to the real world example, Dave, that you're starting to see sports as one metaphor. Healthcare and others are going to see those coming in. To me, totally a tell sign that Amazon's continue to lead. The thing that got my attention was, is that it is an IoT problem. And there's no reason why they shouldn't get to it. Some say that, oh, concussion NFL's just covered in there, but they don't have to. This is actually really working. So you got the tech, why not use it? And they are. So to me, that's impressive. And I think that's, again, a digital transformation sign that when the NFL is doing it, it's real because it's easier. So let's see how that goes. Look, I think it's easy to criticize the NFL, but the reality is, is there anything old days that was like, hey, you get your bell wrong, get back out there. That's just the way it was, a football player. But Ted Johnson was one of the first and Bill Belichick was the guy who sent them back out there with the concussion, but he was very much outspoken. Got to give the NFL credit. It didn't just ignore the problem. Yeah, maybe it took a little while, but these things take some time because it was generally accepted back in the day that, okay, hey, you get right back out there, but the NFL has made big investments there. And like I say, you got to give them props for that. Especially given that they're collecting all this data, that to me is the most interesting angle here is letting the data inform the action. And next up after the NFL, they had this data prep, data wrangler news that they're now integrating snowflakes, Databricks, MongoDB into SageMaker, which is Athena, Redshift, S3 and Lake Formation into. This is not the other way around. So again, you've been following this pretty closely, especially with the snowflake recent IPO and their success. This is an ecosystem play for Amazon. What does it mean? Well, a couple of things. As you well know, John, when you first called me up, I was in Dallas and flew into New York in an ice storm to get to the one of the early Hadoop worlds. And back then it was all batch. The big data was this big batch job. And today you want to combine that batch. There's still a lot of need for batch, but people want real-time inferencing and AWS is bringing that together and they're bringing in multiple data sources. You mentioned Databricks and Snowflake, Mongo, these are three platforms that are doing very well in the market and holding a lot of data and AWS is saying, okay, hey, we want to be the brain in the middle. You can import data from any of those sources and I'm sure they're going to add more over time. And so they talked about 300 pre-configured data transformations that now come with SageMaker Studio, essentially I've talked about this a lot, essentially abstracting away the IT complexity, the whole IT operations piece. I mean, it's the same old theme, AWS is just pointing its platform in its cloud at non-differentiated heavy lifting and it's moving it up the stack now into the data lifecycle and data pipeline, which is one of the biggest blockers to monetizing data. Expanding that more, what does that actually mean? If I'm an IT person, translate that into IT speak. Yeah, so today, if you're a business person and you want to answers, right, and you want to ingest a new data source, so let's say you want to build a new product. Give an example, let's say you're like a Spotify, make it up and you do music today, but let's say you want to add movies or you want to add podcasts and you want to start monetizing that, you want to identify who's watching what, you want to create new metadata, well, you need new data sources. So what you do as a business person that wants to create that new data product, let's say for podcasts, you have to knock on the door and get to the front of the data pipeline line and say, okay, hey, can you please add this data source? And then everybody else down the line, has to get in line, hey, here comes a new data source and it's this linear process where very specialized individuals have to do their part and then at the other end, comes a self-serve capability that somebody can use to either build dashboards or build a data product. And a lot of that middle part are operational details around deploying infrastructure, deploying training, machine learning models, a lot of Python coding, yeah, there's SQL queries that have to be done, so a lot of very highly specialized activities. What Amazon is doing, my takeaway is they're really streamlining a lot of those activities, removing what they always call the non-differentiated heavy lifting, abstracting away that IT complexity. To me, this is a real positive side because it's all about the technology serving the business as opposed to historically, it's the business begging the technology department to please help me, the technology department obviously evolving from the glass house, if you will, to this new data pipeline data life cycle. I mean, it's classic agility to take down those, I mean, it's undifferentiated I guess, but if it actually works, it creates a differentiated product. But it's just log, you can debate that kind of aspect of it, but I hear what you're saying, just get rid of it and make it simpler. The impact of machine learning as Dave is one came out clear on this SageMaker Clarify announcement, which is a bias decision algorithm. They had an expert, nationally, CIFAS presented essentially how they're dealing with the bias piece of it. I thought that was very interesting. What'd you think? Well, so humans are biased and so humans build models or models are inherently biased. And so I thought it was, you know, this is a huge problem, two big problems in artificial intelligence. One is the inherent bias in the models and the second is the lack of transparency that they call it the black box problem. Like, okay, I know there was an answer there, but how did it get to that answer and how do I trace it back? And so Amazon is really trying to attack those with Clarify, I wasn't sure if it was Clarity or Clarify. I think it's Clarify. A lot entirely certain how it works. We really have to dig more into that, but it's essentially identifying situations where there is bias, flagging those, and then, you know, I believe making recommendations as to how it can be stamped out. Yeah, and also some other news, deep profiling for debugger, SageMaker debugger, which is a deep profile neural network training, which is very cool again on that same theme of, you know, profiling. The other thing that I found- Don't remind me, John, if you may throw up, don't remind me of like grammar corrections and, you know, when you're typing, it's like, you know, code corrections and automated debugging. Try this. Put us in like a better debugger. Come on, we first of all should be bug free code, but, you know, there's always biases of the data is critical. The other news that thought was interesting then, Amazon's claiming this as the first, SageMaker pipelines for purpose built CICD for machine learning. Bringing machine learning into a developer construct. And I think this started bringing in this idea of the edge manager, where you have, you know, and they talked about machine, SageMaker store, storing your functions. This idea of managing and monitoring machine learning modules effectively on the edge and through the development process is interesting. And really targeting that developer, Dave. Yeah, applying CICD to the machine learning and machine intelligence has always been very challenging because again, there's so many piece parts. And so, you know, I said it the other day, it's like a lot of the innovations that Amazon comes out with are things that have, you know, problems that have come up given the pace of innovation that they're putting forth. And it's like customers drinking from a fire hose. We've talked about this at previous re-invents. Can the customers keep up with the pace of Amazon? So I see this as Amazon trying to reduce friction, you know, across its entire stack. There's another example. Well, Swami laid out a slide ahead. Build machine learning gurus, developers, and then database and data analysts. Clearly, database developers and data analysts are on their radar. This is not the first time we heard that, but it is the first time we're starting to see products materialized where you have machine learning for databases, data warehouse and data lakes, and then BI tools. So again, three different segments. The databases, the data warehouse and data lakes, and then the BI tools. Three areas of machine learning innovation where you're seeing some product news, your take on this, natural evolution or... Well, it's what I was saying up front, is that the good news for customers is you don't have to be a Google or Amazon or Facebook to be a super expert at AI. Companies like Amazon are going to be providing products that you can then apply to your business and allow you to infuse AI across your entire application portfolio. Amazon Redshift ML was another example of them extracting complexity. They're taking S3 Redshift and SageMaker complexity and abstracting that and presenting it to the data analysts so that that individual can worry about, again, getting to the insights. It's injecting ML into the database, much in the same way, frankly, that BigQuery has done that. And so that's a huge positive. When you talk to customers, they love the fact that when ML can be embedded into the database and it simplifies all that complexity, they absolutely love it because they can focus on more important things. Clearly on this tenant and in this part of the keynote, they were laying out all their announcements, quick site ML insights out of the box, quick site queue available in preview, all the announcements. And then they moved on to the next, the fourth tenant, Dave, solving real problems end-to-end. Kind of reminds me of the theme we heard at Dell Technology World last year, end-to-end IT. We're starting to see the land grab, my opinion. Amazon really going half beyond IaaS and pass. They talked about contact centers, Kendra, look out for metrics. And then Matt Wood came on and talked about all the massive disruption in the industries. And he said, literally, machine learning will disrupt every industry. They spent a lot of time on that. And they went into the computer vision at the edge, which I'm a big fan of. I just love that product. Clearly, every innovation, I mean, every vertical Dave is up for grabs. That's the key Dr. Matt Wood message. Yeah, I mean, I totally agree. I mean, I see that machine intelligence as a top layer of the stack. And as I said, it's going to be infused into all areas. It's not some kind of separate thing. You know, like Kubernetes, we think it's some separate thing. It's not, it's going to be embedded everywhere. And I really like Amazon's edge strategy. You were the first to sort of write about it in your keynote preview where Andy Jassy said, we want to bring AWS to the edge. And we see data center is just another edge node. And so what they're doing is they're bringing SDKs. They've got a package of sensors. They're bringing appliances. I've said many, many times, the developers are going to be the linchpin to the edge. And so Amazon's bringing its entire data plane, its control plane, its APIs to the edge and giving builders or slash developers the ability to innovate. And so I really like the strategy versus, hey, here's a box. It's got an x86 processor inside. I'm going to throw it over the edge, give it a cool name that has edge in it. And here you go. Hyper edge. That sounds good. Call it hyper edge, you know? I mean, the thing that too is the data aspect at the edge. I mean, everything's got a database, data warehouse and data lakes are involved in everything. And then, and some sort of BI or tools to get the data and work with the data or the data analysts. Data feeds machine learning, critical piece to all this, Dave. I mean, this is like, databases used to be boring, like boring field. Like, you know, if you were a database, I have a degree in a database design, one of my degrees, a few of science degrees. Back then no one really cared if you were a database person. Now it's like, man, data everything. This is a whole new field. This is an opportunity, but also, I mean, are there enough people out there to do all this? Well, it's a great point. And I think this is why Amazon's trying to extract some of the complexity. I sat in on a private session around databases today and listened to a number of customers. And I will say this, you know, some of it, I think it was NDA, so I can't say too much. But I will say this, Amazon's philosophy of the database and you addressed this in your conversation with Andy Jassy across its entire portfolios to have really fine grain access to the deep level APIs across all their services. And he said this to you, we don't necessarily want to be the abstraction layer, per se, because when the market changes, that's harder for us to change. We want to have that fine-grained access. And so you're seeing that with database, whether it's, you know, NoSQL, SQL, the Aurora, the different flavors of Aurora, DynamoDB, Redshift, you know, RDS, on and on and on, there's just a number of data stores. And you're seeing, for instance, Oracle take a completely different approach. Yes, they have MySQL because they've got that with the Sun acquisition, but this is, they're really about putting as much capability into a single database as possible. Oh, you only need one database. Totally different philosophy. Yeah. And then obviously you got HealthLake and then that was pretty much the end of the announcements. Big impact to healthcare. Again, the theme of horizontal data, vertical specialization with data science and software playing out in real time. Yeah, well, so I have asked this question many times in the queue, when is it that machines will be able to make better diagnoses than doctors? And, you know, that day is coming if it's not here. You know, I think HealthLake is really interesting. I've got to interview later on with one of the practitioners in that space. And so, you know, healthcare is something, is an industry that's ripe for disruption. It really hasn't been disrupted. It's a very high risk, obviously, industry. But look at healthcare as we all know, it's too expensive, it's too slow, it's too cumbersome, takes too long sometimes to get to a diagnosis or be seen, Amazon's trying to attack with its partners all those problems. Well, Dave, let's summarize our take on Amazon keynote with machine learning. I'll see a pretty historic in the sense that there was so much content in first keynote last year with Andy Jassy. He spent like 75 minutes, he told me, on machine learning. They had to kind of create their own category. Swamy, who we interviewed many times on theCUBE, was awesome. But a lot of still, a lot more stuff. More, 215 announcements this year, machine learning. More key people than ever before. Moving faster, solving real problems, targeting the builders, broad platform set of things. This is the Amazon cadence. What's your analysis of the keynote? Well, so I think a couple of things. One is, you know, we've said for a while now that the new innovation cocktail is cloud plus data plus AI. That's really data, machine intelligence or AI applied to that data in the scale of cloud. Amazon obviously has nailed the cloud infrastructure. It's got the data. That's why database is so important. And it's got to be a leader in machine intelligence. And you're seeing this in the spending data, you know, with our partner ETR, you see that AI and ML in terms of spending momentum is at the highest or at the highest along with automation and containers. And so, and why is that? It's because everybody is trying to infuse AI into their application portfolios. They're trying to automate as much as possible. They're trying to get insights that the systems can take action on. And actually it's really augmented intelligence in a big way, but really driving insights, speeding that time to insight. And Amazon, they have to be a leader there. It's Amazon, it's Google, it's the Facebooks. It's obviously Microsoft. You know, IBM's trying to get in there. They were kind of first with Watson, but they're far behind I think, the hyper scale guys. But I guess I think the key point is, you're going to be buying this. Most companies are going to be buying this, not building it. And that's good news for an organization that we know. Yeah, I mean, you get 80% there with the product. Why not go that way? The alternative is try to find some machine learning people to build it. They're hard to find. So you're seeing the scale of kind of replicating machine learning expertise with SageMaker, then ultimately into databases and tools, and then ultimately built into applications. I think, you know, this is the thing that I think, Dave, my opinion is that Amazon continues to move up the stack with their capabilities. I think machine learning is interesting because it's a whole new set of, it's kind of its own little monster building block. That's just not one thing. It's going to be super important. I think it's going to have an impact on the startup scene and innovation. It's going to have an impact on incumbent companies that are currently leaders that are under threat from new entrants entering the business. So I think it's going to be a very entrepreneurial opportunity and I think what's going to be interesting to see is how machine learning plays that role. Is it a defining feature that's core to the intellectual property or is it enabling new intellectual property? So to me, I just don't see how that's going to fall yet. I would bet that today intellectual property will be built on top of Amazon's machine learning where the new algorithms and the new things will be built separately. If you compete head to head with that scale, you could be on the wrong side of history. Again, this is a bet that the startups and the venture capital will have to make is who's going to end up being on the right wave here? Because if you make the wrong design choice, you can have a very complex environment with IoT or whatever you're app-serving. If you can narrow it down and get a wedge in the marketplace. If you're a company, I think that's going to be an advantage. Let's be very interested to see how the impact of the ecosystem this will be. Well, I think something you said just now gives a clue. You talked about the difficulty of finding the skills and I think that's a big part of what Amazon and others who are innovating in machine learning are trying to do is the gap between those that are qualified to actually do this stuff, the data scientists, the quality engineers, the data engineers, et cetera. And so companies last 10 years went out and tried to hire these people. They couldn't find them. They tried to train them. It was taking too long. And now I think they're looking toward machine intelligence to really solve that problem because that scales as we know, outsourcing to services companies and just hardcore heavy lifting doesn't scale that well. Well, you know what? Give me some machine learning. Give it to me faster. I want to take the 80% there and allow us to build certainly on the media cloud and the cube virtual that we're doing. Again, every vertical is going to impact a Dave. Great to see you. Great stuff so far week two. So, you know, we're cube live. We're live covering the keynotes. Tomorrow we'll be covering the keynotes for the public sector day. That should be chuckful of action. That environment is going to impact the most by COVID. A lot of innovation, a lot of coverage. I'm John Furrier with Dave Vellante. Thanks for watching.