 Live from Boston, Massachusetts, it's theCUBE. Covering AWS Reinforce 2019. Brought to you by Amazon Web Services and its ecosystem partners. Hey, welcome back everyone. It's theCUBE's live coverage here in Boston, Massachusetts for AWS Reinforce Amazon Web Services, inaugural event around cloud security. I'm John Furrier, Dave Vellante. Two days of coverage, we're winding down day two. We're excited to have you here in theCUBE, special guest, one of the big announcements. Well, I think it's been a nerdy announcement is the Automated Reasoning Byron Cook, director of the Automated Reasoning Group within AWS. Again, this is part of the team that's going to help figure out security, use automation to augment humans. Great to have you on. Big part of the show here. Welcome to the team. Yeah, thanks very much for having me. All right, so explain the Automated Reasoning Group. Verner Vogel had a great blog post on all things distributed, applies formal verification techniques in an innovative way to cloud security and compliance for our customers, for our own AWS developers. What does that mean? What does it mean by that? A bunch of math. Yeah, let me try, I'll give you one explanation and then if I puzzle you, I'll try to explain it in a different way. So, do you know the Pythagorean Theorem? Yep, oh sure. Yeah, so the Pythagorean Theorem is about all triangles that was proved in approximately 300 BC. The proof is a finite description in logic as to why it's true and it holds for all possible triangles. So, we're basically using the same approaches to prove properties of policies, of networks, of programs, so for example, crypto, virtualization, the storage, et cetera. So, we write software, this finds proofs in mathematics and the proofs are the same as what Euclid found for Pythagorean Theorem. So, to apply into solve problems that become these mundane tasks of checking config files, making sure things are, it's a little bit more than that, so I'll give you an example. So, S2N, which is the TLS implementation used, for example, in S3, but the large majority of AWS, so that has approximately 12,000 state-holding elements so that if you include the stack on the heap usage, so the number of possible states it could reach is two to the 12,000 and if you wanted to show that the TLS handshake implementation is correct or the HMAC implementation is correct or the deterministic random bit generator implementation is correct, which is what we do, using conventional methods, like trying to run tests on it, so you would need, if you have like a million Haswell microprocessors, and you would need many more lifetimes than the sun is going to emit light at 3.4 billion dollars a year to exhaustively test that system. So, what we do is we, rather than just running a bunch of inputs on the code, we represent that as a mathematical system and then we use proof techniques, automatically search for a proof and with our tools, we, in about 10 minutes, are able to prove all of those properties of S2N. So then we, yeah, in 10 minutes. And then we apply that to pieces of S3, pieces of EC2 virtualization infrastructure and then what we've done is we've realized that customers had a lot of questions about their networks and their policies. So, for example, they have a complicated network worldwide, different availability zones, different regions, and they want to ask, hey, does there exist a way for this machine to connect to this other machine? Or, you know, does all SSH traffic coming in that eventually gets to my web server go through a bastion host, which is a best practice? And then we can answer that question, again, using logic. So we take the representation, the semantics of EC2 networking, the policy, the network from the customer, and then the question we're asking, express in logic, and we throw a big, call a theorem prover, get the answer back, and then same for policies. So you're analyzing policies. Policies, networks, programs. Networks, connections. Yeah. Right? Yeah. And the tooling is Zalcova and... So, yeah, so basically we come with a tech, we come with an approach and then we have many tools that implement the approach on different problems. That's how you apply it. So Zalcova, all underneath, it's all uses of a kind of tool called SMT and SAT. So there's a SAT solver proves theorems about formulae and propositional logic, and SMT is SAT modular theories. Those tools can prove properties of problems expressed in first order logic. And so what we do is we take the, for example, if you have a question about your policies, answering semantic level questions about policies is actually a PSPACE problem, so that's harder than MP complete. We express the question in logic and then call the solvering and they get their answer back and marshal it back, and that's what Zalcova does. So that's calling a tool called CVC4, which is an open source prover, and we, when Zalcova, we take the policy, the question encoded to logic, call the solver and marshal the answer back. What's the root of this? I mean, presumably there's some academic research that was done, you guys are applying it for your specific use case, but can you share with us the origination of this? So the first MP complete problem was discovered by a cook and not me, another cook in the early 70s. And so he proved that the propositional satisfiability problem is MP complete. And meanwhile, there's been a lot of research from the 60s, so Davis and Putnam, for example, I think a paper from mid-60s where they were trying to answer the question of can we efficiently solve this MP complete problem, propositional satisfiability? And that research has continued. There have been a bunch of breakthroughs, and so now we're really starting to see very, from there was a big breakthrough in 2001, and then some further breakthroughs in the 2005, 2008 range. So what we're seeing is that the solvers are getting better and better. So there's an international competition of, let's say, usually about 30 solvers, and there's a study recently where they took all of the winners from this competition each year, 2002 to 2011, and compared them on the same benchmarks and hardware, and the 2002 solver was able to solve a quarter of the benchmarks, and the 2011 solved practically all of them, and then the 2019 solvers are even better. And so nowadays they can take problems and logic that have many tens of millions of variables and solve them very efficiently. So we're really using the power of those underlying solvers and marshaling the questions to those solvers. You're codifying, thinking math, and that's the math power is, you gave a talk in one of the sessions around provable security, what's kind of the title, actually proves provable. What does that entail? Can you just explain that concept and the talk thesis? So, mathematical logic is 2,000 years old, right? So it has refined, so Boole, for example, made logic less of a philosophical thing and more of a mathematical thing. And then automated reasoning was sort of developed in the 60s where you take algorithms and apply algorithms to find proofs in mathematical logic, and then provable security is the application of automated reasoning to questions in security and compliance. So you want to prove absence of memory corruption errors in C code, you want to prove termination of event handling routines that are supposed to handle security events. All of those questions are properties of your program and you can use these tools to automatically or find proofs and then check the proofs that have been found manually and that's where provable security fits in. What was the makeup on the attendee list where people grokking this, where people excited? Was it all a bunch of math geeks? Because you have a cross-section of great security people here and they're deep dive conversations. It's not like reinvent this show. This is really deep security. What was some of the feedback and makeup of the attendees of these talks? So I'm going to give you two answers because I actually gave two talks and the answers are a little bit different because it's subject to the talk. So there was one on provable security which was basically the foundation of logic and how we, how Thierry St. Selkova and our program, because we also proved correctness of crypto and so on and so forth, those tools. And so that was largely a folks who had heard about it and we're wanting to know more and we're wanting to know how we're using it and trying to learn. There was a second talk which was about the application of it to compliance. So that was with Tom McAndrew who's the CEO of Coalfire, one of the third party auditors that AWS uses and a lot of customers use and also Chad Wolf, who's Vice President of Security, focused on compliance. And so the three of us spoke about how we're using it internally within AWS to automate certification, to complete certification. And so that crowd was a really interesting mixture of people interested in automated reasoning and people interested in compliance which are two communities you wouldn't think normally hang together, but that it's sort of like chocolate and peanut butter. It turns out to be a really great application. And they need to work together too because this is where all the action is. They don't get stuck in the compliance and auditing, but full engineering teams are merging with kind of old school compliance nerds. So there's a really interesting sort of dynamic to proof that has the perfect use case in compliance. So the problem of proving termination of programs is undecidable. Proving problems and propositional logic is NP complete. It all sounds very difficult and you use heuristics to solve those problems. But the thing is that once you've found a proof, replaying the proof is linear in size of the proof. And so actually you can do it extremely efficiently. And that has application and compliance. So one could imagine that you have, for example, PCI, HIPAA, FedRAMP, you have certain controls that you want to prove that the property, like for example, within AWS, we have a control that all data at rest must be encrypted. So we are using program verification tools to show that of the code base. But now once we've run that tool, that constructs a proof, like Euclid found in the Pythagorean theorem, that you can package up in a file, hand to an auditor and then a very simple, easy to understand third-party open source tool can replay that proof. And so that becomes audit evidence. It's a scale. A total example of scale. Yeah, so I would say the engineering problem you're solving is security at scale. The business problem you're solving is trust. Customers are struggling just implementing better security. There just aren't enough security professionals to hire. So the old, as the talk explains, they're all on YouTube, so people watching this show can go check it out. But by the way, I should make a plug for, if you Google AWS Provable Security, there's a webpage on AWS that has papers and videos and lots of information, so you might want to check that out. I can't remember what I was answering now, so I lost track. It's got links to the academic channels as well, right? Oh yes, so that was the point that Tom McAndrew is pointing out, is in the old days you would do an audit, you would come in, there'd be a couple Linux box, there'd be a Windows box, you'd check a few things, it'd be a little network, great. But now you have machines across the world, extremely complex networks, interaction between policies, networks, crypto, et cetera. And so there's no way a human or even a team of humans could come in and have any reasonable chance of actually deeply understanding the system. So they just sort of check some stuff and then they call it success and these tools really allow you to actually understand the entire system. Well Byron, you guys are doing some cutting edge work, folks watching and want to know how math translates into the real world with all you high school kids out there, parents, this is an example of the stuff you learn in school actually can be played. So great work, I think this is cutting edge, I think math and the confidence of math intersects with groups, the compliance example, audit example, shows that worlds are going to come together with math. I think this is a big mega trend, it's going to not eliminate the human element, it's going to augment that, so great stuff. Final question, just randomly while you're here since you're a math guru, we're always interested, we're always covering our favorite topic, a blockchain. We believe that a security conference is going to soon have a blockchain component because of the immutability of it, there's a lot of math behind it. So as that starts to mature, certainly Facebook entering in with their own currency, a whole nother conversation, which we don't want to have here, is bringing a lot of attention. So we see the intersection of security being a supply chain problem in the future, your thoughts on that just generally. So the problem of proving programs is undecidable and that means that you can't build a general solution. What you're going to have to do is look for niche areas like device drivers, networks, policies, API usage on crypto, et cetera, and then make the tools work for that area and you will have to be comfortable with the idea that occasionally the tools aren't going to be able to find an answer. And so the Amazon culture of being customer obsessed and working as closely as possible with the customer has been really helpful to my community of logic formal methods practitioners because they were really forced to work with the customer and understand the problem. So what I've been doing is listening to the customer and finding out what the problems, what the concerns they are and then focusing my attention on that. And I haven't yet heard of customers asking me for mathematical proof on cryptocurrency, blockchain sorts of stuff, but I await further instructions. You're intrigued, yeah, I'm not sure. So I always like mathematics. But where we have been hearing customers ask for help is, for example, we're working on free RTOS, so IoT applications, understanding the networks that are connecting up IoT to the cloud, understanding the correctness of machine learning. So I've done some machine learning. I've constructed a model. How do I know what it does and is it compliant? Does it respect HIPAA, FedRAMP, PCI, et cetera? And some other issues like that. There's a lot of talk in the industry about quantum computing and creating nightmares for guys like you. How much thought have you given that? Do you have any things you can share with us? Yeah, so there's work in the AWS crypto team preparing for the post quantum world. So imagine Adversary has quantum computer and so there are proposals. AWS has a number of proposals and those proposals have been implemented, so they're standards. And our team has been doing proof on the correctness of those. So actually in one of my talks, I think the talk not with Chad and Tom, I show a demo of our work to prove the correctness of some post quantum code. Cool, Byron, thank you for coming on and sharing the insight. Congratulations on the automated reason. Good to see you put in the practice and appreciate the commentary. Thank you very much. Cube here for the first inaugural cloud security event reinforced AWS is putting on this Cube coverage. I'm John Furrier with Dave Vellante. Thanks for watching.