 Hey everybody, my name is Mike Waite, this is the OpenShift Commons Briefings Operator Ours. And I'm going to do what I always like to say is I'm going to share my screen and introduce to you Hazelcast. This is the technology overview from Hazelcast today, and we are fortunate enough to have with us Scott McMahon, the Technical Director Technical Director for the Americas. Scott, how are you today? I'm good, Mike. Thanks for having me. I'm doing pretty well. Excellent. Excellent. Now you're at Hazelcast. Tell me about you. Let's find out who you are. So you're the Technical Director for the Americas. What does that mean? What do you do with the company? Why are you so, so interesting? Why have I read so much terrific material about you? I just, we need to know. Now that's a setup. So I run our field engineering team in North America, and I am the technical liaison that works with our partners such as IBM and Red Hat and others to make sure that our technology is implemented correctly. We work with our customers and use things to get the most value for, you know, for the software that people buy from us. So I came over, so a little bit about me. I started my career as a software engineer in Silicon Valley a long, long time ago, but I guess I've always been involved in analytics, and I've always saw the capability or the possibilities around what these computer systems that we were building over time, you know, going way back to doing websites and things, what we could actually do with them. So I kind of followed my career through from ESBs and message based, you know, that kind of thing that we did through BPM and the analytics space. And now I've kind of come full circle and we're back to actually doing what I think of as business processes, but doing it at a real time sort of speed and using all the in-memory stuff, but also applying the technology for to and the analytics and being able to understand what's actually happening out there. So that's kind of my whole circle. When you said way back, what does that mean? I mean, I'm 24 years old. When I think way back, I think, you know, like teletubbies and blues clues, you're probably not 24, like me, of course, but like, are we talking way back? Are we talking Vax, VMS? You know, how far back is way back for you? A little bit like back when the internet was something for academic people and nobody ever heard of it. And we were all using links and this Tim Berners-Lee came up with this thing called World Wide Web that he called it. And, you know, just following that from news groups on, you know, UNIX systems all the way up to now we're doing e-commerce and, you know, shopping carts online. So from the beginning, really. That's interesting. You brought that up. I remember sitting in some college class at some point and the instructor was like, okay, well, you know, so here's your assignment tonight. And then he wrote something on the blackboard with chalk. And it was like, you know, eight, seven, nine, three, two, zero, blah, blah, blah, something, something. And he's like, for any of you who know what that is, you can send your homework assignments there. And I was looking at it. I'm like, I have no idea what that is. So I'll just do it on paper and I'll just give it to you tomorrow. But yeah, that was like when, you know, email was like super rare too. So by the way, I'm not 24, I lied. I remember going down the road and hearing people say, oh, you're going to be able to surf the internet while you drive down the road with a device in your hand. And I just thought that's nuts. There's no way that's gonna work. We really did a lot of good work. We enabled a lot of high speed. You know, we're all about performance, a lot of financial stuff. And that was fine. We went along with that for quite some time. Now, starting about five years ago, maybe something in that range, we, along with everybody else, really started seeing the shifting of how the data that our systems were ingesting, like how are we getting this data and how are we processing it and the things that we do with it. And it was changing. You couldn't just load it into a storage and then come back and do some analytics later. You had to really react in real time. When you start talking about, you know, mobile devices, like we were talking about multiple, you know, massively parallel systems, and you have to treat those as individual streams. But at the same time, you want to get that sort of macro layer over the top and see, you know, what is actually going on in my environment. So that's where we started looking at stream processing and going back, talking about saying, you know, going way back, we tried to do this 10 years ago or 15 years ago with the ESBs and the BPM solutions, the business process management and monitoring, whatever they wanted to call it. But the systems just couldn't keep up. Now, with the in-memory technology, we realized we actually did have a system that could do this. So we can do a million events a second. We can actually look at all these individual events and process them. So that's where we started thinking we have the perfect base in-memory storage solution, but we need to build an engine on top of it that can actually handle this kind of workload that we're seeing. And that's where we started building the streaming engine that leverages the scalability and all of the underlying in-memory technology, but can handle these new workloads, these new things. When you talk about sensors, you talk about industry, you talk about mobile devices, they're always on, always connected, always sending things like GPS or whatever, and you've got to deal with that. And that's what we said about doing. Yeah. So 1500 people, 800 people, 10,000 people, how many people are Hazelcast? About 120 at this point. Give or take. No kidding. We're still pretty small. No kidding. That's clear. We, yeah, just a couple of years ago it was 80. We're growing, I think, as our customers or our user base really starts to see, they start to experience these things that they need to do. Like how am I going to handle this? We're growing fast and there's a lot. It's really fun. I'll just say that because like we're doing things where some of the top companies in the world, we get to go work with them and enable them to do what they're doing. I can't really mention all the names because I don't know who I can actually refer to, but most of the top companies are using us for something or another and it's fun to be involved in all that. When I started at Red Hat, it was in 2002, and there were 260 people in the company worldwide. Worldwide. So I remember what it was like working for a small company like that where, man, I was a solutions architect and I had sales people that I was supporting and I was like flying to Florida for a one hour meeting and flying over to California, sleeping, taking a meeting in the morning. I've said this before in the show, but I remember what it was like trying to sell stuff back then because we were spending more time trying to convince people about open source. We'd go into a big bank or something and they were like, well, you got to explain to me this open source thing and how come you guys don't have blue hair and skateboards, right? You're all buzzing around the office on skateboards and it was like impossible to sell our product, which at the time was Red Hat Linux 9. This was before we came out with our enterprise product and we just spent more time trying to defend open source. Why should I pay you for something that's free? I remember being in, not Goldman, Morgan Stanley, we're up in the 99th floor or whatever in New York City, we were in a boardroom and everyone's sitting around the table with bottles of Poland spring water and Bermbaum goes to me. He's like, well, okay, great. So you guys have presented your technology, blah, blah, blah. Why should I pay for something that's free? And I looked at him and I was like, what do you got in your hand? Like you got Poland spring bottled water and everyone in the room was like, yeah, I'm like, well, why didn't you just grab a couple five gallon buckets and go downstairs and go to the Hudson River and fill up your five gallon buckets with water from the river, bring it on up here and serve it to people. And that was our first multi-million dollar deal because they got it that like, yeah, your technology is open source, but it's just like water. It needs to be refined, it needs to be packaged, distributed, marketed and frankly if someone gets sick drinking your water, you know that you can get taken care of. So that's my long diatribe of what it was like way back when. What kind of challenges do your people see out there given that you're fairly young company and we're going to get into the whole AI ML, is AI real or is it a foul? What does it really mean? But what do you see out there today that you're hearing from your reps when they're talking with customers about what the challenges are that the customers are telling you and how you folks can respond to that? Funny you say that about the open source. I was using Red Hat back then. I remember that back in five. I think the first one I remember really getting into was five, but yeah it's been a while. So just let me just you know, hey it's my show I can talk about whatever I want but I remember Red Hat Linux 5 was when I was working at DEC, a digital equipment corporation and Red Hat sent some people. What's that? True 64 right? Yeah true 64. But Red Hat sent a couple engineers up to our office here in mass to make Red Hat Linux 5 work on the alpha platform. So they ported Red Hat Linux 5 to run on alpha and it you know it did and Sousa did the same thing and Debbie and did the same thing as well. But Red Hat Linux 6 was like the big one. That was the one that like really started to really kind of become mainstream and was in like stores and boxes and bus buy and stuff like this. But anyways I didn't mean to steal the microphone from you but I just I remember that very fondly. Actually it's funny you say that so I have a question for you. How do you say Sousa? Is it Sousa or Sousa? My European colleagues say Sousa. No it's not and I know exactly why. So when Red Hat came to get Red Hat Linux 5 running on the alpha about a month later the guys from Sousa came from Nuremberg and it was Jurgen Geck. You can look him up. He was the CTO of Sousa and it was Jurgen Geck and he was like a rock star man. I was like there in a lab you know with machines everywhere and you know bricks of stores and like Jurgen walks in with like four or five engineers and it was like this it was like this electricity in the room. It was like you know I don't know like John Travolta just showed up you know. But I spent four or five days with them when they were importing it to to alpha you know the initial one and it's absolutely pronounced Sousa. This is from that's the way I've always said it but this is from the CTO of the company and I don't know where Jurgen Geck is these days but he was the original CTO of Sousa and I went out and took him out to dinner and we went and had you know a beer or two and like I can tell you emphatically that it's pronounced Sousa. There you go I have the definition then straight from the horse. So now going back to Hazelkatz what you were asking me I think it's it's funny because the in-memory data grid part you know that was all about performance and and that was in my mind a tactical solution. You were dealing with people that were already building their applications they're building microservices whatever they're doing and they just needed to make it faster they needed to make it better but they already had this so they understood the value. When we start talking about some of the newer stream processing and and the event processing that we're doing that's more of a strategic type of of of conversation where you start to you have to lay out and and make people understand this is the benefits of doing this this is this is the business value this is what you're going to get you're going to get new ways to do things that you don't know about yet and and it's going to give you advantages or or even just handle the problems that you're having that you don't know yet that you even have. So the the conversation moves strictly from tactical how do I make this thing faster to more strategic what is the value and what am I going to be doing in the future as these things grow and we're only going to see it grow more it's 5g rolls out what are they saying that's 40 gigabytes gigabytes per second like it's it's orders of magnitude more so that just gives developers of apps and sensors and all these different things more bandwidth to work with and ultimately all of that comes back to somewhere that you have to handle it you have to process you have to be able to do something with it so you know that's the that's the conversation that I tend to have more and more yeah so let's let's talk about Hazelcast you know like when we did our when we had our intro call with you and your team was it last week or maybe the week before and I was like don't take this the wrong way no offense but like I had no idea what Hazelcast is I mean I my team is responsible for working with third party software vendors to help them get their you know their products tested and certified on the Red Hat portfolio whether it's you know Linux or Ansible or OpenStack or you know our storage platforms or you know OpenShift or whatever it is but like like what what is Hazelcast the company name and is your product also called Hazelcast so they I see I see a lot of companies these days where the company name is the product and like unlike you know Red Hat we're a company and then we have products it's like Red Hat Enterprise Linux or OpenShift but you know how do you folks deal with that is is Hazelcast and the company the same thing it was when we were the data grid and the caching layer and all of that when you said Hazelcast you could either be talking about the company or the product it was just Hazelcast when we started building the streaming engine on top of it we originally really saw this as two different things so even though it's the same application the stream processing engine is leveraging all of the underlying in memory data grid technology we sort of divided that and we call it Hazelcast JET and then we renamed in a very creative fashion the in memory data grid to in memory data grid IMDG so we had Hazelcast JET and Hazelcast IMDG they're very different capabilities but they're the same application but it was it was a little confusing and it was a little sort of just in elegant I guess is the best way to put it so we're actually pretty excited now with our version five we're having a major version come out in the fall we're pushing that all back together we're putting it all into one application even though it really kind of technically was and we're going back to just saying the Hazelcast application and it will have the stream processing capabilities it will have the in memory the storage what we think of is kind of data at rest is the storage and data in motion is the stream processing and it's it'll all just be capabilities within Hazelcast so we'll go back to just being able to say Hazelcast and that makes it a lot easier and it doesn't make people choose either like when we when we divided it and we had the stream processing engine jet or the in memory data grid it sort of made people have to decide which one do you want to go with and I think it's sort of clouded the the the vision of it also now it's just capabilities same platform same thing one application see we're a Java application so it's a single jar no dependencies you just run it start if you start 10 of them they'll all join together and create a big one if you start one it can run on a Raspberry Pi actually it's a 15 meg so let me ask you this if if I have described you know Red Hat's products I would say you know Red Hat makes Linux Linux is an operating system that allows hardware to work so people can you know do computing you know OpenShift is a container platform you know Red Hat storage is a storage subsystem you know Red Hat's KVM is a virtualization technology what is what is what do your products do without like the the soup like we're going to get there but like without the super deep detail one just give me like the give me the the one sentence thing about what it is because like you can't find that on most companies website you know they they dive right into like the who the what the when the where the why it's like what is it Hazelcast is an all-in-memory platform to for high performance processing of data I mean in a nutshell we're all about high performance we're all in memory we process events we store event you know store data and all that but in a nutshell we are an all-in-memory platform for high extremely high performance data processing if you want to put it that way good well great thanks no more questions your honor I need to I'm writing this down how does all-in-memory like I get it if you had like one computer you know like you got you got a 3u server whatever with a couple sockets in it and stuff and the memory is there local how do you do in memory in multi-cloud I mean that sounds to me like magic yeah so from the beginning that was that was an original sort of requirement for the design is it had to be scalable so Hazelcast you start up one instance of the application it starts up it runs okay on my Hazelcast node now there's discovery mechanism if you start another one and it's it's flexible you just configure it whether you're on cloud or on premise or in containers and Kubernetes and the different things but you start up another one they will discover each other and then they pool up their resources both RAM and CPU resources so the memory is handled in a partitioning scheme so that as you add more instances of the application more nodes if you want to call it that from a cluster point of view you grow your available storage and you grow your distributed processing so we have built over the years the engine of that handles all of the hard part around the distributed processing how do I you know how do I run in parallel how do I coordinate how do I do my locking and synchronization all of those things as a developer you don't have to worry about it you just write it as if it's a single application and then the system handles all of the complex part about how to do that so the idea from the beginning was that it had the scale you had to be able to add more resources gives you more storage gives you more CPU and there's theoretically no limit now ultimately you're going to run into like a network bottleneck or you know saturation type of issues but we've got hundreds of nodes running with terabytes of data and and massively distributed you know when you write an executor or a processing service the system runs that out in parallel understands how the data is distributed and then it's it's similar almost to map reduce we're not really like map reduce but it'll run and do various things in parallel and then kind of shrink back and then ultimately give you your answer and the streaming the stream processing engine does the same thing it it figures out you define your process of what you want to happen or how you want to handle these events but the system takes it and distributes it out across all of the available resources and uses all of those resources because we want to get the extreme high performance as much as we can get so we're going to use all the physical resources to to be as low latency as high throughput as possible but all the stuff for you the engine that we've built handles the complexities of that so by the way we are live on youtube and twitch and facebook and a bunch of other places right now so if anyone has questions and they want to basically what i like to call stump scott wednesday drop them into the chat down below and then our our bots will magically make them show up here in our little bridge but it's it's officially called stump scott wednesday my question i like those uh i like that i like that yeah but how do you um how do you maintain state like before red hat i worked for this company called mission critical linux no one's ever heard of it we ended up getting bought by red hat but we were we were building something called convolo data guard which was high availability high availability clustering for linux and we required shared storage and in the shared storage you could have multiple nodes there was like a quorum partition where the different nodes would vote about who has state and so you had to have at least three so you didn't get a split brain like you couldn't have four or two or something like this that's probably like super old technology and anyone interested you could go probably go google convolo data guard uh you probably wouldn't be able to find mission mission critical linux anymore we became red hat but whatever how do you how does it how do you do that in in multi cloud like what like where is state maintained if you've got nodes all over the place and you're doing shared memory where do you maintain state um and how is done and isn't that a huge performance bottleneck that that is a very interesting question and it's something that we definitely have to deal with um there is if you've probably heard of cap theorem cap and that's where you get into the differences between availability and consistency so yes we have to deal with that we have simple quorum if we want to if depending on what the sla you know what the use case is so we can do simple quorums that for most common use cases is sufficient because it may it prioritizes availability over consistency now if you get into use cases where you do have to um require a iron cloud level of consistency we've actually implemented the raft protocol if i want to get into you know those kind of details so that we can have structures that are shared that guarantee consistency make good make consistency guarantee so that is it that is an issue um but we have given choices and you can pick you consistency does come with the performance impact so as a user you can pick which one what whatever your use case is now i will say too the interesting part of that and for us a strong suit is that we were originally a storage layer so we had that distributed managed all of the all of the engineering to do that storage and locking we have fence blocks and different types of cp locks and all these different types of things we had that in place so that put us with our stream processing especially in a pretty unique position because we can store so we can store state and and one of the things that we are beginning to roll out now we've been working on for quite some time is a whole new way of handling um the contractual type of concepts for microservices and when you link those things together and if you've heard of the a newer concept instead of doing transactions we're actually doing sagas if you will that's that's the new kind of way to do it where you wrap things together and then you have automatically compensating for failures no rollback and do these things but what that's going to give us is a way to handle the microservices type of architecture where you can put things together and wrap them in a saga and then if something goes wrong you roll the whole thing back it compensates for itself but it gives you sort of a transactional nature for multiple you know for um what is the word i'm uh when you when you put a bunch of things together i'm forgetting the word um and be able to handle that so it is then stateful microservices type of architecture and and that is because from the beginning we've had that storage layer so we've had the ability to do this whereas stream a pure stream processing engine probably doesn't have the storage layer and a pure storage layer probably doesn't have the stream processing engine so you know we bring the both of those things together and you kind of end up with the best of both worlds and again i'll hit it again it's all in memory high performance we actually on that same lines we were doing some scalability tests just a few months ago and we've got a blog post out and we worked with some of the academic institutions that have a standard type of testing for throughput and they have a certain set of queries and transactions that you have to handle and we wanted to see how well we scaled and we wanted to ensure that linear scalability so that as you add more resources you'll see more performance and you don't hit that kind of hockey stick thing and we got to we were doing one billion events a second in our transactions and it's no one's ever done that before that i know that's pretty that's pretty good one million transactions a second scott that's no billion b with a b but we only stopped there that was 45 nodes and that was running up i believe it's on aws or one of the clouds i don't remember which one it was but we stopped there and and there was no limit we could go further but it just seemed like a good round number so we stopped there and said all right we did what does it do though like what do you get you're doing a billion a second of like what does that mean like if i'm a if i'm a consumer and like how does that how does that benefit me a customer like a billion what like what like okay great i mean a billion you know requests a billion messages like what like what is it so i i was kind of thinking the same thing when we when we caught that out i thought well nobody has a billion new events a second we do a lot of credit card transaction processing and we do fraud detection and we apply all this stuff and and i think really even even with the largest we pretty much most credit card transactions touch hazel cast in some way i don't know who i can name or not name but you know even the largest one like i know i can say cap capital capital one you know they they use us for their fraud detection they might do 30 000 a second something like that so these use cases i've thought well you know it's it's just a fun kind of academic thing really but now actually now that it's come out there are i again so there's some some streaming uh video processing sites that you go to to watch movies and things like that they do have billions of events a second in in their events these are requests replies you know searches for movies you know whatever the the command to start the streaming sort of thing they really are they're worldwide and they are using us now and testing for use cases we've never seen so what it gets you as a consumer is that you whether it's a mobile app or a or a web you know browser or an application it's going to perform in what i think of as more like human scale where you click on something you want an answer you you send a query you don't want to wait it shouldn't take 10 seconds and then finally come back where you're you're you know should i did i click it right maybe i need to click it again and then you open two things you want to have real-time interaction or the ability to process these events in the timeframe that is important basically and as you know as we say events lose value with time and certain things happen and you have to react immediately at that point and if you don't then it's you just don't even bother you might as well not even bother so from a user standpoint it it gives you the ability to handle the workloads that we're seeing with you know massively massive massively parallel streams and and individual and the ability to react to those and then also have a big picture of what's going on in my world and in my performing right and am i meeting the the needs that that i said i was going to do so it's kind of you know it's cross the board depends on if you mean your user satisfaction or do you mean the ability to react in a timeframe where i can i can make money and take advantage of the situation you know provide a service that i wasn't able to provide before those types of things hey i didn't ask you where where is hazel cast is it like a california thing are you guys is israeli are you from bogota columbia like where's the company head boarded out we truly are a global company the original founders were actually in turkey but the company they came over to the u.s incorporated in 2010 in silicon valley so we're you know we're palo alto san mateo now we moved from palo alto san mateo because our office we our office was too small so we had we outgrew that but we're global we have still a large engineering team in turkey we have some in london we have some you know a lot in the u.s we're we're everywhere i'm located in portland oregon actually so we're truly a global kind of virtual team but our headquarters is in california so here in portland oregon so you're getting the same heatwave i'm getting here in massachusetts it's like actually got it cooled off today yesterday was 115 and the day before that was 112 it's never been seen before this is just like it's i don't know it's it's almost something out of a movie but uh today it's only mid 80s it's it's cooled off again it's like it's 103 degrees here it's 100 excuse me 106 here just north of boston and it and i went to turn on my air conditioner for the first time in two years because usually i don't need it and it was out of freon because there's been there was a link a leak in the line so thankfully they were able to come and fix it but well i had the opposite experience there i uh i got recruited up there in cambridge and they brought me up in the spring when it was beautiful oh look the the st charles river they're out rowing people are running it's beautiful sunshine and then the winter hit and it was just like i didn't know that that whole river froze over it's like my god this is yeah that was i was freezing cold yeah it freezes over now because it's actually getting pretty clean but it used to it used to never freeze because of all the junk that was in it um hey you were talking about containers and you and you use the word microservices you know if you talk to any company that makes like a service mesh you know tool or offering like that it's like microservices microservices microservices and everyone's all like you know this is where it's going to go we got to have a service mesh to manage all these things because the containers are getting so tiny and you know 50 90 000 of these things but you know when i talk to some people they're actually saying the containers are getting bigger yeah how does how does how does the the the number of containers and the size of the containers affect the ability of hazelcast for doing distributed you know uh you know stateful memory across nodes yeah i'm still on actually stump scott wednesday challenging that's actually a very common use case for us and and the reason is that especially if you're a java technology since we're java you can actually embed hazelcast directly in your service and and if not you can use a client and we say that we're polyglot so we have clients that talk to the cluster and and all of the major languages so so either way what we typically are used for is that we provide a very easy common data storage access layer so and all of that is shared in the microservices so if one microservice does something that inserts data or modifies data that's immediately available to any other microservice on a different container that might call into that and fetch that data or request that data the way we do that is that memory partitioning scheme because the what we say that we're our our nodes or our clients are smart and what that really means is each one of them has the entire memory mapping uh partition map within it so it knows exactly where every piece of data is when it goes to get it by the key now where that's important is because as you scale and you get 100 200 a thousand you know different instances of these services the data access time and latency will be linear because it's one network hop so you don't have to go from one to another to another and go find something everyone knows exactly where that data is so when a request comes in even if it's if it's three nodes or 300 nodes it's still one network hop across the switch gets the data fetches it back so you can theoretically grow as big as you want and still have that linear consistent access time now as far as the container is getting bigger yes they are i agree with that but that's just below that's just people not really building those containers as lean as they want i've seen now i've seen some of them get to be you know a gigabytes in size because or not getting like terabytes in size because right not terabyte well you know just big because everybody wants to get all in a container yeah and it's too big that's not what they're made for and that's just you know we can do our part and give fast access but it ultimately uh a lot of that container bloat is just around discipline of of your development processes but but we do give you the easier way to handle all of that back data plane and you don't have to do it yourself because that's the hard part consistency how do i access the data how do i guarantee you know the changes are reflected immediately all those types of things with with uh you know parallel and concurrent systems that's what we built so we built that engine to handle all that so all you really have to do is say get put and and that's it you don't have to worry about the details so what about what about public cloud there's a lot of companies out there i forget everyone's doing studies on resource allocation in public cloud and you know a lot of the a lot of the reports that i've seen from companies i don't know if i can name their names or that basically when when customers are running workloads in public cloud they're over allocating resources like in a big way to to like like does does hazelcast help with that in any way to allow customers to not have to over allocate resources in a public cloud or do you guys just exacerbate the problem well i hope we don't exacerbate the problem i think i think we do we would help because we offer that shared data plane and you can be much more concise in the data that you store and you don't have to have copies and make distributed types of things now on the other hand we do enable easily you know add more containers because we make it very easy to do so again i don't think we either contribute or exacerbate it but we do give an easy shared way to to store and access data in a high-performance way again i think you just go back to kind of your engineering discipline in that you need to have a good plan and make sure you control your resources but i think hazelcast would give you actually now that i think about it it would probably give you help you with using less resources because you wouldn't need to add a storage layer you would have hazelcast directly in your applications it's shared it can grow and shrink independently of what your applications are depending on your storage needs and your processing needs but it could be more concise and tighter on the storage but again you know you can have the best tools it really comes down to to adhering to your discipline and engineering best practices but i think we could help okay well that was as close as i can get to stump scott wednesday um when when we were talking before we went live and by the way we are live hello everyone um uh you said something about a you know like the terms a i and m l a i and m l a i and m l oh we do a i and m l this this has like been like the buzzword bingo for at least three years and you said something when we were talking before you know the camera went live that like a i for what word what did you what did you say science fiction learn what i said it's like science fiction you know everybody uses the term and that is that is a pretty provocative thing that a i is is science fiction when so many companies out there are all saying that they do a i so how do you defend that statement true and and i probably maybe that's not the right way to say it it's not science fiction it's actually really there the question is how do you put it into production how do you actually use it so when you start talking about a i machine learning these different algorithms you go all the way back to the 60s you know in the labs at the at the colleges they were developing these these algorithms and these methods for for doing intelligent type of decision making but it was stuck in the lab you couldn't get it out of there and and i would say in like the last 10 15 years now you have companies that are coming out with tools that mean you can you can build models you can build these machine learning models and you don't need a phd in data science to do it you can you know that they're automated i don't know who i can mention but like h2o and data robot and all these guys you know they're giving the tooling for the average person you know just an it person to be able to build these models and but then you still have the problem of how do i use these things how do i put them in production and make them valuable it's like yeah it's cool technology and all that so we that's where we are focusing we're building with our streaming engine if i can go a little bit into the technical details we built it on the concept of a dag a directed acyclic graph that's what spark uses and so it's a pretty you know easily proven type of technology to control and manage the process that you're building but we want to enable the injection and usage of these models we're not trying to be a data science tool we're not trying to be a training tool or anything like that we just want to say all right you go build your models and whatever you want whether it's python and c++ or java you know but those are the kind of the three main you export this model and we will ingest it and then we will call it and we will run it and that's where i was saying the science fiction ideas like and that you know that is pretty provocative but it's really like you can talk about ai everybody talks about ai they just wanted on their web service like blockchain you know let's just get it on your website so that you know you can get robotics or something but how do you actually put that in production it's there but how do you use it and that's where we really saw ourselves as being able to to implement this and and be able to run these things in production you know like fraud detection on credit card scores we're doing that we're actually we have a most you know like i've said most credit card transactions touch hazel cast in some way so transaction might come in you know the user swipes the card generally speaking you've got a one second sla from card swipe to either approval or decline and that's going through all the network goes through the processor goes all the way to the issuing bank the bank issues the money comes back that's that whole thing so your fraud detection generally you've got about 50 milliseconds to pull that off and what we've seen that we could do with our models and and not just data science models you can do rules engines and some other things but we can actually apply more and get more accurate scoring results in less time using the stream processing engine and being able to enable these things in real time that we've seen tremendous benefits are i think i can use the capital one example in that they save something like a hundred million dollars a year and in not only just detecting better fraud and declining fraud transactions but also false positives not getting false positives because if you score something as fraud and it's not and you deny it then the user is just going to pull out another card and use it so you've lost that revenue so you know you've got to you've got to have a better way to to detect fraud and also not make mistakes and and declining so that's what i talk about when i when i was saying it was science fiction what i really meant was everybody says they can do it but nobody's really really doing it yet until now and i think we're doing that now okay now you guys have been working with us the reason why you're on the show today is not because we we think you know we think you're nice and we like your flower arrangement there on your bureau in the background but you know our engineers have been working with the hazel cast team for quite some time to test and certify your solution on red hat open shift and and other products but let's be realistic open shift is you know certainly you know uh the one of the future technologies for us your software you've built an operator which is sort of like an artificial intelligent agent for managing and and and and uh orchestrating your technology along with open shift you folks are in the red hat marketplace people can go to the marketplace and buy you can download your operator from our registry and you know i just wanted to throw a plug in there that this is all about you know improving um the the support ability the day two support ability of for customers when when they want to use you know your your products along with ours um that was just a gratuitous plug there for us we have uh 11 minutes left and i know that you know people probably are like what's the architecture look like scott like how does this work look like it's great you know you guys haven't been you know delivering death by power point which frankly is really refreshing uh but i didn't know if you wanted to put up that that architecture uh chart there that we were talking about earlier sure i can it's not really necessary but i will i will put it up let's see let me share my screen here don't you wish that all the various different uh video chat things had the same way to share a screen so you didn't have to print or uniform yeah well maybe one day but uh as it stands i like competition some are better than others so you know we have these uh applications competing they add features zoom added the virtual background now everybody has virtual backgrounds and it's it's been pretty fun so to really to talk about the architecture a little bit um this is a sort of a conceptual idea of what it is we are in in essence though a cluster or a grid as we say and the reason that's important is because we engineered that from the beginning for the for the storage so all they do they have to just have a way to to find each other so when say you have a three node cluster running and you want to add a fourth one when that fourth one comes up it just needs to have a way to find it and find the other one so that's what we call the discovery mechanism and it's configurable we can run you know directly ip or we interact with all the orchestration engines Kubernetes and OpenShift and you know all the things that that you would expect but once they once they find each other once the new one finds the cluster it just opens a direct socket connects they share information the topology is sort of distributed and then then that's it that's how it joins and as a as a client or or you know however you want to access the cluster you do the same way you just you just open a you know the client opens socket has direct access what this since this is the way we engineered it from the beginning it really fits all of the containerization ideas that have come along later we didn't actually build this thinking Kubernetes because Kubernetes didn't exist or OpenShift because it didn't exist but it fits the way that those scalable resources work because that's what we wanted to build our platform as so what you see here on the screen this is sort of euphemistically referred to as the the north south west east diagram and what it really is is that we have that distributed store data storage layer which is what we call memory or store data at rest if you will that's that's where we're actually putting it into memory it's high it's highly available easily accessed and high speed but it's stored and then on top of that here's the streaming and batch processing in this diagram is the streaming engine that I was talking about and that runs distributed out across all the nodes so it uses all the CPU all the memory whatever resources it needs to do this distributed processing so our traditional use case is kind of top to bottom where you have microservices they may be using different clients or it could even be embedded like I said before if it's a Java application but it's generally accessing that stored data and then there's a configurable connection to the persistent storage layer which is usually a database but it doesn't have to be but it usually is a database and then that data is loaded and made available and and if there's synchronous uh you know needs then changes are made to the database or vice versa back and forth that's your top to bottom microservices let me let me jump in tell me about the distributed memory first database is that part something you offer or are you saying that you have a requirement for interfacing with a distributed database no that's us that's Hazelcast that distributed that's the storage layer that that IMDG as I referred to in memory data grid that's that's the base foundation of our application so everything you see in that middle green box is Hazelcast that's the Hazelcast application how would you characterize like what type of database it is I mean and you know there's no sequel new sequel my sequel is it is it that kind of a database I would say it's it's mostly no sequel and by the way people think no sequel means no sequel it actually stands for not only sequel so we we are an object store at our core so we've always been mad and that's because we were about performance so if you have a normalized schema then every object access has to be reassembled and that's why schema database databases are better for reporting and analytics and then object stores are better for performance so you know there's trade-offs between the two now we have actually added a sequel layer engine not only on the storage layer where you can use sequel ANSI sequel and interact with the objects and we built in a parser and all the stuff that you need for for that kind of access but that is an object store we've also built what we call streaming sequel to attach at any one of the points of a what you see there that graph that's basically a representation of the DAG I was talking about the directed acyclic graph you're going to attach a sequel query at any point on there and what we call streaming sequel and then get a continuous update of the events you know from a sequel language but and and with a sequel type output of events streaming through but you know to go back to it that that entire middle box is the hazelcast application and it brings that storage layer what you know what we used to call in memory data grid but now it's you know just distributed memory database and then the streaming engine on top of that that's that's the hazelcast application and that brings all of those capabilities to to be available for your for your storage and your and your processing okay well thank you um your marketing folks you know we're sending me emails they're like hey gotta make sure you can drop your slide down now if you want okay uh you know your marketing folks will like hey you gotta make sure you talk about now i'm gonna read this because i was trying to you know be the the the good person here the real-time stream processing production use of ml and change i'm reading by the way and changing data processing needs have we talked about that or or we have you know a few minutes left here we need to pick and choose what what we want to uh make sure that you do to placate your your greater marketing i think we have what what they're really referring to is just that the the landscape is changing the landscape of how we're processing data how we're doing things is is going to change it's not you can't store something in a in a persistent storage and then come back overnight and do do queries or analytics or you know whatever we the way we used to do it you have to do it in situ you have to do it as you're as you're running it if if you don't have the performance and you can't keep up with it in real time storing it and trying to do it later you're never going to keep up with it at that point so i think really for us the way that we're seeing things work has fundamentally shifted over the last four or five years and it's just it's just going to continue to grow and continue to move in that direction where you just have to have systems that can interact and and work in a real time manner you can't wait you can't you can't store and wait it's just you just can't do that anymore so that's really where we see the whole fundamental shift of of how things operate and it's got to be distributed there was um so if you go way back there was a woman named Grace Hopper and she was a data scientist in the you know in the navy long back in the 60s but she always had a quote that i really love and it said uh you know back in pioneer times when you had to pull up stumps the pioneers didn't keep trying to just grow a bigger ox that you you hook up more of them together and you know and you have a team of oxen to pull out that you don't just keep trying to make a bigger computer you have to team them up you have to distribute and that's that's where we see it going and that's what we've been built to do and it's only going to continue to get more and more complex and and and parallel i mean you're going to have to take advantage of the parallel concurrent processing to to do reach the scales that we're seeing now you know with the internet and the new things that are happening and and these you know these devices i always point out the iphone 3 came out in 2007 i mean that you you think these things have been around forever but it really hasn't been that long and i can't even imagine what it's going to be in five years from now you know maybe they'll just stick them right in our head i guess i don't know i had it i had a trio do you remember the trio oh yeah it was it kind of had like a keyboard on it it was a phone but it had a full keyboard like kind of like the blackberries do i guess or something or i don't know anyways you're right the phones certainly have i just i just hope they don't keep getting any bigger i don't want i don't want to have to wear a hip holster to carry a phone they're like yeah i mean i work with hundreds of you know propeller heads and they're all walking around the building and they're like oh my god i can't believe you haven't rooted your phone i'm like dude i don't want to root my phone i want to i want to be able to text and make phone calls and you know i don't care about rooting my phone um anyways we we are out of time um so we'd love to have you back again i don't know if you are doing if you guys have been part of our podcast series that we do or not but you should you should i think this is a really interesting topic um and if you folks wanted to come on to our our podcast show sometime and and talk you know at any level that you want if you want to bring like some of your engineers on there and it can you know talk about you know hashing files or whatever but you're more than welcome to come um sure that sounds like fun yes so what's next always willing to talk about it yeah i i i i gather that i get the sense you know maybe when we're doing the dry run and i was like all right scott now listen no one word answers we've got an hour we've got an hour i said okay that's two yeah all right great well scott uh mcman from hazelcast thanks so much for joining us if anyone wants to get in touch with scott uh how do they do that what's your home phone my emails scott at hazelcast.com you can give me there scott at hazelcast.com and i'm wait at redhat.com so uh thanks for coming uh thanks for being part of being our our victim today it was enjoyable yeah it was fun thanks for having me appreciate it