 Okay, hello Okay, apologies. I look a little bit jaded. I had to get up at 4 a.m. This morning in order to get my flight to get here So I'm here for a few hours, and I fly back home. That's the nature of the the role that I have So I have the world's best job Good game pays me to talk about open source. How cool is that? Okay great, you know, thank you Thank you So I'm an old guy as you can see, you know gray hair white hair no hair So I've been around for a long time Worked in a variety of different environments started life as a developer and then pretty much done everything over the years and these days As said before so good game is a company behind Apache ignite any ignite users here any ignitors Great few of you. Hopefully a few more will be interested at by the end of this talk so One of the things that we have today is this whole open source movement And so grid game donated ignite to the Apache Software Foundation a few years ago It's fully top-level project very active in terms of the dev list and the user list Okay, so I encourage you to have a look at that download it try it And it started life as an in-memory data grid and there are some commercial products out there You may be familiar with hazel cast you may know that Oracle hope coherence for example And ignite is a key value store as well again You may be familiar with some other products in this space redis for example is a fabulous very fast super-fast product as well but one of the things that I think we have particularly when building distributed systems is that Imposes all sorts of issues and problems for us And there are some architectural choices that we need to make because you know building distributed systems is hard Ignite can do so much for you it can help you to some degree it can take intelligent decisions It can partition data Distributed it, but there are some intelligent decisions that you can make as well in terms of how you want to You know there are some useful things that you might want to run for example co-location Which is one of the things we'll talk about Okay, so just three issues because of time Pressure is I mean there's really only three main issues that we can discuss of those the first two I'll talk about the third one I'll leave for you again if you want to Discuss that feel free to drop me an email. It's just my first name dot last name at grid gain comm so partitioning Okay, so we'll look at that pitfalls of even distribution Which is one of the things that you might think about doing and not necessarily useful in all circumstances Good work very well in some situations and then affinity co-location. Okay, so this idea of co-locating data It's an architectural decision that you have to make because you know You can't beat physics. Okay, I mean there's only so many ways that you can co-locate data And if it's co-located in one particular way you can't do it another way Okay, so it's intelligent choices again in terms of what you want to do how you want to run your queries And the last thing again fairly technical and what I'd encourage you to do is have a look at the webinar by my colleague Dennis mag that he's actually my boss Then it is the PMC for ignite at the Apache Foundation did a wonderful webinar recently where he drilled down into that in a lot of detail Okay, so if you're really into low-level stuff looking at sort of Neo and Iowa look that kind of thing have a look at that webinar Okay, so quickly then let's move on so typically then What we have because ignite is a hashed partitioned map, you know, that's really what it does When we want to do an operation in terms of putting the sort of a key and its value Apply some algorithm to that and if we've got a cluster, you know, how does it actually what what actually goes behind the scenes in terms of How the data are distributed? We just apply some algorithm and generally that's all we need to do. Okay, so data are partitioned for us It's taken care of for us. We don't need to worry about that The other thing though is that generally we might think that a kind of balanced Partitioning might be a good thing. You know, we use some kind of naive function. It distributes things evenly across our cluster Here we've got a very very simple sort of two-note cluster in which case we can see that we've got six partitions on one node and six partitions On another that seems fairly reasonable However, you know, if we add more nodes in this case, we've added a third node Then there are some issues in terms of data distribution. Okay, so there's a bit of shuffling going on This is not necessarily what we want because what we want to try and do within a distributed environment is reduce the amount of data That gets shuffled around reduce the amount of network traffic Bring the processing to the data and try and limit the amount of a sort of noise if you like that's going on So this is maybe one kind of naive weight. We might want to do this, but not a very not a very good one Okay, so two approaches Possibly you're very familiar with this those of you have a very sort of technical and have a deep sort of understanding of distributed systems in and hashing algorithms So consistent hashing so that's the sort of fairly standard approach and then rendezvous hashing. Thank you That's the approach that used by ignite So in this approach what we're trying to do is really leave it up to ignite to manage that for us So when we added a third node in this particular scenario, you can see that actually there's far less. That's actually being moved around. Okay It's unbalanced. Okay, but that's okay. I mean the thing is that eventually what will happen is that as data New partitions created nodes go down nodes are added the thing will take care of itself I mean, that's the key thing here. It isn't necessarily the case that we want an entirely balanced Network, okay very quickly then in terms of colocation Well, one of the things that ignite allows you to do is it's very useful feature about co-locating data So you can run queries together. So Great example would be if you think back in time about five six years ago, you know the Hurricane that hit New York this weather alert that we we could have sent out So if we co-locate people that live in the city with the city itself on a particular node We want to send out a weather alert very very easy to do because we know it exactly where all the data are That's a useful feature here. We've got an example of being able to You know do debit and credit across two sort of bank balance And two accounts where we're taking one From one account adding to another account. These might be distributed across these two nodes That involves an awful lot of messages Up to 16 network operations, particularly if we've got things like to face commit as well where we do, you know prepare and commit Messages acknowledgements and if we've got both primary sort of partitions backup partitions as well And even for a sort of two-node cluster. That's quite heavy So one of the things that we can do again co-location helps achieve and reduces the amount of traffic that we need to worry about and We can reduce this back down to sort of four network operations in this simple case. So got one node Okay, we've got this one face commit as well. Some of you might be familiar with that We can take advantage of that because of the particular architecture that we've got in this approach okay, so the upshot and as the light and you know as they say the lightning Talks that you know because only so much that we can cover so keep key points here partitioning It's not just about even distribution, okay There are other distributions that can work quite effectively and we don't need to worry too much about that because over time things will balance out And that's okay Affinity co-location golden concept of distributed systems, okay very very useful feature for particular types of processing But again architectural choices Business model that we need to think about that from that perspective But again ignite can help us in certain respects and again This is kind of a general principle for a lot of other distributed systems as well And the third point is I mentioned before that goes into too much detail is pretty deep and unfortunately today because of the Sort of time pressures. We're just unable to cover that Okay time for a question I think all right as usual time for one question. So anybody yes, okay, okay Okay, so the question was what kind of real-world workloads is ignite used for And I think my best answer with there is pretty much everything So if you check the Apache website ignite dot Apache org There are case studies all sorts of verticals everything from financial health care IOT many many different types of applications One of the ones that I had the privilege to look at just late last year was a company called e-therapeutics in the UK So they doing genomic sequencing and looking at you know, a lot of this kind of medical type of applications They're running ignite in parallel because they've got certain types of processing which you know Using this power is and gives them almost two orders of magnitude performance boost and it's a because it's a competitive area They need to be able to run their jobs and do the kind of the type of processing very very fast So I would say pretty much anything and everything. There's a whole range of applications Feel free to drop me an email. I can follow up with you or have a look at the ignite website where lots of case studies You know top name companies as well Apple, Microsoft and others are using it for a whole range of different applications All right. Thank you so much. Thank you