 Okay, hello everyone. Thanks for coming as with every presenter here I tried to figure out something that would be interesting in 15 minutes and try to make some sense out of it So thanks for the introduction, but we're not really about monitoring So what terracotta does is trying to provide a solution for JVM level clustering or generally clustering of Java applications, right? So that's me. I'm a developer at terracotta now for about a year I started a Belgian Consultancy company called you win. We still do stuff once in a while. I created a rife project, which is an open-source Java Wet framework that uses continuations and then I did a whole bunch of other open source things over the years I was one of the people that started gen 2 together with Daniel I'm a some Java champion and then I researched native Java continuations If you want to know more about that you can always Talk to me But today we're gonna talk about terracotta This is weird because you've got this ticking going all the time. You guys don't hear it, right? But this is like at other conferences. You've got this huge screen where you got a minute counting down and near It's like So terracotta is an open source clustering solutions for Java, right? So really focus on the JVM. That's the important thing to remember when looking at these slides So let's look at the overview. So today when people want to cluster something when they want to scale out They've got several lab servers. They've got several JVM. They've got several web applications They've got state inside those web applications And then they start scratching their heads because one system isn't sufficient anymore and oh my god We need to scale out. What do we have to do usually they resort to some Custom development or they use existing solutions that I also have to plug in through custom development like JMS or or grid-based solutions or things that are quite invasive Can't really take a Java application that has been written for a single machine and say, okay Let's take it to several machines and make sure that your state is possible is it's correctly Clustered across all those nodes. Well, we think that that should be possible. So what terracotta does instead of plugging in In your app layer, we go one level down and we plug into the JVM Okay, so we don't need a custom JVM. We support Sun's JVM. We support IBM's JVM. We're working on J rocket And the real aim is to eliminate as much custom code as possible, right? So it's Really aimed to be transparent to be plug-in Without with the least possible effort of game That's a very nice ideal, of course It's not always possible, but most applications we succeed in doing that now Why is this possible? And I actually ripped out a slide too much. I'm going to explain it like that So the JVM has got a contract about memory management, right? They've got the Java memory manager in Java guarantees how multiple threads Access any given data structure that is shared by those threads. So What does this consist of? For example, you've got monitor locks around you start a lock around the data structure and then You start a monitor on it you end the monitor on it and while that monitor is busy No other threads can access the data structure. You've got synchronized statements in Java for that. You've got other Coordination tools like way to notify on threads and with all those tools Java on a single JVM allows people to share data across multiple threads, right and this contract has been written down in a spec so basically what we've done is taken that contract and Scale out a thread to another machine and use the same kind of guarantees So nothing changes basically you still use synchronized statements. You still use wave notify But instead of having multiple threats in one VM on one machine You can have multiple threats in multiple VMs on multiple machines by using the same contract So first of all, it's declarative. So we generally don't Require that you modify your code now. There are people that prefer work to work with annotations instead of XML They prefer to do declarations with annotations. It's still declarative, but we support annotations One major other feature is to be able to cluster. We don't require your data structures to be serialized since we plug in Inside the heap inside Java memory model. We can actually detect each individual change each byte that has been modified and just Marshall those across the wire instead of requiring you from Implementing a specific serialization Protocol and a deserialization protocol This is various advantages one of the most important advantage in my opinion is maintenance wise You've got complex data structures You don't have to make sure that the serialization is done correctly over time and another advantage is that we can really optimize a lot because if you've got a huge data structure with lots of a big tree lots of branches and There is one modification that is being done somewhere Very deep down in that structure We don't require everything to be serialized and sent across just that one little piece of data structure that has been modified will be shared on another node so The other side effect of that is that across the cluster each instance Maintains its identity so Instead of when you've got a serialization model like what most other clustering solutions do is you serialize the data structure It goes across the wire you deserialize it It becomes another object and if you do let's let's say you should never do it But if you do it an equal statement It will never work because it's on another node will have another it will be logically another instance of that object with terracotta It will work so this means that you don't have to start working with surrogate keys like a string identifier That identifies a particular entity. No, you can just use object identity and and and Use your collections to get the specific data structure and use get and it will go through the To the collection and get whatever result is appropriate instead of having to use surrogate keys for that Okay, so one of the other nice benefits is because we have one centralized server That is able to work in a fine-grained fashion We can optimize things a lot We can batch things up and we can also allow you to have a virtual heap that is larger than what a single JVM can hold So you can actually create a heap that spends. I don't know how many gigabytes 40 50 100 gigabytes as long as you've got storage space on the server on the terracotta server, okay? So let's see how this in how this works in practice So I've got a server. I've got an application server that starts up I've got a JVM that has been augmented with the terracotta libraries It is simply by registering a jar file in the boot class path for people that know a little bit about this in Java So it's very it's not invasive at all. It works with with your existing JVMs So this app server starts up. You've got your web application that starts up with a data structure and You've got a piece of that data structure that is actually shared So you declared something to say this particular part of the data structure is what I want to cluster across multiple machines Because you set that up in a configuration file the terracotta server will actually Plug in there and see what the data is and pull it back to the terracotta server, okay? So it arrives there it is in the terracotta server now I've got another application servers that starts up that has got an entirely different data structure at one location, but it starts using The application and it hits at a certain point in time that particular shared data structure, which has all will also been Augmented with terracotta's functionalities and at that point in time instead of creating a new object from scratch it will actually go to the terracotta server and fetch that object and Use data that was put in from the other application server and use it in the second one, okay? Now let's say they've got a third application server again. I've got a shared object, but instead of using an entire The entire data structures and the whole data structures that are stored on the server I can actually just use a little bit say, okay I'm just interested in that part and I'm not as accessing anything else So it will not fault it in once you start going down traversing the data structure It will lazily fold that in and make sure that it will be there whenever it's accessed, okay? and then of course we've got Functionalities allow you to monitor all these things, but they're very specific to to terracotta and to clustering but it does have certain advantages because One of the things that has to be done properly is that your application has to be thread safe So you have to have synchronized statements across shared data structures and a lot of times it is not the case So we give you meaningful error messages when that happens and you can actually improve your application just by running it through terracotta Well, I did with many applications that I wrote because when you write it on a single on a single note, it's always wrong This is a quick example Typical hello world. So what does this do? I've got a list that contains dates. I Create a new date. I add it to the list and then I print out everything that's inside the list And that's what this application does now if I run this application by itself There will always just be one entry in the list and will also always just print out that one entry now With terracotta. I can point to that dates data structure by creating a configuration file. So I'm saying The hello world class has a date status structure I want to lock automatically on it so that each synchronized statement becomes clustered and when I do that I I Can start the terracotta server I can start the client as many times as I want the first time it will get an output of a single line the second time I get two lines and so on why Because that dates data structure is stored in the server and each time that I start a client will actually be faulted in And this list will grow bigger and bigger and bigger Okay Now let's look at another very simple example about coordination. Let's say that this application. I only wanted to start Adding information to that list when all the nodes are at a certain point So I can use jdk's 1.5 cyclic barrier for that. I can say, okay, I've got a barrier and I want to wait Until all the nodes have joined that point. So I create again a field in that class that I call barrier I added as a route to the terracotta's configuration file And then when I start the server and I start the client So you notice here that I added Jvm property called nodes. So this has to be correct, of course It has to be the same number of nodes as the clients that I'm starting It will just let's say that I start two clients It will hang until those two clients have joined and only then it will execute the rest of the program And that's this works in a clustered fashion across multiple nodes So let's wrap up Basically terracotta works under the cover It follows the java memory model to guarantee correctness But this is not crazy voodoo magic that you Don't understand it's been widely documented and it's been out there for several years We just implemented another another version of the java memory java memory model that works across multiple nodes Another thing to remember is that you will not code for it. You architect for it You think about data structures and how they will migrate across your nodes You don't get a clustered hash map that you will put your stuff in and then get it back from another from the other side You can if you want to you can just create a hash map Make it a root and then it becomes a shared data structure and then you get a clustered hash map So you have to think a little bit differently You don't get an api that you can code against It really has to be identify which parts of your data structures have to become shared And those will become declared inside the data inside terracotta's configuration file Okay, so that's about it. So no serialization fine-grained field level changes It allows you to traverse your object graphs in little parts Things can be faulted in lazily only when you access it The jvm's coordination is maintained across the cluster and you can have a large virtual heap And now we're also working for the next version On a clustered wild statistics monitoring tool that allows you to Gather data while your system is running and then analyze it in real time or afterwards to see how your Shared data is actually behaving and how it impacts any kind of system behavior. So this is a little bit what zavix also does but for For for individual operating systems and hyperic does that also and That we try to drill down it really into the data structures. That's for the next release Thanks for coming. So if you've got any additional questions, you can always email me I'm not sure that we still have time. We still have one minute I put Some papers here that you can take away It's got this but much better detailed and explained with Source codes and then my business cards if you want to email me you have a question, sir Okay, the question is can you achieve high availability with terracotta? Of course That's one of the main reasons to use terracotta. So what basically the architecture is you've got a terracotta server You can set up several other terracotta server that are passive servers and that start up in hundreds of milliseconds So they can take over easily and then you have your nodes that connect to the terracotta server If any one of your client nodes goes down you can use you can start up another client node and it will take off Where you were before I don't think we've got time we can do it in private. Thanks a lot