 My name is Olym Mahon. I usually spend the days like a royalist to do the technology and stuff on the street, where one of the things I do is that I have to participate in a meeting. I mean, there are some small groups. There's a couple of research organizations that we call the Loads of Life as a Model, and anything that we do is that will foster the development and our partner development, something that we call the Loads of Life. I'm going to talk a little bit about what we deal with in Loads of Life as a group's computing. This is largely a distributed system as well. Yes, how many of you are acquainted with computing? So I'm going to cover that a little bit more in detail and give you the picture of what a group computing is and where Loads of Life is. So, we're going to go back two steps here and explain a little bit why this has happened. The real computing originates really in the research and scientific space where, just as in any other business, you no longer stand by the blackboard with your crayons and figure out new theories. Rather, what you do is that you participate in a distributed, multi-discipline, collaborative, fast-paced effort. An example of that is that you have experts from various fields trying to explain to each other what kinds of components you see. You have large-scale, unique instruments placed at different places in the area and the thousands of researchers have to... You want to... These instruments, for instance, they tend to create... You want to enable all the researchers to easily make a data vacuum more effective. It's called meta-computing, which in turn evolves from something that we don't call distributed computing. And really what it is, is you cross it to enable resource sharing. They don't want orders. The idea comes really from the analogy of how the power grid works. If you had once upon a time... So in the inner village, you got electrified. And you used that electrification. Over time, we got small parts of cities that got connected, a couple blocks. And various cities that power grid consumed. Nowadays, we don't even think of where the power comes from. What we're trying to do is something similar with a computing grid. So we want to decouple the production. Where are the resources? Where is the storage? Where are the pre-CPUs? And the consultant run 10,000 different processes here to analyze the content of 10,000 different files. So you've heard all the business talk. Some have this dollar per CPU hour thing coming out now. It's all the same concept. It's all the basic computing as a base for us. So in order for this to happen, we need to standardize on interfaces. This is sort of where we have caught up a little bit. Too much to have it. But we need to be able to talk across the operating system, across organizations, across different software stacks. And reason about what the resources, how I communicated with it, a common security model. Yes, it's like you have all the cheaper and pretty much the same slot. You can do this on your university campus. You can do this on an enterprise in a big scale. And then I'll give you a couple of examples of where grid computing are used today. Well actually, the two bottom things that you know, set the at home and distribute.net, they already existed when this came about. So they still have their own software stacks to do the same thing. But this is mostly, I think you're aware of it. You have these screen savers that when you don't use the computer in the complex or server somewhere, participates by contributing some small chunk of work to a large effort. Search for a security goal in intelligence or cracking crypto algorithms or doing stock market analysis predictions or what that is. Another example is CERN in Switzerland. They have a new, they're building a new accelerator group. That's what we've done, it will be even more powerful. And it kind of, people go online and probably get it now. It has four large detectors and I think you can see here, this is a full grown map. These are huge. This is just a solid iron. And in huge detectors, they spit out gigabit per second of half of it to analyze that story. And when they have the money to build this whole thing, they started to think about what kind of computing and storage capacity do we need to handle all the data. And then they realized that that would be just as expensive as building this facility. So they couldn't really go back to their partners and ask for that. So they need another solution. And the solution here is done like everyone ships in with whatever resources they can collect locally. That they're only going to be there, that they're only going to be used as well. And at the end, just 2000 businesses or something across lots of manufacturers should share all of these, the low-cost data somewhere. I think that they would say there's a trace of this kind of elementary particle that they haven't found yet. That they want what they hope to find out. That they're going to give them a little price for whoever has it. So here we have a unique instrument with lots of data. And they have a distributed analysis of that data. We can also turn around the whole thing. Just say an example from the United States where they have a very quick engineering project. And here people, this is a past slide, but that was told to me. But essentially what they have here is they have sensory networks that are out in the various rids. Sensing sensory activity. They feed that into sensory networks. And they provide that as input to large checklists that you can see out on the comment. Those are huge things that we can put the three-story building on and shape it around so you can hold it respectively. And of course, you have to combine this with data-registering or analysis for computing or simulation. And eventually this all comes so complex that you want to find where the browser is based, interface to the whole thing. So this is some seven or eight different institutes that are participating and researchers. And they share data access and hotel. We can also go into the commercial world. SAP, which you may know, they use, or they showcase a demo quite recently where they have, are building on top of what they still get to for various internet-based, where you buy hardware somewhere on that. And based on the current demand, they will give more or less resources to handle, for instance, the various peaks that come when they offer a product that is cheap or, right, almost current to the final stage as well. So in that case, this is just to illustrate a little bit about what we do when we share resources and that is that again. We can do it in several different ways. Instead of both hardware's A and B having nine computers by themselves, they can share, for instance, if one, if they are in different kinds of, they can share on seven common computers and have two of them themselves. They will still have access to nine computers at some part of the day. Or they can scale up and get more or less 16 servers each. First of all, for the whole time in the future, you will find someone else that has not exactly the same kind of usage pattern as you have someone. So, as far as on an online demand, you can scale for the reach of the consumption. So, how do we model this? Well, we call this, so we build a company trajectory. But of course, I won't, if some is charging customers one dollar per CPU, I don't want some to steal my cycles and have someone else pay for it. Likewise, if you would offer these kind of services, you wouldn't offer them to anyone. You would like to know that if I'm offering storage, something that is of military research or something else that you are opposed to. But if we step back and look at this, there is a kind of collaboration. The same concept arises in that you have across all these organizations, you have people in each organization that join a club. And that is what we call virtual organization. And inside this club, you pitch in resources and the tasks that they are doing, and they sort of self-organize the virtual organization and how hard to solve the problem. So, what is it that those of the club is talking about? Well, it's a... So, most of us spend much of the many years trying to help these very scientific applications in particular. And sort of realize that we are reinventing the wheel over and over and over again. Same problem to rise all the time. So, Tokyo is really a small collection of the solutions to pieces of the puzzle. I can't say that. It provides pieces of the fully complete puzzle. And so, we try to introduce a common security perspective. We try to introduce common access mechanisms as well. And we try to hide all the gore details of, oops, am I running on the server? Am I running on Linux? Am I running on Windows? By trying to, more or less, characterize that in a way or not try to introduce those small details in our protocols and what other kind of details. But are a generic compromise. And we try to follow standards that has to as much extent as possible. But we can't build one solution because everyone else have, either they have already been doing things in this context already, they have some legacy or they have various ways of looking at the whole problem. And they want different ways of following it. So, if you look at what's done in the internet, while you have sort of some small pieces in the middle of the area, like the IT protocol, and you have started to get diversity, and you have a piece of the new local writing. And on top of that, you have different applications that you go. And so that is precisely what we are doing as well. We are providing middleware that will provide some core services in the middle for the actual resource description and connectivity. And we will never jump as far up as actual applications, because this is a toolkit. And it won't be an alternative solution, because then it will be only an alternative solution for one particular application. So, this was growing fine in the beginning. We were a small bunch of crazy people that were hacking day, night, day, night from day. Finally we got something that we could call 1.0 and it was based on whatever we could find out there that sort of fits our fit with our purposes. So we ended up with a floor of different protocols. And then we tried to shoehorn in the common security perspective. That was better than username passwords, because username password doesn't fly if you're going to try to manage that across 12 different organizations. If you need something and you need a better model, so we have something that has a great security perspective, which is PTI based, so it certificates as long. So we have a single sign-on model where you sort of log in once and then the various processes and applications that we start to authenticate as long as we have to find a way around the system. So these are the problems that we say are protocols and many of them are still in use today. But the problem that we encountered was that for each new protocol, for each new thing that we wanted to add, it got more and more complicated, for instance, to contain the common security layer and so on. So we had to rethink that strategy. And in the process, the whole web services stuff came about as well, and we looked at it and realized that actually our purpose is quite well. We also got some push from industry, and they said, hey, you are doing quite the stuff we want to use, you are doing your stuff instead of doing everything we want to catch ourselves. But we have to be able to sell it to customers as long as we have to follow standards. So standards become more and more important the more successful we get. And that's also because then, of course, it's not the monoculture anymore where there is a single de facto standard because there is just a single toolkit. So now we are just a little piece of a big ecosystem. And, of course, all the standards doesn't exist. You have to walk around and travel to ITF and why is this a piece of this? The standards that you've all tried to settle down and things like that, that's a pain. But it's a very hard trade-off of wanting to do something and push it out the door to the industry versus someone else also being able to do the same thing. So it's a very hard trade-off of when you should sort of try to grind on a big grinder and try to get something standardized in a big context or when you should get to implement something and push it out the door. So greatness of being evolving from de facto standards such as, well, our own protocol invented protocols for protocol extensions, really, into some larger setting where we now have something that we call Roku Grid Services Architecture, and there's one in the industry and academia and all the intelligent people that discuss things in big meetings. And so we are something but in the beginning of this era right now. So in terms of what you looked in the talk, if you just download it right now, you will notice that it's a multi-megabyte bunch of software. And then you start to look at the various components and then you get even more scared because over time you get a lot of stuff there, but still because of backwards compatibility, you can't drop anything. So this whole evolution has crossed the plural of various components and they are all orthogonal or modular and for your particular solution you can choose from these values. It's a bag of components, but they have some common structure in them in that they are all part of the same build and packaging philosophy. And you have a common security infrastructure as well, which should I say. We focus around five or four particular areas. Security, data management, execution management and information service. I'm going to talk about all of these. But of course we also need to hide, as I said earlier, we need to hide the various differences that we have in the same software on AIX and Windows and what have you. So we have some common libraries, common run times with utility functions and things that actually we have something that we call the loadless link seed because link seed shape is not the same in all the differences. That's one. Last week we shipped a better final build up with all of the version 4.0 from the toolkit. And what's new there is that we have named that transition for real to the web services portal. We introduced something that we call the web services resource framework where all the various great resources that I've talked about. They are instruments, they are computers, they have storage. We model them as web services resources which means that they all have a little element in there that is called resource properties. That explains what they are, what they do and that you can query, you can subscribe to notifications about changes of these properties and you can move strong. And this has helped a lot in the implementation of creating these distributed systems. So there are several implementations of this framework already. Java C, Python.net and it's still better as I said but things are much better than the previous version. So we have worked through three which was more or less the first approach to something in the web services world and that was probably more or less a disaster for performance and usability. So what we have, we provide not only tool kits, we also provide a development and hosting environment and for web services we offer both the Java and the CEO's environment. There's a, the Java stuff is more or less only Apache access so you can have similar related Apache products. So much of what we have done has been working straight back into Apache which has been done as part of the Apache project. But then we also put these modules into, some of them are quite complex to use, we put them into a common framework, some simplified wrappers on top of it that you don't get completely gray hair and this should make it reasonably easy to create your own grid services. So when I switch a little bit and talk a little bit more about the various form fields that we get specializing the most important really thing when you want to do these kinds of cross-organization and distributed platforms is the security. If you don't have a good security model, this is an administrative model to install the software. Also you have the problem that you cannot have something like username password something as simple as that. First, because the same username trying to organize a namespace with the same username and all across organizations is hard. It's not possible. Well, the more you want to have this to scale. We have thousands of users but perhaps only 10 or 20 of them access your particular research at the time. That means you shouldn't have to offer or reserve some UID space on several thousands of people. So you want some more dynamic mapping into your local space and some sandboxing. So if a user has got around 10,000 dogs, of course he cannot do it alone. He needs computers to do that for him. So typically a user will use a store of something from his laptop and then go online. Somewhere we need something that runs and monitors some kind of rotary or other task queues or what have to. That are empowered to act on behalf of the user. So that when they go to a resource somewhere they will get the same access as one that the normal user would. Even though the user may not be involved, it may not be online. So as I said before we use PGI for this and we use delegation to say that a certain process may be allowed to act on behalf of under these certain circumstances as well. Then we have to use normal standards but then finally, as I said before as well, the resource doesn't want anyone to access these resources. But rather I will serve with these various communities for virtual organizations as another word for it. And they in turn will decide who are part of their particular community. This is all about how you enable scales. So the resource will have to track some 10 or 15 or whatever different communities. Inside each community you might have a hundred users and they will come and know where to go. But this is not easy. So I just give those details. The important thing here is that the resource is always being controlled. The resource always has the ultimate saying what should be run on this particular. So another thing, execution management. The problem there is that if you have clusters, if you have batch queuing systems as one, you will notice that there are many different schedulers out there and they all have their own way of submitting jobs. So what we've done is that there is a common interface to these various schedulers. That will also define an execution environment for a job where you will push your foot bait by the end there and start at the job with a lot of work, life cycle, the job, stage, a bit out, kill anything. So I guess if you feel the mark, it's not very solid. Oh, I saw it was going to be different from what I was going to say. You know what I mean, right? Yeah. So if you were at the clustering workshop, I would be showing you sort of the first version of what this was all about ourselves. Now this is becoming semi-defined while awaiting an emerging standard in this area. But it's the same thing that you get to define what is executable to run where you want to say, hey, go and load to some 5.0.0.2 into there slash 10. And then you want to file a state of the state of file out of here somewhere. I'm not going to cover that. Next thing is monitoring and discovery. Monitoring and discovery are really two different things but you can solve them both with the same solution, namely subscription notifications. And again, this is where our work on the common runtimes fits into flight. Now all the services get this for free. You can query them all. You can do subscription notifications on them all as well. So what we have is an aggravated framework and we have either polling information or subscribing to the information. The aggregator looks just the same but we can have by virtue of different implementations you get either monitoring or the discovery functionality. So we have an index service which will... So remember where you have resources because they will come and go as various organizations decide to offer out free available cycles to some good for some community. Or you can have a trigger aggregator which will listen to notifications and when some particular notification comes that you want to set alerts, you can get actions to happen. And there's also... It's not part of the toolkit but it's on the roadmap. The third aggregator will then be an archiver that will just let you do or spotting what has happened in the system as well. And as I said, every service is discovered. So this gives you an example of exactly what I said. In each network service container there is one of these index servers. So more or less each host can provide you at one single point until you have all the various services that are running on that host. And need some reliable file transfers that we will use for instance to just move terabytes of data around and trigger bytes of data around. And maybe one of these execution management services. There's still a case for not running everything on top of web services. There is an adapter into GAMBS page, now appears and monoliths are on-screen for it to come on. So we will hopefully just blend it into the normal monitoring. Systems that people are used to. The third of the pillar of the distribution is data management. This has to do with everything from keeping track of where your data is to actually pushing data on the network. If you want to use the various high bandwidth fiber of networks that we have there, normal TCP has to be kept. The way what we've done is there are some extensions to entity, a whole reliable protocol that we own up. All that great entity and there is some additional match that we can use to define TCP buffer sizes, multiple parallel screens, striping across multiple servers that will each as much data they have on this, which is usually 100 megabyte per second. Thanks to this, you can actually fill out the 30 gigabit per second fiber across the wire if you want to. Of course, you have to be a little bit careful. The guys at CERN were trying things out with a big storage facility at the UK. The UK emergency response unit, they thought that they were being attacked by a huge dust attack. It was all of a sudden the whole gigabit fiber to UK. And it was all going to one single host in the UK. So sometimes you have to be a little bit careful what if you let the incident response unit know ahead of time that they're going to actually do things like this. Similarly, databases is another concern. You have myHPL, you have cross-dress, you have Oracle and so on. You have many different representations. You have plan files. And, Bill, you have these safety to different sources of it. It could be, for instance, some medical journal data. And now some doctors want to take part of all these various medical journals and do a epidemiological study. Now because of legal reasons and a lot of other possible reasons they are not allowed to assemble all of these data into a central database. So you have to have these distributed and distributed queries that also take into account where these sources are. So that is in terms of access and pushing bits around. But once you have done it, push your gigabyte data sets around. So you need to remember, well, I talked a bit from there to there, I should remember that I have those two copies that might make a replica of my data set. And you want to be able to remember where they are and so that you can also associate the attributes with these data sets that the application can use. And you can also have logical data sets which is a collection of hundreds more files and whenever you replicate them the logical data set of all 100 files will move on over the back of them. Then actually, yeah, as I said, I mentioned the grid that you already, and then there is another survey something we call Reliable File Transfer which is that you don't have to be online in order to move these large sets around and let the service do that for you, which we call Reliable File Transfer. I think you could also do, necessarily, free price and talk. And this is highly coupled into the execution management to let you coordinate the conversations. As I said before, even though we have all these things it's often not enough to build a full-scale application. So what we have is a ecosystem that is developed or evolved over the year and actually this is just a bunch of different tools and packaging efforts that also build on top of. And this has been, for example, successful, I have to say. So, now we may ask ourselves, so this is all good and fine for you guys that are doing these 1,000 user communities distributed analysis kinds of things that puts in for me. You don't have to scale it up to those sizes. It can still be of use, especially if you're writing any distributed application. For instance, the service development comes in certain hosting environments that we offer in the area for us to use. There's something in this whole thing for many people that are doing distributed stuff. So, just be aware that we exist and when the day comes and you're part of the task with doing something similar, just remember that we exist and we're always looking for additional new collaborators. We have an online community, we have a discuss list with some class-sected members. There's a developer discuss, so there's more than a hundred. There's some fairly adopters, friendly users to remain in list for the latest release. We have a very recent, so something that people always think of as documentation. It doesn't matter how much you plan for it, documentation is the last thing that the developer writes. And so there's a documentation project in place, and this is also meant to be when the system administrator writes for the system administrator, instead of having the developer writing for it. And these kinds of things. There's a really good tutorial there on developing your own service developed by a Spanish guy and it's very new, so it's still there. And the tool itself is having lots of documentation over the last few months. As I was reminding you last night, we should add a small warning here, but if this sounds a little bit interesting to you, and you try to download the tool, it's really clear that this is more than a 15 minute exercise. It's not just going to be your main menu and for starters you have to get those certificates as well. We try to make this simpler, so we have this facility for where you can just go and get these kinds of the necessary play around with certificates as well. We have collaborators providing binary RPMs and so on like that. Things can always be. Don't give up and plan for more than 15 minutes if you want to. We always welcome feedback in terms of fight reports and improvement requests. We welcome discomforting feedback as well and hopefully with some satisfaction on how to solve the problem, I just want to say that. We always think of collaboration. We know that you are out there and that there are other ideas you have. We're really happy to talk to you about how to make this a bigger effort. That's against all I have. Are there any questions? Any questions? No. In that case, it's usually that you wrap the application in... So the application can still be sort of standalone and you can shift the application as is. You don't need to compile it in particular libraries as well. All you need to do that is to provide the executable for the environment in which you're going to run. So if you want to access both SULARs and different processes you would have to provide a SULAR is executable and it's executable. And then you can just use non-ordinary pipeline tools to do actual station handling. Are these tools useful or are they more convenient? Those are more firewall-friendly. Firewalls are always tough when you want to do distributed systems. First of all, if there is a policy that you shouldn't have any inbound or outbound connections and then we shouldn't try to desert them like that. But we have some... We have some... Well, first of all, things more and more of the traffic is now going across a single port with the transition to web services. We still have the old that you can specify that we... You specify an environment variable that will restrict the services to only use a certain range or range in a... for a different area of the port. But we do have things that we can... You can define, well, my IP number has had to be 192.168 whatever. But that we can define so that all the services will be displayed to the proper external name of the quick way to find the service as well. But ultimately, if you have a completely closed environment, you need to open up or something. There are... In collaboration with some guys and apps to them, there's a method where we try to make a dynamic connectivity provisioning service. So that they regularly ask something that is in control programmatically and say, well, here I am, I'm providing with all the credentials so that you would then open up on a per need basis. Now, if you ask the system administrators, you get two counts, one that says this is exactly what I want and the other says, no chance of power that I will give to the devices around me. But, once you can play out the tricks with this, yes, IP tables are the ones that are the one which is actually at the job. It's a good question, of course, but I have to try to focus and next to continue the production and generally we have a completely different date I'm out of service but but then you have to think of other means of where you can offer services to the experiment elaborator in security Is this a single network or is it just continuous in different items most definitely yes, you could establish a VPN and all these different players but that won't fly because the same resource may be consumed by several different communities that the different communities would have to establish a VPN with each other. That's not really a test. It works in the business environment where you have more or less static, private partners I could also add that this is the tool kit these items are all different choices how to implement how to use the tool in different ways but not all the items are identical I have a let's say I was thinking I should do this one but I could see it in slides that was there that gives you an idea and so it's all over the world in Sweden in Europe we have the largest project is the native grid science that covers 70 different partners that all participate with the hardware it's primarily for high-ended physics and biomedical applications that others are invited as well we always have a national grid that covers the left of that is the US corresponding thing to the EEG that is the Japanese National Research Group initiative and these are all actually public in Scandinavia we have an effort called well I don't know what to make there's a Nordic consortium nowadays they used to make Nordic but that is now based on hard and it's more like collaboration but all these things exist across the planet and these that I have on display here are more or less only in the scientific and open research environments they have of course some idea made to be platform so they just have a protocol and on and on and on they are selling super solutions so we have made a conscious choice not to they're both so there is a but increasingly for instance there are these infrastructure projects where collaborating to integrate a solution called shibboleth into this and support for shibboleth is what I was talking about it's currently out of the way but shibboleth is the last thing that's happening so you will authenticate locally and then the organization will issue a small ticket and say that the barrier of this token is okay he's a user of mine and we can put what role we place in the organization as well so in that sense you have the integration with local authentication but then you have more of a federated model the ball that we have is not relying on those organizational federated models where you have the EI and then you have the third body so the physical quality is may not be on Python but it's just somewhat of a digital certificate so it could be national national