 That's a lot. Thank you for inviting me today. I'm going to talk about clouds in high energy physics. I thought I would first maybe make a definition or explain what high energy physics is. I'll leave the cloud definition to this group. I'd rather not put that one up. What we do as particle physicists or high energy physicists is study the fundamental particles of nature and their interactions. We use a variety of facilities. We use accelerators. The picture on the far left is the linear accelerator at Stanford University. I've been part of an experiment there for a number of years. The middle one is an underground laboratory in northern Ontario in Suburban. Suburban, it's 8,000 feet underground where we put detectors to look at particles from the sun. It's the Suburban neutrino observatory. I visited there. And then we have detectors in the space station, the alpha magnetic spectrometer. Some of this was in the news in the last week. I'm hoping to go there, but I'm not optimistic. With these facilities, we try to address a number of questions that are important for our understanding of the universe. The one that you may have heard most about, and I'll explain a little bit more as I go along, is the Higgs boson. We've discovered this particle among one of the atlas experiment at CERN. And we see evidence for the Higgs boson, which is supposed to give the source of mass to all the particles that we know about. It's been coined the God particle. That's not our name, but it's sort of stuck in the media that way. And I'll go over a bit of the detail of this. We're also asking other questions. I mean, maybe you saw angels and demons about how we can make anti-matter. We can make anti-matter, but one of the questions we're wondering is why the universe today is only made of matter and not matter and anti-matter. We expect that the Big Bang, and we're both created equally, but why don't we see a galaxy of anti-matter and a galaxy of matter, but we don't? We only see matter. So something has happened during the evolution of the universe so that we only see matter in the world. Any other, the last question I'll just highlight is our understanding of the universe is actually pretty limited. This pie chart shows that atoms or matter is 5% of the universe. Dark matter is 24% and dark energy is 71%. And dark just means we don't know what it is. We find this out from projects such as this WMAP satellite that measures the temperature of the universe. We can also measure it in astronomical data and other, but the bottom line is we don't understand what 95% of the universe is made of, which is kind of embarrassing. So we work, or at least I work in these projects. And so we have astronomers use telescopes in a sense of particle physics and these accelerators are basically big microscopes. So we've evolved from these simple instruments to more complex instruments. The astronomers have their telescopes in space. We have large detectors. This is the atlas detected that I'm involved in during its construction at CERN. Our collaborations are now massive. The atlas is 3,000 scientists. So these projects are decades long, large scale projects. I work at the Large Hadron Collider at CERN. This is in Geneva. I think it's the largest instrument in the planet. It's 27 kilometers in diameter, or circumference rather, and it's 100 meters underground. And so protons circulate in this ring one in one direction, one in the other, and we collide them in certain areas. It's basically a microscope and we're probing distances of the order of 10 to the minus 20 meters. By going back in small distances, we're going back in time and studying the Big Bang. So it's sort of analogous to what the astronomers do. They look deep into space and they go back in time. We look at small distances and go back in time. So since we're studying the same things in complementary ways. The Large Hadron Collider, as I say, is a tunnel. It's 27 kilometers. You need a transportation to get around. The blue beam pipe, the blue pipes contain the magnets and two beam pipes for the protons to circulate opposite each other. It's also at 1.8 degrees Kelvin, so I think it's the biggest refrigerator in the world. At certain points along the ring, we have these detectors. So this is the Atlas detector. There is another large detector called CMS, and then there are two other specialized detectors as well. You can see the scale with the people at the bottom. It's 44 meters across, 25 meters high. It's basically a camera. I forget how many megapixels. So this is the picture looking down the proton beam line in 2005 as the Atlas detector was being constructed. So in a sense, the center is hollow right now and they're sliding the pieces in. So it gives you the scale of the facility. So it's a fairly large facility and it took probably between five to 10 years to construct. Today we're recording data. We've just finished a period of data taking and we're now in an upgrade period of two years where there's no more beam for that time. So this is a picture of a proton-proton event. The protons are coming in and out of the page. The yellow lines are the charged particles. They bend one way or the other depending on their charge because there's a magnetic field and the color boxes around represent the energy of it. So we collect roughly 200 of these per second. I'll explain a little bit more as we go along. Actually, I'll do it now. So the protons are colliding every 25 nanoseconds. So we get 40 million per second. The Atlas detector then selects 100,000 per second and then we pass them to event filter computers to select 200 per second at two megabytes per event. So we're collecting 400 megabytes per second for roughly half the year. So this means we need a large computing system. What we've constructed over the last decade is called the WLCG computing grid, worldwide large Hadron Collider grid and it's based on a hierarchical model where there's a large center at CERN, a tier zero we call it. There are 10 tier ones distributed around the world. The data from the detector comes to the tier one and then gets distributed to the 10 tier ones in near real time, a few hours. And then there are 60 other what we call tier two sites that are also distributed around the world for creation of simulated data and analysis. At the moment we're of the order of 140 petabytes of data, we have a private routed network for most of these facilities all at 10 or 100 gig now. Now last year we found evidence for the Higgs, both Atlas and the CMS experiments, see evidence for Higgs like particle. The little circular picture is the same one where I'm sort of cutting the view along the direction of the proton and the other picture on the left is more of a three dimensional one. It's where the candidate particle goes to four electron like particles. It's either an electron or a muon. The muon is just a heavier version of an electron. If we, here's the only science plot you'll get. This is a frequency plot. So I'm plotting the mass of those four leptons as a function of its mass. So the number of candidates is in the vertical axis, the mass is on the bottom. And we see an excess of events at about 125 proton masses. The colored, the points are data and the other histograms or the colored plots are the predictions of our simulation. And we would get the red and the purple is based on no Higgs and the blue box is what we expect for the Higgs particle. So we see it in this channel, we see it in other channels and we see it in two experiments. So I think we're pretty confident that it is. And as a result, this made the mainstream media I think last summer. So it was all across the world in this way. Okay, so I'm gonna change directions just for the last part of my talk. How are we using clouds in high energy physics? One of our first uses was to help us preserve and archive high energy physics data so that we can continue to use it in the future. We're trying to use clouds to take advantage of special computing resources that I sort of also described that are attached to our detectors. A more common use is to help us simplify just the management of in-house resources. We've used commercial clouds to some extent for exceptional needs. And one of the projects that I'm interested in and some of my group and colleagues are working on is how can we use it if you want a grid of clouds where we can utilize both high energy physics, non-high energy physics resources to meet some of our computing needs. So I mentioned the Slack accelerator before. There was a detector experiment there called Babar. It studied B quarks, B and the B bar. So that's why it's Babar. And we got to use Babar as long as we don't change the color or change the shape of the elephant. You see an event picture. It's a lower energy accelerator. So you see simpler pictures of events. It's again along the direction of the beam line. And this experiment stopped taking data in 2008. The problem we face is once the experiment ends, the funding stops, but we still wanna analyze data. We still have graduate students looking at some of this data. We may wanna look at it in five years. But we have nobody maintaining the system. The software will not be maintained beyond a scientific Linux five, Red Hat five operating system. And so we need to maintain the software. The software will only run on an old operating system. And so we have to build a what we call a long-term data access system. So it's the first experiment that's done this. It did it a few years ago. It's basically a static cloud. They just build the, you know, SL5 VMs. It just has access to the software and the data. And it's transparent to the users. And it's been in operation for a while. As I mentioned with the Atlas trigger, we have these racks of processors that are used in real time to filter the data. But now we're down for two years. So we're converting these, what we call HLT farms, high-level trigger to clouds. So we don't need these 50,000 cores. They'd sit idle if we didn't use them. So they're being converted to OpenStack clouds. I think some of them are in operation today. And there's actually a talk today by Tony Perez at 11. So I encourage you to go see it. I won't say anymore. We're also using or converting our internal resources to clouds, either that they're transparent to the users or they appear as cloud resources. The Ibex cloud, Jan Mandeldijk and Tim Bell are in the audience. And you can find out more about them. But that's one example of how we're starting to convert our facilities into commercial, not commercial, but cloud-based platforms. We've also used commercial resources in, let's say, a limited way. We've Amazon, Google, Rackspace, others. There have been a number of reports by the Star Experiment at Brookhaven. There's the Bell Experiment in Japan and Atlas and others have used it. Typically, it's been used for exceptional demands where we don't have enough capacity to get enough data or process for a particular conference. We tend to use it for low IO demands, say, generating simulated data. We face challenges using it. Particle physicists use X509 certificates for identity management, so we have to figure out different ways to do it. Network connectivity, for example, if I wanna get to Amazon from Canada, I have to go over to commodity network, which is a pain. And the costs are a little bit higher than our private resources, so we're using it, but as I say, for exceptional resources, or exceptional needs. The area where I have a bit of interest in is trying to take advantage of our distributed computing infrastructure because particle physics is a global collaboration. Everyone has their resources and we would like to use them in some kind of seamless way. We have built a grid, it's working, but I think one of the ideas, maybe is to try and use it as a grid of clouds. It's also been coined as sky computing by Kate Keehe of Chicago. So what we're trying to do is use both dedicated and non-dedicated high energy physics resources. We wanna use them independent of the cloud type. We wanna remove any application dependence from the actual site, but also support multiple projects that I'll show you in a minute. I just wanna show you a very simple workflow for what we're doing, and then you can see how some of my requests, maybe for open stack development would be useful. So we have a system where a user submits a batch job to HT Condor. We have these clouds out there, but we have a customized service where we call cloud scheduler. So this cloud scheduler, the user submits a job, the cloud scheduler discovers the job, the cloud scheduler then boots a virtual machine on one of the clouds. That VM registers itself with Condor, and then the job gets dispatched to that VM. So that's how we're running a cloud of I think 10 or 12 distributed clouds right now. We use a combination of Nimbus clouds, open stack clouds, and some commercial cloud. The VM for some of them are pulled in from a remote repository. The data is also remote as well. For open stack clouds, we upload it. So we're using it for Atlas in production. We're also using it in the Canadian astronomical community. They use it for user analysis. We have a variety of clouds. We've had ties with the Nimbus group for a number of years, but that's proved to be a fruitful collaboration. So we have clouds that we have used in Victoria, Ottawa, excuse me, and Future Grid in Chicago, Future Grid in San Diego, we also in Florida. In terms of open stack, we've used Melbourne Nectar clouds, a certain Ibex cloud. There's two Canary clouds that we brought online, last week, and we're starting to talk with the Oxford people. So we've been in operation for about a year and a half. Astronomy has run maybe 500,000. We're over 300,000 jobs now using 10 clouds. Yesterday, we're running 1,000 simultaneous jobs. I mean, it's not large, but it's, we're just starting to get going now. And their jobs are roughly 12 hours, and they run, and they're all submitted, at least the Atlas jobs are submitted from CERN. We can, in principle, use Amazon and Google as well. So this just shows this week, these are just the number of each, that's the number of virtual machines that we boot up as a function of the day of the week. Each color is a different cloud. Some clouds go up and down, depending on their internal needs, but generally we're running about 100 to 150 whole node VMs, so as the order of 1,000 jobs. And then, as I say, we've run at least nine clouds there and over three continents. So it seems to work reasonably well, and we're hoping to scale it up further. So as a, I guess I'm a particle physicist that's interested in computing. At least I see myself and my team as integrators rather than developers of cloud technology. And what we would like to see that would help our life is the following, at least these key things, is common authentication. Maybe a centralized VM image store that we could download from, rather than having to upload to every cloud. Maybe consistent metadata and don't use Nova as your cloud name. So there's some things you could help simplify us, simplify our lives for us. Okay, so I'm reaching the end, so I just want to summarize the goal of high energy physics is to understand the universe. We believe we've discovered the Higgs. The next step in the LHC is to search for the particle that's the source of dark matter. Maybe we'll understand the difference between matter and antimatter with our studies. We do have an impact on society. It's always useful to remind you that the World Wide Web was developed by particle physicists at CERN. The first two sites were CERN and Slack. They're now more accelerators in hospitals for imaging and cancer treatment. Building these large instruments generates technology development. I mean, just look at the cryogenics involved with the LHC and we're training highly qualified people. In terms of computing, we've had large distributed systems. Initially grid, now we're looking more towards cloud. We have a very big global network that consumes a lot of the bandwidth. We're trying to use the novel computing technologies that this community is developing for. So that's all I have to say. Clouds are a collaborative endeavor and many people have helped us in our work and I need to acknowledge them. If you want to contact me, this is my email. I'm around for today and that's our website. Thank you. Thank you.