 I would like to talk to you about my CPU addiction. Back at uni, when I first started, I did a lot of this thing with Povre and animating, you know, doing CPU to throw out models which you animate, basically what happened was I spent a lot of time crunching data a lot. Then I would go into student labs and I would crunch data on lots of machines and, you know, you spend a lot of time automating things. And after a long time, I sat down with this guy called Yasek and we sat down and we started building hypercube machines to do specific kinds of number crunching and then we started working and doing stuff for the engineering department. And over a long period of time I was doing different kind of, you know, just admin jobs and crunching jobs of various places. And over the next 10 years, I wound up working in various places and, oh, is that not showing? Yeah, it's probably just didn't keep settings in it. No. I've hit the wrong thing. Apply. And then it'll say, please keep this, in other ways. Right. So I'm going to go back a slide because, all right. So Yasek and I did a lot of work and he ended up writing that. And that would have been about 94 or five. So these guys went on to do a lot of different work. I ended up doing system and networking at admin various other places. ISPs and number crunching and stuff. But I ended up here. And this is my employer. And in 2010, my job was to migrate from old service to new service. So what do you do when you've got a dozen new servers and, you know, bringing your machines, you want to make sure they're working. And all of my roles before me, before this particular one, you want to, you know, burn your hardware, throw some jobs at it. And traditionally, RC5 cracking was the most offensive thing you could do to your CPU and burn as much CPU, you know, heat stress relief, everything. It's, you know, my favorite engineering term. So I started running DNC, you know, because that was the thing. And then someone put me on to this thing, which I thought was pretty cool, called the Skynet. And the Skynet's a citizen science project run by the ICRA. And their job is to crunch data for the data that comes out of the square kilometer array. So that was the very end of 2010. We had a long process of doing software automation and all the acceptance testing and stuff. And machines ran for quite some time, just crunching numbers for hours. That was kind of cool. And so by the end of 2011, that's not the right screen. It doesn't matter. By the end of 2011, I had actually managed to muck my way all up the table. I wasn't really paying attention. I just let it go. So I did a lot of crunching for those guys. And I ended up winning the little competition and I got a trip out to see the square kilometer array. And I did a lot of crunching, radio astronomy, installation. I spoke a little bit about that last year. But what I'm kind of keen to show you today is what the system that they have moved to. So they used to use NERIS, which was a Java based system. They have now moved to Boink. I don't know if we can see that better. So there's a bunch of things here that they have for managing the instances. This is what I currently got. I used to have 250 nodes, 250 cores crunching, doing stuff before. Now I've just got a couple of 8 core machines and some 16 core machines. So there's a couple of desktop boxes. Nothing flash. But this is so much simpler than doing any other form of citizen science that you can basically find. Boink allows you to do, basically, enter in your credentials. You go to the site. You create an account there. You enter your credentials into where the hell is it? Boink. Now here, here. Right. Yes. No, don't run those now. Where do we go? Projects. Add new tasks, damn it. Anyhow, all of the site. You can add a project. You add your credentials. You tell it what time today you want it to run. And it goes away and does it. It's the easiest way to get and do contribute CPU cycles to real science projects. The guys at ICRA are now crunching data from the square kilometre array. The last couple of years they've been re-crunching and confirming data from the Parkes Telescope and various other sites in Australia. They've been comparing the data that they've previously crunched and making sure that all lines up so that the output from the Boink project confirms. And so basically they know they've got good data coming out because they can compare. I've been, you know, like I said, I've got a couple of machines here doing stuff. It's a really good feeling to be able to say, you know, you're doing something useful for that project. Any questions? What do you do with your credit? I see a big number with credit underneath it. What is that? Good feeling. Oh, it's a big number that gives you a good feeling. Is what? Ah, rice. Bragging rights. Okay. Okay. I thought you might have tea. Yes. Those points can be exchanged for geek cred at any LCA conference. I'm going to show you something. So I'm doing about 80,000 work units a day. That's about 32 CPU cores, right? There's some folks with a whole lot more CPU running around than I've got. Back when I was doing stuff with my work servers, I had a couple of hundred cores. But there's folks here who've got a single CPU just chucking in a couple of work units a day, and they're all doing useful stuff, right? This is what it's about. It's not about, you know, getting, there's nothing in it other than I've done useful stuff, right? But the thing that's really, really useful about it is is that the guys over in ECRA have put together their C libraries, pushed into a Boink client. I've gone through the process of having, they've got AWS as their back end, thousands of thousands of users that crunch data for them. You're talking before about the citizen science approach not being so helpful, not being so useful in the big scheme of things for reliably crunching data. So every work unit is done three times to confirm that that's the right stuff of multiple clients from multiple users. And Boink is basically a platform that allows you to run a number of different projects. One of them is the Skynet Pogs that Kim is working on. Another one is SETI at home that you'd probably be aware of that uses the Boink platform as well. There's others less astronomy related, things like Folding at Home, which actually uses it, spare CPU cycles to do analysis of protein folding a candidate and all of those sorts of things. So, question? I have two completely contradictory questions. The first one is how do we help researchers package their research in a format that can allow them to distribute the effort like this? Does Boink have some kind of program for helping on-board researchers? Okay, so most of the projects are done by specific research groups. They have a programmer, they're research scientists that do stuff. The people obtain the data, they package it up, create the client to consume the content, crunch and send it back again. Right. So as we heard and names totally forgot me, but Nick's talk, I mean, we've seen that if we can provide operational technical expertise that we can help researchers to massively sort of accelerate what they can achieve. So how can we distribute that effort so that you don't need a whole research team with a coder? This is hard because... ...who want to do things their way. They have a way of working. The main issue is that academics work in a box and the community outside doesn't see their work until they release. They release that paper here. No one in the community sees the data until that paper's been released. It's all about first life or whatever the term is that the academic world wants to say, these are the results, they're the meaningful things that this data points towards. So that data won't be released for some time after that paper has come out. And so I've spoken to various researchers about this and it's kind of like, if we could see the data, we could also look at other angles on this and they're going, yes, but you're not qualified to understand it or look at it, so we're not going to give it to you. You know, it's one of those things where you could do more with more eyeballs on the problem, but it's all about the qualifications and the funding for releasing the paper and the results. And so you can understand that dichotomy there, that those guys want to release the results and they won't share the data until sometime after. And of course there is a lot of work being done with open source tech, as Nick was saying. The university has a department that provides HPC computing and supports it internally and all that, so there are plenty of us working within academia already supporting that kind of stuff. Yeah, yeah, exactly. The other angle is, I mean, you saw Sage on the screen there before and tools that are better integrated that allow the workflow to make these guys' lives easier. And that's perhaps where we can really push towards, I mean, if our skills are around that organisation, the software tools, these guys, for the one of a better description, need to write a spec about what the thing is that they think they need to go and play with that and then iterate as much as you can. And there are some projects where, in fact, the data is effectively embargoed for the team who paid for the time on the instrument for a period of six months or 12 months so that they get exclusive use of that data, the research is paying for it at that period, after which it becomes automatically released to the rest of the community. That's not just an astronomy thing, that happens more widely as well, at which point it's like, go for it and take it off. Some of it has embargoes for much longer. Some of the guys I was talking to, they're saying they were handed a real tape reel from Pax that's 20 years old and it has never been seen outside the community before. Yeah, that's unfortunate. I've been asked. Here's the tape, can you get the data for us? Real tape right, quarter inch? What brand did you own? That's all right, we've got the data off, but how often do you get asked that question, is can you pull data off tape for us? All right, thanks Kim, just before you, we had somebody who wanted to just pull up a website but they don't have a laptop, do you mind if they do that on yours? Did you want to come down and present on that? Hi guys, my name's William and I'm going to show you the IPython notebook. You just said someone show you Sage, which is very similar. So just type in IPython. This is it here. So we've got, let's show some examples. This is what it looks like. So you can mostly Python, but you can also run other languages with it. It's got Matplotlib and NumPy and SciPy all integrated inside of it. So you just type code, execute it, you see the results straight away. It basically saves it as a JSON object, which is really awesome. Here's an example collection of notebooks. So this is what it looks like. You can run bash commands inside of it. There's an importing NumPy. So they really need developers, so if you're a developer and want something to hack on, have a look. There you go. I hope you found that interesting. Thanks William, I appreciate it. So if we don't have any other last-minute editions, addendums, errata. No. General comments. Keep it G-rated. Thank you. This applies somewhat to science in general and research in general, but specifically for astrophysics. There's a wide variety of tools out there, some of which are packaged with a permissible license and can easily be distributed. Others, such as Zendu and F-Tools, you still need to go through rather bizarre loops to try and get the source installed on a system and give it out to different users. What can be done to make life easier, whether writing software as academics or just as people on the opposite ends writing Linux distributions or other distributions, to make life easier to just properly package new academic software, specifically for astronomy and get it into potential users and researchers' hands a lot easier. I can actually speak a little bit to that last year as part of the mini confided presentation on a thing called Distro Astro, which is actually exactly that. It's an astronomy-focused Ubuntu-based distribution that is available. We still need to do a little bit of work to get that mirrored properly through ARNET, but hopefully we should be able to do that. That is basically an ISO that you can spin up. It's got everything pre-installed, not everything, but a lot of things pre-installed for both amateur and professional astronomers. Or you can just load it up in a VM as well. One of the UK astronomers is actually packing up a VM, like a VMDK file that you can just run out of the box for that as well. But I think part of what I want to try and accomplish with this is that I think it's a forum where we as the doers and the engineers can actually connect and interact with the academics and kind of say, we have this talent pool, we may have some spare cycles that we can help you with stuff. If you have stuff that you need done by a bunch of people who can do stuff, then we can do stuff, and we would like to do stuff, and maybe our stuff can help your stuff. So that's one example of that. But there's absolutely, I think, reach out to the academic community and perhaps you might have something to add to this as well. And maybe there are limitations around the publish or perish kind of data thing that is a barrier to that happening. Do you guys want to talk to that? Yes? No? I mean, just to say the move is towards open data. It is a thing when there is proprietary information which is good, and the reasons for that is because you want to be the first to announce that you've found this really great thing. But then you've all, with all the archives, they're all free. I mean, even with NASA, all the data is normally free in all the telescopes. So, yeah, in terms of doing things, that's something that I think takes a lot more thought from our view, and we never really have time to sit down and think what stuff we'd like people to do. There's a long list of things we need to do, but trying to work out exactly how to do it and to get who to do what in what language is the problem. Does that make sense? Well, so maybe that's something that you can take back to the community as well and sort of say rather than necessarily having to reinvent the wheel or build stuff from scratch yourselves, if you have parts of projects that can be farmed out, you know, turned into an open source project before it is something that you release that doesn't necessarily give the data away, but is about building the tool set. You know, you can reach out to the people here. The Twitter stream is going to be a great place to find people who can build all those sorts of things. We may be able to, you know, not always, and it's not always a free labour pool, but we can certainly, some people will be keen to participate and contribute stuff in their spare time. Sorry, Kim. Part of my role at the Unibattle Aid, we work next to the media section and marketing and all that sort of stuff, that fun stuff, and there often are outreach programs for various things. And while I didn't participate in, there was recently a NASA challenge for doing... There's a whole bunch of challenges they had for software on making useful apps for their data that they had, and they had a lot of data to play with and share with people. So I'm sure that, you know, as universities, engaging with people, engaging with the wider community, those kind of challenges could be put together. And the outreach is good for the uni. It's good for the community in general if they can participate. And it gives direction to how you want to do that. And while some of that data is... Well, some of the specific goals you have might be... You see what I'm going with. The idea is that if you set the goals for those things, your engagement people can actually then deal with the workload of engaging with the people and making those other things happen. OK, so I think we might be at the point where we can wrap this up. I wanted to thank all of you for coming, especially our presenters, both in the main program presenters who got up and talked about their various things. I guess I'm declaring the Astronomy Minicomf officially closed. Thank you all and see you next year.