 Right, so I'm going to be talking to you today about ShapePipe, which is a modular weak-lensing processing and analysis pipeline. So I would imagine that most of the people in the room understand each one of those words individually, but might be struggling to understand what that combination of words means in total, right? So by the end of this talk, I'm very much hoping at the very minimum that you would understand the title, and therefore could say, okay, I learned nothing else today, I understood the title of this talk. Okay, so you'll also notice that I've gone for a very dark background, so I apologize for the people in the room if they can't see it very clearly. There is a reason for this, and I'm hoping that this will become apparent as the talk progresses. But before we get into the juicy stuff, I have a little prelude to the presentation. So when preparing the talks for EuroPython, the speakers were asked to read through some suggestions and tips of how to prepare their presentation. One of the tips in particular stood out to me, which is to tell a story. Now, I think this is a particularly useful tip in any presentation in any context, because you're trying to reward the attention that people are investing in your presentation, and also you want to keep people's attention span and make it interesting. So I think I appreciate this sentiment. Also, being Irish and in Ireland at the moment, Ireland has a long tradition of storytelling, and for us it seems very appropriate that this is the right time and place. Okay, so what kind of stories can we tell? Well, if you're a Python developer, you might want to go back in time and start talking about the very beginning about when Guido started putting together this amazing package. If you were an Irish astronomer or an Irish historian in astronomy, you might want to go back another 100 years and talk about someone like Agnes Meri-Clark who published various books about astronomy. But if you're an even deeper historian, you want to go back all the way to the Stone Age, you might go back another 5,000 years and talk about New Grainge and how new lithic farmers were able to develop structures that were able to capture the sunlight of the winter stosses, and incredibly impressive for the time. But as a cosmologist, when we start talking about history, we go all the way back to the beginning, 13.8 billion years ago, and we started the Big Bang. Now, obviously, in 40 minutes, I'm not going to tell you the entire history of the universe. We don't have enough time. But I'm going to give you a little taste about a little part of the universe that we're trying to explore and better understand to understand that full history that we have. So that's the story for today. And that's what I'm going to try to tell you. Okay, so now we can come back to the actual presentation. And as I mentioned, there was a reason for this dark theme. And because one of the main topics we're going to be talking about today is dark matter. But what I did say at the beginning is what I wanted you by the end was to understand the actual title of this presentation. So I'm hoping that everything will become illuminated by the end and a lot more clear. Okay, so let's start on the first side of things. So I'm going to divide this talk into two halves. On the first half, I'm going to really focus on the science, motivating what we're trying to do. And the second half, I'm going to talk more about the package and the Python side of things. So weaklensing. So this is in the title. You need to understand this to understand what we're talking about. So I'm going to give you a little bit of cosmological context. So our current model for the universe in terms of energy density, which is essentially everything that makes up the universe, it has this kind of distribution, right? So over two-thirds of the energy density of the universe is in a form of a structure we call dark energy. Another quarter of the universe is dark matter, and then only 5% of the universe is what we call ordinary matter. Now, dark energy is sublimely fascinating. It's what drives the accelerated expansion of the universe. It is a little bit tricky to talk about in 40 minutes. And I don't want to derail this conversation too much. So we're going to focus on the matter side of things and really just look at this matter distribution. So what we have are ordinary matter and dark matter. And a lot of this talk is really focused on dark matter. So I want to make sure that the picture is clear of understanding what those two things are. So when we say ordinary matter, as a cosmologist, what we're talking about are galaxies, stars, planets, all the way down to Python programmers and the computers you're using, all of that constitutes ordinary baryonic matter. And the things we know about ordinary baryonic matter is it interacts gravitationally. So we know that if you pick up your laptop and drop it, it's going to fall down and you're going to be very sad. We also know that it interacts electromagnetically. Now, there's many ways of interpreting this, but when we're talking about big astrophysical structures like stars and galaxies, what we really mean is that they can emit light. The reason we can see stars and galaxies in the universe is because they're emitting photons that are electromagnetic that we can then detect and observe. So by contrast, what's different about dark matter is that one, what it's made up of, we don't really know. There's plenty of theories that you could read about and lots of things, but there isn't any consensus and no really compelling answer to that question as we state. But we do know that it interacts gravitationally, and this is what we're going to talk about a lot today. And the reason that we call it dark matter is because it precisely does not emit light. It does not have an electromagnetic interaction, or if it does, it is very, very, very weak and such that we cannot observe it directly. So then the question that we want to ask ourselves is then how can we observe dark matter given that it does not emit light? Okay, so to answer that question, we're going to take another little tangent and talk just for a few minutes about general relativity. So I'm pretty sure every single person in the room is familiar with the gentleman on the right, and I'm pretty sure everyone's heard that term at least before. So whether you've fully immersed yourself in the theory itself or if you've only heard it in passing, you're probably aware of its existence. So we're not going to have a whole theory on general relativity. Don't worry, there's no test at the end either. But what I did want to show is just these kind of diagrams that are very useful kind of conceptual images for trying to understand one of the aspects of general relativity that's really pertinent to this presentation is which is the distortion of spacetime. So we know that, well, from this theory, we predicts that large structures like the Earth will distort the spacetime around them. And the larger the structure, like the Sun, distorts spacetime more. And the impact that this has is this is what actually produces that gravitational attraction and what allows objects like the Earth to rotate around the Sun or in orbit. So again, this is just a conceptual image and what we'll have to try to do, and it's not easy, is to try to project this into your head into three dimensions. So this effect, this distortion, is happening in every possible direction. Now, when we're talking about spacetime, this is really the space that you inhibit and exist in. So there are some interesting consequences of this distortion of spacetime. So for example, if we were to take a star that was behind the Sun, and the photons that are being emitted by this star are going to travel out towards us, and in doing so, that path that light travels is going to be distorted. And it's not because the light itself is being distorted, it's simply the light is travelling in a straight line, but that line is curved through the space that it's travelling. And so, and this is a phenomenally interesting thing, and the consequence of this is that when observing the star, it will appear to be at a different position than it is physically. So the actual star is here, but from the Earth we look at this as if it was coming to us from a completely straight line in flat space. So it looks like that star is actually at this position. And now, if we know the actual mass of the Sun and we have pretty good estimate of that, we could actually say from general relativity, okay, how much distortion will we expect from an object of this size? And then you could measure the angle and you say, okay, we predict that this star would have moved by this amount if the Sun is of this mass. And this is actually an experiment that was carried out in 1919 by Eddington and Dyson. So this is one of the first real tests of general relativity and one of the first pieces of confirmation that showed that this is a very good description of gravity. Right, so that's very interesting. Now, what we can ask ourselves is what if we were to try to do the opposite? What if we saw this shift but didn't know the mass of this object? Right, so the inverse problem. In a very similar case, we have a big blob of dark matter that we cannot observe, so therefore we cannot directly try to measure the mass of this object. But we do have some background objects that is emitting light that's going to be distorted as it travels towards us. Right, so the impact will be the same but it can be even more dramatic in the sense that this distortion can produce multiple images of this background source because we'll see that multiple angles from where it is actually physically located. It will also squish the objects, so kind of what we call shearing them in a direction that is tangential to the source, the dark matter potential. So basically the image gets magnified, squished and it can appear in multiple locations and so on. And the thing is that if we knew what the properties of this galaxy were and we could measure how much it's been distorted, then we can then infer how much matter would be needed in order to cause that distortion due to general relativity. So therefore we can kind of map out how much dark matter needs to be there in order to explain what we're observing. And this process is called gravitational lensing because this object here behaves exactly like if you had a giant magnifying glass stuck in space floating around because the way it distorts the light paths is to make sure that the objects work in a lensed system. All right, so hopefully I haven't lost anyone so far. I think this is pretty straightforward, right? Okay, so let's go a little bit further. Okay, so this is just kind of a graphic representation of what I just explained, right? So we have a background galaxy, a foreground structure. So this is a cluster of galaxies, right? So these are the largest gravitation-abound systems in the universe. And as we saw in our little pie chart, we respected about 80% of the matter in that galaxy cluster is actually dark matter rather than ordinary baryonic matter. So we can observe that there is a cluster there from the baryonic matter, and then we can infer the amount of dark matter that's there from the distortion it causes to background sources. Okay? So in the case where we can directly observe this distortion, the effect is called strong gravitational lensing and the reason for it is because it's a strong effect. So what you're seeing in this image here is actually one single galaxy being distorted into an arc all the way around some foreground dark matter, right? And this is a real image. This is not an illustration. This is not a cartoon. And so this is what we observe in strong gravitational lensing. So we can have multiple images, like in the diagram I showed before. We can have arcs, which we're going to look at in just a second. And in some cases, in very dramatic and beautiful cases, we could have what's called an Einstein ring like this. Okay? So just to kind of tie in with yesterday's a fantastic keynote presentation by Patrick Canova about the JWST first deep field, right? So this is the image that he showed and that was released in the press recently. I assume most of you have seen this now. It's really absolutely stunning image. But I just want to kind of highlight, if we zoom in on that region there, what you can see here is an absolutely astonishing arc strong lens system from real data that we just got. And this is absolutely incredible and really breathtaking. So anyway, so now everybody in the room is familiar with strong gravitational lensing and what it looks like and it's very cool, right? But as you can see, the section title is weak gravitational lensing. So what are we doing differently here? Well, the process is exactly the same. We're talking about the exact same process that we were talking about in this round-line system, but we're thinking about it in a slightly more statistical way. So first, we're not going to focus about one galaxy and one blob of dark matter and the beautiful arcs and things that you can create. We're going to think about all the galaxies that we can observe. We're going to think about the full line of sight of dark matter and ordinary matter and things that they interact with. So we think about some distant galaxies, the entire path that the light has to travel between where it was emitted and where we observe it, and all the dark matter has to interact with on the way. Okay? Secondly, the distortion that we're going to observe is a 1% effect, right? So the thing is, if I were to take now an original galaxy and the weekly lensed galaxy and put the two side by side, you would say, oh, that's just the same picture twice. So for a one-to-one system, this isn't very interesting because honestly, the statistics and the noise that we would get would be essentially impossible to measure any effect whatsoever. However, if we were to measure this effect on billions or millions of galaxies, we can start to see this statistic pop out, particularly because, as I mentioned previously, this effect is tangential to the source of dark matter. So the galaxies were going to get squished tangentially to where the lens is. So we here see these little ticks on this plot here. What they're showing is the direction in which the galaxy has been squished with respect to its original shape. And then in the blue in the background is some kind of fake dark matter that we've put into a simulation, right? And what you can see here is that those little ticks trace out that distribution of dark matter pretty well. Right, so then finally what we want to do is, okay, now that we can measure the shapes of all the galaxies in the universe, and we do this very accurately. We have lots and lots of statistics. We can then use this to constrain our cosmological parameters. So this is a plot of omega matter, which is the matter density of the universe. Again, sigma 8, which we don't need to talk about right now, but what it allows us to do is fix the values of that little pie chart that I showed at the beginning. So when we want to know what is the fraction of the universe that is made up of dark matter, this is how we do it, and this weak-lensing process allows us to do that. So weak-lensing is a way that we can observe dark matter and also infer the actual matter content of the universe. So have I lost anyone so far? Fantastic. Okay, so we go on. So now to add an extra word into the title that we were talked about, so we know what weak-lensing means. I think everyone knows what a pipeline is, so now if I put those two together, I can explain what a weak-lensing pipeline is. So what we're going to do is we're going to start with raw images that are obtained by the telescope. So we have either a telescope on the Earth or one in space. We observe lots of galaxies, lots of fields of space. Then we need to reduce those images. We need to mask them, detect sources in them, select sources in them, estimate the point spread function. We'll come to that in just a second. Validate that point spread function, measure the shapes of the galaxies, calibrate the shapes of the galaxies, and then finally we can do our cosmology and then find out how much dark matter there is in the universe. So sounds simple. Lots of steps, but the steps in a little bit more detail. Obviously I'm not going to take up your entire afternoon going through every detail, but just to kind of give you an idea of the challenges that we're facing in this problem. So when we get the original raw data from the telescope, it often looks something like this. So as you saw from Patrick's Fantastic Talk, there are a lot of beautiful images there, but when they come from the telescope, they don't usually look like that. They look something like this. So if you look out as an astrophysics PhD student or something like this, and your supervisor sends you that, you're a little bit disbunded in thinking, oh, this wasn't what I signed up for. But fortunately we have ways for dealing with this. We can correct for the bias, we can correct for dark currents, we can correct for flat fielding, we can remove cosmic rays that land on the detector, and things start to look a little bit more familiar. But then we say, okay, but when we're trying to do the statistics we have to take account of bad pixels, border spikes, and maybe some bright stars that might saturate the image. So we want to mask those out so that we don't have that effect. And then we start to end up like, oh, okay, this starts to look like something that seems a bit more familiar. Okay, so now that we've done that, what you can see here are a bunch of white dots. And the thing is, again, when we're used to talking about stars and galaxies, people usually show you pictures of stars and galaxies that are pretty nearby in terms of astronomical scales. Which you have beautiful resolved things. You can see these beautiful clouds and rings and arcs and all sorts of things. But when we're talking about distant galaxies and when we're talking about weak lensing we need to talk about distant things because we need that foreground dark matter to travel through to measure this effect. What we usually get are just blobs. And then the tricky part now is to figure out which of the blobs are stars and which of the blobs are galaxies. So we have ways of dealing with this principle. We know that stars tend to be unresolved at long distances and galaxies do tend to be quite elongated. But there's also other color properties and things. So now that we've said, okay, we've identified the stars, we've identified the galaxies, what's the next problem we have to deal with? So the next thing we have to deal with is what's called the point spread function. So some of you in certain domains might be familiar with this in terms of instrumental response if you've ever worked in any imaging domain. If not, I'm going to explain the whole process. So don't worry. So firstly, if we were to take a telescope and just point it at a star and take a picture, we would say, okay, what would we expect that to look like on our detector? So this is my toy detector here. So firstly, if we were looking at a star that's really far away, like I said, we would expect it to be a single point source. So we would hope on our detector is to get a single pixel that lights up and say, cool, we've got a star. That's not what you see. You see this. And the reason you see this is because light acts like a wave and a telescope is an aperture. So when you have waves entering an aperture, you have diffraction and that creates an interference pattern. And this interference pattern looks what we call like an airy disk. It's essentially just described by a Bessel function. So if this was the case that what we actually observed was this, we'd be fine because we say, okay, well we know what a Bessel function looks like. We can explain that completely analytically. So I know that my object has been distorted by this amount, et cetera, no problem. But what we actually observe is this. And this is because telescopes are not perfect. We have imperfections in the mirrors. We have imperfections in the entire optical system. If this was a ground-based observatory, we also have to deal with the atmosphere because the light has to travel through all the gas and clouds and so on. And various other things that can go on. If it was space-based telescope, there's jitter and things like this that also introduce. Anyway, globally what you end up with is this kind of a Gaussian blur of your single point. So what happened is our little pixel ended up in a big blob. Now, as I mentioned, what we're trying to do is measure the shapes of galaxies and this is a 1% effect we're trying to measure. So obviously if you're trying to measure a 1% effect and this is what's happening to your images, you know that you're going to have some issues. So what we need to do is correct for this. Now, as I mentioned, if we were pointing this up at a single star, we know that that star is supposed to be a single pixel in this kind of toy diagram. So we know, well, if we measure this, we know how much that's been distorted at that point. And if we did this for all the stars in our field, then we could work out how much has been distorted across the sky and therefore we can try to work out how much the distortion is at the position of the galaxies. So that's our next step. What we're going to do is take the point spread function at the position of our stars, interpolate that to the position of our galaxy and therefore when we measure the shape of that galaxy, we can correct for the point spread function as we do the shape measurement. Okay? Everybody still on board? Fantastic. Okay. So this is pretty much the process that we're dealing with in a wheat-lensing pipeline, starting with these raw images and going all the way to these shape measurements that we then calibrate, and then we can do the statistics we need in order to determine how much dark matter there is in the universe. Okay, so here's a cheat sheet. So again, I'm trying to make life easy for you. So this is all you need to remember from this first half of the presentation. Matter and especially dark matter curve spacetime. That curvature of spacetime distorts the path that life travels. That just deflection of light creates distortions in images and that results in a kind of squishing effect. And therefore if we can measure the shapes of galaxies very accurately, we can infer how much distortion was induced by how much matter there is available. But it's a challenging process and there's a lot of steps involved. Okay? So hopefully that was clear for everyone. We can move on to the second half of the presentation where I'm going to talk to you a bit about the actual shape pipe package. So my hope is that this subtitle is now clear for everyone and then I can just explain the top part. Okay, so why did we start working on shape pipe? So I and my group are heavily involved in a space telescope called Euclid. So Euclid is going to be launched hopefully next year if not maybe in a couple of years. And it's aiming to study very accurately dark matter, dark energy and gravity. So over the sea, I hope all of this presentation at least the first half of this presentation that the interaction between those things is roughly clear to you. So one of the big focus for Euclid is weak lensing. So that this telescope has been designed to measure the shapes of galaxies extremely accurately and in order for us to be able to get this 1% distortion that we see to the highest possible precision. But as I said, this hasn't launched yet so we don't have the data now. So what we do have is some data from the ultraviolet near infrared optical northern survey or unions which is a collaboration between the CFHT, Panstars and Subaru which are all observatories in Hawaii which are providing what we call ground-based photometry for Euclid. So basically once this survey is completed and this survey is completed, we're going to combine all of the data so that we have a really comprehensive view of the electromagnetic spectrum in the northern hemisphere. So Euclid will be mainly in more infrared bands and these are more in optical bands. Okay, so that's the motivation for what we're working on and then we asked ourselves, could we make a shape measurement pipeline for unions since we have the data now so that we can get better prepared for what we're going to have to deal with when Euclid actually flies. So what did we want from a pipeline? Well, we wanted it to be modular because the thing is that this field is moving quickly and a lot of the processing techniques on the resulting edge today in two weeks from now are going to be old news and we want to be flexible to change things up. Also, we don't know the answer a priori. This isn't some user data or something like that where we can go out and say that this is working or not. We're trying to understand the universe and so sometimes we're going to have to try a few things out to see what works best. We're also not professional software developers. A lot of researchers, a lot of students come in and their first day of Python is their first day of their PhD. We can't set the bar too high in terms of the complexity of developing tools for this so we want it easy to develop. Also, it needs to be fast enough in the sense that we can't spend the rest of our lives processing a single image. We need to get some results in a reasonable time scale and to find that we want it to be robust. We want people to trust the results that we published from this pipeline so that we can say that this is actually working. Myself, being a strong advocate for Python, said, well, we're doing this in Python. I feel like Python satisfies all of those criteria and that's why we progressed in this way. Like any package, nothing comes out of the ether. We need people. The core team for developing shape pipe were myself, Martin Kilburger and Axel Guineau who's a former PhD student from our lab. But we've also benefited from a lot of interns, PhD students, staff members, postdocs who have come in and out of the department and contributed in different ways. Some of them directly in the code, some of them simply suggesting ideas, some of them simply running the code, some of them just looking at the documentation saying none of this makes sense. You guys need to do better. But every contribution always helps. What I wanted to highlight in this slide is that we can have a diverse team. We can always do better in diversity, but it is an international team and it's an open source project and we welcome contributions from the world. Okay, so if you're interested, and like I said, I'm not trying to sell you anything here because I don't imagine many people in the room are in the market for a weak-lensing pipeline. But if you were just interested to see what we're up to, you can find the package on GitHub. I have written some extensive documentation, which is mainly for ourselves because I don't know about you, but I do find that often when I go back to a code that I've written a year ago, I forget what half of it does. And often a surprise that I even wrote some piece of code, I'm like, oh, that's cool. I didn't remember doing that. So it's very helpful to have documentation. And again, we have students and things coming in and we need to get them up to speed as fast as possible. So having something that they can go straight to, read and understand is very helpful. So the other thing that we've done is we've published a paper describing the software. So this is not a common practice, I would say, but it's definitely an advocate for, more in academia, is publishing software papers, because we have a tendency to publish the results from software without really giving any description of the software behind those results. And there's a lot of effort and time and energy that goes into the software in itself, a result being able to produce something that works, not only for the results that it produces, but for the software itself. So again, if you happen to be interested in reading a software paper, you're very welcome to go check it out. Okay, so in terms of architecture, the way we designed ShapePipe was the following. So I divided it into three sub-packages, one that I called Pipeline, one that I called Modules, the name convention for Python, and then one that we called Utilities. So in Pipeline, what we have is kind of the meat behind the Pipeline part of things. So it handles all of the arguments, configuration, file handling, job handling, exception handling, dependency handling, logging, all of that sort of stuff. In Modules, what we have are all of those processing steps that I mentioned to you for getting from those images down to the shapes. And then Utilities are some just kind of extra tools that we find useful for various different things. Okay, so I'm just going to talk to you in a teeny bit of detail about some of those little aspects of the package, because I don't have time to go through everything. Just to kind of mention some of the things in Python that have made the job possible for us. So in terms of job handling, we've implemented JobLib and MPI for Python because most of this processing can be done in an embarrassingly parallel way. We can take individual images of galaxies, process them on separate CPUs, and then take the data back together to do the statistics. So one of the things that was really helpful for us was being able to simply do that distribution on big clusters in an easy way. So what I've done is set up ShapePype in a way that this is all handled in the meat part so that the students and things doing the module development don't have to worry about that. They simply write their module, whatever it does with A and puts out B, and then on the other side it can handle the distribution and management of all those tasks. Right, so this is, I think, the only bit of Python that I've actually put in the presentation is for some hardcore developers, this might seem incredibly simplistic. For some people, maybe, you know, it's something new. But this is kind of how I managed to manage the module development for trying to make things easy especially for students and things. So, you know, what I did is designed a module runner decorator which could be added to, essentially, any function in the code. So you simply write a function, your function does X, takes in some inputs, provides some outputs. All you need to do is stick this decorator on and then ShapePype can integrate that into the pipeline and run it in any order you want with any configuration inputs you want and so on. So, all I wanted to kind of highlight in this is that it was really, you know, only a few lines of code that made that possible which is, you know, the beauties of Python. But also the flexibility of, you know, everything being an object and being able to pass different objects to objects and the fact that even things like functions can have attributes that you didn't know were there and so on. So, I'm going to take a look at this part of the pipeline. So we assume that somebody's already kind of handled the raw image processing which is pretty much standard in most astrophysical projects. And we assume that most of the cosmology is kind of going to need to be done manually because this is a lot of work in itself, this step. So what we do is handle all the steps from the reduced images all the way to the calibrated shape measurements. So kind of all of those steps that I showed you. So what do we use a lot of community packages? I mean, I guess most of you will be familiar, oh, that's a bit dark. But with most of these astropie is a huge package for the astro community. But, you know, I mean all of these other standard packages are, you know, exceptionally helpful for various different tasks. So for the robustness one of the things that was very important for us was reproducibility of the results. And the thing is when you have lots of dependencies on other Python packages but also non-Python packages which we have to use managing the versions of all of those things such that when somebody runs this pipeline again with the same data, they should get the same results out. It was really important. So I found that KONDA was actually exceptionally helpful for this. Now, of course, if you were doing something for deployment and production, you probably wanted to have some kind of Docker container or something like this, but for, you know, on the fly, you know, more you know, human level development having, you know, an environment with its own bin where we can install different packages that can be activated and deactivated and separated from some of these packages which might already exist on the user system was really, really helpful and, you know, I assume this is probably common knowledge for everyone, but it was being able to install, you know, your own version of C make or even in compiler into that bin directory meaning that you can compile all of these packages in exactly the same way in that environment extremely helpful. Okay, so you know, now that we've talked about what shape type can do, what have we actually done with it? Well, we've run it on some unions data. So far, 1700 square degrees of the sky. So in the R band, so this is just an optical photometric band and measured the shape of 40 million galaxies. So we've done a detailed analysis of this point, point spread function. We found very good, you know, levels of systematics and what you're seeing in this figure here which is not probably very familiar to most people is a what's called a convergence map. So it's this effectively, it's like the cumulative mass along the line of sight and what we've done is stacked that around some positions of known clusters from plank. Right, so you see here is that there's a little black dot which is the position of those clusters and that bright blob is the distract convergence and what we see is that nice consistency showing us that we have a large concentration of matter at the position of some known clusters meaning that we have probably detected a good amount of dark matter. Okay, so I can get more into details in the coffee break or some of the thing if anyone's interested in that but just to show you that shape type is working. Okay, so what's coming in the future is we're working now on 3,500 square degrees of union's data which is pretty much the whole survey which is actually going to be one of the largest weak lines in catalogs produced to date. The paper, this will be published I assume it's not paper that you all will be reading but just to let you know that that will happen and if this is something that you think is interesting you know, do stay tuned. Okay, so my cheat sheet for this section is just to let you know is now you know all of the words in the title mean and if you've learned nothing else from this presentation I'll be completely happy with that. You know, this is you know, an open source project you know, it's the diverse team of people who are very happy for outside contribution I mean, I don't assume this may be very attractive for a lot of people but you know, if you're interested in a hobby project and wanted to get involved don't hesitate to send me a message. Yeah, and you know we've run this on real data, got some nice results and you know, there's plenty more coming from ShapePipe in the future. So I'm just going to end there I have more information about myself and about ShapePipe available in the slides that are online you can click on all of these links and go around like good. So thank you very much and have a lovely day. Well that was a very fun talk I definitely learned a lot and I don't even know anything about astrophysics So do we have questions in the audience? Yes, please. Hello, I have a very basic non-Python question Yeah, go for it. So you say you're measuring shape distortions of galaxies Yes. How do you know that what you see is a shape distorted by lensing, not the actual shape? Fantastic question Excellent question. So one of the issues is that when we're talking about one-to-one we don't, but statistically we assume that the orientation of galaxies should be roughly random and therefore if they have no lensing whatsoever you would just expect the shapes or the semi-major axis in an ellipse to be random, right? There's no preferred direction for that in the universe. So when you see that there is a correlation of those shapes around some structure it would be a strong indication that that has to be caused by this lensing effect. So this is effectively what we're measuring is the correlation rather than the shapes themselves. But it's an excellent question. Thank you very much for the amazing talk. Question about maybe also the future of shape pipes it's going to be used for this a union state. And then how do you see like the progression how will it develop for Euclid? So is it mainly this modules that sort of different steps of the pipeline to be chosen? That's a very good question as well. So one thing we wanted to do is try to make it like I said modular for future development so that if we find a better way of measuring the shapes or if we found a better way of handling the PSF or a better way of detecting stars and galaxies we're going to be able to stick that straight into the pipeline and switch it around or compare it with the existing method. Scaling is a huge issue because we're talking about 3,500 square degrees but the whole sky is 40,000 square degrees so being able to handle all of the data in the whole sky is a huge challenge for us and one that we don't have as much experience with as some people. But yeah also being able to try to remove anything that's kind of hard coded to the properties of union's data and make it a bit more flexible so that if you had another survey and you wanted to stick it in there it's not immediate that you're going to be able to get results. There's so many little details that need to be fine-tuned so being trying to find a better way to manage that I think would be one of the big steps for the future. But good question, thank you. Thank you. It looks like the number of galaxies is infinite so you measure the shape of the galaxies like what is the general goal that you follow. It's like infinite work, right? You will measure the size of the galaxies that you probably know that it's like what else comes? If you would know the shapes of all galaxies in the universe how would that help? So that's a very good question. I'm going to give you a bigger answer than the question you answered. So one of the biggest challenges in cosmology right now is kind of consolidating the discrepancies we see between the early universe and so when we talk about the early universe we're talking about the cosmic microwave background so I don't know, probably you've heard about this before but anyways the earliest light in the universe was after the Big Bang on cosmological scales so when we talk about shortly in cosmology we're talking about hundreds of millions of years or something like that billions. So what we can see is that we have a cosmological model that we can derive from the cosmic microwave background on its own when we look at things like weak lensing so things that we derive from galaxies and clusters and so on we're talking about what we call the late universe which is much more recent time scale because there's a long period in the universe where there were no structure right like it took a long time for the first stars to form and then galaxies and then clusters and so on so when we measure the cosmological parameters of things like you know from galaxies versus the CMB they don't match up perfectly right so we have some big questions one is our model wrong two is there some systematic in our processing of the CMB or is there some systematic in our processing of the galaxies in the late time so getting more accurate information about the late time universe so really really accurate shape measurement for all of these galaxies will help us better answer that question right so if we still see that discrepancy maybe there's a bigger physics problem to be answered if the discrepancy goes away then we can say okay the model is probably quite right and you know we can be more confident that we do understand this does that answer your question? Cool. We can still take a couple more questions or we can also end this right here and give okay we have another question cool. How big is the data you're processing like how many terabytes? That's a good question I think at the moment we're talking about like a couple of terabytes but like I said this is only a kind of bite-sized piece of what we're expecting in the future right so for comparison Euclid data will be in many many petabytes and that's an order of magnitude different processing problem right because at the moment on a small patch of sky we can run shape pipe on your laptop and get some results not enough to do proper statistics but you can make sure everything's working for the processing that we saw there we ran it for two weeks on a cluster where we had like a hundred CPUs or something like that but of course when we're talking about measuring the whole sky we need giant supercomputing clusters and tons of storage space and we have to be very careful about how we manage the memory and all of these sort of things because if it takes us 20 years to build a satellite we can't spend another 100 years running some code to measure the shapes of the galaxies so we have to find a way to manage those things as best we can well right, that was a wonderful session and great talk Sam and let's give him a huge round of applause for thanks a lot