 What I wanted to do today is first sort of give you my overview on how I feel astrophysical surveys have been evolving and then tell you why that sort of implies that we need these a lot broker systems. So every astrophysical survey that I have been a part of likes to highlight how they are bigger, better, go deeper and find more sources and so for Ruben, which is what I have been part of for a very long time now, we show people things like so if you look at, you know, early photographic plans, this is sort of what you might see. And going from that is something like the digital sky so you can immediately see how CCDs have changed things and revolutionized how deep you can go, how much resolution you can see on sources but just looking at sort of the relatively nearby universe. So as you go deeper with larger aperture telescopes, but potentially smaller field of view you're seeing richer fields you're seeing more sources more distant sources in the universe, and potentially fainter sources that are nearby as well. And LSD is big trick is to do this of course with a much wider field of view than any other instrument that is so with a nine square degree square degree field of view for a single exposure, and the ability to scan the entire southern sky every four nights. And you'll have something like 37 billion stars and galaxies over the full 10 years survey. And every single night, it'll produce something like 10 million alerts. What's alert you ask well if you see these galaxies in the sort of image, and you come back at a later time and take another picture of the same galaxy, you can subtract the two of them off, there's a little bit of convolution trickery going on that. And then you'll find the source that has changed between those two images, and that source in the difference image is what leads to an alert it is a significant detection either positive or negative in a different image. And that's the core of what you see. Now to give you a sense of just how large LSD is. I'm going to play this animation by my grad student Alex Kalianov. So what we're going to do is, is show you all of the historical supernovae that has ever happened until around 2020. And this will go fairly quickly in time and you'll see little flashes here and there on your screen left example and you'll see as time goes on you're starting to find more and more of these objects. And then you go from photographic plates to CCD is you see Sloan Digital Sky Survey turning on so for example gets tried to do it so sometime around last year you were around 20,000 supernovae. This is really great. Right. But this is the accumulated total of all spectroscopically confirmed supernova that we found from a whole bunch of these surveys over. And to give you a sense of how LSD is LSD will do more than this in six months. So in other words, LSD is alert rate outstrips all of our follow up resources combined you could give me every hour on every spectroscopic facility on the planet, and there would still not be enough time to follow every LSD LSD transient. It's simply not possible. And the core reason for that is that the field of view of all of these eight meter cluster scripts is miniscule compared to LSD is enormous field of which is something like seven four moons across. The other challenge that we have is the time domain and transient sky is extremely many people are interested in different sorts of objects. So for example, I am personally interested in things like supernovae particularly type one a supernovae they measure the expansion history of the universe I'm also interested in things like killing away, but my colleague for example Professor Decker French was just on the corridor for me works on title disruption events as do some of you. And there are folks here who work on things like active galactic nuclear I'm sure, and our industry finding things like changing location and so we have very different interests for what we want out of the alert screen from LSD. The bulk of the alerts that will happen from LSD will be sort of in three broad categories and they'll serve three large scientific communities desk which I'm part of, as well as TV SSC. The transient and variable star science community and then of course what we'll see else is a GNN things in that picture. And so you'll see a whole bunch of these sources in the LSD and extreme. And they are very different what sort of science people want to do what sort of time scales people want to study with these kind of alerts. And of course a whole category of sources that I haven't talked about sources that are multi messages so zooming into the source you saw that blue contour of the like over go Kagura localization region for GW 17 0817. And of course, as you zoom in you were able to actually localize this one multi messenger source, and go get a whole bunch of follow up spectroscopy of it, which is critical because it faded very rapidly and by sort of 11 days was much harder to write and much harder to study without having trouble for example. And so one of the things that we have to do in the coming decade for big surveys is not just look at surveys individually, but to make optimum use of them you have to combine data from different sources. So we have one example of being able to do this successfully in GW 17 0817, but it was extraordinarily hard and took a huge coordinated community effort with a whole bunch of manual intervention to schedule observations to point telescopes to get active every time. And that's sort of simply not sustainable when you get to the point of having 10 million alerts every single day. How do you find something like GW 17 0817 in that mess it's really hard because it's not a question of trying to find a needle in a haystack. It's trying to find a needle within another stack of needles things that all look very very similar needles in a haystack and actually really easy problem if you want to find a needle in a haystack, just burn the haystack. The sort of traditional way we do data releases is completely not suitable for this kind of science. So if you have a traditional archive, and I know groups like CTA thinking of things like this, you can go in and say, search for M 100 or some source and look at some proprietary period is over, or for your own proprietary proposal, a list of all of the raw images that you can have, maybe you get some image processing with it. But there's no catalog level data for any of these sources. In general, you'll have to download them, you'll have to take them on your own laptop or high performance cluster, and do your own analysis of them identify sources. So this process has a huge amount of latency. And if you tell me that, you know, we're doing this for Hubble and of course Hubble is all I'll point out that this is the same exact technology being used for GWST. It's the same technology being used for tests. It's the same technology being used for a whole bunch of other service. And things haven't changed a huge amount since this. This sort of update effectively was SDSS gas jobs, which gave you instead of having just raw images access to catalog level data. So useful data products that were from processing on those images, and you could write your own SQL query, and every now and every SDSS will do a massive data release and now you have better data of more sources going deeper. And this is again the mode of operation of ongoing experiments like Gaia. There is no easy way to get real time Gaia alerts. So we need to going forward find a better way to filter. So this is a cartoon that Pete Marenfield drew when I was at NOA when a postdoc. And the real issue of course is volume and rate. And you are trying to find objects that are common but also be able to identify extremely rare sources. And for those of you who are amused by this. Yes, it's me on the left hand side, and you may recognize my colleagues, Professor Hanai Nami and Colette Salek who are postdocs at the same time at NOA. And it's even a direct picture of the whiteboard that was in my office at the time. So they just cartoonified us. So the key challenge then is not just building a survey that gets better data, bigger instruments, facilities that go deeper. It's to really figure out how to get that data to the community quickly to use it. The time demand, this implies something it means that you have to deal with real time data streams you can't just wait for periodic data releases every year, and you have to process data from heterogeneous sources. It's not a question of what can I do with just CTA it is what can I do with CTA and Ruben Observatory and CNBS for and all of these other experiments all together at the same time ice cube for example. And that is a much harder challenge because none of that data currently comes out to us in any kind of standard format. How do you process that. To answer that question, we came up with this concept of a middleman, much like a stock broker takes your requirements and tries to find a stock portfolio that matches your needs and figures out exactly what you need. Alert brokers in the middle are doing that except not with stocks but with rather real time alerts from ongoing service. I should mention at any point if you have questions please feel free to interrupt me and I'll happily take them. So what do we actually need, based on those last few slides that I went through. We need something to set sit through these heterogeneous about streams in real time. We want to characterize and classify the events that are in the industry. We want to identify things that are potentially rare so outlier characterization particular. We want to be able to prioritize events for follow up automatically rather than have a human, like some grad student figure out oh I want this object. That's simply not going to scale when you have 10 million. There's no way a human can even look through all of them. And you want to actively learn from that follow up you want to have a system that improves over time and gets better and better with every classified spectrum. So the core functionality then is to do all of the above and provide a search and filtering and community, complete service to the entire community. So the core part of what a broker does, I think is this box region over here. There are a few different brokers out there. It's not a huge amount it's a it's a handful something on the order of like six to eight. And this will also be about the number of LST supports in their ongoing operations. So how LST will actually interact with this is LST is not going to run a broker on its own, but rather as part of its own processing. So they will create poster stamps and alerts from the different simulators. Those will go into a message queue, implemented by a tech country called Kafka, and those get broadcasted to these community alert brokers including the one that I work on and and you the user, or you another science collaboration connected the brokers rather than to LST directly. So the method of access that is sort of in the same vein as what I showed you with SDSS's cash job, and that is through things like the Rubin science platform so every now and then there will be a yearly data release for example, that will go into the science platform and it will give you the same sort of SQL tools to query that database but also something like a Jupyter Hub environment to do more complex analysis. So the alerts themselves right now come in the form of what is called a Apache Kafka queue. So these Kafka is a technology that was developed by the Apache Software Foundation the same sort of group that that for example constructed the sort of web servers that you're familiar with for most everything. And it can have several different producers and for producer here fill in groups like LST or CTA or ice cube or whatever else. And they produce records that go into topics topics for example can be things like supernovae or glass night's clothes. So they can be very generally defined, and those in turn get connected to a queue system through a queue system through the consumers. So here what we're really talking about are the broker systems that will actually be listening to these scripts, but you can also imagine the brokers run an instance of this exact same thing and use the same sort of technology to talk directly to the users themselves. Now, the nice thing about Kafka is it's extremely fault tolerant consumers can disconnect they can be new messages pushed in the queue once they reconnect they'll simply pick up from where they left off as a buffer of up to like a week. And it guarantees at least once message delivery, you have more than one repeated message but at least you'll get one copy of it. So hopefully nothing gets dropped. It has, it's very efficient in the way it serializes its data and sends it out. And it's particularly the case when you also combine how you send the data in the form of Kafka, with what data you send in the form of the format. So the format that's being used for alerts right now is this thing called Apache Avro, which is effectively a serialized JSON schema that includes a whole bunch of data. I put in LSSTs on alert packet example and schema for folks here in the CTA to start thinking about. I think of this entire ecosystem of Kafka plus Avro as very much a successor to Vio event. Vio event has of course been alone for a lot longer, but the mere fact that big surveys like this Ricky transient facility, and soon a LSD and like a world of Kagra, as well as groups like SMUs and such are starting to use. Kafka directly or I think hopscotch will talk about the second, mean that in the couple of years that CTF has been running there are now vastly more alerts that have been sent out over Kafka with Avro packets, then Vio events have done in their entire existence. And currently access management for these consumers is very basic it's just in the form of an IP waitlist. But a group called the scalable cyber infrastructure for multi messenger astrophysics, which I'm a part of similar is adding this identity and access management layer to all of this Kafka infrastructure. And they call that product hopscotch. And so hopscotch I think will become very much the standard over the next few years for how these alerts are sent out from big astrophysical surveys to broker systems to eventually get to consumers. So, my first suggestion for how CTA can sort of engage in this ecosystem, which is what Albert told me that I needed to sort of emphasize is to start to work with these groups that already started paving the way to getting alerts in the time to similar Ruben observatory CTF. There are examples for the data distribution system Kafka and Avro packets out there. You can of course have now multiple years of CTF operations to have example sample data, and it's pretty easily adaptable and I'll show you this in a few slides. Do any other survey. So brokers then are the things that are receiving the alerts from these big surveys like LSD or CTF or through other groups like Amon, which is another multi messenger monitoring network that I think you heard about the last CTA colloquium CTA webinar and what brokers then do is of course manage the solicitors so they add contextual information, they help you characterize these sources, they want to annotate rank and distribute and classify them to the community. So the community, for example, can listen to this directly, but you can also have completely automated pipelines, listening to these things, things like telescope observatory managers or Tom's, which connect directly to telescopes for automatic for example so you can imagine a pipeline for example where we issue an alert from a survey like LSD a broker like Antares characterizes it completely and says, Hey, this matches your criterion for an interesting rising early young supernova. Because you have also asked us to connect it up to the storms we will now issue an alert that says trigger your observing program and point to telescope the source. And the important thing here is that sort of latency for multi master transfer physics really drops down from a few days to a few hours to now a few minutes to potentially seconds. The core ability of brokers is enabled by not defining the science that you want to do ourselves but rather letting you write your own filters. So if you want to do any kind of complex targeted processing of a large amount of data sets, write a little bit of code in Python, and we will run your code for you. So this really lets you correlate optical gravitational real time particle between articles for multi master transfer physics. And it's already up and running so you can sign up and Antares will start processing ZTF data for you with your own filters. Should you want to do any kind of experimental analysis. I like the people that are really, really crucial here and Terry's now actually also published for this since I'm in the context of a bunch of particle astrophysicists and Terry's has nothing to do with the particle astrophysics experiment. So the acronym stands for the Arizona know our lab transient alert and response to event system, which is also no longer accurate because initially was in a way or in a way has been renamed know our lab plus people who were at know our lab. All over the country, Monica said I some here is with me at Illinois, then I know folks in Hawaii, for example, who are working with it but these are the core folks who have developed on curry so Monica, Jen Shirley and Tom Madison, the core science people along with me and Nick Wolf, Adam Scott and Carl Stubbins and the core development people. So, rather than sort of just show you static slides I figured what I will do is stop sharing my screen, my slideshow and instead share my my web browser, and just give you a walkthrough of how the system looks in honest to goodness real time with a completely vanilla generic user account. That's my own account here that I have no special privileges, nothing, nothing that I will show you is something that you cannot do yourself. If you are looking at this presentation later on in just the slideshow format. There are slides are sort of will walk you through what I'm doing give you links to things like example notebooks, and so on and so forth. Now, the alerts from entities come in the form of these objects from ZTF. You can sort of decide that you only want to, for example, look at objects where the latest alert was within the last week. Maybe you want to limit the number of measurements it has. Maybe you want to ask, for example, if people here are particularly interested in galaxies we can demand that it is cross associated with SDSS catalogs. So NYUs value our catalog, maybe you also particularly interested in AGM so we can see if we can find something that satisfies all of those criteria. I'm sure enough, after a couple of seconds of processing. My screen here is moving a little bit slower than the zoom window is updating a little bit slower than my actual screen so this takes literally no time at all. You get all the latest alerts and so I have no idea what any of these things are so I'm just going to click completely random at one of these sources. Let's look at one that has just a few alerts at least. So this one for example has two alerts at a location you can see the sort of like curve that you get from ZTF. And you can see the last couple of detections, where it has apparently risen. You can get something like a poster stamp you can see where the detections are. You can see the associations for this particular object so for example I requested that it was in NYUs value on a galaxy catalog. So everything that is an NYUs value on a galaxy catalog shows up here. And so you can do all of this stuff through just the web page. You can also require alerts that meet other people's filtering criterion. So let's just clear some of these criteria off, because often if I use a filter I'm using a much more selective record and who knows exactly what you'll find so you can see that we have a bunch of criteria that are effectively tags. You can also create little tags that you can look at. And those tags define whether an object is interesting or not these tags already have the, the particular sources identified. Most of the names are reasonably self explanatory otherwise there's a little table that explains what they are. But let's just say we're looking at nuclear transits. Why not. You can get your listener with that tag associated with it, and look at things that just nuclear transits. You can also get at this through the filters directly so you can go in here and create your own for example, or look at any of the tags so for example I can look at monikers and isam stag for sub luminous supermodel. And you could sort this thing of course however you like. I am going to, I think sort by things that are really recent in time, and maybe by brightness, those seem like sensible things to sort out for me. So we keep this as an ascending order thing. And again I'm going to pick one completely random type. Okay, that's a cool one. So that looks like a regular supernova to me, completely random again here then just picking whatever I found, you can find the galaxy it's in, and the catalog information that it has. So for example this is in an SDSS galaxy so all of its magnitude are here. And this is really just using monikers code to find it. I'm going to actually bet that this particular source is a classified supernova because it's, it's gotten to 18.5 and since I'm a supernova person I can tell by roughly the race time this is probably co collapse supernova probably a type two of some sort. So to the transient name server and check if I did that reasonably well. And see that is in fact an entry I can see the SN. And there is a spectrum that is a hydrogen line that is for sure they call collapse by supernova. And in fact it was classified by Francisco Foster the university group, which is one of the other broader systems out there. So this is, this is a quick example of how this infrastructure can very quickly let you find and discover sources in these large alerts ZTF is sending out something like a million alerts per night. But with a few clicks of my mouse to this web interface I was able to do very quickly identify and find this interesting large source. And of course, this tag, so that Monica's sub luminous tag may not be exactly what you want for your science, but you can actually see your own science codes by simply checking out how our tags are created so for example, if you want to see for example how our high amplitude tag is created. Simply expand that code, and you'll get the Python code that our system runs on ZTF load screen. This is a user contributed code. And you can write your own version of this code to do so. And then you go to the Jupyter lab environment that Antares that NOA was dead lab hosts. One of the notebooks that you'll be able to find there is this Antares filter development kit and this is a simple Jupyter notebook that simply walks you through how to create how to connect to our server, how to write your own filter from scratch is a bit of Python code and do your own sense and you have access to any of the properties created by any of the other filters, but also all of the properties in the alerts themselves so again, here is where I'm going to quickly show you the avro schema what's in an alert. Each detection each source comes with whatever is in that one particular image and so this is LSD format but it's very close to what CTF is using anyway. You can get the obvious things like RA and declination, of course, but you also get things like aperture flux, the PSF flux and so on and so on. If it's trained for example if you're looking for solar system sources. If there's a bipod for example to do any kind of distinguishing between real and bogus sources, you have all of these things and so you can write any filter that you want that uses any of these properties. This is something that we compute ourselves within Antares. So it's an extraordinarily rich ecosystem that you're effectively becoming part of when you start sending out your alerts in real time rather than doing a annual data release. It gives you the ability as a scientist to sort of look at the sources very very quickly and do all sorts of analysis in no time whatsoever. So in addition to sort of typical filters, maybe you have a list of known sources your own little catalog you can create a watch list for that catalog and simply update and they'll simply give you a slack notification alert, or let you know and the very very simple it's just a CSV file with an RA and Deccana radius and every time that source has a new alert from any server not just LSD or ZDF, but potentially CTA itself or say, if we connect up to ice cubes alerts to aim on for example, you can get your alerts it can go on the Slack channel so you can have your phone ding it to you as much as you like or as little as you like. So the filters are more a more complicated version of watches, but again you can write your own code and submit it anyway you like and the Jupyter notebook environment for example, that I showed you the decade lets you write your own filters this can be, for example, quite complicated things so I'm showing you here a filter that I wrote for Ogles micro lensing project that actually will for example do a full lens model fit to a micro lensing event and try to identify events in real time that look a chromatic and have the characteristic rice scale for micro lensing source. And so you have very genetic, it's very efficient, Python code that's run in parallel, currently on a cluster at the University of Arizona, we're shifting all of this infrastructure very quickly to Google's own cloud platform. So I've shown you, for example, how to create your filters how to create a watch list how to, if you have your own catalogs we can also add them for example so you can see all of our current catalog holdings here and entities, along with every single one of these things. You don't have to look through all of this right now but this is a whole lot to see. But of course that's not good enough for some folks you might want to do a little bit more even so you might want to build your own real time pipeline around this thing and not be going to websites in your browser to see. So there's a Python client that lets you do everything that you can do on a website, but now programmatically with an API on your computer so this means that you can, for example, do all of the things I showed you find interesting events in a large stream run Python code on them to identify particularly outlying objects, but you can now also connect it up with a broader ecosystem so I'm going to quickly stop sharing here and go back to my slides and show you what I mean by that. And roughly around here. So the API lets you query everything that you would have gotten from the website but now in a Python client for example any of the local lookups that I did through the browser you can now do in a Jupyter notebook. For example it's an RA and deck with some corn search radius that you like you'll get a whole bunch of objects. And that of course lets you do things like look at an ensemble of sources all at once rather than one by one on a webpage. If you want to, for example, create a color magnitude diagram, you can do that you can find outlying sources that are non stellar. You can do a whole bunch of real objects from this real time stream that match your criteria. Without having waited for a year for a date of this, this is a really powerful thing. This example notebook, there's a link to that notebook right here. So you can go around and play with it and use to identify your own variable sources and of course I'm this example is done for variable stars, but you could do this for any other source that you might like for example, changing if that's your interest or sources like Quasars or Lens systems where you know where the locations of each of the multiple images are. There's anything that you can imagine basically and can write in Python you can implement quite easily. You can also have queries that are just pure SQL and done against our database. This is actually how this was implemented and return. So this is very much a combination of what was possible with things like cash jobs, but an evolution of that into these systems. And the real nifty trick is when you go beyond just hey I'm analyzing sources and finding interesting objects but you decided to do something with those sources, you decided to follow up. So you can convince attack to give you some telescope time with whatever resource you particularly like maybe Swift or whatever else, and you can find an objective interest in the stream. And this facility is connected up with something like a telescope observatory management system, then you can create a follow up observation request, completely programmatically within antaries or within your own notebook using the API client to query antaries. So you can run it through your exposure time calculator and set the exposure time right way, and then you can submit that observation request, and there's an entire way to do that with this, this a on Tom system that currently exists and supports Gemini as well as the last facilities. And sure enough for that particular object you can submit a real honesty goodness request to observe it on the observing portal that you get from LCO, and it will go off and trigger one of their facilities like and get you an image, all in real time, no human in the loop potentially, you don't have to have people wetting each of these requests you can simply define a strong enough pressure to the criterion that you have this observation, follow automatically when your filter trigger is all in real time. That's cool, and that's a good way to try to deal with an alert stream when you have 10 million alerts every single night because you could put all of the grad students you have to try to find interesting objects, and they won't be consistent, and cannot possibly sustain that level of effort for 10 years of electricity in any case. So, you'll have to sort of move to this environment. And you can see how this is really really powerful for multi messenger astrophysics. If we had the sort of thing back in 2017, instead of waiting four and a half hours to trigger one GW 70 no 817 and we could have potentially done it within minutes. There are all sorts of science use cases that this sort of specifies I for example a particularly interesting technology. Now I don't need spectroscopy for all of those objects, but I can get targeted spectroscopy for a few that for example, fit particular constraints so maybe they have really discovered really early or maybe they're in a particular range of range, or maybe I want a random sample to assess my selection bias. So this is the sort of transients on demand science case that matches up with that. And because we did this entire connection up with the Tom systems, and can trigger real time sources. You can for example find an interesting source within CTA data, set it out in your own CTA that's framed with these brokers, you can use the entire East client query anything associated with it. You can report it to DNS automatically, and then show you a follow up like I showed you, and this is in science fiction we're doing this right now so this was all done by a guy student Patrick a layer along with a poster constant and we'll answer working with each other. This is one of the objects that we have submitted to see DNS completely automatically. So Patrick's anomaly detection filter board said this looks like a supernova and flagged it, it was reported to DNS completely and no human the loop. And then we went off and got a spectra of it right after that. And with not and you can see that this is a regular type on a supernova. So this is cool and it's not science fiction anymore, we can do this stuff, and it's happening more and more often, I'd say all of these systems are still under development and there are changes happening but it's a really good time to engage and get involved in it because they will evolve and become more and more sophisticated going forward. And that really enables you to write not just stronger science papers but more interesting science proposals to save funding agencies or telescope proposals to try to convince groups to give you time to do the sort of coordinated follow up with CTA, together with Rubin as Ricky or any of the other facilities are the ones. So it's a really, really interesting future that we have going forward. So that sort of brings me to how CTA can start to engage already now you folks are in the middle of survey planning and designing, but you must already have a catalog of sources that you can start monitoring in other way of an article, for example, or if anything interesting happens and say ice cream or something like that nature. So, for example, when I was a grad student I'd go down to more Hopkins all the time and take selfies of myself with the veritas observatories and WIPL, and I'm sure those things have given you a really good catalog of sources that you can already upload for example, do enteries and start monitoring. And sort of Pathfinder facilities are also at WIPL are already a great source of alerts for testing you don't have to go to millions of alerts every night you can deal with just a few. And so if the CTA group wants to get into the sort of real time connecting a Pathfinder telescope with a broker system, please reach out to us we will be happy to work with you to sort of publish your alerts. And we can even put little writers on them in the community and say, you know, we're still in test phase, so don't necessarily go off and worry if you think you might have somebody identify the galactic supernova or whatever. There are caveats here. That's all already doable. And all the science use cases you folks are defining implicitly define the kind of sources that you're trying to find. You can implement these sources as filters, and we can help you develop them there's already this dev kit that I showed you and the ability to run your own Python code or watch this or whatever else. But you don't have to worry if the learning curve there seems to steep just get in touch with us we have a slack space where very very friendly we will happily work with you and implement what we think you need talking with you. So there is a real way for this group sort of engage already with what's happening in these other wave bands and really have CTA become part of this entire cyber ecosystem for time domain science going forward. But I think there's actually even more that you can do with these systems and that's to optimize the server that you're already planning and planning. It's known for a while, for example, that you can use machine learning to separate our variables and transient sources so this was an interactive demo that was developed by Carlos Scheidegger and myself. When I was at no alab, and what you're seeing here are the two principal components for a whole bunch of variables in the linear and oval catalogs and so as I'm moving around my mouse. And the same sort of thing that you can get in that area is if you click on a particular object you can get posted stamp as well as all of the other information on there. But what's particularly interesting for us was that even in two dimensions as long as you had the full like I'm only looking at two PCI dimensions here. You can see typical classes of objects start to differ from each other. For example, they look very, very different from alcohols and they really have different light curve ships and those features put them in a different part of principal component access space. You can also use the sort of technology to say identify outliers and see if anomalous objects are very far away. And so for a second, I'm going to upload a known anomalous object that is definitely going to be far away from the rest of the sample and sure enough it is reasonably far away and so on and so forth. So we knew that machine learning if you have all of the data on a particular source can be used to separate things out. But how do you actually use that to design design your survey and make it better and use machine learning and real time as far as the systems are also within ribbon was to do something called the photometric LSD astronomical time series classification or plastic. So plastic ran a couple of years ago on the Kaggle platform. It is to date the biggest astronomy data challenge that's ever been run on Kaggle which is operated by Google. So we had $25,000 prize money for the general public to get engaged and do a photometric classification of the time demands guy using simulated LSD alerts that we created. Except they were in the form of full light curves and we had we made this as reasonably realistic as we could, different kinds of time demand sources things like tidal destruction events and emboffs and supernovae of different kinds, Myra variables whatever else have you. And we simulated something like 3 million fiero light curves and you GRIZ why, and we really wanted to jumpstart for the metric classification efforts, and in that respect. This plastic effort has been tremendously successful we've seen. We've seen successful things that do custom process interpolation and, and use deep neural networks with these data decoded units here by Ashley, do identify anomalies objects. We've seen general public classifier split out of them. The charge was actually one by Kyle boom who's now who is an astronomer and is now a postdoc at the University of Washington. He got really, really lucky and one right at the end. And if, if our victory conditions to be in slightly different, an astronomer would not have won the church, a member of the general public would have, which is very, very cool. So what comes from plastic, you know, and you care about say killing over discovery. This is not going to be a problem. Kilanova were really strongly well identified by pretty much every of the major winning entries in Kaggle all the top three had no trouble separating Kilanova for many of the other classes they had much more trouble classifying regular supernova and communication from each other. But what we know is that it will in fact be a problem with LSD. And why it isn't the problem in plastic is because of the limitations of the challenge itself. So LSD is cadence has this median internet gap of roughly four days cross world filters or 10 days in a single filter. And that means that if you are trying to find something like a Kilanova which it was very rapidly over a few days, you'll only get one or two points. So you're seeing this one or two points and seeing other objects for a whole bunch of other points, and very quickly identified which things were killing over simply by the paucity of that data. So class space really only good is what the training data you give them is and what the master she is. If you look at something on and there is in the other hand something like this sort of object I showed you earlier, what you get is not the full light curve but rather a slow rise. And what tells you that this is a real object is that you have this slow evolving rise over time you don't get all of this at once you see one alone one night, getting brighter next time getting brighter and protection another by the third night. You see it's associated with the galaxy, and you want to make all of those connections and use that information for your classification. So the basic version to which will happen this year is to connect the LSSD science community with the brokers so instead of sending out full light curves as we have what we will do this time is send out alert packets, the actual alert stream that you might get from LSSD itself. What we will do is preserve environmental correlations will have a representative sample of the alerts for training, and the brokers have all the broker teams including and eddies as well as a mercy and the sir and some of the big teams like Apple will be working with us to still process and classify these alerts in real time, and we can have many different metrics and so we can have the community involved with LSSD science now become part of a verification validation loop within desk to sort of ensure that we have compatibility between these broker systems and the LSSD science kind of relations, and this model I think works, not just an LSSD world it'll work just as fine for CTA so surely folks in CTA are doing things like creating simulated objects like that they will see in the sky. You can issue alerts on those simulations themselves through these broker teams we will happily work with you to do that, and see how well the community can actually identify them. And that has created all sorts of interesting developments so for example, if we want to get away from simple light curve classification where you see the full light curve at once you can now see what happens when you have an evolving alerts stream of alerts, and you see this light curve coming. I had a grad student, Daniel Northcrestor who's just got his PhD a couple of weeks ago, work on this and he came up with a system called rapid or real time automated photometric identification. And you can see that as you see more and more alerts from this particular object this classifier this deep neural network which has these great record units evolves its classification over time. So for example that this particular source looks more and more like title disruption. We are also figuring out how to do our simulations with hosts. So we have the largest library of supernovae and other transients now that we can find the correlations between the host environments of those transients and the sort of the host and include that into our alert simulation so now when you have alerts coming out from LSD for this plastic challenge. They are not just the light curves of the sources themselves, but they also have all the host information including a postage that potentially associated them so you get color you get redshift you get radial moments, so seeking messes. So what else have you to go along with the machine he helps classification that makes this entire process more realistic it gives us a lot more faith that we're using machine learning correctly and identifying the right sources. And what we're doing with all of this is now already, not just including only LSD data, but we are as part of plastic version to simulating data from another survey this case, like a work of Congress sky maps. Our entire machinery within LSD land is over here but now we have this extra stream coming in in the form of like alerts, which could also just as easily be CTA that's a few folks are interested in working with us. And these will go to a whole bunch of brokers which will rank and classify and do all of the things that I've been talking about in the slide, and let you the science community actually see the output, see what is useful, tune your algorithms appropriately, and really engage with the project as a whole. And, as part of that, within my own group, I, I'm working with a post up the chat tree, who's in the noise survey science over here. And what we figured out is how to modify rapid take not just the night curve, but also the entire sky map the simulated sky map from like a work of Kagura, and use the sky map is effectively a prior to identify the subset of sources in a survey like ZTF that are potentially killing over counterparts. So the electromagnetic counterparts to these by the new front standards. And this sort of system works really really well. We only have one real data set to evaluate it on which is the actual GW Sunday night 17 data, but applying else in just this machine learning codes now to the one real killer know that we have. It seems to work perfectly it identifies the killer know that with the sky map and like from that can without any trouble. So this really means that the next time we go into like oh oh four and oh five beyond that. We now have the tools to really find the counterpart much more quickly than we were able to for GW Sunday night 17. So this sort of lets me wrap up a little bit and with house I think city I can engage from the last point, every survey is creating these large simulations before they don't expect to see a survey design. And this is typically infrastructure work so it's hard to get people to really be part of that. But you can engage the community scientifically you just have to do the actual order of taking those simulations and converting them into alerts and sending them out and really that gives you a way to effectively write science papers or what would have otherwise been considered by the government to take this this core software effort and make it into something that can really be into science papers. We are already in classic version two, including like a Virgo Kagura together with an SSD and four version three as the plastic head know I am would be thrilled to have CTA data simulated alerts for that next date or at least the next version of our data and so we'd love to work with you folks on getting this done. So I want to end with a few, just one other thing. This this ecosystem is constantly evolving. So what I showed you now a real time analysis of data, but you might also want to do sort of archival reprocessing of data, or analysis on large sample of data, and brokers themselves are evolving to sort of make this an ecosystem that evolving to what we call research platforms that do all of the things that brokers do but now also give you cloud based computing. So you can reanalyze a whole bunch of data sets you can do your data releases in the cloud. And I really think this is where the community is going to be going. I don't think we're going to have individual universities or projects on their own HPC servers. We're going to be able to do all of these things and do annual data releases I think instead what we'll have is ongoing living data releases that are constantly being operated with the real time analysis that's happening and it's, we're moving away from traditional archives, where you can download fits images to your laptop and do all sorts of analysis to these cloud based environments where you do all of your analysis and visualization of the cloud. You can download identity and access management layer so you can share data with groups, either privately or make it more public all at once. And you can have real interfaces not just from service themselves, but also between other facilities. You can have, for example, follow up with things like Swift or the last complex observatories but potentially even the entire amateur astronomy community can now become part of the science that you do so it's really a very cool new word for what we're trying to do. I'll end with a few takeaways. So, as a student CTA then building on these current generation of service discover many, many new sources. And I really think that maximizing scientific utility from these experiments requires that we move away from sort of regular data releases to providing our data and real time, so that people can start doing science. This is particularly crucial if you're working on time domain sites. A lot of brokers are really a key to the sort of the middleware that give communities ways to search the leader and characterize and filter and find follow up events. And they let you use machine learning which is otherwise this very abstract thing that seems to be done in the same papers but otherwise isn't used as part of your real time survey operations. So CTA is building on its pipeline and it's creating these sort of simulations for survey designs and attracting the themes in the scientific community. You can actually engage with this entire ecosystem already. You can define sort of interesting questions that spur people to go on and develop filters and machine learning codes for the CTA group and do that analysis. The best science is not going to come from individual surveys acting separately from each other but when they act coherently in concert with each other. And that will really allow us to do this multi master astrophysics by combining real time streams and experiments. It's very challenging but I think this is a tremendous opportunity for the community, and I will end there.