 Our next speaker is Jim Davenport, also from Denike. Oh, it's already up. It's already up. Thank you. So when Vina asked me if I wanted to come talk about machine learning, I was initially very excited and then very hesitant. And I said, I am not an expert in machine learning. And he said, that's OK. We want to stand up and talk about what a normal person might experience trying to apply machine learning to their workflow. We'll demonstrate how machine learning is applicable to normal problems. And I thought, OK, I can consider myself a normal person, I hope. And so that's what I'm seeking to do. I think Mario talked a lot about the opportunities, the very exciting opportunities for very complicated things like neural nets that are not a great part of my workflow. Instead, I want to show how machine learning is becoming sort of more democratized. And first, let me show a figure that I pulled up from ADS yesterday. You just search for the words machine learning in the abstract. It's going up. And most things in astronomy is probably a straight line if we took the log of the time and everything. But this is going up very rapidly, something exponential. And my takeaway from this figure, perhaps controversially, I'll make this controversial statement in hopes that somebody will scold me and we can have a discussion about it. My takeaway from this is that machine learning is now boring. We have entered an era where machine learning is sort of pedestrian and boring. And that's good, because that's the kind of thing that I can do. Is that fitting a line is sort of my bread and butter as an astronomer. And so I'm looking for things that are like reliable, boring tools that I can use. I'm not a machine learning researcher. I'm not a computer scientist. I'm an astronomer. I need a tool that very smart computer scientists have developed that I can then use. I need a very good hammer. And so my takeaway from this and my takeaway from the whole talk is don't be scared. Machine learning is ready for you to adapt as part of your regular workflow. It doesn't require a second PhD in computer science. There are various straightforward applications, as Mario said. You can fit lines very nicely with them in very elegant and interesting ways. And more importantly, they're very flexible algorithms. And I think that's the big takeaway, is that it's better than just fitting a line. There's some very flexible algorithms that we can use. Another takeaway from me, from again the normal quote unquote normal users ground level perspective. Most of our work isn't big data. And if we do our, if Moore's Law continues going up, a lot of what we do will still not be big data. LST may still be big data. It may be quote unquote big data. But I'm not going to probably download all of it. I'm only going to download the pieces that I care about. And if my laptop grows sufficiently fast, then it won't quite be big data. I'll still be able to read it into Python and operate like a normal person. So again, I don't need to be a computer scientist really. I can be just a normal astronomer. Again, this is a good thing. This means that most of the algorithms are performance enough that they can run on your laptop. And you can experiment and it's pedagogically useful. You can try a lot of things and not have to wait months and months for things to run. But this is again, good for somebody like me who likes just using a well-developed hammer. Our data is also becoming better for machine learning. Mario talked about good examples of data, not just getting large, but getting more complex. We get lots more features, right? We're not just getting photographic plates. Instead, we have high-resolution images in multiple wavelengths, taking over time. This becomes a very high-dimensional, interesting data set, which is nice and rectangular often. You can take it into machine learning algorithms pretty straightforward. Data is also easier to get. If you haven't heard of GitHub, if you haven't heard of Zinno, if you haven't heard of Vizier and their new Xmatch tool, there are so many easy ways to amass gigabyte-sized data sets that you can read on your laptop and play with. And these value-added data sets that you can build trivially from surveys like Sloan to Mass, assuming to be LST and many others. When you combine them, you get these value-added data sets that make discovery much more interesting. And that makes machine learning then ever easier to use. Libraries are generally open source. They're very robust. This is not the wild west anymore. There are lots of very good algorithms that work very well, and importantly, documentation is excellent. There's a lot of domain-level, i.e. astronomy-level, workshops, experts, people who have recently left the field to go to work for, name your big tech companies who love coming back to WAS. There are many of them right here this week, so we'll talk to them. There are workshops on various algorithms that you can do at meetings. They're available for R, for Python, for IDL in some cases. You can use your language. So, again, machine learning is boring. The time is now to use it. This is a very good thing. I'm going to focus here for the next few minutes on the implementation of things that I've been using, which is through Python Society of Learner, SK Learner. This library is very robust, very well developed, and many people in the astronomy community have helped develop it. So I would recommend having it dove into this example page. It's really good because it's really long. And when you're trying to look for a good algorithm to decide what to do for your problem, I just literally scrolled through that. It took a while. You can go through and just look for something that looks like your problem, right? And so there are other libraries worth talking about. Of course, I want to highlight AstroML, a very useful library, a very educational library that I would recommend looking through, looking at the example code. Brigitte is going to talk about this later, AstroML. So I won't spend any time talking about this other than just to highlight her talk. Stick around for that. As Mario nicely outlined, there are different kinds of problems that machine learning can do. They're supervised. They're unsupervised. There are various ways to look at the problems. There's classification problems. Are we going to identify dogs versus cats? There's regression problems. Are we going to fit lines? There's clustering problems. Are we going to group things into similar clumps or group things into these clusters that are best suited? There's other problems to mention. I'm not going to talk about these really. I think I have an example of all three of these top-level ones, which are sort of the most basic problems. And my very favorite figures from the scikit-learn documentation are these two, especially in this clustering figure. When you have a problem, you have a bunch of data, and you're like, I would like to cluster this somehow and make groups. Let's say I want to pick out clumps. I love going to this. This is when you click on clustering there, you get this figure. And which one of these little sub-panels looks like my data? And this is like the very observationally-driven science. Which one of these looks like my data? Does my data kind of look like this? Am I dealing with concentric features or little snakes or nice little balls, maybe Gaussian balls? Or am I trying to segment things across a uniform field through some other parameter, or is it many-dimensional? These kinds of documentation examples, and this is the same sort of thing for classification, which kind of classes am I trying to build, which is actually the way to the clustering. This kind of level of documentation makes this incredibly accessible, I think. And this is how I have come to use it, also with the help of very smart people in the field who have helped. Okay, so now that I've undersold machine learning and everything about it, and you can just go and Google everything that I'm about to talk about, let me spend the next few minutes talking about three real examples from my research workflow that I've drawn that make use of these, just to kind of give examples. Again, normal, everyday examples of how machine learning has worked its way into my workflow. So yeah, my goal was to demonstrate three problems that I have chosen machine learning algorithms, and they may not even be the optimal algorithms. They worked well for me, and they allowed me to then move on and do science. So these were three examples where I spent like a day, dug into some documentation, asked some people on Twitter for help, and then moved on with my science. All right, the first comes from the last chapter in my thesis. When I was working on this transiting exoplanet system where the planet goes in front of the star, and a cult's not only the star, but also these spots. And what you get at the transit is these little secondary bumps, these little humps right here in the transit when the planet is going over a spot. And we wanted to track these. Now this system, you get many transits per rotation, so you get multiple passes over these spots in order to trace the spots and their evolution. It was a great problem. And we had years and years of data from Kepler to do this. And so from each of these example transits, we were able to stack up four years worth of data on the longitude and the size, where I've coded color and point size here as star spot size. We were able to trace the locations and sizes of star spots over time using lots and lots of transits. So one transit would be like the location of, you know, if you pass over this chunk of the star, there's spots. You get these bumps that correspond to maybe these three right here. And then we do it again, again, again, again, again. You have hundreds of transits. Okay. So this is a nice data set. It's very interesting. By eye, you can say, well, there's lots of really cool features here. There's like maybe a long evolution of a big spot over here. There's lots of features here. This shift in longitude means that these spots are moving on the star. This is differential rotation for those people who paid attention in stellar astrophysics class. So there's cool astrophysics going on here, but it's noisy. There's all this other stuff. And I need some way to sort of cluster. I mean, the question is which one of these are tracks that I can measure and that I can then measure the evolution along? How do they emerge? How do they decay? So my thesis advisor was like, get a pencil and go draw a bunch of circles. And this actually is a very performant algorithm for this amount of data where you can get a literal pencil or maybe a computer. You can draw circles. And let's not undersell this. This is important even in modern machine learning. There is an outstanding problem. Where do we get training data? When we have LSS-T, you need somehow to train LSS-T algorithms to identify things. And we don't have a spare LSS-T running where we know the answer to everything. So certainly things with pencil is actually a pretty good algorithm. But I wanted to play around. I wanted to learn some machine learning. I thought this was cool. And so I went back to this clustering figure and I said, is there anything here that can pick out clumps? Clumps that are sort of oblate. And I looked at this line in this documentation. And I said, OK, these kind of look what I'm looking for, sort of these oblate smears of data. And I need something that's going to clump them together, pretty reliable. I'm going to segment them along the feature. So this is out because it splits the features here. This one's out. It splits the features. There's a lot of noise here. These two algorithms. Sorry, what are they? Read up. All right, we have two algorithms. One called DB-Scan, whatever that is. And one called Gaussian Mixtures. And I know what Gaussians are. So that seems interesting. And so you said, read the documentation for these two algorithms and you tried it. Again, these are performance. So that amount of data took, you know, five seconds for these algorithms to sufficiently run in my lifetime. Now, there became a big problem where Gaussian Mixture was actually a pretty good algorithm, but you had to pre-define how many Gaussians you wanted to fit. You had to say, OK, I want 100 Gaussians or something. I'm going to slow it down a little bit. Whereas DB-Scan would figure out how many clusters and which belong to which cluster. And I was like, ah, that's my algorithm right there. And so on we move. DB-Scan stands for density-based spatial clustering of applications with noise. That was the other parameter. You can give the error bar, which is important. A lot of machine learning algorithms not only don't produce error bars, but they don't take error bars in in part of the calculation. This is sort of another outstanding problem, that new algorithms are getting better. And then here is the example of just naively letting it run with essentially the default parameters from the Python. You know, import DB-Scan, go DB-Scan, go. And this is what it produced. And, you know, OK, it's not perfect, right? There's some wacky junk. It left out some features. I don't know what's going on here. But the point is there's a lot of these clusters that were real and that I could then do science on it. As an example, here was, I don't remember which one, maybe it's this one. There's one where there's a very nice emergence and decay. And while this is a little ratty, again, it's not big data by any means, this is comparable to the best star-spot illusions that we typically measure on the Sun. So, science was had. Papers were moved forward. I graduated with my PhD. Thank you. OK, example number two. Another example that sort of came out of my thesis work was looking at flares, another form of magnetic activity on the surface of stars. So this is a flare from the Kepler mission. The details aren't particularly important other than, like, the star is sort of variable. That's this blue line, this is broad variability going on. And then all of a sudden, there's this explosive event called a flare. All right, we can all identify the flare by eye pretty straightforward. The science question, which we can talk about if you're interested, is the quasi-periodic behavior, which might have something to do with MHD waves coming out of the surface and interesting astrophysic, blah, blah, blah, is their quasi-periodic signal in this decay phase? There's definitely structure, but the question is, is it just a position of, like, little spiky features, or is it actually a flare, plus some kind of damp sinusoid? There is science behind this question, and so that's what we were after. Now, there was a paper that went after this previously where they just fit lots of big flares with a simple harmonic oscillator function. And they said, if it fit pretty good, they called it a damp sinusoid. And there's some problems with that approach, where lots of things can kind of fit good and aren't actually damp harmonic oscillators. It's difficult to classify whether or not that's a good enough, like, how is that a good enough fit? Do you know if it's a good enough fit? Is there a real period? Is the period stable? Is it actually just stochastic noise that just happens to look like a sinusoid signal? And it can't deal with things that are quasi-periodic, like a chirp or something that is changing its frequency over time. Instead, we can use an algorithm, a very popular, and I would recommend every person, especially a student, Google this phrase, a Gaussian process. This is a way of representing, say, a light curve, or again, you can feed it some noise. And it can model a light curve using a very flexible function, a sort of semi-parametric function that you train it based on a generalized shape that you're looking for in this case. We're looking for a damp sinusoid. And it goes through and it tries to figure out is there some flexible fit to this data. So the blue was the actual damp sinusoid. You can see it's not a wonderful fit. It kind of actually totally disagrees over here, but it's noisy, so it cares. And then the orange is the suspiciously good fit to the Gaussian process to the same amount of data. Again, it's a very, very flexible algorithm, and that's not a bad thing. So all this comes down to what's called the kernel, in this case for a Gaussian process, where you sort of prescribe what the general shape of the thing you're looking for is. So we subtracted the flare off, so we didn't have to look for the exponential thing. Instead, we just looked for a damp sinusoid and we fit it with simply a simple harmonic oscillator, a kernel. Again, this is something that comes out of the box. And we get, again, this very nice, very straightforward fit. This actually was a product that I had an undergrad work on. He spent a couple of weeks doing this, learning the documentation. He's shot a couple of messages to the author of the Gaussian process code Soleritate and called him Mackie, very responsive. I would recommend talking to him or anyone who worked with Soleritate who wants to learn more about it. But the point was it was so simple that I was able to describe the problem and let the student take it and not only did he fit this one event, but he was able to classify hundreds of other events and do sort of an objective search. This one had no clear period. This one had some candidate periods that were interesting. And then you could couple the Gaussian process fit with MCMC, which I'm not going to talk about, but you can also Google. And you can get robust errors. This is a nice thing. You can actually explore Bramper space pretty efficiently, again, on a laptop, and get uncertainties on these periods and the coupling between the decay and the period. So this, to me, was a great success. We just had this idea, and we could run with it in a few weeks. Machine learning, boring and easy, and reliable. Perfect. My third and final example is something that I did during a hack day, or a hack week recently when the guided aid came out. A friend of mine I had observed, and others have as well, that when you make this for stars, color, color diagrams. So this is infrared color, and this is the Gaia G band, probably your visible white band, versus two mass J. So this is like optical infrared. This is like near to mid-infrared here. And when you color these dots, these are 30,000 stars from Apogee, by their Apogee measured metallicities, there's a nice little gradient here. You can see that something in the W1 minus W2, or somewhere in the 3.4 or 4.6 micron band, there's a little opacity feature or something that tells you metallicity. Very coarsely, the metallicity of stars. That's kind of cool. A broadly useful thing, again, for galactic structure in stellar astrophysics. But how can we use this? I can look at this, and if you handed me a star, I can drop a dot on this diagram, and I can say, well, it's sort of bluish. So I don't know, minus 0.2 or something, right? Like, we need a way to figure out this surface. You could either build some sort of less polynomial work. I just in PowerPoint drawn a curly line here and made some sort of, like, eigenfunction, I don't know. You could build some complicated polynomial to fit the surface. But I don't know what the shape of that surface is, and I could spend days just monkeying around with different shapes. Or, instead, you could, again, just appeal to a very simple, oh yeah, it's tedious, it's difficult to add another dimension, and what if you wanted to add a four-dimensional thing? My brain doesn't think in four dimensions very easily. So the point is you can add a simple, flexible machine learning model. And in this case, we chose to use something called K nearest neighbors. This is, again, sort of like clustering. It just looks, takes each point and looks at its neighbors, and you tell how many neighbors to consider, and it basically just builds a flexible surface of averages. It's something like a big spline fit, really. It's not a very complicated algorithm. It's extremely fast. There's some very terrible hacky ways to estimate an uncertainty based on it, but it's not very good. In this case, we fed it data, which was the color colors. The X data is a two-dimensional ray, but we could have also had the color, color, color, color, color magnitude if we wanted it done in color magnitude. And the Y data that we're training to fit here is the FDO range. So this is a supervised learning method. We know the answer for 30,000 stars. Here's the input versus the output. You can tell this is not a great fit, but again, this is like a hack day. So we were just sort of playing around. Here is one of these obscene figures where I just showed how easy it is to well, there's an import. I imported it, then I ran. It's very simple. This is, again, just using the stuff out of the documentation. And the point is you can then feed it a million new stars with no spectra and it can take that surface. It figured out the shape of that surface that really applies the surface that new into this new data center. Okay, so the fact that it looks kind of like the other figure is by construction. I told it has to look exactly the same. But this million new stars was taken from all over the sky. One guy came out. And what's nice is if we then color the sky positions here's RA and deck, excuse me for not letting my axes. You can clearly see the plane of the Milky Way with higher metallicy, these are younger stars. Actually, the LMC shows up down here as well. There are some interesting artifacts and structures here. But what you get is something that has astrophysics baked into it, where we just trained it naively on a color, color diet. Okay, so this is, again, an example of a very flexible, very easy to use, well documented model that you can import and you can run. And with a little bit of help, you can make some intelligent decisions about it. So, in conclusion, machine learning is easier and thus more boring than ever to use. And therefore, you should all use it. It's not something to be scared of. It's not something that needs vast expertise to learn a little bit about. Now, this is totally every selling machine learning because this is cutting edge, amazing topic, which people are investing their entire lives in working on. So don't let me undersell the value and the interest and the potential for machine learning to do amazing and subtle things with huge data sets. My point is that for many of us who think in sort of modest size, class clusters of problems, machine learning is very accessible for us to learn to use the machine. I've given you examples of a clustering problem, a simple regression problem and a very popular new package called Gaussian Processes. This one I didn't use from Astro, from site that I learned in Solerite, but there are several versions of Gaussian Processes now which are useful for many, many kinds of problems. And so, with that I will take any questions you have provided that again, I'm just a normal person.