 Okay. Kia ora everybody. Welcome to the afternoon session of the astronomy mini conf. I'm Jessica Smith. I'm the convener of the mini conf here today. We have three slightly shorter presentations for you this afternoon. You can see them listed up there. Alex is first up. He's going to be talking about searching for planets in eclipsing binaries in the MOA database. The MOA database is one that is yielding lots and lots of very useful stuff, even things that it was not originally intended for. And we have three different takes on that data. After Alex, we're going to have Martin talking about data mining, the MOA catalogue, and the machine learning algorithms that he's working on. And then to finish up our formal part of the program, we're going to have Ashna, who is up here, talking about GPU accelerated modelling of micro-lensing events. We heard a bit about micro-lensing earlier today from some of our other presenters. After that, starting at five o'clock for a short 20 minute session, we will be having lightning talks. So if anybody has some ideas about what they might want to present, five or 10 minutes on anything astronomy and Linux open source related away you go. That would be awesome. And it's a great opportunity for all of you to participate in this as well. So I'm just going to hand you over to Alex, who will take it from here. Thanks, Alex. Sorry. Yeah, please. It's not on the same screen. Say you just need a mirror? Yeah, how? Can you see it? Yep, keep. Hello, everyone. My name is Alex Lee. I'm a PhD student in astronomy at the University of Glank under Dr. Nikolaus Blattambure. And I'm currently working in a moya group, a micro-lensing research group, which is to detect extra benefit in micro-lensing techniques. Yet you will be found later that my project has nothing to do with micro-lensing anyway. So what else can be studied by moya other than micro-lensing? Yes, actually many. In fact, to micro-lensing events are very rare to optimize the chance of observing such events requires a long-term observation with a wow-fill camera towards a dense region with starlight garlic center here. And so as a result, a huge amount of data sets of star objects has been collected by moya almost in eight years now. So there must be something very interesting in the moya database that just waiting for us to be discovered. I'm a PhD student. I have to write a thesis for my degree in the end, so I have to pick a specific thing for my thesis. So I plan to study, plan to design a project on detecting planet in the UYB. Meanwhile, I maybe try to search for new astrophysical phenomena. So what is an UYB? Just look at the image at the background. It is a UYB binary. It is a class of variable star in which a secondary star is passing through the primary star repeatedly. So how to detect planet in UYB binary? Well, there are a few popular methods like ratio velocity and transit method, but none of them are suitable for moya because of several technical reasons of the moya telescope. So I'm looking for other method. The promising way goes to the UYB timing variation method. So the idea is that every time as the secondary star is passing in front of the primary star, it creates the drop in the brightness, this phenomena be called ellipse, which is usually repeated in a regular light, a plate code. But if any planet exceeds in the system, it will create a periodic variation in the ellipse times like the graph showing here. So this is that kind of time variation I am trying to look for to detect any existence of planet. And in fact, there are already five planets in the binary system claimed to be described by this method by the other groups. So it seems like to be a promising way for us to have a code. Now here comes to periodic finding method. Before doing any exciting things about detecting planet, I have to first identify ellipse and boundary candidate in moya database. Unfortunately, the light curve, the data sets in the moya database are very ugly. Unlike anything smooth and regularly generated by computer. So I have to looking for an algorithm which are capable of dealing with irregular sampling, big gaps in the data set. And I also want to take the whole light curve, not just the segment, to take the whole light curve into account to study something. And so I have to, I want the algorithm to be able to run the analysis for a large sample site like over several thousand pounds at a reasonable speed. So here comes to the method I recognize is the most suitable for our purpose. This all three method called stringing method, phase dispersion method and conditional method requires floating the light curve to see how the different part of data sets overlap each other and minimize this kind of measure, the total distance, the length which is the total distance between points. And the theta which is just, I'm not saying this standard but it's just the mean and the conditional entropy respectively. The basic algorithm of this three method is quite simple. You just have to set the frequency range for previous searching and then input a frequency, calculate phase, throw the light curve which is just to calculate the phase of each data point of the light curve, then bin or source the data point of the further light curve dependent on the method you use, and then compute the measure which is the signal value and you let the process iterate over the frequency range. In the end you output the frequency with minimum signal value. So here I take an example in the Moira database to demonstrate how the methods work. So this is the light curve in the Moira database which is a elipsing boundary. You can see there are many gaps in between. So if I throw the light curve at a wrong period you can see something like the string of the light curve is just formed by collecting the data point and you can see it just look like a mess. It doesn't make any sense. And if you put the variance against the phase you can see the variance in each bin has the measure comparable to the overall variance of the light curve. And also you can see the data point over a large width as can be seen by the two-dimensional histogram of the phase relative. First, if we throw another wrong period you can see quite a similar situation as the previous one. But if we throw the light curve with the two pairs, the things change dramatically. You can see the sharp shape of the string appears and you can see the variance in each bin much smaller than the overall variance of the light curve. Meanwhile, you can see the distribution of the points is concentrated leaving majority of bins with no points. Yeah, sorry I haven't explained what the entropy is. Entropy is the concept from information theory. It is just a measure. You can think of it as a measure of how complete the data points are distributed over the space. So the higher the entropy, the less compact the distribution is over the space. Am I wrong? Am I correct? Sorry. So here I just put the result of the period analysis. Because it is a program to show the result of the period analysis. The periodogram is just a part of the frequency against the signal values. You can see there are sharp signal at the true period for all three methods. But actually the minimum signal corresponding to PDM and conditional entropy are not at the true period but at half the true period. So PDM and conditional entropy find sub-harmonic periods instead of the true period. This may be a great problem if you want to develop an automatic algorithm for, say, classifying a variable star. But for me, I just want to identify it. So it's still okay for me at the moment. So I also want to, I am also interested in finding if this method is able to find which signal in the light curve. So here is another example. I performed the period analysis of the sample for the three methods, respectively. And I found that the PDM and the conditional entropy are able to find the true period at the period of 2.25 days while the stringing method failed to find the period. So what happens to the stringing method? If we fold the light curve at that period, you can see there is little bump in the noise. And if you look at the string, you can see the length measure doesn't fail to give any sense in this case while the conditional entropy keeps doing his good job. So compare all the cases. So I also want to study if this method can find any periodic variation, particular shape. And actually, I was able to find this example by the conditional entropy method. This looks really strange. I have no idea what it is corresponding to. There may be a Yiddish binary with some kind of particular variability. So based on my experience, I recognize the best method for my purpose comes to conditional entropy method. It is fast, simple to write the code. And it's not sensitive to noise and all that as the stringing method. And it's shape independent. So you can use it to find any shape of the periodic signal. Here is part of my code of the conditional entropy. Yeah, you can, if written in C, if you are familiar with C, you probably can appreciate the simplicity of this method. You just need to write it if you like to perform the whole of the period analysis. So, but if you're interested in other methods, I suggest like PDM, there are open source code available on the website. And I also write my own code for stringing code based on the paper here. Yeah, also, of course, I also write my own code for conditional entropy code. Yeah, but they all are written in C, but you can use any... This there is no particular reason for C. You can use any higher level programming language like Python. They still work very well. But for Python, don't use the... I just have a piece of advice. Don't use histogram 2D in Python to bin the phase. And the first I have tested is slow the program anyway. So I highly recommend you better to write your own code. In fact, especially in the case of conditional entropy method, it's quite simple. So, in the end... Am I talking too fast? Yeah, I expect... Yeah. So in the end, I show some... I find in the model database. So my future work of the project is to detail any shift of the Yelps by any shift. That may be associated with the existence of Panda in the binary system. This one is... You can see there are a lot of kinds. There are many kinds of Yelpsing binary. This one is content binary. And this is some kind of detached Yelpsing binary. I realized that the most promising quality will come to the detached Yelpsing binary because they have a very sharp Yelpsing that can be easy... That may be easier for detailing any shift associated with the Panda. So, thank you. Thank you. We really appreciate you coming along and giving us the insight into your work there. We do have a few minutes. Martin, if you want to come down and start setting up. Does anyone have any questions for Alex out of that? Let's just get you the microphone. Yeah, I'm able to answer your question. I'm just curious. I notice you mentioned the method you settled on was quite raised in 2013. Were you already working on this beforehand before that method was published? Sorry. The method you chose, the conditional entry method, was pretty raised in 2013? Yeah, yeah. So, did you do any stuff before that was published, or were you already doing tests? No, I don't have any contribution to that. So, I just used that method to write the copies. This is quite a reason I... Sometimes the latest thing is the better things I try. Actually, this comes to the best method I recognize for my presentation. Okay. Do we have any more questions for Alex? Okay. We'll just let Martin finish setting up. I just... Yep, no, that's fine. Martin Donicky is another PhD candidate working with Dr. Rattenbury. And he's going to be talking about data mining using the MOA database. Most... The presentations this afternoon are all talking about use of that data set in various different ways. We'll just let David finish up the marking. Just flick it on, Dave. Do that. Ladies enough? So, can everyone hear me? All right. All good? Later. Okay. Okay, I'll pass you over to Martin. Okay.