 All right, so sorry for all the technical stuff, but it seems to be, yeah, it seems to be working great. Okay, so I'm gonna be talking about my new R package called Kinfitter. This does pet kinetic modeling using R. So this is several of the packages that work basically alongside this in order to make kinetic modeling for pet data a little bit easier. So basically what I'm gonna talk about today, I'm gonna give a quick introduction to pet imaging itself. Then a little introduction to pet quantification, just for those of you who don't know much about pets. Then I'm gonna talk about the package, the aims and the content of the package. Then for those of you who don't care about either pet or about the package, I've got a little section at the end. Oh, but who do care about R at least? I got a little section about new tools for working with nested data frames, which is actually a really powerful tool from several packages released by Hadley Whitman. So I studied cognitive neuroscience and we were always shown images like this. And we were always told like, well, fMRI images, it's that really cool thing that gives you like super detail and you can see amazing spatial resolution, but it hasn't got the best temporal resolution. If you want temporal resolution, then you move to EEG and MEG and you have this awesome temporal resolution but terrible spatial resolution. Oh yeah, and there's this other thing called PET and it sort of, it sucks at spatial resolution and it also sucks at temporal resolution and it's also invasive and it's kind of like old sucky fMRI. So basically the question of this frame of this slide is does PET suck? And my answer is going to be that no, PET does in fact not suck. I mean, first, just to settle a score like resolution wise, PET doesn't actually have such bad resolution as they say our HRT camera that we have at PET center has a 1.5 millimeter full width hub maximum at the center of the field of view. But basically this all misses the point about what PET does. PET has specificity. So what we do with PET is we can make different traces for different targets and then specifically look at that target in the brain. So we can look at specific receptors, we can look at specific transporters, we can look at specific enzymes and these are all individual PET measurements with individual PET radioligans but we can actually specifically examine that chemical in the brain. Then sensitivity is basically PET's trump card. So MR, where you can see like gray matter and you can see white matter, it has approximately millimolar concentration ratios. You can look at different concentrations of the chemicals of the brain while PET we're talking about picomolar sensitivity. So the third dimension of that graph on the previous page, we're winning basically by a billion times. So what this means with PET is that we can conduct studies like this where we look at proteins in the brain in very, very low concentrations, baseline. We then give people some kind of medication that blocks those same things, which is also in small amounts and then we can see dramatic differences in the actual signal itself. So we're like this one would be specifically looking at serotonin 1A receptor and this specifically looking at serotonin 1B receptor. So basically specificity and sensitivity are what PET is about rather than like spatial or temporal sensitivity. So just to really briefly explain what PET is, we have physicists who work with a cyclotron that basically create radionuclides. They just send the radionuclides. So these are radioactive atoms. They send the radionuclides to the chemists who attach them to ligands that binds to proteins in the brain. The chemists create the ligands and send this to the camera and the camera is then injected into participants. It travels up into the brain and what these radionuclides do is as they decay, they release a positron. That positron travels a little bit, little distance collides with an electron releasing two gamma photons at approximately 180 degrees, which we then detect in the detector ring. And then what we can do is actually reconstruct that image and we can get PET frame by frame and then PET images and actually quantify that. So the kind of images then that we get, so I mean, I'm cheating a little bit here. This is an average of several people, but nonetheless, it's kind of different from what an fMRI is showing you. We can actually see the different regions and we can see the concentration of different proteins in different regions. So what we typically do is we define a region of interest, let's say, so here the dorsal striatum. Then we look at each frame of our PET image. So we have lots of different frames and then we define the roi on each frame and then we extract what are called time activity curves. So we look at how much radioactivity is in a specified part of the brain at each point in time over the course of the PET measurements. And from that, we can actually get information about how much of the protein of interest was actually there in the first place. So in terms of PET quantification, the basic idea, we typically look at volume of distribution, well, volume of distribution is the sort of classic outcome measure. So this is to ask what is the concentration of the radio tracer in a specified region of the brain relative to the concentration of radio tracer in the blood across time, so identity. And this is what basically defines VT, the volume of distribution. So I'm gonna work your intuition pumps over here. So let's imagine at time zero, if we were theoretically able to stick all of the radio tracer into the brain, and it's all in the brain and none of it is in the blood, it can go into the blood, but it isn't. And we fix it all there and we stop time and then we start time and we start everything happening. What happens next? Does it increase or does it decrease in the brain? A ton of stuff in the brain can leave. It decreases, all right? And does this decrease get faster or slower with time? Slower, exactly. So we basically have exponential decay, all right? So at time zero, we have 100%, time two, we have like half of that, time three, we have a half of that, time four, we have a half of that again. So the idea is just that we have exponential decay. But the only problem is that blood is not all concentrated in the brain at time zero. So what we have to do is we have to measure the amount of radioactivity in the arterial plasma using arterial cannulation. So this is pretty invasive, we actually have to measure from the artery rather than the blood vessels that's being pumped from the heart. We measure the amount of radioactivity. So to give you an idea for how this essentially is when it comes to quantifying in the brain, imagine if we have a bathtub, all right? So this comes from a cauliflower's martensine, this is a really great thought experiment. Say we dump a bucket of water into the bathtub and then we look at how quickly that water leaves the bathtub through the hole, all right? That's the first example where we have everything concentrated in the brain and then is able to start leaving, all right? But at time two, we have a big bucket of water. That's as we reach the peak and then time three, we have a smaller bucket of water and then we gradually have smaller and smaller buckets of water that we dump into the bath and then we watch as they go out through the plug hole and we see and basically that will be the exponential decay but convolved with the input function. So we then have this input function over here. We convolve that with the exponential decay function, the impulse response function. So this is the function of if it were everything at the beginning and it were just to leave, that would be the function, but we convolve it with the input function to create attack and this is what we've measured. So what we do to fit this is we do non-linear fitting. We guess values in this case for K1 and K2, we guess values for the rate constants iteratively until the output matches attack. So in this case, we have there, the one-tissue component model in blue and we have the two-tissue component model in red that fits really nicely and the one-tissue did not. Is this feeling relatively clear? So you have to do this for every single place. And every single region, exactly. And then from having these rate constants, we can then use fixed rule, which turns this into a differential equation to actually calculate the integral. So for the one-tissue component model, VT, the distribution volume is K1 over K2. So what the situation that we have in PET though is that we have loads of different models. We have compartment models, as I just shown you, the one-tissue and two-tissue component model. And different models work for different traces. So here's the one-tissue component model, which is pretty simple. Here's the two-tissue component model assumes that this is actually binding to the specific protein. This is binding, this is the free ligands as well as the ligand binding to like non-specific stuff. All right, and so the more complex model we have for rate constants, for example. So for different traces, we have different models that fit better. We then also have linearized models. So these are linear approximations of the compartment models. So they're mostly compartment free, but sometimes these work better than the compartment models, especially if we have lots of parameters, we can get messy estimates. We then also have reference region models. So when you're able to use these, a little angel of research has come to you and blessed you with a nice tracer. So this is where you have a region of the brain that has no specific binding. So that means the CS is zero. So you can basically say in this region, we know that all assuming that the non-splicable is similar between the target and the reference, we can actually calculate what the C, we can calculate the CS relative to that C and D compartments. So in this case, we also, we don't need to use blood sampling, which is better for patients, better for everyone involved actually. And we calculate what's called binding potential. So the problem really is that there are many different models. There are models for the tax, as I've shown you, but there are also models for the blood. There are also models for the metabolized fraction because as the tracer is in the blood, some of it's being metabolized slowly. So we actually also check that and correct that for that. There's different quantification decisions. So as I've mentioned, there's different models that fit better, but then there's also assumptions of the fit. So we can make different assumptions based on different traces. We can also modify weighting schemes if a certain weighting scheme doesn't work for a particular tracer. And so basically the story at the end of it is that there's lots of analytical flexibility and it becomes extremely important to actually demonstrate the steps that have been taken and it makes reproducibility really important. So reproducibility, this is a slide from custody Jane and Cambridge. Basically it's this sort of minimal requirements that the same code and the same data should actually produce the same output. And basically if there's clicky buttons, then there isn't any of that. It has to be that like, I mean, it's like a really minimal requirement and basically that requirement is lacking for a lot of stuff. So at present, software that's available for doing this kind of analysis, there's a lot of in-house solutions in many groups which make this totally unreproducible for other groups to actually do because they just say, well, trust us, our model works. There's also a commercial graphical user interface but aside from it being extremely expensive, it also is a graphical user interface which is a problem in and of itself which means there's no scripting. It's only reproducible if you have that software itself. So you have to, everyone else has to buy the software to be able to check it out and to do the analysis. And to be fair to them, there is an audit trail possibility but you double the price to pay for that. So even in our group where we, I mean, we have a big group, we only have several keys for the software because it's so expensive to actually go around and people have to give their key to someone else and therefore can't run their analysis. There is now also very recently a new system that's been released called Miacat which is MATLAB based. So it's scriptable and it has like a really pretty GUI but we don't need to talk about the GUI. But this also means that it's only reproducible with MATLAB and as soon as you're talking about MATLAB then you're lacking all the other open source goodies and like the stuff that you have with open source code where you actually, people have developed all sorts of cool solutions for stuff. So talking of reproducible research, that's one of the aims of Kinfiller to make this a little bit easier for the field. So the basic idea is that data and code should be reproducible so that the results and the figures can be recreated by yourself as well as by another group. So I would really recommend Carl Brohman's steps towards reproducible research and he shares an excellent email he received. So this is regarding reproducible for yourself. He talks about this email. Hey Carl, this is very interesting. However, you've used an old version of the data, N equals 143 instead of 226. I'm really sorry you did all that work on the incomplete data sets but with the reproducible workflow everything is basically, everything is within the script, everything is automated, the reports are all scripted so you can simply put in the new data, press go, you generate all your figures, all your tables, all your values, all the figures in the text are all done automatically. And this is the idea with reproducible research is that you can recreate your own work and trace your own steps and be able to adapt to changes as they come. So Kinfiller itself is aimed to be a lightweight R package for pet quantification. Being in R, it promotes reproducible analysis because R makes it much easier to actually share code. Well, and also by being lightweight, it promotes reproducible reporting by being in R and having R marked on capabilities. So this means you can write your reports and actually just include little chunks of R code. It should also hopefully promote transparent research practices. So using GitHub for example, the first step I'm hoping is that people will start actually sharing their full analysis reports as webpages. We're actually currently preparing a poster for a conference and we're gonna include a little QR code at the bottom where people can just go to the webpage hosted by GitHub and see the entire analysis step by step by step. And you can actually see exactly what we did. So later, hopefully, there will also be sharing of data as well as the analysis. The other idea is that this should hopefully promote data sharing. So this tool starts in the extracted time activity curves, which means it doesn't include, I mean, it basically gets around all the excuses that people have had. So, oh, the image files are too big. We can't share those with you. Or, oh, there's anonymity issues because you can see people's faces in the thing. This basically gets around all of that by starting it just like little text files that contain time activity curve values. It can just be vectors of values in order to minimize anonymity issues. But it must be said, it's lagging far behind on open data practices. Primarily, I think because it costs in the region of 80 to 120,000 crowns per measurement. So people have collected data, they spent a lot on that data, and they really don't want to release it unless everyone else starts releasing it too. So hopefully this will make it a little bit easier. So the package is hosted on my GitHub page. It's got a little read me for exactly how it works. Installation is really simple. If you have DevTools, it's just a one-liner DevTools and still GitHub, Amazon, CanVidder. At present, I have implemented quite a few models, but there's several still to come. So a pile of reference region models, a bunch of arterial input models. There's several models for, there's models for reversible, there's models for irreversible ligands. There's also the CMAP, which is for quantifying VND for those who know what that is. So yeah, and there's still lots more models to actually implement, which are gradually being added as I have the time. So CanVidder itself, it takes this input, simple vectors of times and radioactivity concentrations. I've tried as far as possible to avoid these sort of enormous, complex data structures that many of the other, that many, so our in-house MATLAB tools have these enormous structures that contain all of the data in different bits and pieces. And it becomes very difficult to actually know what's going on as I think is, in my opinion, the nature of MATLAB by whatever. So in addition, model fitting, it's sort of, it all occurs by just a single line of code for actually implementing each model. So there are then parameters, so it includes all the potential things you might want as well as optional options such as parameter starting points limits. The output comes as a model output object containing all input parameters, all fitted parameters and all details. So you can trace exactly what was done just by the output. It contains all inputted and all fitted data. And it contains a fit object that you can run, say the akaiki command, the bit commands. You can just run that on it and get back everything that you want to know about the fitting itself. And in terms of plotting, you just run plotKinFit on a model output and you get your plot. So that then is a ggplot2 output. So therefore it's customizable in place by just adding a plus whatever you want. So now is the point for those people who don't care about PET and who don't care about KinFit but are interested in R, which I hope is some of you at least. So considerations for doing this in R basically is that fit objects are lists containing all the information, which means that there's lots of information of different lengths. There's blood data, which is really long. There's tax, which are much shorter. And then there's output parameters which are extremely short. So R itself works really nicely with data frames. R loves data frames and tibbles and sort of different implementation of them, which requires that all columns are the same length, which makes this all pretty difficult to actually implement. However, so another talk that I highly recommend is Jenny Brian's talk that she gave at plotKinFit on YouTube. She talks about the data wrangling but that actually a lot of this data wrangling is in fact data-rectangling. And that's basically what these nested data frames allow you to do, is that the problem here is actually rectangling that data and we can actually rectangle it for use with these processes using Declire, McGridter and Perr. So the basic idea then, so this is from Jenny Brian's talk, is that you normally have a data frame where you have each object is like a little unitary number or character string or whatever. And you can have say here, there's three blues and there's two oranges and there's four whites and it becomes really difficult to actually put everything together. Instead of the nested data frames, you just simply have each, what each thing is and you have the nested structure within it. So we can put our entire list with all of our stuff into one of those nested bits. So for example, I'm gonna give an example. So we have, here I have, I have an enormous list called TACDAT and in TACDAT is each pet measurements as an item and within each pet measurement is an enormous data frame with all the TACs for all the regions for that particular pet measurements. What we do is we take that, we make that DF, we say map, so that's from Perr. It can either take stuff out of lists or it can perform operations on lists. So from TACDAT we remove TACDATA, we then bind rows, so we stick them all together. We then make an ID column that comes from the names of the first level of the list. We group them by pets, we nest them, and then here we have it. So we have the pet, there's the names and then inside each element of the list, we have this table containing a 14 by 330 data frame. So we've basically made chunks that are now pet measurements. So each row is one chunk and we basically, that's the theme of this is that we create chunks that are the size of the bytes that we want to take. So here we now take that DF, we then group it by pet. So what we want to do here, we fit one model called MRTM1, we use one of its output parameters from one region and we use that for fitting all the other regions. So we can mutate, that means add an extra column, MRTM1 fit, we map, we then, oh it's, I guess on the edge of the screens, but the tilt means that we perform that operation on each item and then we actually perform the fit, send it into the MRTM1 fit nested list and then mutate to map double, so that's map to pull out of the list and to make into a numeric format and we pull out in the list, in the fit object, we take parameters, we take K2 prime and we put into the list and now, just like that, we've made all the fits into the MRTM1 fit nested object and we've got all the K2 primes pulled out of it. And then if we want to plot it, we just say plot can fit MRTM1 fit five, just to choose the fifth thing, if we want to modify our aesthetics, we just say plus something, so if we want an old school Excel looking table, we can just say plus theme Excel and we have an Excel table. Actually, this package is really cool. So what we can then do, if we want to use this, so the idea is that we can use this K2 prime for all the other regions, we can then, basically we would normally create a long data frame, but now we just want to, we want to make it long, but we want the byte size pieces are the size of bytes that we want to take. So we say long dat, is dat.df we select the columns that we want, we unnest them so it goes on for miles out to the side, we then gather them so it becomes an incredibly long data frame and then we group by the things that we want, we nest and we now have a 9810 long nested frame where we have the pet, the region, so now we can actually fit for all the regions, we have the K2 prime that we fitted for that particular measurement and we have the tax containing the stuff that we want. So basically these tools are incredibly powerful, they allow really real convenience with actually doing the kind of model fitting operations that we want and moving things in and out of chunks where otherwise we'd have really enormous data frames that become difficult and unwieldy. So in terms of next steps, I'm looking for students, so if anyone is interested, potentially a master's student to validate Kinfitter against other software, we've already taken some preliminary steps and found extremely high correspondence. Also, if anyone happens to know of any grants for this kind of thing, like basically open source development rather than just comparing small groups of people and so yeah, if anyone is interested in pet after this, there is a pet course happening towards the end of the month, so you can contact Anton Fushberry and if you wanna contact me in general, that's my name and doesn't make it out, accounts. So I think there are like five more minutes, so we can either take it for questions or I can show you a demonstration. So maybe we just see how many questions there are and we can go from there. Questions? Nobody's waving, I think we can do the demo. Okay, I mean, I'll just really go through a couple of things, but. So here then is the basic document, the notebook that we're talking about. So we like load in the stuff, we have everything here. We put in these little art chunks, so just stick in a chunk like this, just insert a chunk and we have a new chunk. And in those chunks, we can write our code and we get all our, I mean, you can turn it off to actually see all the outputs in line, but it's typically, I don't know, it's kind of a cool thing to do. And at the end, so what we do is we perform a little test-re-test analysis using this data sets, we look at the output. And then just to talk about the reproduced reports, we say the ICC value in the Core 8.4 and then we have a mini chunk, R, N row, the number of rows of the data frame participants was the, and then we just have the ICC value and the absolute variability was and the absolute percentage difference value. And if we do that, then all of our values just go directly into the page, into the sheet, into the document, into the article without having to worry about anything. So if we just change the data, all our figures get re, don't really like come in like arrive or the tables arrive and all everything arrives. So here, if we just knit this, I hope it works, we can choose our formats, but I'm gonna go for an HTML. If this just comes, now we're gonna be with fitting, or how many is it, four by, I think 150, we've fit 150 regions and there it's all generated, it's just done everything. We do everything, we've got fit-mult of attack, we then try out stuff, we arrange it, we fit the K2 prime. Okay, so for those people who are really interested in PER, I've also included a little demonstration of PMAP. So MAP was the function that we used, but PMAP2 allows working in parallel with elements within two lists, and then PMAP allows working in parallel with elements within infinitely many lists. So here, I'm just gonna use PMAP, I don't actually need to here, but it's actually just a really nice example. So if we want, we've done MRTM1, we got the K2 prime value, which we need for the K2, for the MRTM2 fitting. So we then derive a function, or we create a function, tax and K2 prime, we say what all the things are, we then take our long data that we created for, we add an extra column called fit MRTM2, we just PMAP on a list containing tax and K2 prime. So I've used this before where we have our input data, which is a big 4096 long data frame containing all the blood data, where we have all the tax and another nested element, and we have the delay between the brain, between the brain measurements and the blood measurements fitted in another fitting object. And so we have all three of those, it's cycling through three separate lists at the same time and doing all the fits at the same time to produce these outputs. So then we can just do our MRTM2 fit, we go in, we get par, we get BP, we get everything there, we perform our little test-free test analysis. This is another little package of mine with like convenience functions called grandvillar. So we just send it all through in like six lines, we do the test-free test analysis, we produce all the values in a table, and then as we've seen the ICC value in the code eight for 30 participants was 0.91 and the absolute variability was 5.44%. And so we're now writing our articles like this, we're basically doing analyses like this, and everything is, it's like a little work at the beginning to get used to it, but after a while it makes everything so much nicer. Thank you. I think that's everything I've got to show. You have time for one question if somebody came up with something. Did anyone come up with anything? Okay, cool. Thanks.