 So this is nice both because some of the previous speakers have clarified a lot of things I won't have to talk about and also some of the stuff that I'll show you, I think we'll directly address a couple of the questions. So I think it flows nicely. So the kind of idea here, the general theme is reproducibility and your imaging meta-analysis. And so to me, there's actually really two reproducibility challenges. You can course cart the space up different ways but one kind of take on it is this. One challenge is what you could call standardizing and representing experimental procedures. And I think that's what the first two talks mostly were about. And so the idea here is if we don't know what other researchers did, then of course you can't reproduce the results. So you have to be able to extract in some machine readable way, ideally what someone else did before you can go out and try to do the same thing and evaluate the results. But the other side of this is sort of synthesizing the results of many studies. And so it turns out that reproducible methods don't entail reproducible results. And so even if we know what other researchers did, we still can't reproduce the results much of the time. So even if you did exactly what someone told you they did, of course with different subjects and other very minor differences, but essentially the same experiment, much of the time you still won't get the same result. And this is something we I think don't always appreciate. So the question is why? And well, one answer, a big one, is because of sampling error. So there's just natural variability in the observations you actually get in any given run of data. And so some of you may have seen this paper, which elicited mixed reactions from the community. I think it's a beautiful paper. I think it should be a required reading for everyone in neuroscience. Making the argument that really in many domains of neuroscience, certainly including fMRI and their imaging, studies are underpowered very often, in fact, maybe the majority of the time. And this is not a new argument. Many people have made it throughout the social and life sciences. I've made it myself. Here's a figure from a couple of years ago. This is in response. Some of you may have recalled the voodoo correlations controversy. And the argument I made was that there really is a problem with fMRI studies in terms of results, looking better than they should and effect size being inflated. But it mostly has to do with the fact that we're underpowered. And so just to give you a sense, here's a typical correlation analysis. You might correct it, 0.01, modally in fMRI study, because it's a big brain and we do lots of comparisons. If your effect size is moderately actually really large, let's say your effect size is 0.5, which is an astronomical effect to come from a psychology background, for instance. It doesn't really happen very often. Then you'll notice that if you have 100 subjects through this line over here, you're OK. Your power is nearly one. Almost every time you'll detect the effect. But more commonly, the modal threshold or modal sample size is around 20. And so you're down here, this blue line, where you have maybe 10% power. If you allow for smaller effects, still quite large by the standards of psychology, and probably neuroscience as well, then you might be talking 2%, 3% power in a typical study. So this seems very strange. How could we have this enormous literature, thousands of studies, really interesting results, and yet when people actually sit down and simulate these things, it looks like power is extremely low. There's some kind of, there's some paradox there. But it actually turns out to be pretty easy to reconcile. So here's a simulation showing you what's probably happening in many of these cases. And this is true, again, not just in neuroimaging, but in many other domains, certainly in genetics, with GWAS studies, for instance, but in many other domains too. So let's say you have, this is your two-dimensional brain here, and every voxel has some effect size in relation to some behavioral measure. Nothing is truly ever 0 at the limit. So much of this is interesting, potentially. But the effects are not astronomically big. I've kind of circled in white the regions that are sort of stronger, just for the sake of illustration. And then this is the truth in the population. Now you come along and you're gonna run a sample, right? So you're gonna collect some amount of data, let's say 20 subjects, and threshold it very conservatively because you're doing fMRI, you don't wanna have any false positives. This is what you're gonna end up with in the typical case, right? FDR corrected 20 subjects, and you have this one blob, and you say, oh great, look, here's the, you know, whatever, the visual spatial manipulation region of the brain, right? And it's not that this is a false positive, I mean, there's something going on there, but notice you've knocked out everything else most of which was highly interesting. If you've had more subjects, as you move this way, or you're thresholding more liberally, you start to end up with something that looks like the truth, right? And what's really insidious about this is it's gonna look to you like you have this really interesting resolve, right? You have this one brain activation that tells the story, and you miss all the stuff. So in practice, very low power doesn't always hurt you, right? In terms of getting a publication, it can actually help you, it makes it for an easier story, but in terms of reproducibility, what happens the next time someone runs the same thing, right, exactly the same methods, well, they're not gonna end up with this one cluster there, they might get this one, or this one, or this one. So you have a literature that looks like people are using more or less the same methods, assuming we resolve some of these other issues, obviously, and you still don't have any consensus from individual studies. So that's a problem. Just to illustrate, here's, and you've heard all of that meta-analysis, so here's that meta-analysis from Torweger's group a few years ago. This is, I think, about 150 studies of emotion, and each point is an activation from a study. And the point being, when you meta-analyze all of this, you get this nice representation, and so this is the ground truth, probably for emotion, broadly speaking, but any individual study is not going to give you all of that. It's gonna miss most of the stuff, potentially, especially smaller studies. And so the idea is that even if we could perfectly reproduce the methods, you still have some uncertainty. And by aggregating a lot of data, we can actually start to get to a consensus. There is still another problem, though. Let's say you could do this, and I'll talk a little bit, you could do this, you've already seen some of these, you could do this on a large scale. But even if you could do this for everything you could possibly care about, you could automatically synthesize this data, there is another problem that comes up, which is one of interpretation, right? So if you look at the literature, and most of these are ALD meta-analysis, you know, all sorts of different domains published in the last couple of years, fast switching, subsequent memory, smoking cure reactivity, rectal distention, empathy, right? So I'm not gonna go through it in detail, but take my word for it, these are not all very different maps. There are regions that show up time and again, right? And that presents a problem, right? So you think that this is like the empathy network, but many of your regions are showing up in meta-analysis of go, no, go, and stop signal and other things. And so almost by definition, if a brain region or network is showing up all over the place, it cannot be doing something very specific, right? And just to kind of reflect that, here's what you get, this is from David Van Essen's Sum's DB a few years ago. It's just a density plot of where coordinates reported in studies fall. And it's not uniformly distributed, right? The assumption that activation could occur anywhere in the brain would be the likelihood is simply not true. Some regions are much more likely to activate, for example, the anterior insula. And so by definition, these regions must be doing something that's not specific to a given paradigm if they show up in many, many different paradigms. Okay, and so formally this has been described as the problem of reverse inference. The fact that a task T consistently activates a region does not mean that region is specifically involved. It could be that there's some function that is recruited in many, many different paradigms. And so you're gonna look at this as, let's say a pain researcher and say, aha, that's pain because it shows up all the time. But someone in working memory would look at it and say, oh, that's a working memory region, right? And you wouldn't know unless you actually were able to compare the activation across many, many different domains. Which is something you cannot do in an individual study. No matter how many resources you have, you cannot get every subject in your study to do a hundred different tasks and compare them and see, well, what is really the common denominator? So it's a big problem. So how do we address this? And there's various things you might do. The kind of approach that I've taken is to kind of try to do large scale automated synthesis of the literature. So hopefully I've convinced you that the results are gonna diverge no matter how well you represent the methods. So it's not to say we shouldn't standardize things. Of course we should, but that's necessary. It's not sufficient. Accurate interpretation of our results is a very difficult proposition and it requires the ability to aggregate and synthesize data across many, many different studies. So it's hard to do. There is a way to make it easier, which is to kind of give up the pretense that we're gonna do a really, really stellar job of it. We just say, well, let's just see how well we do with really simple approaches and go from there. And so that's kind of been the approach of a framework that I call Neurosynth, which is a platform for large scale automated meta-analysis of fMRI data. And it's complementary to other approaches. So as you'll see, all the stuff that BrainMap does well, this does really poorly, and some of the stuff BrainMap doesn't do well, like scaling, as you heard, it's very effortful, is not a problem at all using an automated approach. So they're nicely complementary. And really we're just trading off right now, at least quality for quantity. So we're gonna give up the illusion that we're gonna get it just right and see how far we can get. So we're trying to extract and process as much of the published fMRI literature as we can. And this is all based, at least the preliminary framework, is based on two very simple assumptions, which probably to many people in this room will seem kind of mind-bogglingly dumb, because they're actually very simple. First assumption is if it looks like a duck, it's a duck. Which is to say, well, how do we extract brain activity from a paper in an automated way? We just look for something that looks like XYZ coordinates, right? We find tables and we say, look, you have these three columns, they're labeled XYZ, they have numbers, it's brain activity. We don't worry about labeling what that means, right? So this is heresy potentially to some people, right? How do you know if it's an activation or deactivation? What experiment it came from? We don't, right? It all goes into the database just tied to a particular article. And just banking on the fact that at least if nothing else, this is reasonably well standardized already. So most people report their activation coordinates in XYZ space. So that's one assumption. And the other is to treat articles as just a bag of words. So this is a very common approach in lots of domains. We're just taking the full text of an article and extracting the high frequency words and saying, look, if a study uses the word emotion a lot, it's probably an emotion study, right? At a first approximation. Then we can do other more clever things like we can do topic modeling, various other things we've done. But really the kind of the bedrock assumption here is just that you can tell what a study is like based on the words it's used. Which of course is gonna be wrong much of the time. And it doesn't give you the level of detail that you just heard about, right? You don't actually know necessarily what the paradigm is. You know it's something to do with faces but you don't know exactly what was involved. But that's the assumption. And this actually turns out to work quite well. So what you get out of this is a database of several thousand studies where now you can say for example, well, give me all the studies of pain and you pull them out and then, because you have a parser, you can pull out all the activations associated with those studies. And now, just like you would do with Ailee or any other package, you can meta-analyze those and generate a nice image of that. So you can query for any set of terms or any combination of terms you want. The database now contains automated meta-analysis images for hundreds of concepts. Essentially anything, any word that occurs frequently whether it seems sensible to you or not, you can run a meta-analysis in seconds. Activation data from over 6,000 neuroimaging studies. So it's two to three times the size of brain map and it's probably half to a third of the quality. So there's that trade-off again. And we have lots of other things. So there's almost 200,000 functional connectivity and meta-analytic co-activation maps. We'll talk about it in a second. And the database is completely open. So this is all available on the web. Everything you see, all the tools there, you can just grab and do whatever you want with, right? So it's, they're at least on their MIT license and you don't need to email us for permission or anything. Just do whatever you want with it. There's a web portal, nurses.org, which is sort of continually growing but it's still experiencing some growing pains. And I'll show you just some of the things that you can access through it. Well, actually before that, let me just tell you that software is on GitHub, right? As I mentioned. And there's really sort of several packages. One, it's for automated extraction of the data from published articles, which is very boring stuff. But if anybody wants to help write parsers for new journals, much appreciated. There's a viewer, which is brand new. It's a JavaScript library for 2D visualization. It's not very sophisticated, but it's pretty modular. So you can drop it into your own page and display images pretty easily. And then there's the core tools, which are in Python and which are what generates all of these metanalsis images and various other things. And again, all of this is freely available. We have RESTful API now. So let's say you want data on study 280. You can also query it by DOI. Well, you can pull out all the information we have and they'll tell you what are the images you can pull those and so on. So you can, you can build applications and interface nicely with NIRSEN. And the focus is on simple, large-scale analysis in ease of use. So the niche here is to do things that are, again, pretty crude but are useful. So everything runs quickly, you get a result and it's not the final word by any stretch, but it's potentially a useful product along the way. So just to kind of give you an example, here's the metanalsis of person. What does this mean? I don't know. What is the metanalsis of person? Well, strictly speaking, it's just studies and literature that use the term person with high frequency, probably reflecting person perception mostly. But then you could do kind of some of the things that Angie was talking about, right, where you can go back and look at those studies and try to kind of separate out what are the different components. You can get all the studies that are involved in this metanalsis. And this is on the web. It's visualized for you and you can download it. And the nice thing here, just coming back to this problem of reverse inference I talked about, because we have this enormous data set, what you're actually seeing here is not just activation, it's consistent for studies of, let's say person perception. It's a comparison of all the studies that use the term person to all the ones that don't. So it is giving you some measure of specificity, right? Each voxel is here that's active, is relatively more active and it's more important for person processing than it is for other things. So that helps overcome this problem of reverse inference to some degree. A region that shows up for lots of different functions is not gonna show up in this image unless it's more strongly associated with that particular function. We also have functional connectivity and meta-analytic activation maps, which is something that the brain map and other people pioneered. So the idea is you just like with functional connectivity analysis, you have a seed and you see what other regions activate over time. Well, you can do the same thing meta-analytically. You're just asking, in a study that reports activation here, what else is reported? This is actually functional connectivity. And so we just partnered with Randy Buckner's brain super struct project. So now you have those functional connectivity maps for over 1,000 subjects for every voxel and you can download all of those as well. And so the idea is to keep adding kind of these kinds of resources to the platform. And so just to kind of illustrate one of the things we can do with this. So this is a paper by Luke Cheng last year where we postulated the insular. So we took an anatomical mask of the human insular, did some postulation, which I won't talk about. And then we looked at meta-analytical activation patterns as well as time series based postulation. So we get networks that are broadly consistent. And the idea is you have these three, at least there may be more, three insular networks that are spatially segregated. And now because we can do this kind of reverse inference on a large scale, we can ask, well, what do these networks do? And just in the interest of time, I guess I won't focus on this too much. Well, I'll just say this is a report inference. So if you just look at each domain individually, you would probably always say that the anterior insular network, which is this blue one, is most active. It does the most stuff. But again, that's because there's this base rate issue, right? So these are regions that are active in general and lots of different things. When you actually do reverse inference, meaning you're asking, well, what is each insular network more specifically involved in? You get this really nice tripartite division where you have a sort of cognitive insular network, an affective one and a sensory one, slash pain. So it's a nice tool. Again, this is accrued, right? It's not gonna tell you exactly what's going on, but it's informative. Of course there are problems. So this is, there are many problems. So the data quality is very poor. And if you actually go and look at the data, you'll find studies where like stuff is clearly not a real activation, right? It just thought you mislabeled something. It doesn't happen often, but there are problems. As I mentioned, we don't know what's an activation, what a deactivation is. We don't know what population something came from. So in some ways it's kind of surprising it works at all. But it does seem to work. The data are not automatically centralized. So I use centralization here lightly. So it could be represented in many different places. But the problem is that when you publish a study, we don't get the data, right? Was that five minutes or am I out of time? Five, okay. You don't, we don't get the data. And so it would be great if there was kind of a push approach, right? As soon as you publish, we get the data automatically. Arguably it's the wrong kind of data, by which I mean coordinates are great and they are a good approximation of whole-brain images. But if we can get whole-brain maps where we have an effect size at every voxel, that would be better, obviously. And most of the functionality is not readily accessible. So there's a web portal, you can download the software, but it'd be great if we could do as much of this online, right? So the nice thing about meta-analysis, in some ways, it doesn't require local resources really. So most of the stuff you could do potentially on some remote server. So what we're pushing to now is an online infrastructure for meta-analysis. And so right now almost everything's conducted locally. The data may be served remotely, but you're gonna run the thing on your own computer. And we wanna move everything online. And this would, I think, bring a lot of benefits. So one is crowdsource validation and annotation, right? So ideally anyone can help curate and validate the database. Now this sounds easier than it actually is, because as Angie will tell you, motivating people to actually do this is very, very difficult. So there has to be some incentives built into place and figuring out how to do that is a difficult proposition. But if the editing is done online, then everything is inherently centralized, right? So rather than you downloading some, like a set of studies and then editing it yourself and then maybe you're maybe not pushing it back, if you're doing everything in a browser, it's already on the server, right? So you don't have to say, you don't have to do any more work. You've already done the work to centralize this without even realizing it. And you can have alternative representations of the same space. So you could come along and say, well look, I think this study is this. And this kind of came up in a question, right? And so now there's a record in the system and how to model that as I think a question and et cetera and other folks are quite interested in. But we want to have some kind of way of tracking these different ways of representing the space and allowing for disagreement. Here's just one example to show you there are efforts. This is Roberto Toro who's not here, unfortunately, has brainspell.org, which is the entire neuroscience database and lets people tag it with the CogPo terms, with cognitive Atlas terms, Russ Baldrack's platform and go table by table and just tell us what you think it is. And I don't know how many people are using it. It may or may not take off, but the idea is something like this, structured the right way, I think, hopefully should take off. Could help with ontology development. So, neuroscience is deliberately ontology agnostic. We will soon have an online feature set which you can use to tag any study and ultimately any contrast with different ontologies. And then you can ask questions like, well, which ontologies carve brain activity more sensitively? You can use the brain to tell you which one you prefer or how different ontologies are related. Maybe the term used in this ontology actually doesn't map on to that ontology. There's a different way of brilliant them and you could do things like that. Image-based meta-analysis, as I mentioned, we're still relying on discrete coordinates and we would like to use images. And so this is thanks to Chris Gorgalevsky. There is now a prototype. It's very rough, but it is functional at neurovault.org. So you can go and upload your maps and using our viewer, they'll be instantly rendered and available to anyone else to download and you can give us as much metadata as you have the energy for, essentially. And hopefully this will scale up and so eventually we will be able to do things with the images. So you can just come along and say, I want that image, that image. I'll tag them this way or I'll pick out the ones that have this ontology category assigned to it and run them in analysis online in real time. Skip all this. There's a screenshot and this is all also on GitHub. So if you find bugs or you wanna contribute, please do. And the last thing I'll talk about and this kind of gets out of question that was asked after Andy's talk, is you can now start to talk about things like real time decoding, right? So you run your study, you get a map and you don't know what it means or it looks unusual. It's not what you expected. Well now you can upload your maps and decode them in real time by comparing them to Neuricent meta-analyses. And so this is already available. It's hidden because we're still testing it. If you really wanna try it out, it's at www.neuricent.org slash decode. And you'll get back something like this. So keeping in mind all the data quality issues, it's still quite useful. You upload your image and it tells you, this is just a correlation coefficient across all voxels, but it says, this looks like something related to sentence comprehension or language, for example, right? And if you don't get anything sensible, that probably tells you something about the quality of your image. Either that or you have this amazing result that nobody's ever seen before, right? Either way, it's useful. And so the kind of the overall just to come back to the general theme of reproducibility and here is, once meta-analysis is online, and we can't do this for other things equally. You can't run a study online, obviously, an fMRI study, but meta-analysis potentially we can do in a fairly centralized way. And documentation and valuation become much easier, right? All procedures can be inspected. So I can come along and say, oh, here's your meta-analysis. Here's the paper trail. Which studies did you use? What happens if I remove these or include a couple of new ones? Does anything change? Results can be visualized immediately. You could put them in a paper. You can stick a link in an email if you want. You can annotate them potentially. So we're not there yet, but ideally you will get to a point where you can just on the image say, look, here's what I think or see this and so on. So I think this fits nicely with the general theme. I think we probably don't have to convince many people at this meeting that openness is good and we have all this data and we wanna make it accessible and useful to people. And I think if we can do that, at least in the domain of meta-analysis, the world will be a very, very slightly happier place. So thanks to all these people, some of whom are in this room, and I'm here for funding and I'll take any questions. I tell, how do you suggest versioning? So if you use neuroscience, how do you reference that and then the amount of data that's there? Yeah, I mean, that's a big problem. So what I've done right now, which is really not a good solution, is when you download an image, it's time stamped and stamped also with the number, with the number of studies in the database at that moment. That's not because I think that's a good solution, it's just because the alternative of having formal version control is a lot of work and there's other priorities, but it is something that has to happen. So ideally, there would be metadata attached to that that says, look, here was the state of the database at this date and that's still accessible online and you can go in and look it up. But I think that is a big problem. And there have been cases where people have emailed me and said, look, I did this six months ago and it's different now. And typically different means that it's better now, right? The data set is bigger, but it's hard to tell someone the result they were happy with kind of vanished. So there has to be somebody to track that. Yeah. Okay, so this is fully automated and I'm curious about, is there any aid or documentation coming up next to the results coming out from the meta analysis because for instance, one retrieval may result in GLM maps of contrast, one other may result in means mostly. And there are recently pattern recognition algorithms that produce GLM SPMs. So everything is mixed together. So you need a summary of from where at least these maps are obtained. Absolutely. And that's why I really hope that Satcha and Gully and other people who are developing tools that will allow you to codify that information succeed because the problem from my perspective is each of those, every time you wanna, so I would love to be able to quantify sample size population and of course the imaging methods in an automated way, but each of those domains is a difficult problem in and of itself. And so from my perspective, if anyone has a way of quantifying that, right, so if there are any NLP experts who know how to get sample size out of neuroimaging articles, then we can put that straight in and we can tag that and you will have that metadata. But I guess my view coming at this was, we don't have that now and it'll take a very long time to get there. And so this is just a very crude approach and you have to keep the limitations in mind. It is blending everything together. As soon as I have a way of pulling out that information, I would love to put it up there. And so you'll know exactly what each study is and be able to filter on that. I should say actually that there are some, there's some work in that direction. So Josh Karp has done this really nice work tagging a lot of the same studies with methods that people have used, at the cell versus SPM, field strength of the scanner, different movement correction approaches. And so that's not online, but it will be shortly. And so the idea then is at least to provide some filtering, right, so you can say, well, I don't want studies that are, you know, not geolumbates, for example. But right now it's sort of everything is kind of bundled together, unfortunately.