 Thanks very much for the opportunity to speak. I'm going to tell you about two things in this presentation. By the end of this 10 minutes, I would like you to, one, use more high plots, and two, do more open science. So consider for a moment a scatter plot. The nice thing about a scatter plot is that every point of data can be laid out visually based on the content of the data. So the reason that a given point is where it is is because the contents of that point are an x value and a y value. So we have the nice property that it's easy for us to lay down points. And there's a nice reference system for us to use. And that common reference system is really useful because that lets us do things like compare scatter plots to each other so that we can do some analyses across different data sets. However, increasingly, data sets that we're working with in biology look more like this, hairballs. This is an example of a protein-to-protein interaction network. And they don't have that same property of a point being at the position that it is based on its contents. Variously because the way that we visualize them are through different kinds of layout algorithms, like force-weighted algorithms, which put things based on the position of the network. And that's a problem because that means that depending on what algorithm you use to lay out the network, even the same network laid out with four different algorithms might look very different from each other, leaving you with the inability to compare them across other networks that actually have different content. We have this problem in conectomics. So here's an example of the C elegans connectome with motor neurons, interneurons, and sensory neurons laid out. This was one choice that this author used to lay out the different connections between the neurons that are there. But a different paper might use a different way to show it with sensory neurons at the top on a line, motor neurons in the bottom line, and the interneurons in using some sort of clustering thing. And of course, every way you lay it out is trying to show some different story, but it'd be nice if there was some way to have a standardized means so you could compare these across each other. So Kurzweizke in 2011 wrote a really important paper proposing high plots, and this is how they work. A high plot is a, whoops, so a high plot, let's see, it's this thing, yeah, okay. So a high plot, imagine you have a network with three layers, okay? NSRR on the first layer, IHFA on the second, MICF and NR, FD on the bottom. Now you wanna do something equivalent to the scatter plot so you wanna be able to put it on an axis. And for a network, a sensible thing to use for a metric is the degree of a node. So you'd put, so let's take three different axes for the three different layers, okay, so take this and rotate it. And for the first one, we'll put NSRR here at three because it's got three edges coming out, the degree being the number of edges going in or out of a node. IHFA, we put up here at six because it's got a bunch more, right, six edges. And MICF down here is only got one, okay? So we got these three edges. So that's the first trick of high plots is you graph based on the degree and then you connect them in the same way that the edges are located up here. And the second trick is to be able to show you something about, like, engage your spatial reasoning, ability, and so when we take those three axes and we put them on a radial basis, like this, so X1 goes over here, X2 over here, X3 over here, and you can connect them together with edges. Now the choices that you make of what to put on X1, X2 or X3 are kind of up to you, but there are some really useful conventions that we'll talk about, okay? So that's the idea of high plots. You can find out more about highplot.com. This is not my invention, it's not my website. I don't get any commission, but it's really good stuff. You should check it out, okay? So to review then, you could take a network that looks like this, A, B, C, D, E, F works with undirected networks, works with directed networks, and you get a high plot like this. If we choose to put A and B on this axis, C and D on this axis, E and F on this axis, then we can connect them together like this. Now you'll notice that you don't render, by doing this, you don't render the connections between like E and F, or A and B or C and D, right? Because they're on the same axis. You can choose to do that with one more extension of a high plot, which is that you can clone an axis, right? So we take this middle axis, we split it, so we put exactly the same nodes and exactly the same positions, but now we can also render the connections between them, okay? This is, you'll see why this is important in a moment. So the question that we asked with the open worm project is can we apply this to the CL against connector, okay? Let me tell you a little bit about the open worm project really quick. It's an international open science community. It has nine core members and 23 contributors. It's a lot of it's on GitHub, with 24 different repositories and 70 GitHub followers, and 88 folks on the high traffic mailing list, which means that they're subjected to all the content that goes through the project. It doesn't have a headquarters. It's an open science community, it's online. And this is inspired by books like this, Reinventing Discovery by Michael Nielsen. I highly recommend you read it. The goal of the project is to create a full scale simulation of the C. elegans in the long term, in the medium term, working with a specific behavioral data set and trying to predict it using a 3D neuro mechanical model. You can find out more about that at openworm.org. But so, and this connectome, I wanted to say a little bit about it. It's kind of a, you probably heard about it. It was originally published in 1986 by this guy, John White, after 13 years of work pouring through electron micrographs, counting synapses under the electron microscope. And the paper was a tour de force. It was this 400 and some page article that showed and proved that all the synapses were where they were supposed to be. And it also had the effect of consolidating its naming scheme so that all 302 neurons had a name of four or five letters. And Cooley, in 2013, he reached out to the openworm project, John White, and he did a online hangout talking about the paper and going through all those details, which you can get on openworm.org. That's actually a live link to this. By the way, this entire presentation is online right now. If you go to Twitter, I just tweeted it about 45 minutes ago, so you can click on the links that are here. All right. And so what the openworm project has done to kind of update this is it's converted the connectome into an anatomical framework with all of the neurons represented in neuromel. And it uses the connection statements inside neuromel to embed the contents of that connectome in a spatial framework. And if you go to the open source brain site, you can actually see this for yourself where you can click on a neuron and it'll change the colors of its neighbors based on whether a neighboring neuron is an ingoing neuron or an outgoing neuron. It's actually quite a nice little feature. So that's kind of the C. elegans connectome brought into sort of modern frameworks and sort of stuff that you can do online. And again, live links are on the slides. Okay. So, so I'm getting to the results of this process in a moment of what we did, but first I want to say a little bit more about the process. So in order to make this work, we started by posting the description of what we wanted to do. Apply high plots to C. elegans connectome as a GitHub issue. Okay. And here's the GitHub issue itself. Link at the bottom. All right. And we explained kind of how to do it. And on the mailing list, we got a introduction by a gentleman named Pedro Tabacoff who's the first author on this presentation who said, I'd like to volunteer. What can I do? We pointed him at the issue. Here's Pedro on GitHub. Here's his GitHub account. Okay. He's actually from Brazil. I actually have not ever met him in person. And that's kind of how open science works. But three weeks later, he looked at that issue, he worked on it and he committed a folder of goodies implementing high plots. So let's see what he created. Before we do that though, I should point out that there have been other explorations of the C. elegans connectome structurally. Here are two articles that have looked at fractal properties and have looked at topological clustering of the C. elegans connectome. None of them have ever used high plots. The best of our knowledge, this has never been done before. So what we, okay. So here's an example of a high plot using the C. elegans connectome. On this axis, only sensory neurons. On this axis, on these two axes, remember the split I showed you, only inner neurons and on this axis, only motor neurons. And you can already start to see some things that are kind of interesting about this. We'll drill into one thing in a moment. But this is a rendering of just the chemical synapses, okay, because the connectome had both chemical synapses and gap junctions, all right. This is what it looks like, just gap junctions. And you can already start to see, just with those two pictures, I'll go back and forward, just with those two pictures, you can already see that we're laying out the network in exactly the same way, but we're seeing different features. Actually very different networks, the very different edges. But you can already start to see that, for example, there's an asymmetry here and many other things. What we focused in on was the fact that these neurons up here, the nodes that are up here stand like heads and shoulders above the ones below, something that hadn't really ever been pointed out. And we wanted to drill in and see if that was actually true if it went beyond just the visualization. So we switched tools, we went away from the high plot, we plotted the histogram of the degree of every node of every interneuron. And sure enough, the ones at the top are looking actually pretty high in degree. We zoomed in on those, looked a little bit closer, and yeah, sure enough, the interneurons that are there are way statistically significant above average degree in all the neurons of the connectome. And so this is an interesting feature. Interestingly, so this is something that's been observed in a different kind of connectome. Vandal Heuvel and Sporins talked about a rich club connectome, and this is sort of the human connectome, or in looking at large scale projections, and they proposed this concept of rich club which refers to a network, a master network of nodes that are sort of driving this. And we'd love to compare that to this, but they haven't used any high plots yet. So we'd really love to see the high plot version of that. I should also say that there's a lot more interesting stuff. I don't have time to go into, and the 10 minutes allotted to me, but there's also a lot of other interesting patterns when you start playing with these high plot things. This one is looking at edges that are above a weight of five. Okay, so you can explore more for yourself online, you can click a bunch of links and check this stuff out. So I'd just like to acknowledge Timothy Buzzbys was responsible for producing the connectome data set that was used, and Pedro Tabacoff, of course, did the work that, as I mentioned. So really quickly, so I think the things I want you to take away from this is complex networks currently, hard to draw insight from and compare. If you use high plots, you can actually do a lot more interesting comparisons, and an open science approach to high plots, and an open science approach to doing this can establish the connection between high plots and the C. elegans connectome, and analysis of the C. elegans connectome shows a striking bridge club phenomenon. But again, as I said at the beginning, the two things I really want you to take away are, one, use more high plots, and two, do more open science. Thank you. Thank you. We have time for one quick question, Steven. What can you do if there isn't a way to, if there isn't any sort of clear labels on the things, because it requires that you divide them up by labels, right? Maybe labels on the things. The nodes have to get to an axis. So if there's no, do you have to maybe start with a clustering or something, if you don't know what, you always had motor neurons or something like motor neurons or... Yeah, I mean, so motor neurons and neurons, sensory neurons is just a property of those nodes. If you have nothing at all, if you have nothing at all, you can use the structure of the network itself, right? So you could put on one node, only on one axis, only nodes that have outgoing connections, but no incoming connections. A second axis, the ones that have both incoming and outgoing, and the third, only the ones that have incoming. And that would be a standard way that you could do it on a directed network. So you swear that if I go into those hot spots, because I didn't see anything, I don't know if anyone saw, you know, like some, because I also wore a bit of clothes, and I didn't, you know, like, go on it. So... No, did you think... Did you notice that the inner neurons had, but there were more that the nodes at the top, the feature that I pointed out, was that they stood a lot above the others? Yeah. Okay, so... Let's talk about it offline. Okay, let's thank Stephen again. Okay, the next speaker is Giorgio Innocenti from the Caroleinska, and... So, well, thank you for... Yeah, thank you for asking, for allowing me to talk here. In 1971, a person who became then my colleague here at Caroleinska, Christa Christensen, published a paper which was one, in my opinion, one of the most important papers in the history of neuron anatomy, showing that neuron will pick up horseradish peroxidase, will transport it along the axon deck, the cell body. These allowed to trace connections in the brain in a way which was only dreamt of before. Then later on, it was found that neurons actually like to pick up and transport all kinds of junk. And so, all kind of interesting, you know, networks were discovered and presented in the literature. One of these is the very famous visual system network by Feldman and Van Essen. And then, of course, more recently, in vivo tract tracing, allowed to also see connectivity in the human brain with non-invasive techniques. What is nice and what is missing, these studies gave a very complete, you know, image of the connectivity in the brain of all kinds of species. But one element was missing, and that element was time. And I came to this issue of time a little bit by an accident, looking at the axons going into the corpus callosum from the monkey brain, comparing these were animals which had been injected with the biotin-lated dextrain in different areas. And then, when I go into the corpus callosum and I look at the axon diameters, you see very clearly that axons coming from prefrontal cortex are thin. These are longitudinally and transversally cut, and axons coming from motor cortex are thick. And then you can trace these axons. And this is what the basis of the stuff that I will be talking about is just these. Axons which have been labeled with biotin-related dextrain transported and then measured in the optical microscope using high-resolution lenses. Just I have the time to give you just the flavor or perhaps the stink of this story, which is the following. What you see here is the corpus callosum, and then you see a bunch of structures which are projecting into the corpus callosum. This image is in the abstract that I have presented. You see RL17 and 18, mid temporal, V4, et cetera. And then you see a certain number of subcortical structures like the thalamus here, the nucleus caudatus, and the internal capsule. And what each of these arrows shows, the thickness of the arrow is proportional to the median axon diameter going to that structure. From, for example, from RL9 to the internal capsule. And then the length is proportional to the length, not in this case, but in the other cases, the length is proportional to the connection between the area of origin and the size of termination, the corpus callosum in this case. And then the numbers that you see here are the delays which are generated by the thickness of the axon, where the thickness of the axon is proportional to the conduction velocity of the axon, and then the length of the pathway. So these are the delays which are generated by the combination of the length of the pathway and the thickness of the axon. And when you look at this, you see that practically each area is generating different delays to the target structures, which range for the corpus callosum between 2.4 milliseconds to something like, you know, five point time milliseconds. And this is just half of the conduction time the contralateral hemisphere. So it seems that if we are using this kind of approach, we are going to introduce time into the connectivity graphs. And the first information which comes to us when we look at something like this is that there are families of structures which are faster in conducting to their targets than others. And this really outlines the fact that somatosensory area two, area four, premotor area six, are faster at going into the corpus callosum and also are faster in going into the thalamus and into the internal capsule than other areas. So this is the old story and I could really, you know, stop here, but I just wanted to give you a little bit of an idea of what the data look like. So here you see at histograms of velocities which were measured in these studies and you see that there are areas which are, oops, there are areas which have actions with higher conduction velocity in meters per second here, particularly the motor areas, somatosensory areas and other areas are slower in conduction velocity. And then the important parameter here is the length of the pathway which was done in this logical material but just attract the tracing from section to section the trajectory of the axons and what you see here are axons originating in primary visual areas that's going toward the corpus callosum. They take this very strange trajectory, they go backward first and then they go forward and toward the midline. Because these length of the pathways is an important element in the dynamics, so in the conduction velocity and the time that it takes to go from one point to another point, more recently, and this is work which is not published that will appear in German neuroscience, we are starting to compare the anatomical data to the in vivo data obtained with DTI and you see here examples of DTI track tracing in the monkey brain done by my colleagues in Rome. And it is nice that actually the two assessment come to and can see you when I'm turning back. Okay, so you touch me on my shoulder. Okay, you see here that if you measure histologically or if you measure with DTI, the length of the pathways in the monkey brain look rather the same and then the delay times that you can calculate rather look the same. The difference here being with the visual projection where the DTI technique is not identifying this kind of backward and forward loops that we see in the astrology. This is the colossal midline delay, you have seen them in another way in the previous sections which really tells you that there are shorter delays and faster conduction to colossal midline in motor and somatosensory areas than in the areas of mid-tempo and anterior temporal posterior temporal cortex. Now, you can say this is, okay, this is anatomy but does it relate in any way to electrophysiology? So have we got an electrophysiological data to validate or to compare with? And to some extent we have and it's interesting for example to compare our estimates of conduction velocities to the internal capsule with this work which was done by studying anti-domic invasion and delays by anti-domic invasion in area four the monkey by Humphrey and Currie in 1978. And what they had here is the range of conduction velocities which then they corrected taking into account the bias that the electrophysiological analysis introducing and see here our estimates of conduction velocity based in the astrology. And what it is sort of surprising to us is that we don't see this very fast conduction velocity axons in our study. So this is the only difference that I could say we can see with the electrophysiological data. The reason might be that the electrophysiology was done in the hand representation of the motor cortex and our injections were actually in the trunk of the motor cortex. So there might be, good, thank you. I'm done. The next slide simply shows another way of comparing data. This is comparing electrophysiological data obtained with anti-domic invasion in the monkey by the group of Swadlow. You see a work in the 70s and you see that our estimates and their measurement really completely correspond. And you also see that in these studies of the visual cortex of the macaque and the visual cortex of human, as we have predicted, the motor cortex is faster in transmitting information to the other side than the visual cortex. So it looks as if there is at least some kind of correspondence between the anatomical data that we produced and the electrophysiological data which are available. And I could continue this story in several different directions. But obviously, I don't have time. But I'm glad if you shoot me down with some questions. I may have a quick question for Giorgio while we switch over. Yeah. Up here. Do you do nothing in the opposite? Yes. The functional implication is rather far fetched. And it is the following. I believe that the primary sketch of ourselves is essentially motor and somatosensory. The information which is processed in these areas is faster. And everything which is coming later on, it has to match this sort of temporal template. And it also turns out that these connections, at least between the two hemispheres, between motor and somatosensory areas, develop earlier than the other connections of the brain, and particularly the prefrontal and association areas connections. So this is the speculation that I can offer to you. OK. We should move on. Let's thank Giorgio one more time. Our next speaker is Gael Veraco. I'll be talking about mining and from our databases. Thank you very much. Is this all? Yes. OK. Excellent. Thank you. So if you think about how we accumulate data across publications, how we accumulate knowledge on brain areas, the way it's done is through experts that know the literature very well and come up with conclusions. So there have been more systematic way of doing this than I have actually been presented earlier today, which rely on mining coordinate databases. So basically accumulating coordinates across papers and then either define activated regions for specific questions or look at co-activated networks. So what I'm interested in is in mining brain images. So if you think of it, each year we have thousands of brain images that we're accumulating that amount to B-divides of data. And there are a variety of projects that nowadays share this data. So the data is there. The challenge is that it's extremely homogeneous data. It's homogeneous data. So you have all kind of different paradigms. And I'm only sticking to fMRI for this talk and things like resting states. So the questions I'd like to address are how do we summarize this data? And specifically, how can we map functionally distinct brain units? So I'll first talk a bit about how we can learn partulations from resting state data. And the specific problem, in my opinion, of resting state data is that it has no sign of feature. If you look at the data, you can see structures, but you can't. So structures like, say, the ventricles. But it's hard to, it doesn't separate out different functional systems by itself. So the way we can think of it is that it's actually displaying a mixture of different cognitive networks. So what we're observing is different networks that are added up with random time courses. And so the challenge here is going to be to un-mix the network. And this is not new. People have been doing this with ICA for years. But we like to tackle this with a sponsoring learning. And the idea is that we're going to do junk learning of time courses and networks. And for this learning, we're going to use sparsity because it's a good way of thinking of functional segregation. So basically, the maps are sparse. There are only a few number of voxels at the brain level. One thing that we've done is to add a two-layer model on this and say, well, the subject-specific networks are actually derived from group-level networks. And when we do this, the nice thing is that we get at the level of the population and at less of brain networks or brain regions. But at the level of the subject, we get the subject-specific. And so this is from real data. And what you can see is that the outline of the corresponding subject-level region actually outlines better the cortex that we can see in the background. So if we look at the different brain, the different maps that we get, so we get a segmentation of the primary areas. So this is not surprising. We get things like the default mode network. This is well-known. More interestingly, we get something that looks pretty much like a probabilistic segmentation of white matter. While we're looking at EPI signal, white matter has a different noise structure. And something that in the beginning I used to think that those small dots were noise, things that I didn't want to see. But if you look at them closer, they're actually a segmentation of the vascular system. So it's not a neural structure, so you may not be interested in it, but it's in our signal and we're separating it really well. And so what we find is that what's really hard actually out of average data to get at is the basal ganglia. And you can see here that we're getting them out pretty well. So if we look at what I like to call a harder segment, so we're affecting each voxel to a region, what we get as a braceletion. And we can easily compare to other approaches. And what's most interesting is to look at the basal ganglia, because as I said, they're amongst the hardest things to separate out. And this kind of convinces us that on average quality data, our method is actually really interesting. All right, so a lot of data that we have and that we think is interesting is actually composed of activation maps. It's not resting states. And one question that we wanted to ask is, given a large multi-subject experiment, can we get more information than the mean effect across the group? Because if we've acquired, if we scan a lot of subjects, there is probably a lot of information in this data. So the model that I'd like to put forward is that the response to the different stimuli, so the brain maps, is actually a composition of different atomic cognitive units. And because of functional degeneracy across subjects, the way each subject is going to compose a response and therefore create the corresponding activation, that is going to vary across subjects. So we have a set of different loadings across subjects and from a statistical standpoint, the challenge is how do we learn these loadings and that's going to give us a new and mixing problem that can be caused in a specific dictionary learning problem. So I won't go into the details, I'll just show the results on a specific data set that we call the localizer data set. So the specificity of this data set is that it's actually a very short acquisition, five minutes with only five contrasts that we ran on many subjects in my institute. So it's not very rich in terms of cognitive content. We've got very simple contrasts like some visual tasks, computation or reading tasks, some motor tasks, and so we have only five contrasts and what we're going to try to do is to extract 50 cognitive atoms and the corresponding maps. So we're extracting 10 times more maps than we have contrasts. So we get 50 of them, I'm not going to display all of them but these are a few of the interesting ones. So I found that there were non-network completely uninteresting, some that were more interesting than others. So this is the language network, we all know it, what's nice is the way that it's loaded on the original experiments. So we have it loaded more on auditory than visual and more on words than checkerboard. So it makes sense, dorsal attentional network, fairly well known network, I found really interesting that it's loaded on computation. We actually know from psychology that computation is a visual special task. So it's not surprising if we know our psychology but it's not something that I would personally have come up with. Scenic network, we have a very clear scenic network here and it's loaded on computation, so this all makes sense. So we do a hard assignment and we get an atlas, a brain region and the nice thing of this atlas is that every region is linked to a cognitive profile. And so we're breaking down systems like the visual system with different loadings and we have, for instance, that the checkerboards, horizontal versus vertical are more loaded in what I believe here is V2. So this sort of makes sense. All right, so before I conclude, I'd like to talk a bit about software. So I've shown you that we're making progress into developing complex machine learning algorithms. So that's the algorithmic part. The problem we have is that we want to tackle beta bytes of data and we also want to get these IDs out of our lab. So we're leveraging, we're developing a different open source project, one which is really new, it's not even available yet but the webpage is there is nylern which is going to be machine learning for neuroimaging and it leverages scikit-learn which is one of the references toolkits for machine learning all open source optimized. All right, so to conclude, I think that using specially crafted diction learning techniques is a great tool to do brain atlasing by mining big image databases and we've adapted it to resting state and activation map but the good news is that the data sets that we've used so far are fairly small and these are very preliminary data, preliminary results. So I think that we're going to be getting much better prosalations than we are. Thank you. Any quick questions for Gail? So in the sparse dictionary, that's sparsity in space? It is sparsity in space. Is that the magic of it? Because you're getting these nice results on only 30 subjects, is that the special sauce there? Yeah, I found, well, it's not enough, we actually do much more but I found that, yeah, my personal opinion is that moving from independence to sparsity for very technical reasons, technical reason that basically brought down to sparsity is a well-defined prior whereas independence is not and you can combine it with other things. So you get rid of the PCA set basically. That is the secret sauce in my opinion. All right, let's thank Gail again. This is a good show. Thank you. All right, and next up is Cameron Brodeck from the Child Mind Institute and the Nightingale Client Institute for Syngetched Research and the NeuroBureau. They'll be telling us about the NeuroBureau Processing Initiative. Thank you. Hello, is this thing on? Hi, so I'm Cameron Brodeck. So I am a co-founder and member of an organization that's called the NeuroBureau. And for the uninitiated, the NeuroBureau is an international organization that's made up mostly of young neuroscience researchers and our goal is to foster interdisciplinary collaboration as well as open science and the neurosciences. Although we try to be very neuroscience non-specific, we are mostly neuroimagers currently but if any of y'all are interested, please join up. Everybody's a member of the NeuroBureau. But so the NeuroBureau has been working on a pre-processing initiative and our goal is to develop high quality, well characterized pre-processed data sets that we can put out into the world for people to do research with. And our goal is, is using these data sets, people will be able to use these as benchmark data sets for their tool development as well as resources for non-neuroimaging brain enthusiasts. So for example, machine learners who would like to use neuroimaging data but don't care about the specifics of whether or not you should do global signal regression as well as to compare and evaluate different processing strategies. So the core of this are data sets that are openly being generated and put out into the world, openly shared. And so a lot of these are an initiative by the Child Mind Institute, many of my colleagues and maybe many of the individuals in this room have taken part in some of these in the past. But specifically we're, you know, I'll talk about three here. One of them, the first pre-processed initiative was the ADHD 200, which is a data set of 530, I'm sorry, 375 individuals with ADHD as well as 598 typically developing controls. And so when the ADHD 200 data set came out, this was aggregated across eight sites and they had a competition to encourage people to use it. And the competition, there were two challenges. One of the challenges was for machine learners to come up and try to come up with the best classifier that could distinguish ADHD from healthy controls as well as subtype ADHD. And they also had another competition that was more just a neuroscientific product. So, you know, somebody uses the data to come up with a neuroscientific exploration, you know, test the hypothesis and the data. But what clearly became a parent or quickly became a parent is that there was a barrier to entry into these competitions and that barrier was a lot of the people that would be interested in competing in such challenge don't know specifically how to use resting data from our data, don't know how to pre-process it, don't know how to do these things. So our first initiative, there were three different groups that went together to pre-process this data and make it available for competition participants. Since then, through the India initiative, there's a very large data set of DTI data that has been processed by one of my colleagues in Hungary and made available. And currently, we're working on the abide data set, which is 539 individuals with autism and 574 controls. So when we're doing this, we're taking a multi-pipeline approach. So the first initiative that we did, the ADHD 200, there are actually three pipelines using, so one was based on FSL and AFNI, the other one was based on a variety of MIG tools using Pierre Bellach's phenomenal pipelining system the ACMPISOM. And as well, one of my colleagues, Carlton Chu, put together a BVM data set using SPM. So for abide, we've actually extended the number of different pre-processing initiatives that we're including. So now we have the AFNI and FSL using CPAC, which is based off of Satru's Nipipe source, and I've been showing this here, as well as Pierre's gonna use his MIG tools again. And we have another colleague that builds D-parfs and REST, which is based on SPM that's gonna provide a pipeline. As well as we have, we're gonna have more cortical measures this year. So Alan Evans and a group at M&I are using CIVIT to extract all the cortex for these individuals. And also we have a group doing free server out of China. So anyway, so this data will be available, the abide data in various derivatives. So we, you know, from the functional data, we're doing things like the sort of the normal things that you would expect. Here, you know, some of the examples of the data that we'll be generating. Things like ALFF, FLFF-RIHO, Fox and Mirren Homotopic Connectivity, as well as, you know, distinct maps that are commonly made available, and as well as time courses for various parcelations of the brain. And using those parcelations, you can visualize the connect on this way, which is obviously insufficient compared to the hives. And as well as, well, we have the DTI data in the various formats. So also with the DTI data, we have full probabilistic tracking data from FSL so you can do and look at the tracks if you like and come up with another hive plot. And so that's basically the idea. So some of the initiatives for this year, for the next one are, is that we're gonna try to do extremely careful quality control. One of the things that we sort of punted on with our first initiative was QC, and our justification was that we would allow people to do quality control themselves. But if we're presuming people don't know how to work with this data, they probably don't know how to quality control it. So what we're working on is very careful inspection so that we could come up with quality control scores and essentially handle able to quality control. And then hopefully this will be a valuable resource for other researchers to come up with automatic procedures for doing quality control or coming up with better quality control metrics for neuroimaging data, which we sorely need. So the availability of this data, so we've been in a pretty good partnership with NITRIC. They've been handling a lot of our data storage requests even though I know that we frustrate them quite a bit. But the, so it's, so the ADHD 200, the DTI data are currently available at the neuroburo project on NITRIC. Hopefully we're working on trying to get it into the cloud so that people could directly access the data from there. As I, so far to sort of give you a sense of how successful this project has been. So the winning team from the ADHD 200 was a group of biostatisticians that had never really worked with neuroscience data before or neuroimaging data before. And so, and they used our data and won the competition. So far our count is there's been at least 10 peer-reviewed publications. And one of them, and one of them's a group, there's one guy, Jau Sato. I'm not good at his name, he's Brazilian. But I think he's got like three or four publications on it. And so he contacted me and then from his perspective, it's just using this resource as a way to quickly test his algorithms and get them out. And so he's been a really successful user of it. We've had quite a few downloads to give you an idea here, some of them. There's actually a dot in Cuba, which we're pretty excited about. We got to Cuba. We haven't gotten into Africa yet, though. I'm really disappointed about that. But anyway, there's a future, right? No penguins in Antarctica, unfortunately, are using our data. But as I mentioned, so that part of what we've done through with the FSL and AFNI is based on an open source project that I've been working on. It's a pipeline that is based on nine pipe, but it's a specific pipeline for doing connect-ons analysis that is available. I had a demo earlier. It's late now, so I can still show you stuff if you like. But anyways, to check it out, it's pretty good. It's developed on GitHub, so you can fork it and do what you will with it. Also, I'd like to bring your attention to BrainHack, so the neuro bureau hosts the BrainHack every year. And this is for interdisciplinary brain enthusiasts to get together and work on projects, actual neuroscience projects. And so we try to make some data available for people to work on, have some organized projects if people would like those. But mostly it's a workshop where there's no structure. There's very few talks, and most of the time it's been actually an open discourse with colleagues. And so this year it's gonna be in Paris, and I believe it's an old royal castle. It's a fantastic setting, I know. But so check out if you're available during those weeks, come to Paris and hack a few brains with us. And I think that's everything I have. Cameron has left ample time for questions. Do we have any questions? So the natural one is so you've now analyzed the same data in a gazillion different ways, and... You know, this is why we need you, Tom. This is why we need you, is to come up with multiple comparison corrections for everybody using the same data over and over and over again. Well, I'll just say, are you sort of scared by how different the results are, or do you generally get the same things out, or... Well, so, you know, honestly we haven't systematically done it, and that's what needs to be done. So the abide data, that is what we're planning on doing when we release that data, is to actually do a systematic comparison of the different processes to get a sense of how different they are. I know, preliminarily, for two of the pipelines, the D-PARFs and CPAC, that at least our data derivatives, I mean, not the results from the group analysis, but the inputs to the group level analysis actually agree very, very well. And we're still waiting on that comparison for the mean tools. Great, let's thank Cameron one more time. Okay, and our next speaker is Kit Chung from Imperial College London, and he'll be talking to us about neuroflow. Hi everybody, I'm Kit from Imperial College. I'm under a custom computing group, which mainly use FPGAs to speed up various kind of applications, and also in neurocoding lab, which do various type of biological experiments on mice. And so one very interesting question is, what if we've got enough computing power in a single electronic chip so that we can just build a brain chip into our head because just remove our cerebral cortex and then replace it with one, or maybe two of the brain chips, and it's maybe termed cortex. But it won't be out for 100 years, I think. So, but that is just science fiction, but the now more recent researchers doing large-scale simulation of spiking neural networks has found some very interesting applications, like the Spon, which is just out in the last year, has been using 2.3 million spiking neurons and is able to carry out six cognitive tasks in a single neural network model, and like counting or remembering different numbers, etc. And also Isikovich has been building a biologically detailed model and to investigate the dynamics of different brain states. There's a genuine motivation for building a large network because in order to understand something, you need to build one, so if you don't build a brain, then you can't really understand how it works. So it also, for a larger network, it also supports some functional networks and also you can explore the dynamics of larger networks. And also is there is some biological possibility, like the connection probability is not going to be the same for smaller networks. And, but now the problem is, so previous people use CPUs or if you don't get enough parallelization, you use multi-course and then even if you don't get enough parallelization, you use GPUs. But seldomly people talk about FPGAs as a possible candidate for this type of processing challenges. So basically FPGAs has a number of advantages over GPUs and also CPUs. It allows reconfiguration and low-level customization and it's easy to scale up. But the major problem is the program difficulty, which is you can see on the right-hand side is the X-axis is the program difficulty, the Y-axis is the performance. So it's very hard conventionally to program an FPG, you use very low-level languages like very low or very low to program an FPGA. But now some sort of CAD or some different tools have been evolved. Now some people have been using FPGA as easy as a multi-course GPU or GPU, so that in the particular system that I've been using is the Java to, which is a Java code to describe the data flow, the computation flow of the different computations in the FPGA so that it then compiles into configurations in the FPGA. And now it allows faster development time so it becomes a more attractive candidate for such type of, such computation. And we've purchased this, the math note in the middle which consists of four math freecasts on the left-hand side of this slide. And each of these math freecasts contain one FPGA and 24 gigabyte of memory. So it's specialized for high performance and data-intensive tasks with low power consumption comparing to CPUs or GPU counterparts. And also it has a very special streaming programming model which enables a deep pipeline for the computation. And also it, unlike some of the customized platforms, it is off the shelf, you can just buy it and then install it into your lab and then it's safe time to build your new system and then it also provides fast network so that you can build a number of, you can use multi-note implementation to build your FPGA-based new network accelerator. So to introduce my platform, it is the overview of it. So it's a very standardized kind of time-driven simulation of point neurons like e-cigavage or integrated virus. Model one cycle of it will correspond to a delta T usually it's just one millisecond so you calculate it, calculate the delta V which then add to your previous value to update the neuronal states. And then same as other type of accelerators, it's also provides some parallelization like for the, such as the parallelization of this differential equation and also I've done some pre-formating of the memory content so that the memory access is built up and the hardware is, I've just mentioned it's four FPGAs and 96 gigabyte of VR free RAM and now most important function that I've included in this platform is nearest labor SDP which many people find it hard to include in some real-time performance systems and also it supports some various post-geometric kernels like exponential or alpha function or your custom functions or custom kernel which you can build for your network. This is the major computation flow for my platform. So in a single cycle of one millisecond, it contains two phases. The first one is the kelp phase which you just parallelize the differential equation. You just fetch the neuronal states from the memory and then you just update it and then you get a list of farther neurons. It should be indices of farther neurons and then you get these lists of farther neurons. You then go to the memory and then fetch those associated symmetric ways from the memory and start it onto the on-chip memory. So for a multi FPGA implementation, I've used a very simple state machine because our platform now has a FPGA to FPGA connection which connects them as a ring. And I can just use a very simple state machine to pass the neuron indices around the between the FPGAs. And also the ACP has some make use of FPGA and the CPU's advantages in assessing memory. For instance here, the CPU uses a CPU got a lot more cache so it's faster to assess the memory randomly. And FPGA is more faster in processing the memory linearly. So I combined these two advantages and achieved this one, two, three, four, this flow of computation for STDP. We've got some hardware customizations because it's not like CPU or GPUs, it's got limited hardware resources so we have to make use of the most resources possible. And also now we've got the hardware platform going and we can now add an extra layer of pine so that it achieves like the neuronal, you can support different neuronal population which share neuronal parameters and also current injector or different source of random different functions to it. And this is the combination flow. If you feed in this, the flow of the computation, you just feed in the pine description here. It produces a whole scope memory file and hardware description. Hardware description will be compiled to be the configuration of the FPGA and also the memory file will be loaded to the F chip memory for the system. And currently the neural flow is for 98,000 neurons, it achieves a three point five times speed of performance for neural flow. And it's around double of that of the GPU which is of the same, using the same 40 nanometer processor. So it's relatively attractive to use FPGAs. And there are some other customizer platforms like the Spinnaker which is from Manchester which is quite famous. They've got these very huge range of different requirements for real time computation and also a 16 FPGA platform for 256,000 neurons in real time for FPGAs. And the performance beta for my platform is 400,000 neurons to be calculated in real time and it's supported a 800,000 neurons which is correspond to around 16 millimeter cube. But yeah, it's just a approximation. And as the size goes larger than, of course, the performance drops, but it's around 0.5 for a 1000 random synapses and for FPGAs. So the major drawback is, it's pretty hard to convert a algorithm from CPU based to FPGA based. So you need to do some parallelization or some streaming effort to avoid loops in your programming. And also it's harder to debug. And the major drawback I think for most of the people is it will cross 10 hours of compilation time, but if you use the same design, you just change the dataset, you don't need to compile it. Now, we are migrating it to a new six FPGA platform and going to do some sort of modeling work onto the platform. And in summary, I've introduced the FPGA as a very attractive platform for neural simulation for this task. And we have introduced some low-level customization and fine-grained parallelization, but also we can do some high-level computation and we can do the things that the other high-level platforms is able to do. Thank you. Okay, thank you. Okay, one quick question up here. So you're saying that you're using Java? It's not a regular Java code. It's Java code to describe a computation flow of the inside FPGA. It's a custom build by the vendor themselves. How good is it? How good is Java? Oh, no, no, it's not a direct translation. It's just a way to, you can treat it as a different language, but it uses Java languages. And objects. You can treat this way. I'm sorry, I'll follow up with Kit later because we had to move on for time. So let's thank you one more time. Our next speaker is Krishnan Padman-Apan from the Salk Institute. We're talking about large-scale, whole-brain mapping of inputs to the olfactory bull. Hello, can everybody hear me? Great, so I wanna thank the organizers for this generous invitation and come talk to you about some of our work. I'm actually gonna begin with a neurobiological question and then hopefully motivate that and use that to drive neuroinformatics question and really talk about some of the work that we've been doing. And so my chief interest in neuroscience is really about understanding sensory coding. And if we think about sensory coding as this problem of taking a complex stimulus, in this case, this cheese in the natural world, and then it's filtered through a neural circuit that's made up of a number of neurons, a number of synapses, and those neurons generate some complex pattern of activity that complex pattern of activity we can analyze and think about in terms of the computations either in terms of the information that it provides about the stimulus, the paralyzed correlations among the spike trains, or some metric that we choose. And hopefully use all of this to try to understand the behavior of the animal. But what I wanna do is specifically focus on this, on the neural circuit itself. And I think that many of the initiatives that we've talked about and that people have discussed here today are really focused on trying to understand the anatomical connectivity of the brain, whether it's at the level of EM reconstructions doing micro reconstructions of small regions of a circuit or whole brain imaging or whole brain mapping. And there's an intermediate problem which is to try to understand the connectivity between neurons, individual cells from one brain area to another. And so what we've been doing is actually developing both biological tools as well as computational methods to try to get at this problem. And this challenge of circuit mapping I think can be summarized in a very simple way using this cartoon. So imagine two brain areas with neurons in each and we wanna try to understand the connectivity of this neuron to the population of cells in the local area, something that may be possible or tractable using EM. But we also wanna try to understand what the connectivity of this cell is to these other areas. And so this cell is unlikely connected to all the neurons here and unlikely connected to all of these neurons. But what we really need is a tool that's gonna allow us to investigate this wiring diagram. And so I wanna begin by talking about some of the biological tools and then talk about some of the computational methods that we're developing and hopefully bring those together to talk about the kinds of research questions that we're using that are motivating their integration and hopefully some really exciting work that we're doing that I think is going to take this project forward. So in terms of the biological tools we need something that can label circuits, something that can actually label synapses. And on the computational end we need something that's gonna be able to image in principle an entire mouse brain including all of the neurons and the structures perhaps at the level of dendrites and possibly even individual synapses. And so the technique that we've actually been using at the biological end is a tool that was developed in Ed Kelly's lab here at the Salk where I am. And it's the G-deleted rabies virus. And I'll talk about it in just a moment. But it has a number of advantages. First of all, it's transynaptic. So when we inject this virus into a population of neurons the virus jumps one synapse and it labels all the connected cells at that synapse. And so it's highly specific for actual connections. It also fills the entire neuron. And so this virus essentially labels the structure or the morphology of the cell and the virus can be pseudotyped for high specificity. So not only can we study the connections of a cell in a given area but we can actually study the connections to subtypes of cells within an area. And on the computational end what we really need is a way to try to capture all this data, to do high resolution imaging of the entire mouse brain and then to automate things like the alignment of that imaging, automated cell finding as well as indexing and hopefully registering it to some universal or generic mouse brain and then use all of this together with some analysis to try to motivate or to drive some questions that are interesting to us. And so I'm gonna give you one example of the way in which we're using this to a system that's near and dear to my heart which is the olfactory system. And what you're looking at here is a cartoon, a schematic of a mouse brain and the front of the mouse brain is here and this is the olfactory bulb which is the first synaptic processing area of olfactory information or odor information from the natural world. This is the back of the brain. And so what we do is we actually make an injection of the rabies virus, the G-deleted rabies virus into the mouse olfactory bulb and we target a subpopulation of cells within that area. So what you're looking at here is a coronal section. So we basically chopped a chunk of the brain here in the front and you're looking in on the brain. In blue are the actual nuclei of individual neurons labeled and in red is the injection site, the place where we put the rabies virus. And so what we're trying to do is use this technology to try to understand all the neurons that project to or connect to these population of cells that we've labeled. And so if we look further back in our sections and I apologize for this being a little washed out, hopefully you can see this here. There are a number of red cells that have been labeled and these cells are actually millimeters away from the injection site, but they represent synaptic partners to this population of neurons. And if we zoom in a little bit more, what we can see is not only the neurons, but in an idealized image, what you would see is actually the dendritic processes of these neurons. So not only do we get the connectivity, the population of cells that's linking to this, to this population in the olfactory bulb, but we actually get their morphology and so we get two pieces of data. Now, the real challenge here is that what you're looking at is a single slice here in one part of the brain and another slice in another part of the brain and a zoom in at a higher resolution in another part of the brain. So we're looking at little chunks of a complete picture and what we'd really like to do is to synthesize all of these chunks and put them together in a full representation of this circuit. And so what we've been doing is actually reconstructing these entire mouse brains at fairly high resolution at the resolution of individual dendrites. And what I'm gonna show you is a movie of one of those reconstructions. What we did was make a very large injection here in the olfactory bulb. You can see one of the hemispheres of the bulb and another hemisphere and you can actually see that the virus spilled over and labeled the population. If we spin this brain around, what we can see is that the virus travels over entirely large domains, almost all the way to the caudal end of the mouse brain, as well as to the contralateral hemisphere. And so this viral labeling technique actually allows us to get at circuits and the connections between cells over great distances. We can label all the way to the caudal part of the entorhinal cortex and we can also look at the population of neurons in regions like the paraform and the amygdala. These are actually the areas that you saw in one of those sections, but what we're doing now is actually amassing the information over the entire data set. And to give you a feel for how big these data slides are, or excuse me, these data sets, are one of these mouse brains fills up about a terabyte of data and what we've done is take these mouse brains now and actually try to figure out a way to analyze them. And so without going into details and glossing over things, what you're looking at here now is a representation of that mouse brain. Each of the individual sections are represented here in gray lines. The front of the mouse is here, the back is here and this is just the convex hull of those individual sections and each of these little red points corresponds to one neuron that we found that's been identified with an algorithm. And when we actually, and all of these neurons essentially connect to this location here, which is the injection site. And what we can do is then classify where in the brain these cells belong and you can find the olfactory bulb, the two accessory olfactory nuclei and the axolateral and the contralateral hemispheres that you saw in the previous image, as well as the paraform and the amygdala and other regions of the brain. Oh, great. So what we can do then is really use this technology to motivate scientific questions, which is kind of what I'm interested in. And so we can actually change the size or the location of our injections. So here we've made a big injection in blue and a small injection in red using two different colors of rabies virus. And if we actually advance this all the way back into the paraform cortex, we're gonna zoom out, you're gonna start to see the cells up here and the axolateral injection here and the axolateral injection for the red. And when we spin this brain and look at it from the top, you'll already start to see differences in the distribution of cells. And those differences can only be captured if we have all the data. And so in a sense, we're actually extracting structure by pulling out all of the information about this connectivity. And you can see here that there's a clustering of cells here in the front and the blue injection labels neurons much farther into the caudal parts of the brain. And so what we can do is actually represent this by looking at the individual distributions of cells, which are here as individual little points. And you can see that there is some structure and I won't go into the details of how we've classified or how we've quantified the structure. But what we're doing now is really using this complete reconstruction, this tool for looking at all of the pre-synaptic partners in a sense, the connectome of this circuit to try to extrapolate principles about how olfactory computation may be occurring. And so I wanna conclude by saying that we have this great tool in this technology to map the connectivity of cells. It allows us to identify and classify these individual elements, including the neurons. We can reconstruct their morphology and we can blend this with other techniques, including electrophysiology and imaging in hopes of getting spatial and temporal data to complement some of the anatomical methods. And I wanna talk really briefly about where we're going and the way in which we're going forward. And I think one of the areas of interest that I am particularly excited about is trying to model human diseases. And so we've actually had a collaboration with Fred Gage's lab at the Salk and which what we're doing is actually reprogramming human stem cells and transplanting them in the mouse as a way of studying or modeling psychiatric disorders. And so you can imagine the complexity of this problem. And so what I'm gonna do is just advance this movie. What you're looking at here in blue is the edges of a mouse brain. And these two green spots represent neural precursor cells that were actually transplanted into this mouse brain. And as you spin around what we'll see is we go from about a centimeter of resolution down to individual processes that are being extended from these neural precursors. And I think understanding both the anatomical connectivity of these transplanted neurons as well as physiological properties will really give us the kinds of tools that we need to make some headway into understanding some of the neuroanatomical and neurophysiological underpinnings of psychiatric disorders. And so none of this would be possible without an incredible group of collaborators. I'm actually a Kirk Jacobs fellow at the Salk. I work with Terry Sinovsky at Callaway and Fred Gage. Fumitaka Osakata is a postdoc that I work with in Ed's lab and he actually has done all the virus generation. And Carol Marchetto and Bilal are two postdocs in Fred Gage's lab with whom I work to do some of this transplant. And these are all of the past, current and future funding sources who have been kind enough to let me do this work. So thank you so much and I'm happy to get some questions. Okay, we're running bit behind but if we have one question, go ahead. Yes. So the rabies tracing method is retrograde. So it's presynaptic. Yeah, so I forgot to mention that but we can actually control it so that we can limit it to jumping only one presynaptic partner back. And that's what allows us to get the specificity of knowing all the presynaptic partners. If we let rabies go indefinitely, we would essentially, in principle, label the entire brain at some point. But yes, it's a G-deleted variant and I can talk to you more about the details of that offline. Great, thank you. I guess, thank Christian. Okay, our next speaker is Michele McLeory. He's at Yale University and he's gonna be talking to us about the 3D model of the Mitchell-Renewal Cell Organial Factory Bulb. Okay, thank you. We are interested in building a 3D model of the mitral granulocytes as a way to understand aeroplane functions. The olfactory bulb is one of the most steady systems for many reasons. One of these is that it's apparently very simple in its organization although input activates olfactory sensor neurons which then activates the tuft of the mitral cells that sends the axon to the cortex for other recognition. The output of the mitral cells is modulated by a large population of interneurons in the granulocytes which through the dendrodendritic synapses with the lateral dendrites of the mitral cells set up the connectivity between granul and mitral in the olfactory bulb. And it has been shown with viral tracing that this connectivity is not random, is not uniform, but it is composed of a set of distributed synaptic clusters that probably are formatted by activity-dependent mechanisms and they can be very thin by the size of a single granulose. We have also shown in a model that the main mechanism is responsible for this kind of connectivity is the interaction between the back propagation of action potentials along the lateral dendrites of the mitral cell with the local activity of the granul cells. Using this mechanism, we have been able to show in a larger but still one-dimensional system and taking data from the experiments done on the dorsal part of the bulb for about 70 others and about 70 individual colomeruli and we have been able to explain several experimental findings that predict new results on the effect of lateral inhibition on the network cell organization during auto-presentation or the formation of synaptic clusters as observed in the experiments or the spike-time distribution following single SNFs during auto-presentation in single mitral cells. But the problem is that as many other systems, if you want to study micro-sequits, this requires a new generation of 3D computational models. So in this case, the factory bulb is an excellent model because its investigation for natural orders requires a full 3D implementation because natural orders activate a large portion of the bulb. So in order to do this model, we need basically two kind of input data, two kind of experimental data. One is for the input that drives the network to self-organize and we got this from the Alan Carlton lab for 128 glomeruli and about 20 natural orders and we need also a full reconstruction of mitral cells. And this is not an easy task because in the Kensaku Mori lab we gave us a few mitral cells. It takes two months to process a single cell in its entire full genetic range. So we got these cells, we extract the statistical parameters such as the growing direction of the dendrites, the path length, the bench length, the perfocation probability and this is what we think this method is quite general because it does not depend on the kind of population that you want to synthesize but as long as you have experimental constraint, you can apply, as I will show you in a few slides, for Purkinje cells for example, you can grow a network for building a 3D model of the cell development for example. So with this data, collected in such a way that the cell will grow according to the system that you want to model, in this case, an ellipsoid for the bulb, you can get mitral cells which are indistinguishable statistically from the real cells and you can grow basically in a limited number of mitral cells in 3D and build your bulb. So the model is a fully integrated neuron plus Python parallel implementation and a typical 20-second simulation for about 200 mitral, 40,000 granules and 1.5 million synapses takes about three hours on 2000 processor on a blue gene with 98% efficiency. Let me quit a little bit for a while for a couple of minutes to show you an interactive view of the model. So this is a full 3D model using a public domain software at Miami too. And so let's start with the... Of course, we are also modeling the full set of 1800 glomeruli but we are waiting for experimental data to include them in the model. At this point, we are using only 128 for which we have the raw data. If we look at the single mitral... Okay, so this is a single mitral and the set of granules that is connected to it and we can do other things with this nice graphical interface. If we connect an history, a simulation file with this interface, we can ask what's going on at this granule cell, for example, which is connected to this dendritil segment and this is the time evolution of the 20 seconds in this case and the firing rate of the granule and the mitral at this position or we can ask what's going on at this dendritil segment to see, in this case, there are two granule cells and these are the history files. So if we look at the network, I hope to be within the one minute warning flash. Yeah, good. So this again is not the entire system, of course. It's just a couple of glomeruli and a few mitral cells just to show you how the system is organized and of course we can also do some nice thing to compare directly our simulations with the experimental data. We can do, for example, a slice. So in doing this, we can realize, then in the experiments you do a slice, you are losing most of the dendrites of the mitral cells that you are studying. Okay, so if we want to look at our system, our model, our simulations in another way, we can do again, using only public domain software, we can do a simulation of the sniffing for a larger system, but still not the full system for visualization preprocessing. In this case, there are just a few glomeruli and one mitral per glomerulus and a couple of sniffs will give us a way to look at the network activity and the signal propagation with time during another presentation. By applying the same, exactly the same algorithm to Purkinje cells, we get this. On the left, there are real cells, on the right is our implementation of just the two sheets of Purkinje cells. What? Yeah, okay. Less than one minute to show the conclusion that we have the data method using experimental constraint to implement computational model of photorebond microsequits. And the most important part is that, in this case, the realistic neural elements interact in a 3D space. So we think that we are implementing and can study in a much better way the real system. The simulation runs very efficiently on supercomputers and the model can be easily expanded, of course, to further refine the system by adding more kind of neurons or mechanisms or ion channels and so on. And the method is generally enough to be used for implementation of other brain systems. Thank you. Quick questions from Michele? I think for the sake of time, we'll just go on to the next speaker. Let's thank Michele one more time. Okay, our next speaker is Shri Jury Tripathi from Kranigamel University. We'll be talking about reusable experiments. Hi. Thanks for having me come to talk to you guys. So I'm gonna talk about making the results of small data reusable. So, you know, as neuroinformatists, we've built a lot of great tools for data sharing. Like we have like the CRCNS website. We have this data space thing we've heard a lot about. But I think that like these resources are really great, but they're really underutilized, you know, for like all the terabytes of data that gets produced across all the labs that do neuroscience, maybe like 1% of 0.1, 0.001% of all that data are in these really great resources. And that's because there are like legit barriers to data sharing. Like for example, there's like social barriers. Like someone will say, well, why should I share my data with you? What's in it for me? Like what if I get scooped? Or like, hey, it's my data, get your own data. So like these are real barriers and yeah, they're real. And then there's also like methodological barriers. Like that just right now, it's really hard to share data even though we have those great resources. Like there's questions that like experimentals have like that are like how, you know, how do I share data? What do I share? Or going back and annotating my experiments in a way that they'd be useful to someone else is really time consuming. So it's not really worth my, you know, like my time investment to do that. I think as informaticians, like, you know, we could really work on these methodological issues and then, you know, like yell at our funding agencies and work on the social issues. But we need to do both. So I wanna talk about some of the methodological issues. So this is the idea for the project we have. It's basically an experiment. It's that what can we do to make a standard run of the mill neuroscience lab more data sharing savvy? Okay, so the idea is we're gonna go into my lab, my lab at the urban lab at Carnegie Mellon. We do slice electric physiology and we're gonna incorporate structured workflows and informatics and all these really cool things. And we're gonna try and make us like make it to a point where like it's really easy for my lab to share the data openly on the web. And we wanna know what does it take? And where are the points of conflict? So like this is again this like sort of social experiment. So the insights and motivations to this is that you can't just share, if you wanna share data, you can't just share like the raw MATLAB file with like for example like the voltage traces. You have to also share the metadata. And for us, that's in like literally physical lab notebooks. So if I wanna share my data with like Stephen or something, I have to send just my data file and also like maybe images of the lab notebook. And that's just the fact of life. Like the metadata is in these lab notebooks. The second motivation is that you know the most about an experiment as you're performing it. Right now, sort of the, I guess the plan for data sharing is that people are like, oh hey, I'll share my data after my papers accepted. But by that time, they've sort of moved on and they've more or less forgotten what they did or how their data is stored. And so if you're gonna do something to make data sharing possible, like the earlier you do it in the experimental, like in the experiments lifetime, the better you are. And then the last motivation is that if a lab that sort of practices the best practice for data sharing should be more productive. So like the way most wet lab neuroscience goes is like a single investigator sort of collects some data, maybe like 50 or so neurons and they publish a paper and that's great. So for example, a journal of neurophysiology which is an okay journal in my field. But I argue that like if you sort of practice these best practices then you can sort of work with more people. So you can work with say your lab mates. You can pull your data across your lab mates or pull data across your collaborators down the street, maybe across the world. And then these people working together can maybe get a better paper, like in nature. And so like this is good, like everyone's sort of going for like a, we want more productivity. We have to sell this data sharing argument to wet lab scientists. I think this is the way to do it. Like we will make each scientist more productive. So anyway, those are the ideas. So our schematic or the project schematic is that, this is the standard electrophysiology data which is a pipeline. It's just you, there's a neuron, you have computer, you record data from it. Okay, and this is what we've added to this project or added to this. So our key innovation is the idea of storing metadata on lab notebooks, like tablets. Okay, so rather than storing metadata in a physical lab notebook with pen and paper, you're going to store it electronically on a tablet. And then, and so now then like, when you store metadata on your tablet, then you have like a data file. It's just like, that's the experimental file plus your metadata file. You sort of put those together in some cloud-based data server. And when we do this, we can also incorporate some magic technologies into our electronic lab notebook. And so like, when you have all this data in one central place and you do cool things with it, like visualize it in interesting ways and like list the experiments and you can sort of like mash up the data in ways you couldn't before when you just had like the data from like a single investigator. Okay, so this is our metadata data app what I'm showing. Again, it's like an electronic lab notebook running on tablets. And so like, and then the way we engineer this app is it sort of matches the workflow of slice electrophysiology where you first prepare your animal, then you cut slices and then you pull electrodes and then you record from neurons. And so each of those sort of parts of the experiment are captured in the app. And the reason why you want to use an electronic lab notebook versus a pen and paper lab notebook is that it allows structured data entry. So in my lab, we use transgenic mouse strains and where individual cells can be labeled with GFP. And so like rather than sort of scribbling that down into your lab notebook, what was your animal strain? You enter that via pushing buttons on this app. And so for example, like they would push like one of these, like one of the things on the app and then that would register that they were using like that strain of mouse. And doing it in this structured way allows us to easily incorporate semantic technologies. So for example, if you were like to tag mouse strains, we should tie that in to the mouse genomes informatics database which is just a listing of mouse strains and unique identifiers. So this is great. Like and we sort of do this with all the attributes in the app. We can like directly tie them onto like their semantic equivalents. But in making this lab notebook structured, like it's important to strike a balance between flexibility and rigidity, or flexibility and rigidity. Like if it's too rigid then the experiment's gonna be like, hey screw you, I'm gonna go back to my lab notebooks. But you know, but like by making it structured that we can sort of do, we can use that structure later on. One nice thing about having an electronic lab notebook is it allows you to sort of add new content. So for example, this is an image of a mouse brain atlas. And so when the experimentalist is sort of recording from a particular neuron in the brain slice, they can just say, oh hey, my neuron is about here on the atlas. And then we can sort of aggregate across these and then we can do cool analyses across neurons, across atlases. Okay, right. And so to make the metadata app work, we have to synchronize that with our electrophysiology data acquisition tools. We just have like APIs that allow them to communicate with each other. But the end result is that each trace of data that's collected, so each sort of voltage trace of the neuron, each spike is registered to a bit of unique metadata. And so we can sort of go back and forth between them, which is great. Oh, cool. Oh, right. So let me, like, so now we have, we're in certain ways of visualizing this data. Right now we have a simple listening of experiments, like basically what you'd see in a lab notebook page and just like a way-based way of viewing like the actual data that's collected. Down the line, we wanna sort of use the metadata that we're collecting via the tablet app to sort of sort of experiment in smart ways, like sort them by who did the experiment or like what was the animal strain used. And ultimately, we'd like to sort of, you know, like do enable in browser analyses. And we're really interested in like sort of tracking provenance of tracking provenance of data. So like if you have a voltage trace and you're doing some analysis on that, we wanna track that back to the original voltage trace. Like that's the end goal. So our next steps are sort of like we wanna use these tools that we're developing, like so actually use them in the lab, like we've used them a few times, but so we're only testing. We wanna use them and use them for enough time that we sort of have enough of a data set that we can ask questions like, you know, like what are like the properties of neurons throughout the brain that we weren't really intending to sort of look at, but we can now look at now that it's all this data structure in one place. We wanna know like how easy it is to expose these data sets to other databases, like the ICF database or NIF. You know, like this solution is custom to my lab, but like lots of labs do electrophysiology. So like perhaps they're adaptable to other people. And I think this is the way to go, but like who wants to pay for this? I don't know. So let me just acknowledge by, you know, like my lab, Kari Kargamelin, and also say that like, you know, all of Elsevier is not evil. Some of them are, I don't know, or whatever, but like thankfully they paid for this project. It was really cool. So like me as a grad student didn't have to sort of code up this app or like code up this data server. Like they paid people to do that, which is awesome. So yeah, thanks. Thanks. Question back here. Yeah, not many. Like 200? Yeah. What has been the uptake? Do people resist it and brace it? I think it's like there's, it's sort of, it's back and forth, it's different. And so it's different from the standard way of doing things. And it's sort of a different workflow. And right now we're striking this balance in our app of like make sure it's not too time consuming versus like actually capturing the data that we want. So the first time we made the app, it was you had to sort of click everything and that was like too much work. And so now we're sort of redesigning the app or like we're, yeah, we're like, now you're capturing less information but it may be more likely to be used. Now there's a balance. Okay, let's thank Shijir again. Okay, our next speaker is Fan Meng from University of Michigan. And he'll be talking about pub anatomy 3D integrating midline exploration with the Allen grain mouse Atlas. Allen mouse brain Atlas. Can we skip to Anita first? We need to work with you. Okay. So Anita, can you come first? So, Anita Bandarowski from the University of California, San Diego, and she'll be talking about a unified research resource layer and experiences from the NIF. So, hi, my name is Dr. Anita Bandarowski. I work for the neuroscience information framework and today I wanna shift gears a little bit in terms of what everybody's been talking about. They've been talking about a lot of really great projects. And what I wanna do is I wanna look across a lot of those different projects. Certainly I don't need to tell anyone in this room that we're changing the way that we communicate with each other as a scientific community. We are no longer restricted by what goes on pen and paper. So, we are actually creating a lot of web-enabled resources and just like PubMed had to go in and say these are all the great things that have been published in the biomedical sciences, we're trying to do a similar thing with a catalog of all of the things that PubMed doesn't really cover. Databases, tissue banks, software tools, services, these are all outputs, intellectual outputs of the scientific community. And you can imagine that there are some kinds of differences in the way that we want to catalog these things. So, we have like every source catalog. It contains a lot of the projects that we heard about today. There are things like image repositories. We label these things with structured metadata and these are our curators that go in and actually structure these things, these projects. You've already seen some of these, but I swear I put these slides together before today. So, there's the thousand functional connectomes, there's a thousand genomes, et cetera, 3D bar. We have structured data such as where these things are from. There are a lot of different errors on this slide and the only thing I want you to notice here is that a lot of these steps are machine steps. So, there are machine processing steps that we take in order to extract the fact that these resources do exist out of the web and out of the scientific literature, but there are a lot of human steps involved here too and those are all the red arrows, whoops, whoops. Those are all the red at some point. Ooh, yeah, it's not working too well. All right, here we go. All right, red arrows. Okay, these are human steps and these are the steps where curators are actually hard at work trying to assign these various metadata. So, can we actually look at some of these relationships between these different resources that we have and actually the answer turns out to be yes. So, we've done a bunch of different types of analyses on who's related to whom and how and so these human annotations which really tell us that for example here, the PVB, this is the protein data bank, is related and spawned a whole lot of extra resources. I'm sorry, this is not one of your fantastic plots. This is straight out of one of those hairballs. But anyway, what it does tell you is that at least a lot of resources are related to the PVB via these human annotations, but then there are also a lot of annotations that come in from a text mining pipeline which extracts out the fact that the protein data bank is mentioned in all these various papers and so we have different kinds of data about each of these resources once they enter into the registry which I implore you to do with all of your resources if you wouldn't mind. I swear it only takes five minutes. All right, so there is this shared resource registry idea. So, we have a big registry and we're actually asking for other people to use it in different ways and so what a lot of people have been starting to do including these projects here like the Gene Ontology, the Monarch Initiative, One Mind for Research is they've actually used our registry, tagged the resources within it as their resources, added additional fields of metadata and now we know something additional about what it is that is really exciting about those particular resources. We know for example how many tools are in common between the 3D VC, this is the three dimensional virtual cell community and the Gene Ontology tools. So there are some tools that are gonna be interesting to both communities and we can actually know that. Some of the text mining steps that we take can be represented here in the blue line. So what this is, these are resources added on the red, I know that's not really red, but here are the last time that the resource was updated. So we have a little crawler written and it extracts out all the dates from the websites that we actually look at and this blue line here represents the plot of the date that we find on the website which is the last time that something was, web page was updated and the number of resources where we find that and what you notice we ran at the beginning of 2013, most of the sites, so about half, a little more than half were actually updated within the last year, but the other half actually were updated somewhere all the way back to the year 2000. So there are some websites that appear to have died. We've actually got an analysis now for how many have changed their URL, moved from one university to another, so that's an interesting bit of information and we had an idea in terms of the last five years how many of these wonderful projects that you've seen here today have just flat out disappeared. So about 3% have disappeared and about 8% change within a year or two where they are. So we assume that these are postdocs that go on to their faculty positions, they take their databases with them. Now any time you would imagine that a paper is published, it would point to a particular URL. Within a couple of years, that URL is no longer going to work. So papers aren't necessarily very good ways to get back to the information about where some of these nice web resources actually live. Okay, so databases can have new data every day, how can our curators possibly keep up with that information and the answer is we can't. So here are some wonderful connectome resources, some of them represented in this room, I swear I won't point the laser pointer at you Rembrandt, but I will point it at your lovely database. Now these are all statements about connectivity and these are wonderful statements about connectivity. We want to have more of them but we understand that they will update and so what we have done and this is our wonderful collaborators at Yale, what they've been able to do is they've been able to write a lot of these files and standardize how those things are written so that we actually grab the data from all of these different databases and then give the curators push button control to start servers, stop servers, deploy crawlers and other tools. And so what we've got now in the neuroscience information framework are about 200 data sources that are this deeply crawled constituting about 390 million records, about 1.5 million direct links to articles are now sitting inside of this data. They are divided by curators into types so we have all kinds of types of data, there are fMRI images, there are phenotypes, statements about drugs, connectivity obviously, animals, different models, we've seen a lot of wonderful models so those are the kinds of data that there are and because the uniform nature of this, we actually have a uniform search that is executed across all of the data sets that we have deeply registered and here I'm searching for a particular structure subthalamus. Our system recognizes that the subthalamus actually has all of these synonyms so you can read them off on usually the right hand side. There's the data types are listed on the left, you can take a look at any search and look at what kind of data come back and then the databases are listed in the middle, you can click into each one of those and actually find the information. So the subthalamus actually turns out to have a lot of connectivity data and a lot of genetic data. You can always put, click on these little bar icons here to get the category graph or heat maps. I'll show you one of those in a second. So this is what's called the heat map so you actually need to register as a user for that but that's just giving us your email address and then you can run this for any term and here I've run brain and what we do is we basically have the number of data records per data source which is across the top and in this case the brain region. So this is an ontology that's relatively complete. Dr. Shepard is in the back there and he and others have listed out all of the possible brain regions that you could possibly have one minute. Okay, so what I wanted to do very quickly is to not make you stand up and sit down because I don't have enough time but I wanted to ask you all to kind of identify this particular entity. This is the homunculus, very good. All right, so we've got the homunculus. I'm not going to describe this because you already know what it is but I want you to understand that we actually have a data homunculus and we can get at what this data homunculus looks like. The data homunculus looks kind of like this so we have, because we have these wonderful cool things like a search across 200 databases we can actually figure out what is the most popular brain region and what is the least popular brain region? So you notice that the, whoops, that the Simpson's comic book guy and the superior cerebellar pet uncle of the midbrain actually happened to be quite unpopular. All right, so we can do a lot of other things in terms of checking popularity, seeing where we actually have a lot of data and where we don't have a lot of data. I'll try to finish up quickly here but I just want to implore all of you to look at the midbrain because the midbrain is a very sad structure right now in terms of how much data we have for midbrain it's sort of like popery. We're not really sure what it does, apparently. We're not sure why we have it. So is it purely decorative? I don't know. But someone should probably look at that because it's there potentially for some reason. But apparently we have absolutely no data to tell us why. Okay, so this very quickly is just we can actually see the different databases and how many brain regions they actually cover. So I think that I'm not sure if Mikhai is in the audience but he wins. He's got over 90% of the brain regions in BAMS. And then you can take a look at this. As you like, I will have it available. So what I wanted to just kind of end with is to answer all of your questions about why don't you have my database of commissural nucleus vagus nerve images in the NIF database? And the answer is please recommend a resource if you have that. I would love to get it. If you would like to talk to me, there is my email address. You can also email info at newinfo. Also get ahold of me. I'm theoretically at this poster earlier today. And I'll be over at the demo session if that's going on tomorrow. You can also click on the feedback button. Always a good idea. Let us know what you have that we don't have that we don't know about. All right. Thank you very much. A quick question. Everyone seems to know so much about the NIF. I'm ready. Okay, for the sake of time, I think we'll move on. Let's thank Cindy one more time. Could you load a resolution with it? I think you're a solution. Okay. Hi again, with Fang Meng from the University of Michigan. Is that gonna work? Good afternoon. My name is Gadry Yang. I'm not a Dr. Fan. So Dr. Fan is on vacation. I'm an intermediate programmer from Molecular and Behavior Neuroscience Institution, which is a part of the University of Michigan. I will introduce you to the anatomy 3D into integrating Madeline exploration with Alan Mossberg and Atlas. I will make this very quick. So this, there's the time limit. So if you have any questions, you can just stop me and ask. And this is the outline. I will go over all this as quick as possible. So why Permanentomy 3D? So the data from Alan Institution for brain science gives a huge opportunity for learning the functional implication of genes in brain. But many molecular biologists, they don't know brain structures or the functions very well. So linking Alan Institution's gene expression data to Madeline literature will facilitate the integrated exploration of data and the literature related to genes. So the essential function of Permanentomy 3D is to let the user search Madeline literature and then visualize brain structures and the gene expressions from the such results. So there are two types of modules, the search part and the visualization part. The search part has three modules, the query builder, the results and the results summary. And okay, this is the query builder. The query builder is where people, the user input Madeline search terms. In here, user can search on a lot of fields. You can see here. And also user can search nested search terms with and or or not. So there's no limit, there will be, there can be a lot of levels. A user can drag and drop terms to copy or cut. So all these terms, you can actually drag and drop. I will show it a little. And then the result. The result is a big dead grid and it lists all the results. It has a lot of columns here, you can see. And the PMID column will link user to full text literature. The result summary, the result summary will scan all the results records and the list how many times each term is found in on each term, on each field. So it's like a top end list of the selected fields. And the user can also filter the list with the prefix. User can drag one of these fields that the term back to the query builder to refine searching. And among these fields, genes and brain structures can be imported into visualization modules. User can just drag one of these genes into visualization modules. It's all about drag and drop, we'll see. And here's the visualization modules, the three one. We have tree grid and 3D stage and section view. And the tree grid, well, this is the snapshot. Tree grid is this grid, at least most brain structures. It has a tree column for parent and children relations. And what the brain structures are listed is determined by what type view the tree grid is in. So this one is all structures, all the brain structures is more than 860 brain structures listed here. There's only one instance of this tree grid here. And this is the 3D view. And the 3D view only lists on the structures on the 3D stage. And this is the section view, which we'll dynamically created by user. And the section view is tree grid lists structures from this section view. And here is the tree grid with gene column. So user drag a gene from the search result summary into any tree grid, and the gene will become a column. And all the tree grids, they always start from left to right. So this gene is on the left, so they will start by this gene, and you can see it. The gene expression level, the highest is on the top. Also, user also can drag a gene from here back to the query builder. Okay, then this is the 3D stage. 3D stage, while user drag the gene expression, drag gene into the tree grid, there is the column. And the user can select whatever brain structures they are interested in and show the gene expressions in here. You can see the yellow dot, it's the gene expression. And the size is the expression level. And this is the section view, you can see there is the ISH image at the background. And in the front, it's the structure annotations from our institution. And, well. Pap anonymity 3D supports the multiple section view, so user can compare between these views. So that's all the presentation. I'm going to jump to the live presentation very quickly. So, okay. So let's just search anything. So whatever you search, it will get... No? Okay, I'm very proud of it. So any... Yeah. Right, you type here and they will give you a lot of suggestions out there. There will be a lot of terms you can use. Let's use brain first and like the modern million results. Then we can go to the summary. Summary, it's too small. And there's a lot of... Here is the gene symbols summary. There's a lot, but it's just not big enough to show. And there's the disease, there's the disease. Let me close this one first. And it can be bigger. Okay, let's try some things. Okay, filter by the prefix on the way go. We can put this back to the query builder. This is a quick shortcut. And click filter. And this will refine the search. And I get back to the gene symbols. And this one, this gene has 60 cons. So it might be interested. Let me just drag it back into the all brain structures. And then close this one. And you can see it becomes a color. And let me find it here. And find the most interesting. I'm hungry to open the 3D structures view. Oh, we need the gene too. So let's go. Open all structures and drag the gene column into the 3D stage. And we have it here. And we can just, you know, enable the 3D. It's just a little too small. So let's go. And enable the 3D. Let's just a little too small to see the real thing. Make it horizontal. And make all the models invisible. And there is the expression. And so make this one. A little visible one. So this, you can see the expression levels in here. But also let me show you some interesting. I think we will use this one. Yeah. So this is so far we can get. And there's a lot of functions in this application. And if you go to this URL, you can actually play with it right now. So it's out of time. That's all. Thank you.