 Bits and boutiques seem to be trying to solve essentially the same problem, but maybe that's just my misunderstanding of some of their pieces, so I'm curious, maybe there's at least two of you could speak about that, maybe someone else can. I was thinking initially about a question to Tristan, but maybe JB has also something to say. So they are similar in philosophy, but they target different objects. Bits is really to structure your data and boutiques is to describe your applications, the programs. So they are complementary, but there is no real overlap between them. Nothing to add to that. I mean it's a thousand understand boutique, it's really like describing what the input and output of those things, and Bits is really describing the data. So it's in what kind of data and how do I call them, you know, different objects. There was a question there that we had to cut off at the beginning. Do you want to follow up with that question? Okay, thank you. As you said, you applied this model for this space network, but as we know, right now many computational models are quite complicated. And consider, the model is complicated and the structure is also quite complicated, it's quite heterogeneous. And in that case, we all currently, if we try to do this larger scale simulation, it's definitely the network size is huge and the computational burden is huge. So in that case, you think is, I want to know what's the key for your simulations to apply the GPU? The key what? I mean, you did it successfully and what is the experience we can from that? Some, like, this is quite challenging, like the network is huge and the model is huge. So you have to sacrifice some part, right, for these large scale simulations. Either you use a simple model in my understanding, either you use a simple model by considering heterogeneous, or you use a complicated model by heterogeneous, but I'm right. I'm not sure I understand the question. Matthew, it's simple. HH model is complicated compared to this previous you mentioned, active model, right? Yeah. So for HH model, you use all to all connections. So that means homogeneous structure network. Yeah. Yes, but under another case, you use active model, but in the fact, you can use this, you can apply space network. So that means heterogeneous structure, but a simple model. Oh, I think I see what you're saying. Sorry. So it doesn't have to be all to all. So you have a two dimensional conductance matrix, but not all of these need to be conductive. Some of those could be set to zero. So you don't have to have either all to all or sparse the balance of your question. Can I comment on that? Hi, I'm Thomas. Thank you. Yeah. Hi, I'm Thomas. So I think I mean, it's been a bit longish with your back and forth who understands what I think the simple answer is that one was the worst case for the GPU. Yeah. There was not much computation in the neuron and sparse random connections. That's the very worst you can do on a GPU. And the other one was the best you can do in the GPU where the neurons need a lot of computation, Hodgkin-Huxley, and you have connections that quite dense and regular. Okay. And so you see the full range here. So you see that the speed up on the worst case was two to four times, which isn't very impressive if you have 256 cores. Whereas on the other hand, if you have the best case, you have the 200 speed, then that looks very impressive. Are we showing these two because of the worst cases? So what you're concerned about is in the middle, but we can't really predict how it will be before you try. Can I answer your question? Yeah. Thank you. So I have a question for everybody at the most societal level, a social level. In terms of replication and reproducibility, both of those issues. There's not a lot of incentive for the scientists to do this kind of work. The agency, funding agencies are not really minded to fund either replication or reproducibility. And the journals don't want to publish papers which are second or third repeats of the same original findings. So how do we get momentum behind these very important issues when those three categories are not really that interested in doing this kind of work? I don't know. It's a tough question because I think the momentum comes from when people like first and funding agencies could start to think, hey, you're going to do an experiment on that sort of thing, and you're going to base your experiment on this and this and the result. We would like you to first replicate that result before you do that. Because none of these can be sure that whenever they start to work on the FFA or something like that, they just do the first FMA experiment and they replicate the first thing and then they find the FFA for that subject. So they do it because they base their new work on some previous results. And so they have to do it somehow. So that everything is replicable. And replicable replicability, it's kind of a gradient of things. Do you replicate with another method? Do you replicate with other data? Do you replicate with other parameterization? It's a gradient of things. But they could be in the funding agency. They could be the will to do that. And also the fact that people will, hopefully the prior registration will help as well. So let's say I want to do a replication. I push that as a prior registration. And if I don't replicate, that would be published. So that would be a little bit of a factor to often affect on that. But there's no simple answer to that question, I think. Again, it's a bit of a cultural change and maybe the jobs themselves could say, hey, you're basing your result on that thing, but we see that it's been only published once in one paper. Maybe you have to replicate that before you do that. I don't know. It's a few ideas. We need the journal of reproducible results. It exists. That's the problem. Like it exists. There's also a journal on Dali on Github for that. It has, I think, in my memories, four publications being reviewed at the moment. It's just very slow and the incentive structure is not there for that. I guess the funding agencies can build into their grants. That's what I was suggesting. Or a reproducible component. Rather than it being a separate grant, which is hard to get off the ground, it's a fundamental component of an original grant. You have to have a reproducibility model. That's what I was suggesting that before you do your new experiment. Sorry. If I can add one more thing, then the reviewer process can also be changed in a way that you could, you should maybe put your data and your code also along with the manuscript and then the reviewer just replicate what you've done. And again, here perhaps again, the wrappers or these pipeline programs can help a lot because then you can just simplify what you want to process. And here we have the bits. Here we have the boutique, which again, you can just even give a link to the reviewer and say, okay, here is the way, this is the docker, this is the disk, the connector, this is the dataset, and you can just replicate what you've done. That would be a very dedicated reviewer. Yeah, I think that's what I was going to say also. I think some of the journals are doing that, asking you to provide the code or the data that will be made available along with the published papers so someone else can check on the results. So that would be one way to start. If you do it that way, you might have a bigger pool of people who might actually have the time to do it rather than a harassed reviewer. And there are also papers which allows you to, for example, make comments on publications and it would be a post-publication review. This would be also one model for it. So repeatability, I mean, I used to write repeatability in a general sense, but the problem is like repeatability, which is, I think, a tenable. If you provide data and code, people can try to just rerun that and check whether things are going well. It's going to be really tough on the reviewers. That's why we get errors in the code because it's just tough to read of someone else and check things. But if you're really interested in the results, that's what you would do. But repeatability is another beast. If you're getting a new data, especially for imaging data, getting a new set of 50 subjects or whatever, and that's where the data sharing aspect is so important and the documentation of the data is so important because then you could think, okay, I can't really publish that unless I've run this thing on this dataset that has some common elements that can sort of generalize or get that result on those things. That's where the whole business of documenting, making sure that you can access and find the document of the data and that is important. Go ahead, Chen. Any other questions? I was about to ask the same question actually, Alan mentioned. But now actually I'm thinking about something a little bit different. So the previous question actually I was about to ask is how to actually prevent using the very limited resource and this reproducibility test from the entire process become political. And I think Alan got an excellent kind of like a declaration about really how the funding agency or maybe related resources would be actually used for that particular purpose. However, now I got something actually different. I think after hearing the entire set of talk from all you guys I think this is really excellent. So you guys got a lot of very useful tools, right? So for this particular simulation process, a lot of open source tools. So I think I have two questions. The first question is technical. Can you, for example, reuse some of the tools for each other on each other's platform for this kind of reproducibility test or not? That's a little bit technical. The second question actually I'm a little bit kind of like go beyond this is that can we actually imagine some sort of automatic testing platform to be developed in the future? Large, you know, kind of enforce any author in this particular domain. And after they publish something, their result will be automatically run through such kind of like an engine to do the automatic reproducibility test based on certain data sets they potentially can also submit. If that's the case, that could actually really offer a very low cost solution for the future researcher to be able to reuse all the things. Because both the source code and the data set actually have been provided by the authors. And we only need to develop some sort of infrastructure to make sure such reproducibility test will be executed again and again. Yeah, it certainly seems as though INCF could play a very important mediating role in bringing these various efforts to the same table to discuss exactly the kind of interoperability that we're being heard. Of course, here. Maybe just a comment on that. I totally agree with you that in the long term reproducibility studies should be automated. I'm not talking about replication again, but this is definitely, this should definitely happen. Today it's already possible to do this with what's called literate programming. You have tools that can basically re-execute completely a paper. So we build the figures embedded in a paper from the actual data. Of course, there is a problem when this takes time. When you need a computing cluster to do this, then it would need platforms as you described. But I definitely agree with you on that. I also wanted to add something about the funding of these studies. I think if we think more on the methodological aspects of research, then there could be some funding angles. I could imagine that someone working for instance on registration would be interested by first replicating a study and then improving the registration part in this study to make it faster, to make it more accurate. In this case, it would be useful to start from a reproduced study. Hi. I think it's very interesting that we have several different models of architecture that we're working on. For instance, the Bode DB and Century Motor DB. I thought it was really interesting that you have a combination of private and public servers. And then when we're looking at the gateways, you have centralized and decentralized gateways. I'm wondering if I can get a comment on how you think we should proceed as a community on that kind of continuum of centralization versus deep centralization. Thank you very much. Actually, I had a quick comment on the last question. I think someone was suggesting if you have a result published from one gateway or one tool, reproduce it in another gateway or two. So I was thinking it could be good student projects. It could be education. And if INC or someone funds students who will either randomly or selectively pick some published results from using one tool or one gateway and then do it on another tool or gateway, maybe undergraduate students that will generate interest in research ideas and topics which could be part of their meditation. So the reason that we decided to deploy our systems that way for Century Motor DB was based on a meeting that we held in San Diego a few years back. We said, all right, let's just try to get neurophysiologists that work in a common system, say a region-grass system together and try to come up with a way for them to share data. And we were astonished by the resistance that some of these groups had to share any of their data. And so at least all the systems we've seen today are involved in human brain imaging and it seems like people there are more willing to share at least a monkey neurophysiology than you train this animal for a year or two just to do the task and then spend another year recording and they just don't want to let it go. And so our sort of compromise was to say, okay, we'll make the system useful to you to organize and run your own analyses and then that sort of lowers the barrier or the entry barrier to then sharing it on a common system. Can I just comment on the reproducibility and publication aspect of the publication aspect? I think eventually the good journals will ask for as far as possible, reproducible paper which texts, I mean I've seen a couple of those. So there's Michael Baxon, a neuron paper that was entirely reproducible. You can download the data, reproduce the figures and so on. I don't know, I mean it is really good so I don't know, it probably didn't take him that much time to do it. I think it's at least the factor of two maybe for, it depends how technical are the people but it can take like three or four times or maybe five times longer to write that thing rather than just the paper and throw away the things. But the reusability of it, the funding agency can say, hey, when you're going to publish paper we want you to publish in a reproducible, with the tag of a reproducibility. And that also is an aspect of putting it on the publishing aspect rather than the current aspect is a possibility. Last question, Steven. There's a sense in which we're all talking amongst the converted here. There's a sense in which we're all talking amongst the converted and what I wonder about is how we're going to get this out to cognitive neuroscientists. Who publish massively more in my field neuroimaging than any of those of us that are worried about the methodologies do. So there's a massive education problem here. I mean, if I go to them and say, oh yeah, here's three new tools which I'd like you to learn. They're already struggling with FSL, SPM and Anthony and they go to their two week course somewhere and then they spend a year getting really familiar with it and they're going to resist like nothing on Earth having to learn yet another set of tools or even swapping. So how are you going to break through that barrier? I think that's a huge hurdle. It's a very good question. The question I have in my mind, almost always, how do you make sure that some people, because the effect and now the incentive is very small. It's a conterm incentive at the moment because it takes you time. It doesn't add on your CV because the publication list is the most important thing on your CV at the moment. So it's actually, our system is actually preventing us and the students and to learn those things and to do it. So I think my answer to that is that really the funding agencies have a true responsibility in that respect. They really have to enforce that your research is reusable and reputable. That's there because that's the public money. So we have a say in, if you're paying taxes, if you are getting NIH to fund research, you have a say in that research should not be for a way. So I think that's my angle of it. And if you're a scientist, you really want to get the answers. And if you really want to get the answers, you know that you have to get there. There are other possible ways, like a new model is doing more like a public site where these people have shared their data, these are not. And they get like a shame thing, which I don't like, but they have some success in getting the data out. I think one of the key communities will be the clinical community. That's going to be really tough and that's where a dub dub approach may have to do. They certainly don't have time, they certainly don't have the money and they have the most incredible and useful data for neuroscience. So that will be a tough one. I would add that the importance of this issue is growing all the time. We all saw what happened with the DNA paper, that gave everybody in the imaging community a black eye for a while and there had to be a very robust response to put it in context. It was overdone, but we were back against that for a long time to say it's not anything like this, but it's being portrayed or perceived by that article. Well, it's up to us to do our due diligence before it gets to that point. And so the kinds of issues that we're discussing here in terms of reproducibility and replication are becoming ever more important as data is analyzed faster and faster and faster. Just maybe a comment to that. I think integrated web platforms and gateways also have a say to this problem. I think we are at the point now that running an analysis on your own laptop with FSL might be more complicated than going, for instance, to see where you can just click around and launch your analysis. These integrated platforms should, I think we are close to the point where we could export things from these platforms to help reproducibility. We could imagine a button saying, you know, get me the reproducibility link and then from this link external people could download the data, get access to the tools and actually reproduce the analysis. It's already possible in SPM and I think some FSL tools to export provenance in an IDM format. At no cost, you just click a button and then you have the IDM provenance that you can then stick in a paper. So I think integrating things in platforms may make things easier. I have a comment to kind of facilitate Alan's comments as well. I think the big picture, if you think back 30 years to the beginning of neuroimaging and the use of data, that the computer scientists that were embedded with the neuroscientists built the initial platforms and Alan was certainly one of the leaders in that. And as an end user, as a neuroscientist, finding the tools in order to test a question became kind of a very step-wise process. What's happened now, neuroscience is actually thinking kind of Manhattan Project kind of vision. And individual scientists have much less traction. And so what neuroscientists must do is figure out how to fit into a model that's bigger than any one individual, which is something that to me, the computer science world and the informatics world already realizes. And we're at this period of dyssynchrony in this communication. So we're being attacked by questions and data that doesn't appear reproducible when people aren't even doing the same experiment over and over again. So I would just argue, and I'm certainly going to talk about this a little bit tomorrow, but how does one create a platform where the teams are together and what will drive this is big neuroscience questions where everyone agrees that they all contribute to a bigger community. I think you guys have already done that. I think the neuroscience world is figuring out how to do that when it's not incentivized well in terms of how you get a paper out, how you get promoted. Whereas if you're the Allen Institute, they go, come work for us. We have a big mission. And that incentivizes how everyone plans these things differently. And I think we have to figure out how to communicate that. And I think it's just a process. I think the slide that the second talk had about what's going to be 2020 to 2050 is where it is, but there's going to be a lot of casualties of science as it's done now along the way. We've got to figure out how to anticipate that. One of the difficulties is that every neuroscience question falls in that framework. For the genetic stuff, for instance, if you have the genome to sequence and you can cut the genome and push them every little bit to different labs, that's a perfect community process. But what are the questions that fall in that category in neuroscience? It's not entirely clear. It's a difficult business. So to me it's more like how do we communicate the results, such as the machines and what is our job such that those questions can emerge. That's more likely on this side. Okay, one last question. I think Georgia? Just a quick clarification, since Nora Morfo was mentioned, we do post the result of data requests, however, with just a matter of transparency. So whether people like it or not is obviously a matter of opinion, but we do request data and whether the data is then shared or not, we post the information whether it is available or not. It's part of the responsibility of data curators. Sorry, I didn't want to say that you were sharing all that. That's not my point. But I wanted to say that when you started to put this thing in the transparency aspect, I think there's... I don't know how to say, but people felt that they had to... I mean, you yourself said that you had more people responding to when that happened. If I remember the paper I was one of the reviewers of that paper. So I was asking, is that a little bit of a shame? You know, it's just like we have to be careful not to do the sharing. I think that's what you're saying, but I really like the transparency aspect. I think it's also almost like an education aspect. If you're a scientist, that's the normal thing to do. And that eventually maybe will change in the future if we put in the education aspect. I don't know how much of those sort of things are actually doted at universities. What are the basic principles of being a scientist? What is your job when you're trying to be a scientist? I think that's a different question. I think at that point it's a good point to stop. I'd like to thank the speakers for a very interesting discussion. We'll break up now for 55 minutes reconvening in parallel sessions at 1.30. Thank you very much everybody.