 And it's my really great pleasure to introduce Sharon Crook. She's at the University of Arizona State, both at the Mathematical and Statistics, and also at the Life Science Schools. She has done absolutely fantastic contribution to the field in terms of modeling, both at the cell level and the network levels, all those statistical models. But also she's one of the lead developers of new ML, which is like a standard for those models. So I think it is really a great pleasure to have you, Sharon, and I'm very much looking forward to your talk. Thanks so much. Okay. Well, thank you also to the program committee and the organizers. I'm very happy to be here. It's such a great meeting, so many great topics. I'm really excited to hear from everyone. So I'm going to talk about reproducibility, something that many of you work on here in this audience, so it's not new to you. But I'll be talking about reproducibility in the context of computational models. And so we all know that reproducibility is really the cornerstone of science. It's really important. It's something that many of us work on, resources and software packages, all kinds of things around reproducibility. With computational models similar to data, in order to reproduce models, we need to have those models be accessible. We might want to use different kinds of simulation packages to run them, so we want them to be interoperable, we want them to be kind of transparent, and we want to be able to access all aspects of the models. And then obviously if we can reproduce the model, then it's available for reuse. So these should be things that sound familiar to this audience. And so really what we want are models that are fair. Just like we talk about fair data, we want fair models. And if you go to the INCF website and you see what they're talking about in terms of fair, what are some of the aspects of these aspects of sharing data, and I'm telling you also sharing models, you'll see that standards really play a key role. So if we want interoperability, we want to be able to access all aspects of these models, then we need to have standards for how they're shared. So I'm going to talk some about that today and about different resources that are out there, this whole ecosystem now within competition neuroscience that supports reproducibility. So on top of these aspects of models where they need to be fair, I also, the title of my talk also includes the word rigor. And so in modeling, when we talk about rigor, what we, or at least what I mean, is that we want to be really transparent about what data we're using to constrain models, optimize models, what are the features of data? If they're sort of higher level models, what are the features that are important in creating these models? And similarly, after we create the model, we want to validate it with different data. What are those data? What kinds of data are we using to validate models? What are the aspects of models that are important? And so I won't get into this a lot today because of not having that much time, but I will touch on this a little bit, this idea of testing models, evaluating models, and how important it is, especially when we're talking about reuse. You know, there are now thousands of models out there, and if we make more and more of them available, then we want people to be able to evaluate them well and decide based on their criteria, you know, whatever it is I want to do with this model, how do I evaluate the model and decide if I want to reuse parts of existing models? So these are some of the things I will touch on. Okay, so when we talk about standards for data-driven models at multiple scales, I've been working for quite some time on a resource that is available for describing models. You know, so what do we mean by describing models? You might just think, well, we just need to share the equations. Of course, that has limited use when we're talking about very complex models that are modular, that are, you know, large scale, like the kinds of models we heard about this morning when Yvonne gave that great talk. So we need to do more than just share the equations. We want to share, you know, what do the variables represent? What do the parameters represent? What are the units of the parameters? What are the biological entities that are being represented in our models? And this is what NeurML is designed to do. So I'm not going to go into a lot of the technical aspects of NeurML. There are plenty of resources out there. You can read papers. You can look at, you know, the website. You can contact me and find out more about NeurML. But I want you to understand a little bit about it so that when I talk about resources, you'll see why we're relying on NeurML with these resources. So I'm just going to give a simple example. So let's suppose we have a relatively simple model that can be represented with a couple of equations. A handful, two handfuls of parameters. And this particular model here by Brett and Gersner, or first introduced by Brett and Gersner, if you choose, you know, a particular point and parameter space, a particular instantiation of this model, you can have different kinds of behavior. So the model is nice for representing a lot of different kinds of spiking activity that you see in neurons as you see here in this figure. So in NeurML, we have this concept of this adaptive exponential integrate and fire model. And to represent a particular instantiation of this model, what you need is the list of parameter values. What are the numerical values for these parameters? And then you have what you need for the model. But of course, this is just, you know, a concept, a high-level concept. And so underneath that, we have these components that are used in NeurML are defined in sort of a supporting language called LIMS. And so what LIMS does is it's very flexible. It allows us to specify what are the equations underneath these concepts. We can add information about units. So for example, if you have some simulation package that will work with LIMS models, work with NeurML models, you know what the units are. You can do unit checking. You know, all of these sorts of things are added value to using a standard like NeurML, because you could get more out of it. So this is kind of the main takeaway from this slide is just that we have these high-level concepts that are around models and, you know, formalizations like Hodgkin-Huxley conductance and things like that that are very commonly used in neuroscience. It's built on top of this description language that is machine-readable, that gives us all the information about the equations and a lot of other information as well. And one of the things that I want to point out is that NeurML is quite modular, right? So if I have, for instance, a network model, again, going back to the talk this morning, the plenary talk this morning, if I have a network model with a large number of cells, then I can think about specifying the models for each cell and then putting those together to create the network, right? So you could, with NeurML, you can grab, you know, just the model for the cell or even just the model for a particular channel that is in that cell quite easily without having to go through a bunch of code and, you know, find where is this channel specified in code. So it's meant to be simulator-independent, modular, easy to access and reuse models and components of models. All right, so there are quite a few tools that have some support for import and export of NeurML or parts of NeurML. You can find out more about them on our website. There are only a few mentioned here. Of course, one of the most important things is that if the standard is going to be useful at all, we need to have models in NeurML so that we can share them, evaluate them, reuse them, moving forward, right? Okay, so there are several databases, like, for instance, NeuroMorpho. Many of you might be familiar with that morphology database from George Goh, Askeli's lab. You can get the cells there in a format that is part of NeurML, how in NeurML you describe morphologies. So that's just one example. Okay, so another place where you can get NeurML models is at NeurMLDB. So this is a database that's specific for models that are in NeurML format, and I think that will become clear why it's useful to have that in a minute when I show you more about the database. It's complementary to other resources that I will... I'll mention in a minute that should also make NeurML data available. But one thing to maybe keep in mind is these are the models that are shared through NeurMLDB are kind of snapshots. It's maybe a model that's already been published. It's at a sort of stable configuration. It's a model that somebody thinks is useful to share with the community. So if you go to this website and you want to search for a model, you can search by keyword. You can put an author name into the bar. Basically anything you can think about because all of these models have a lot of semantic information associated with them. So if it's a published model, for instance, we have all the information about the publication. We have the PubMed ID. All that is associated with the model. It's in the NeurML file so that when you get that model, you have all that metadata information as well. I also wanted to mention that if you search there, we have ontological information in one of our search routines. So for instance, for a lot of these models, we have Neurolex IDs associated with them that give information like what the cell type is that's being modeled or what the brain region is. So we can use this information to help you find models that you might be interested in. So the example that I show here, specifically looking for a granule cell, the granule cell comes up because that's a keyword. It's in the name of a model. But then because we know that granule cells are in the olfactory bulb, the ontological part of the search engine will show other cells from the olfactory bulb. So that's kind of the idea. All right. So if I click on one of these cell types, just to give you an idea of what we can do because all of these models are described in NeurML, we have access to all these metadata that are associated with the model. So like I said, we link out to PubMed. We link out to other resources where you can find the same model. So for instance, this model comes from open source brain. So there's a link to go directly to the project at open source brain where you can interact with the model there. For instance, run it, as I'll show you in a minute. It also links to the model in ModelDB. Some of you might be familiar with ModelDB, which is a large database of models, but those models can be in any format whatsoever. So a lot of those models are in the neuron simulator, coding language, some are in MATLAB, some are in Python. But for the models that are in ModelDB, we link back to ModelDB. We also link directly to the GitHub repository underneath open source brain where the models are directly available. So things like that. So we're able to do that. What else do I want to say? So you can of course download the files in NeurML format to use them however you want. You can also download code that has been automatically generated to run the model in the neuron simulator, for example. And we're open to adding other simulators to this list. We have libraries associated with NeurML that will generate some of this code. And so that's a very useful thing to have. So another thing that we're doing at the database is that we are trying to make information available that helps people evaluate these models. So for instance, for all the cell models in the database, we're running the same protocol that is run at the Allen cell type database. You know, that they run on cells where they're sharing the data at their database. We run those on the models and you can go and you can see how the model responds to those different protocols in sort of an interactive view. So again, this is pre-run, a whole bunch of simulations, and we're just sharing the information. But I guess the point is, even as the database grows, it's easy for us to do this. We can just automatically run all these simulations because the models are in NeurML. We don't have to worry about, can we run this on the simulator, all that kind of stuff. It's all very easy. And similarly, if the model, for instance, if you're looking at a cell model and it has a complex morphology, we run that through Elmeasure, which is a package out of Askeli's lab, and share all the aspects of the morphology. We're also doing things like running these models in a way that we can compare run times and share with you in case there's some issue with this model. Let's say you need a really small step size to run the model and that's something that's important to you. You're looking for models that maybe aren't too computationally intensive. You can compare models, you can look at the behavior of the model, and you can also learn something about how long it takes to run. All of this because of NeurML. I just wanted to mention that these same models are searchable at NIF. If you go to NIF, you can search, find the models, and it'll point you towards NeurMLDB. For the models that are in both NeurMLDB and ModelDB, you can actually access them at ModelDB. If you search for models at ModelDB and they're in our database, then there's a little button you can click at the bottom to get a menu that'll take you to NeurMLDB. I just wanted to mention that. Another way to find models that are available in NeurML format. I've mentioned several times this resource, Open Source Brain. I want to say a little bit more about Open Source Brain. Unlike the NeurML database, where we have these stable snapshots of models to share, Open Source Brain is meant to be a collaboratory platform where you can just think of it as a place to share your models, but it's really a place where you can share models in a way that you can work on them with other people. You can simulate models. Open Source Brain is built on top of GitHub. For every project at Open Source Brain, if you are wanting to develop models on Open Source Brain, you create a project. It's on GitHub. You can have a wiki where you can have information about the project. You can create issues just like you can with anything else at GitHub, and multiple people can access that project if that's what you want. There are many NeurML models at Open Source Brain. Open Source Brain doesn't require that the models that are being shared there or worked on there are in NeurML. You can use whatever simulator, whatever kind of coding you want to look at models there, but the advantage of making your models there available in NeurML format is that there are extra tools at Open Source Brain that you can use if the models in NeurML format. I should say that all of these models on this slide are also available at NeurMLDB, and there are many more models at Open Source Brain other than the ones that are listed here, but the point of this slide is to make it clear that there are models that are single-compartment models that are relatively simple up to single-cell neuron models that are quite complex with very complex dendrite structures on up to networks of relatively simple cells, networks of very complicated cells. For instance, the cell models from the Blue Brain project that were released in 2015 are available on a project here. You can also get them at NeurMLDB. You can look at all of those cell models, you can look at how they behave, you can look at their morphologies, anything like that. All the models mentioned this morning, in the talk this morning, they're also available. I urge you to have a look at these resources. At Open Source Brain, if your model is in NeurML, there are tools for actually simulating the models, visualizing models, analyzing models that are all available online through the browser. For instance, this is just a little screenshot of a network model showing a visualization of the network, a visualization of voltage responses in some of the cells when the network is run, and then up here showing the connectivity matrix for the network. There are way more visualization tools there than you can see in this one image. One other thing I wanted to mention is that they have the ability to... This is a little screenshot of one of the panels that comes up when you want to actually run a model there through the browser. You can run smaller models on Open Source Brain, so basically on Amazon or something. But there's also this option for larger models to send them to a queue and run them at Neuroscience Gateway. Just for those of you who may not know what Neuroscience Gateway is, it is a super computer platform that's available to the community for free, and so you can access it directly from Open Source Brain without having to worry about how you configure all of your simulations on the Gateway itself. Okay, so there are lots of things that I haven't told you about Open Source Brain, lots of other great aspects of Open Source Brain, tools that you can use there, analysis tools, and things like that. I just want to point you to this publication in the bio archive that makes all that information available, including video tutorials about how to use it and what you can do with Open Source Brain. There are some tutorials that are available that might be helpful in education for any of you who are teaching classes, and maybe you think that Open Source Brain might be a good thing for your training activities. So I urge you to have a look. But clearly, across these resources that I've mentioned so far, what we're seeing now as these become available is that it's easier to reproduce models. These models that are at these resources are fully accessible with the needed metadata to evaluate them correctly and find out more about what the model is meant to represent. And the use of standards is what makes this easy. Well, maybe not easy, but possible. Okay, so I want to spend just a few minutes talking about this idea of model testing. All right, so another project that I work on, which is very closely related to what I've talked about so far, is a project called Neuron Unit. And really the lead on Neuron Unit is a collaborator that I have at Arizona State University. His name is Richard Gerkin. And there's sort of a team of people, of course, like everything that I've mentioned. I'll say more about it at the end. There's a team of people working on it, but I just want to make it clear, like Open Source Brain, also not my project, that Neuron Unit is really Rick's project. But the idea behind Neuron Unit, which is built, I should say, on top of a completely domain agnostic resource called Psi Unit, which is meant to be much broader than just neuroscience. But Neuron Unit is the neuroscience-oriented part of Psi Unit. And the idea is that, is borrowing the idea of unit testing from software development. So when we're creating models, we really should be thinking about, what are the tests that we require of the models? We should make those tests transparent, available to the community. If we're going to publish a model, really those tests should be quite clear. What is it that this model was designed to do, and what is it that the model does do? So those would be the tests for, perhaps, model optimization and model validation that would be important to share in order to have more rigorous development of models. So if you look at the screen here, the idea is you could have only one model that you're trying to evaluate, or you could have many models that are meant to model kind of the same thing, and maybe you want to compare them. But you would model across maybe what you might think about as experimental protocols. So the kinds of experiments that you would do when you're maybe doing electrophysiology, when you're recording from a cell, or you're getting information across an entire network, those are the experiments that you want to run on your model as well. So this is a way of making that formal and rigorous. And then the idea is that when you choose models and you choose tests, maybe from a repository of tests that we have or create your own tests, you can share the information about how the models do in those tests, you know, either numerically or we have visualization tools for doing that. Here I'm just showing, you know, just a matrix of some sort of quantitative numbers that's meant to represent how well a model does on a test. All right, so for instance, you know, just as a very simple example, maybe for a cell model, when you're developing the model, you have certain things in mind that you want it to do. You know, you want it to have a certain spike width, spike height, maybe certain membrane properties, but maybe something more complicated like than that, maybe something like, you know, statistical properties at certain frequencies of stimulation or, you know, something like that. So you could create tests that test these on your model so that you can compare them to experimental data. All right, and as a quick example of that, Neuron unit is being used in the open worm project, which some of you may have heard of. So open worm project is a really cool project that is community-driven, all open source. Anybody can contribute, and I'm sure some of the people here in the audience who are involved with open worm would be happy to talk to people about it if you have questions. I can direct you to them. Also not my project. But the idea is to create a model that incorporates all aspects of the worm, C. elegans, all of the cells in its nervous system, its musculature, how it moves in the environment. All right, so it's a truly multi-scale model that's being put together, and unit tests, these tests from Neuron unit are being used to drive the development of the model. So for instance, using tests to develop models of channels, models of cells, the 302 cell models that are needed for C. elegans in the network, and then also tests are being used to evaluate the motion model based on videos and moving worms. All right, and then all that information can be, well, I should say quickly that this is still under development. You know, these models aren't complete by any means, but the idea is that the test can make the development of the model more transparent and more rigorous. And then these tests can be shared on a portal that we have called Sidash. It's like a dashboard where you indicate what your models are, what the tests are, and what the results of the tests are that anybody can use. All right, again, happy to answer more questions about that if you guys want to contact me. All right, so I hope I've managed to convince you that there are a lot of really cool tools out there that are supportive of reproducibility in the computational neuroscience community. There are many, many more tools that I didn't mention at all that, you know, maybe we'll hear about in some of the other sessions, of course. I wanted to, if this is something that's of interest to you or maybe you want to share some of these ideas in your training, I want to point you to this URL that, and poor Gleason and I think Andrew Davidson were the ones that compiled this information. But if you go here, you'll find a repository that has slides with introductions to these resources so you can sort of go to one place and quickly find out about them. And there are also some exercises there for linking some of these resources to help people understand what kind of tool chains are out there for using some of them. And so as I said, you know, many, many people were involved in these resources that I mentioned today. Specifically, my long-term collaborators on NeurML, Angus Silver and poor Gleason who are here. And then many, many other people who have contributed over the years to NeurML by coming to workshops and interacting with us. Some of our current editors on the board of editors are here also. I want to thank them for continuing to push NeurML forward. And then, you know, groups of people working on, of course, NeurMLDB and NeurON unit. Thank you very much. Great. Thank you. And there's so much also parallel to be done with like our, like our brand managing communities and like those things that are common to those communities are really interesting to like see and talk about today. Yeah. We have time for maybe one or two quick questions. Is there any question in the burning questions? No. Actually, I do have one. Like you do have with those projects, bit like the adoption and sustainability problems. Or like, you know, so can you tell us a little bit more on the adoption of how. Adoption. Adoption and sustainability and how do you. Like getting people to use them. Getting people to use, getting people to, yeah. Yeah. Well, you know, I think it, it takes a little bit of momentum, right? So, you know, I don't know how this magically happens, but I think if you just show people enough times how useful this can be to them, then some, you know, some people will start to adopt it and then it seems to sort of spread in this way. But certainly a key thing is having good documentation and good tutorials and things of that nature. I mean, I think that's really helpful. Any link with journals or publication that are recommending or those models. Well, not yet, but I hear that maybe this is something that will happen soon. Great. Any questions? No? If there's no burning questions, Sharon, you'll be here the next two days. Yes. Like this today and tomorrow. And feel free to email me. Right. And thank you again so much. That was great. Thank you. Great.