 Oh, I'm just going to make this full screen, make it a little bit bigger. OK, awesome. I'm really excited to be talking. What? Oh, really? Interesting. OK, is that better? Maybe? A little bit better? All right, can everybody hear me? OK, awesome. I'm really, really excited to be here. It's my first time here at BIDS. And it just feels awesome to be part of this workshop. There's an incredible group of people here, people I've really admired from the community. There was an awesome morning. I'm really excited about the rest of the meeting. And I'm really excited to talk to you now about the stuff that we do. And the we here is really my group. I run a group at Janelia Research Campus, which is a really unique non-profit research institute. It's sort of in the middle of nowhere in Virginia. It's like a big spaceship just kind of crash-landed in the middle of the forest. I should say at the beginning that all these slides are on this GitHub repo here that you can get later. And there's also some links in there as well. So my group really wants to understand how the brain works. And that's a big kind of hard problem. And we're really trying to attack it with everything we can. And this effort is incredibly interdisciplinary. So this is a graph of really all the people that are involved in some portion of what we do. And there's a lot of people that are not even on here. The people in the middle are my group. And we're a combination of scientists, neuroscientists, also engineers, developers, designers, doing a variety of different kind of technology development and computational work. But there's also people developing crazy microscopes and people developing new lasers and new ways of making measurements from the brain. And it's a really, really exciting ecosystem. It's been really exciting to be a part of it. In this talk, I'm going to start to spend the first half telling you about kind of the cool stuff we can do in neuroscience right now. It'll be a little bit superficial I'm going to try to give you a sense of the kinds of measurements that we can make and the kinds of questions we're asking and how we're asking them. And then the second part, I'm going to talk about some of the more computational technologies that we're developing, which are really focused on data analysis and data visualization and also sharing and reproducibility, which are things that I think absolutely this group is an awesome community of developing and thinking about and advancing those tools. So I'm really excited to talk to you. We're going to jump right in and I'm going to describe work that we do in one of a couple of model systems we work in. So in trying to understand the brain, we don't do so much in humans though I used to. There's really a lot of reasons that model systems like animals are advantageous for answering neuroscientific questions. And one animal that we do a lot of work in is the larval zebrafish, which is probably not something that a lot of people here are familiar with or have seen necessarily, maybe you have. So what's really cool about these animals is that when they're in the larva form, they're transparent, which means we can use special kinds of microscopes in particular something called light sheet imaging to basically sort of see through the animal and record neural activity from almost the entire brain at reasonably fast resolution and generate movies that look like this. So what we're watching here in the flashing bright colors is activity across the brain of this animal. We're able to see neural activity with light because these animals are genetically engineered so that their neurons express proteins that fluoresce when they're active under the presence of a microscope. And what we're seeing here is this animal, so this is the front and this is the middle and this is the back. It's a little weird to look at, almost kind of alien looking. This is a visual stimulus that's being presented to the animal, so we're sort of interested in how these creatures behave, how they respond to stimuli. The animal is actually paralyzed, but we detect its intention to move by recording electrically from its tail. More or less they're just a little bit ghoulish, but this here is an indication of the animal's behavior. The size of that circle indicates how strong the fish is trying to swim. So what you'll see just watching the movie is that every time the stimulus comes on, he tries to swim and then there's this pattern, kind of complex dynamic pattern of a neural activity across the brain. So on the analytic side, what we're really trying to do is just make sense of these volumes, trying to understand how populations of neurons, networks of neurons and code information allow this animal to sort of see what it's looking at and then make decisions based on that or at least respond based on that. The kind of analyses we do generate maps kind of like this one. So what we're seeing here is a decomposition of the kind of movie you just saw where we can try to identify populations of neurons, groups or regions of the brain indicated by different colors in the map on the left that then have corresponding associated time courses as shown on the right. So this is one of many different kinds of exploratory analyses we do to really just try to understand and recover structure in these data. So this is an amazing system because of the ability to measure everything, which has really kind of changed the way we think about a lot of problems. But the complexity of the behavior that these animals are capable of is sort of limited. I mean, they swim in response to stimuli, but maybe that's sort of that much that the larval zebrafish cares about in the world. We also do a lot of work in the mouse. The mouse is a fantastic system for I think answering questions that really start to get to the kinds of things that we do and the kinds of intelligences that we possess as humans. The mouse also happens to be a system where we have incredible abilities to record neural activity, manipulate neural activity, and sort of monitor it in a large variety of ways. So a couple of those technologies are things like two photon imaging. This does not let you record the activity of the entire brain. So unfortunately, the mouse brain is not transparent. There are people thinking about how to do that, but it's pretty hard. What we are able to do is record from small fractions of the brain simultaneously. We also spend a lot of time thinking about mouse whiskers, which is not something that's particularly intuitive maybe, but it is a really important sensory modality for these animals, and we can also measure electrical activity. So we do a lot looking at pretty simple behaviors in these animals. For example, there's a lot of interesting questions to be asked just about how a really simple stimulus is encoded by the brain. So we do things like take a mouse where all of her whiskers, except for one, have been removed, and then we can flick that whisker or have the mouse flick that whisker against the pole and ask questions like how different sensory variables being encoded by that whisker are represented by the brain. So for example, we might have a couple of neurons like these two neurons here, and we're looking at fluorescence signals using a technology like the one I showed you in the fish, and for each one of these we're seeing neurons that are related to different aspects of what happens when the whisker touches the pole. So this is kind of the nitty gritty of sensory encoding, but it's also kind of where everything starts. For us, it's probably more about vision. For them, a lot of it is about their whiskers. We can now do this over pretty large regions of the brain. So this is showing a map where every circle represents a single neuron, and the size of the circle is based on how strongly that neuron is encoding one of these two variables that I just showed, these two properties of the whisker, and we're seeing that in a pretty comprehensive way across about 10,000 neurons. So to put that in perspective, five years ago even, being able to record activity from even like 100 neurons in the mouse was a really significant achievement. So we've really advanced a lot, and we're advancing really quickly. On the other hand, the mouse brain has about 80 million neurons total. So we're still looking at a really small fraction, but it's a lot bigger than what we were looking at before, and the trend is definitely on the up in terms of the way in which, and the degree to which we can make comprehensive measurements. So that's really about increasing the amount of the brain we can look at, but we're really interested in thinking about also more complex behaviors. So ultimately, these animals do a lot more than just detect the presence of a pole with their whisker, and we also do a lot more than that. And one that we're particularly interested in, or sort of whole class of behaviors involves things like spatial representation and navigation. This I think gets really close to things that we really deeply both care about and use. If you thought this morning when we were trying to make that diagram of where everybody should go, and you're doing this crazy mental arithmetic of like, okay, well there's this physical space, and then there's this map, and this is there, and that's there. We're solving problems that are probably not so different than what the mouse has solved, and mice are really good navigators. And I think there's very likely to be a lot of commonality in the sort of neural architectures that implement solutions to these problems. So this is an example I love showing the importance of navigation, I mean the ability of these animals to navigate. This is from the dissertation of Stella Vincent in 1912. So long time ago. And Stella did these really cool experiments where she took rats, in this case this was rat number six. And number six was put in this environment. It was a little bit of an intense experiment for the rat because this is not actually a maze, it's more like an elevated platform. So six is running around this elevated platform, and if she falls off, it's kind of a long fall. So she doesn't want to fall off. And in particular, in this experiment, this rat had all of the whiskers on one side trimmed away, and what you can see by looking at the trace of the rat through this maze is that she's really closely tracking the wall on her right hand side. So this is, I think, a beautiful demonstration. We don't really need to analyze this, sort of obvious that what's going on is that the animal is using the whiskers that she has left to track this wall and ultimately navigate this environment and find her way to the food box. And there's been decades of studying rats and all the kind of cool, rats and mice and all the cool things they can do spatially, they can learn shortcuts, they can maybe kind of build representations of environments. So we want to know how they're doing all of that. This kind of behavior is really hard to study while making neural recordings because when we use these microscopes, it's kind of hard to strap one of those to the head of an animal that's trying to freely move around. Though people have found ways to maybe do that. But what our approach is to instead try to find ways of simulating these kinds of environments while animals are stationary. So a phenomenally talented postdoc and generally working both with me and also Carl Savato's group, a very close collaborator, Nick Safronyev has developed a really incredibly cool tactile virtual reality system. So the way this works is the mouse, she's on a ball and she's moving on the ball and she can run kind of like a treadmill. And there are two physical walls on either side and the walls move in and out in a way that's locked to her motion on the ball. So she's basically experiencing or having a simulation of the experience of moving through a winding corridor. But she's totally stationary, which means we can monitor neural activity using the kind of imaging methods that I was describing before. More recently, Nick and I, we've been playing around with adding not only walls on either side but also a wall in front. So now you have three walls, it's really crazy to watch. They're all on motors and they're all basically moving around the animal to simulate not only kind of winding corridors but also basically arbitrary mazes. And this is super cool because we can start looking at behaviors that I think really are, yeah, kind of approaching the complexity of problems that our brains at least sometimes are able to solve. And here's an example of that where one of our mouse mice that we called Champion was exploring one of these environments and to be clear, she's exploring a virtual environment in the sense that she's not really running through a maze but we simulate one by using a combination of all these walls on all these motors and this is showing, in this case, the first time that she had been in this particular configuration and we're seeing the sequence of trials the first three times, she always starts here, there's a reward location here and what's really cool about this is that if you look across the first couple of trials, she's sort of finding her way around, she gets down a dead end the first couple of times, gets kind of stuck in the dead end for a fair bit there, okay. Finds her way to the reward and then after a couple of trials, she suddenly just starts nailing it and she gets to that reward location basically three times in a row. Now what happens after this is kind of interesting and this is very anecdotal, we're just starting doing these experiments but we think we have the opportunity to really be watching the brain in action while animals solve pretty hard problems. How to do this kind of learning with this few trials is a really interesting problem both in this domain and also in sort of machine learning applications. Of course, while we're doing this, we can now record activity in the brain. So this is now an example of a movie not unlike the one you saw before. On the left are flashing neurons. This is in a small portion of the brain. We're able with this method to look at maybe about 1,000 neurons simultaneously, somewhere between 500 and 1,000. And on the right is a result of in this case a pretty simple analysis where we're basically just trying to identify the way in which individual neural responses are related to different positions of the walls relative to the animal. So if you wanna start navigating, the first thing is to figure out where you are. And here we're seeing a map where color indicates tuning to different wall positions. So blue is to wall positions that are really close. And red and oranges are wall positions that are farther away. And there's almost like this labeled line representation of wall position that this animal can now use to understand where it is in its environment. And this gives you a flavor of the kinds of analyses we do where we're sort of relating the statistical structure of something like the world to something like the statistical structure of the neural responses in this animal's brain. Another tool we have at our disposal that I think is gonna be increasingly important for asking questions that go beyond correlations is the ability to perturb. So what we're seeing here is an image, again, of neural activity, except now what we've done is target a subset of neurons that we're gonna go in and kill. So because we can do some of these analyses on the fly during the experiments, we can now ask questions like if I take this animal and I characterize what all these neurons are doing and now I knock out a subset of neurons that had a particular property, what effect does that have on the rest of the network and what effect does that have on the animal's behavior? Does it have any effect on the animal's behavior? So these are the kind of questions that we're just starting to answer by combining the analysis of the data that we can do with the ability to do these kind of online, real-time perturbations. In this case, it's kind of a brutal one because once we kill a neuron, it basically goes away. There are a variety of technologies being developed to kind of more selectively or subtly either stimulate neurons or inactivate neurons in a temporally precise way. I'm doing that in a way that targets populations is really complicated, but people are developing really cool ways of doing it. So that will surely become an option moving forward. And then alongside this, we also do modeling. So it's really important to be able to start asking questions about when we do these manipulations, when we show that killing a subset of neurons has this effect on the animal or this effect on the network. Can we understand where those effects are coming from? So we do a lot of work that's sort of broadly like kind of neural network modeling where we try to build models that have sort of units or neurons that have different properties and then do little experiments on them like knocking out a subset of them and asking what effect that has on the rest of the network. We can't really look at the behavior of the network in the sense that we can look at the behavior of the animal. But we can start trying to forge links between the neural characterization that we do and our kind of ability to theoretically understand how these systems are organized and how they work. So that gives you, I think, a sense, hopefully of the really incredible things that neuroscience has developed. A lot of this stuff is really recent. So we have technologies now that we didn't have five years ago, definitely didn't have 10 years ago. And it's been incredibly exciting to be involved in these efforts and be part of these efforts. I think right now, given what we can do to sort of measure and manipulate the brain, a lot of the bottlenecks that we're experiencing really are about the data. And it's sort of every aspect of the data. It's how we analyze the data. It's how we visualize the data. It's how we manage and sort of version or don't version the data, generally don't version. And it's how we share it. And it's not just how we share our data, but it's also how we share the computations that we're doing. I know these are things that are sort of near and dear to the hearts of everyone here. And a lot of what my group works on is basically trying to find ways of taking the incredible technologies that are being developed to solve a lot of these problems and find ways to sort of connect them together or connect them to people to solve the kinds of problems that we're facing on a daily basis. So we're very much driven by sort of solving the practical problems that we're facing and trying to answer scientific questions. On the other hand, we really make an effort to build things that are hopefully usable by other people or at least integrate with the sort of open ecosystems that are being developed. And everything we do as part of this effort is open source. And what I wanted to do in the last part of the talk here is basically just take you through a couple examples of the kinds of things we're developing. I think some of them are relevant to topics that have come up or will come up as far as the workshop. Others are relevant to things that I know I've talked to a bunch of people here about. So I wanted to take you through a couple of these projects and tell you a little bit about them, start by telling you what problem we were trying to solve and why it solved that problem or hopefully solved that problem and then where we think we can go in the future. Before I do that, are there any questions about the brain stuff? Yeah. Oh, how do we do reward? Yeah. So basically there's a little lick port in front of her and we can control delivery of water or juice or things like quinine, they don't like quinine. We haven't done a ton with that, but bitter stuff they don't like. Yeah. So for example, if, you know, because we sort of know where she is in this virtual environment, we can say when they get to this location, only when she enters a sort of reward box do we give her that little spurt of sugar water, for example. Other questions? Okay, cool. And we'll proceed with part two. So the first project, actually two projects that I'll talk about really started from the problem of how to work with and analyze the kinds of data that I was just showing you. So these experiments that we're doing generate these really large volumes of spatiotemporal data. So it's usually kind of images that vary over time or it's large collections of time series. The size of the data is anywhere from gigabytes to terabytes depending on the system, depending on the scale, depending on the kind of experiment that was done. And perhaps even more than the size is the fact that these are just happening all the time. So we do an hour long recording every day, more or less, and multiple labs are doing that every day. So it really starts to build up over time. And what we wanted was a way to do a variety of different kinds of analytics on these data. And at least in the regime we're in, getting an answer really, really quickly is really important because if we wanna do things like make some characterization or make some map of the brain and then go in and immediately change the experiment, then we need to be able to get that analysis and tire process down to a few minutes because if we don't do that, then it's gonna affect our ability to keep going with the experiment. So we were sort of really pushing for speed and potentially if we have available the leveraging compute resources. So pretty early on we started to point around with using Spark to solve this problem and found it to be really useful in particular using PySpark, so the Python API to Spark. And we've built originally a library called Thunder and now basically due to a suggestion by Stephen Hoyer, now a new library called Bolt. So Thunder is really about a distributed spatial and temporal analysis and about finding ways of constructing pipelines that work in parallel either across space or across time to implement a variety of the kinds of distributed array computations that we just found ourselves needing to do for these data. And I think there are at least a few other use cases now that are well outside our domains that people have been playing with this stuff for. What Bolt is basically taking out all the chunks of this project that had to do with the general problem of representing n dimensional arrays in a distributed fashion, at least one way of solving that problem. And this was at the time targeting Spark and these two libraries are both open source, developed by really us and people in my group and people in other groups as well. I wanna highlight that there's a lot of interesting commonality between these packages and also a Dask array, which I think you'll hear about later, and X-ray, which you heard about this morning. There are things I'm really excited about. I'm really interested in figuring out how these things can maybe be related to each other or used together or used more synergistically. The particular things in this space that I think are kind of interesting challenges for us in thinking about these problems. One, so Rob's story gave a fantastic talk on this that I watched online at Seattle Pi Data. And one thing he pointed out that I think would be really useful is just kind of putting all these things on the same page. So there's a lot of approaches now for various kinds of distributed NDE array-like things. So I think Stefan pointed out having unified interfaces would be really useful. But also just understanding exactly how these things perform in different domains. There's obviously a lot of constraints. I said in our problems, it really matters that we can get an answer in a couple of minutes. But if you don't need an answer in a couple of minutes and a half an hour is okay, well then the equation might change completely in terms of what you should be using, what you can use. And I think we're excited about trying to do a variety of different kinds of benchmarking to understand how these tools relate to each other and be happy to talk about that with other people here. And here are links to these projects on GitHub. Second project I want to talk about is something called Lightning. And this is done entirely in collaboration with Matt Conlin who's an absolutely incredible JavaScript and node developer who works with us. And Lightning started to solve the problem that we found ourselves facing all the time in our work, which is that we really wanted to leverage all of the incredible web technologies for visualizations that have been developed and are being developed now. Things like D3, but not just D3, other things, Leaflet, 3JS. And we really wanted a way to be able to use those things easily from within the environments in which we were doing our computing. So that includes Python, but maybe other languages as well, and to really kind of minimize any barrier between those two things and also make it easier to share visualizations. So the idea behind Lightning is that it's a Node.js server that can conclude or come with it a bunch of visualizations and can manage a bunch of custom visualizations. All visualizations are stored as MPM modules and the server provides API based access to generate visualizations from external libraries that include libraries or clients written in lots of different languages. So I'll just jump into showing a very, very quick demo of this. And I recorded a movie in case the demo doesn't work, which has happened a couple of times entirely due to Wi-Fi problems. All right, so I'm gonna show at least one way of launching Lightning. So it's a Node.js server, but we have a lot of different ways to deploy it. There's a public server that we make available for free and all of this is open source and free. And there's also a standalone OS 10 app that we built. So that's why I'm gonna open right here. So this is using something called Electron, which is a really cool new library for building standalone cross-platform applications. And in particular, we're using the Electron menu bar developed by Max Ogden, which is an awesome project that was super fun to use. So what I just did is I launched our server and it's just running up here in the menu bar and I can open it, which takes me to the server here. And what I'm gonna do is just show you the visualizations. So Lightning comes pre-loaded with a bunch of visualizations and the really cool thing and core thing is that each one of these visualizations is just a different MPM module. So if people are comfortable with sort of front end development, you can make an MPM module and each one of these visualizations is one, but you can have tons of them in the community. So it gives us a way of having this ecosystem of visualizations for your on your Lightning server, you can import new ones just by going to import MPM. And now that my server is running, I can interact with it directly from within Python. So for example, inside this notebook, I can just connect to it here and then I'm gonna make some random data and I'm gonna generate a visualization of my random data. And that's gonna show up here. All right, and this is an interactive plot and I can do things like brushing, I can even access the attributes of the things that I selected. And the important thing to understand is that I'm generating this, this is just the scatter plot that's down here. And I can make any of these visualizations and if somebody makes a new visualization as long as they publish at 10 PM, then anybody could grab it into their server and then start using that visualization and they can use it from any language. So we have client libraries. This is my movie just in case. We have client libraries that are in already Python, Scala, R, and JavaScript. I'm working on a Julia one because I thought it'd be fun to play around with Julia. And a lot of our idea behind Lightning is that if we solve this basic problem once of sort of developing a cool interactive visualization, then that should then be available to all of these different languages as opposed to having different languages kind of reinvent the way to do things. I do not think this applies to all aspects of visualization for sure. These are really targeting those sort of almost like widget-like visualization. Like I wanna render a leaflet map or I wanna do some kind of cool tip, a tabular thing. But I think it's hopefully a cool way to start thinking about building visualizations that can target multiple languages and be used across languages. I also think there's interesting discussions to be had about where and how can we adopt common standards. They're really cool products like Vega to try to standardize the description of a visualization. We're definitely interested in adopting that and using it. But I hope that Lightning starts to provide an inner sort of ecosystem for developing and using these custom visualizations. So those are a couple of links to it and I'll be happy to talk to everybody about it more. The very last thing I wanna talk about is a very recent project. We really just did this in the last couple months called Binder. So the problem that we were trying to face here, this is my Travis, something really nicely alluded to in his talk as well, is really fundamentally a problem of sort of reproducibility and in particular how to take computational analysis that we do, which at least for us and like a lot of people here is now happening in Jupyter notebooks. And how do we basically make it really, really easy to have say a GitHub repo where you have a bunch of Jupyter notebooks and maybe some dependencies and just turn that into an executable environment that now anybody can use. And that was a problem we set out to solve and the way we did it was building a system that leverages both Docker and something called Kubernetes, which was a platform recently open sourced by Google to basically deploy and manage Docker containers across a cloud. So I'm gonna again do this one as a demo and I wanna highlight here, absolutely incredible person in my group who's built so much of this Andrew Osharoff and also Kyle Kelly, who's doing a lot of really exciting stuff in this space. We were very inspired by something Kyle did called TempMB, Kyle and others in the Jupyter community and we've really built off that. And I wanna just quickly show you how it works, at least by way of example. All right, so this is the main binder website and the way this works is that we just put in a GitHub repo. So I'll do binder project slash example requirements. And then we specify a set of dependencies. So we're trying to really not invent a new dependency system. We're trying to use specifications that already exist. And we know that a bunch of people have GitHub repos where they already have a requirements.txt file, for example. So with binder you can specify a requirements.txt file. We could go added a conda environment files. You can also just give us a Docker file, but that's pretty intense. And a lot of people don't like to use Docker and I don't blame them. So I'll just show an example of this requirements. You can also attach services. So you can say that I wanna deploy this notebook and have people be able to run the notebook. In order to do that they need to have Postgres. So we can just add a Postgres service or add something like Spark. Not like a cluster or anything. I should say we're doing this as a public service that we're making available. So there are limits to how much compute we can just be giving away. We really believe it's important that this be a public service. So in this case I'm gonna make it, but it's like make. And I'm running on the dev server, which means we have a feature that Matt Rocklin really wanted because Matt Rocklin didn't wanna write Docker files because he didn't wanna set up Docker on his machine. So this is actually building now a Docker image for this repo and we get to watch it happen. We have the little Domino's Pizza Tracker over here. We ordered Domino's just to like see the UI while we were trying to put this together. Well my Andrew had never, he never ordered, I don't know, he'd never ordered Domino's I guess. I've done it a lot. So what it's doing is it's essentially building a Docker container for you. Under the hood what's going on is that when you made that request to take your repo and build it we basically took your things you wanted like the requirements.txt file, put it into a Docker file, built that Docker container and then pre-deployed it across all of the nodes running the Kubernetes cluster. So that just finished. And now I wanna show you, we give you this little badge that you can embed in your repo and we already made one for this guy. So here is this repo and here is the binder. Now anyone who comes to your GitHub repo can click this button and be dropped into the environment that we just created. So open that up. We had some port issues on Cal visitor that we just fixed about an hour ago. And this should now be working. So here I am in a environment that has the requirements that I specified. It would have had the cond environment if I had done it that way. And now this is totally executable so I can go in and work with it and generate figures and do whatever we want. So this is something that we're really excited about. I'm putting it together. We've had really cool use cases already, a lot in the sort of teaching domain. People wanna put repos available to all their students. Matt Rocklin has an awesome set of binder, a binder repo that he made for a bunch of Dask examples. We then made it so the computers didn't have as many cores, which I think threw him off a little bit. But we're working on that. We wanna have at least like two to four, two to four cores. I think that's reasonable. And... Well, that was sort of a mistake. I know, I know. Well, that was sort of a mistake. We didn't intend them to be all available but I noticed from your notebooks that that was the case. So... How many people in one machine? So yeah, we've been playing with that. We originally had really beefy nodes and let tons of people on them at once, but then we hit some Kubernetes resource limit problems. So now we do like at max about 20 or 30 per machine and we have a pretty small number of machines. So running this now is pretty, like we're handling it totally fine. And we can have a few hundred at least at once, yeah? No. Every time someone clicks that link, they get their own private isolated environment. So it's reasonably protected if they do something crazy, more or less, that should just affect them and not affect anybody else. Can you, like can you update it? Yeah. Yeah, so you have, right now, you would have to go back to my binder org website and just put your repo in and rebuild it. There are I think some really exciting ideas that are maybe gonna come up because I've mentioned that at least two times on this slide for how to do that a little in a saner way. But right now it will be pretty fast because at least some of the components are cached in the way that Docker tries to do. But yeah, for example, if you update a dependency or update your notebooks right now it basically has to rebuild a container. We've gotten that down to a couple minutes, but yeah. So that happened the first day because we posted it to the Jupyter mailing list just to kind of like get a sense of the community and then somebody tweeted it and then we had some trouble. I mean, I don't think so. So you know, it's interesting. So right now we have about 400 slots available. I don't know how many followers you have. Yeah, don't tweet it, Olivia. No, no, no, you can. So the thing that definitely, if people tweet the direct link to the repo, we found that has a bigger consequence. Or sorry, they treat the link that actually would launch the binder because anybody who's on Twitter will click on it. I mean, more or less. So we have a status page. We don't try to hide any of this. So if you go to mybinder.org slash status, there are 379 to 400 containers available. If Olivier tweets it and then we have zero, that's fine. Somebody come to the site, we'll see that we only have zero. I mean, yeah, you're gonna give Andrew a hernia, but yeah, no, I would say go for it. We haven't got like it'll just go down to zero and everything should. Oh, we'd have to refresh. We haven't connected a web socket yet. We should, that's a good idea, Jake. Yeah, totally. Interesting, interesting experiment. We're still doing fine. Yeah, the blaze one is real. I love the Dask examples one. Yeah, some combination of those things and other things as well. Yeah, I mean, I was very, very superficial in talking about that. So, sorry, yeah, let me re-read through the question. So this was a question of why we started using or we're interested in Spark for solving some of these computational problems with our data. So for us, early on, it was really the combination of a couple of things. One is that, yeah, we often were working with data sets that didn't fit in memory and we were sorry, they would fit in disk but they couldn't be loaded all at once and we were using various machine learning algorithms involved multiple iterations over the data set. But it's really that I think is only a small fraction of what has been valuable for us. The real thing is that because of PySpark, we can use Spark alongside the entire ecosystem of PyData tools so we can use SciPy and Scikit-learn within our Spark jobs. And the fact that it has lazy execution means we can construct these pretty complicated pipelines and have execution be reasonably performant without writing lots of intermediate things to disk and kind of go through these entire execution graphs. So we do things like do a bunch of processing that's parallel across images and then kind of restructure the data and now do something that's parallel across time series and then save out a result. So we can now express the entire thing and just a few lines of Python code and the internals of each of those steps can include a lot of various mempy, SciPy, Scikit-learn style stuff. And for us it was the ability to sort of express that and then have really, really fast if we had enough nodes in our cluster really fast performance. What kinds of speed ups? Yeah, I mean so I mentioned the thing about benchmarking is I think it's a really interesting where exactly the speed ups happen and kind of not fully explored I think. So any kind of embarrassingly parallelized thing we have had not quite strong scan but reasonably close and that we just throw more cores at it, it gets faster up to a certain speed. But there are a lot of things where it's not faster and there's a lot of things where even with a few nodes it's much faster to do locally and this I think is really interesting in the context of things like Dask. It's just in general really interesting. Like how do we think about these workflows that are workflows that happen in real science where you're doing kind of one kind of computation that needs to be parallelized or can benefit from parallelization but then in the same workflow you're now doing something local and now you're going back to parallelizing and back to local and you're kind of jumping around and you always want everything to be as fast as possible. And I think it's a really interesting both kind of API design problem and also kind of engineering problem how to build systems that can handle kind of moving around that way. I think Dask is doing really cool things in that space. I think there's a lot left to kind of figure out about how to make that work. Okay, cool. Totally. I think I have that right here. So if I select this for example I can now do viz.selected and get out the indices of all the points that I just selected. I could even quickly, I could color the points that I selected but I won't bother in live coding. Yeah, I think we've definitely tried to, I mean one part of this product has been kind of standardizing what goes into a visualization. So I said every visualization enlightening is an MPM module. We have a Yeoman generator just to make one from scratch. And we did kind of come up with primitives for sort of what it means to render the visualization, append to it and update it. So these are live, they can be updated with streaming data. The GIFs in that visualization page were all streaming versions. And then in this particular case, the concept of like having user data that's the result of a selection and then having that available on the server. So one thing about this that people might be interested in, the lightning's really meant, I mean it's built on this Node.js server but we've recently figured out a way to make it also run headless. So you can just pip install lightning, lightning put-python and then generate all of these visualizations. The only thing you can't do is the thing we just did which is grabbing something back from the visualization. So at least right now, that requires running a server but that's sort of reasonable I think. And the fact that we have this standalone OS 10 app means that definitely if you're running locally, there's very little bear, I mean you don't have to, you should install Node but you don't have to. And also we have this public server. So I could have put in public, here I put in local host, I could have put in public.lighteningvis.org and then use all the same functionality and features. Absolutely, yeah, yeah, yeah. I think that would be a really cool thing to talk about and definitely dovetails which, oh I don't know where it went, with some of the things that were coming up this morning that'd be really cool to talk through. Okay, I guess one or two more questions. I'm basically done with the slides. I wanted to highlight one last thing which is code neuro. So if anyone here found this kind of stuff cool, we started this group a little while ago. It's kind of an interdisciplinary conferencing, meetup-y kind of thing. We did our first one about a year ago in San Francisco. We did one in April in New York. These are meetups that are very interdisciplinary. It's always been about 50-50 in terms of kind of neuroscientists, biologists, but also people doing coding and visualization, data infrastructure, data engineering. And we're having one here in SF, November 20th to 21st. And these are a combination of talks that people give. Max and Chris are gonna be there talking about that. That's our first talk speaker lined up. Maybe I'll talk to some of you about coming to speak. And we do, yeah, mix of talks and also kind of coding projects. Last year we built this benchmarking platform for doing certain kinds of neuro-science analyses, sort of web-based platforms. So we'd love it if you wanted to come and talk about all this kind of stuff and really just thank you for, yeah, listening and I'm really excited for the next couple of days.