 We now still have the panel discussion, so all four speakers are up. So if there are questions, please raise them, otherwise I do have a question to start with, maybe a bit heretic. But regarding the initial point made by Carol, that we live in this diverse ecosystem of so many tools, and now having heard about so many different types of workflows, each composed of different aspects of different tools that make up these workflows. On one hand, probably it's great and enhances our lives to have this diversity, but do you see a problem with these tools in that it becomes difficult to then make a comparison of the outcomes of these workflows because you cannot describe the results accurately enough in order to really make an inference? It's a challenge. So it depends of the level you're wanting to describe your workflows, so we've been attempting to harmonize through shared descriptions at least, and some steps so that you can see what are the basic steps that are being executed, but also to encapsulate through containers the actual tools that you were using, the versions, this kind of thing, and also to annotate the workflows, I'm talking about our workflows now in Elixir, with a shared ontology which we've been developing that describes inputs and outputs and types and so on, so you can go at least to some point to say within a particular domain can we produce libraries of workflows and can we set up comparisons, and also in Elixir, the tools platform has set up a workflow workbench in order to benchmark workflows. So in order to be able to say for particular areas, for particular domains, so high throughput sequencing and assembly and this kind of things, you can do some comparisons because you have standard formats and you have standard protocols, you have exposed everything that you're doing and you're collecting all the provenance, but here we're talking about quite radically different things here, here you're really, the workflows are actually encoded experimental ideas, they're not high throughput pipelines, and that is different, so I think you have to separate in your workflows, are your workflows well understood, but basically well characterised approaches, or are they really the kind of early dawnings of research ideas, where the describing of them is very difficult to correlate with somebody else's work because you're actually doing innovation in those workflows, that was a very long winded answer to a very simple question, sorry. Okay, well my take on it is that we're dealing with a much smaller problem than what Carol's vast ecosystem of workflows is, but I think the magic word was said by Carol again, which is that if you have standards things become far easier, so to the extent that a given model is well defined, say in a standard format like SPML, that's immediately accessible to everybody, no matter how you got at it, you can get to it. We have worked, in fact we worked with some of the Ferdom hub people to figure out how best to specify the experiments that we use in our workflow, the experiment codification. As you say it's at an early stage, there is no current standard for doing this, which does everything, said ML does some things, other things, do other things, so one has to try out new things and then in due course hopefully when enough people are interested a standard will emerge, but the format is available, the database data are available for anyone who wants to play with them, and that's I think at this stage of the game that's as far as one can go. Could I make a follow up point? Of course. The word workflow is seriously overloaded, can I just say that? Because we talked about data pipelines, which are computational pipelines, there's workflow which is, you were talking about the whole design build test, learn, life cycle, there's models and the whole process of, as you were saying, SPML, there's other related activities I do, my last talk was at Combine, which is the standard setting organization for those. I think it would be useful within the community to really crisp up what people mean when they say workflow, because it does actually matter. I think earlier on we had this thing, does terminology matter? Yes, it does. Good point. I absolutely agree, maybe you have one little thing to add, which can be at least part of the solution is that when developing workflows or software tools, you can always keep in mind there's always already a range of stuff out there which you can use, and it's really helpful to, as far as possible, build new workflows and new software on existing tools, and I think this already can help with comparability through across these tools. Okay, I think I have my microphone here. Yes, I think we find that the biggest obstacle to data sharing outside of projects is not the lack of standards, although that is an important issue, but the lack of organization and structure within the project itself. So the biggest hurdle for people to share data is that they feel they have to spend an enormous amount of time cleaning it up for publishing, and it's kind of like the sausage making analogy that you don't want to show how the sausage is made, and if you want to publish something you feel like you have to clean up the code, you have to restructure data so that people can make sense of it because you just have notes and stickies and things like that. So what we focus on is structuring the project while it's running so neatly that data sharing becomes much, much simpler, and then you can hire a company like ours to convert it into any format that if your data are organized and follows a model that you design but you know and other people can understand, then the process of data sharing converting to other formats becomes, if not automated, then much greater simplified, and that's what we see as the greatest obstacle, and that's what prevents people from sharing, it's just the cost, the effort required. Yeah, I would say that the way to do this is to spin it around and not think about how do I make my things shareable and how do I make things reproducible, but how do I improve my project productivity? As soon as you begin to cast it in that term, then it becomes much easier to sell, but also you begin to respect the fact that there will be a cost because it does take a cost, you can't just sort of throw a PhD student at it with a notebook anymore, you're going to have to go through a discipline and use systems and that I think is part of the cultural kind of socialization of the community is to understand, if you want to have long term downstream benefits, you're going to have to have upstream cost at some point. Well, for a project of any significant size, the actual payoff from organizing and structure within the project will come within the project, not afterwards, so in a way we can, when we talk to people, we don't motivate them by sharing data and open science, but your project will run smoother, if you scale to more than two or three people working together, once you organize it, it pays off very quickly. And retention, that's the other thing, retention is the big sale point to PIs. Retention of people or of data? Retention of information when people move, because in research, people are churning all the time, they're leaving all the time and my experience is that investigators are more interested, they get it when they understand the retention issue, whereas if you try to explain the sharing issue, then that's much less compelling. So maybe unless there's a question from the audience, but maybe just related to this point, you had a dimetry, you had a very interesting Venn diagram in your talk about experimentalists, data scientists and data engineers, and from my perspective, I would completely agree also what was basically evident in all the talks, it's very difficult for this data management stuff to be done by a lab. So do you think this is a model that needs to be incorporated in general in the neuroscience to have these dedicated data engineer positions and how could we convince basically the people who give us money that this is required? What do the panelists think about this? Is this required for the future or will the tools be good enough at some point that we don't need it? So I could say in elixir, we say you need data engineers and you need research software engineers. So it isn't just that you need the data engineers and you need some professionalization around data stewardship. You also need professionalization in software engineering. If you're going to have software environments and tools that will be used by more than the person who just built them. I think that the points that you made and Carol made are actually resonate well with the funding agencies that if you want data retention, they certainly believe in data reuse. They really like big groups to be able to use each other's results and make something bigger and better. I mean, yes, so to the extent that you can get them to buy into the idea, you can hopefully get them to find a position. But unfortunately, this may only be for a finite period of the length of the grant. So we still have the longer term problems. Yeah, so I think a little bit of everything will happen. So it's kind of as the tools improve, to do the current level of science will require fewer people. But as it's kind of the Red Queen situation to stay in the same place you have to. So as tools improve, so I think a data scientist who knows a few tools and has access to a network that supports that infrastructure that they're using, whatever it is, they can become more efficient. But as they become more efficient and as we invent new tools for electrophysiology and all kinds of neurophysiology experimental modalities, the complexity of the data is becoming much more multimodal, much more massive. It will require a next level of complexity. And then once we adopt to that, a little bit of everything will happen. So people will get better. So you will not need to do a dedicated data engineer, data scientist for the type of project that currently do. But then you'll step up to the next level. So maybe one more point. Well, having a data engineer or data manager, of course, is great. But I think it's not at this point very viable for very small labs of up to five people. And before we arrive at the full automatization of these issues, I think a necessary step is also to create further incentives of postdocs and PhD students to invest time to write software to deal with these issues. Like as for now, it's very difficult to include contributions to software projects into your thesis or into a high-impact journal. So the measure of success should also adapt to these issues. Yes. So in light of the time, I think we need to come to a close, although everything was very much in time. The good thing is the workflow we will do now is very linear and very simple. First of all, I'd like to really thank all of our speakers for this wonderful session and the wonderful talks. So please give them a hand. Thank you.