 Welcome everyone to our webinar today. There are still people just joining us, so we'll give them a couple minutes to come online. Thank you everyone who's here already. I'm Naomi and I've got with me Michael and Oliver from Substance. Hello everyone. Hello. So today we're here to give an update on the reproducible document stack project that Substance and Elife started back in September with some work before earlier in the year. Very quickly I'm just going to give an introduction to Elife and the project and then I'll run through the agenda for the call today and we can get on to the more interesting juicy bits from Michael and Oliver. So Elife, in case anyone isn't aware, we're a nonprofit organization and we're funded by these funders in order to help improve research communication in the life sciences. As part of this we have an innovation initiative where we invest in open source technologies, tools and processes to improve the way that cutting-edge research is discovered, shared, consumed and evaluated. We very much care about encouraging responsible behaviors in science and this includes reproducibility and we develop our own technologies in-house open source as well as supporting external projects. The idea for this project came from the fact that we're working with a system that prioritizes sharing research results as a flat manuscript. That's a narrative for the project and then often the resources like data, code, other materials, methods used to actually derive those results are shared as separate assets. And we are aware of demand from researchers and also a wish for publishers to share a much more embedded, richer version of the research results story. So our vision for the reproducible document is something that would encapsulate usable code and data within the flow of the manuscript and that the reader would have an option to progress from a much, the static research article that you see today to something more rich, more details there and even more interactive. This project we very much intend it to be platform, tool and language agnostic. It's not about guiding researchers to use specific products and we'd like whatever the output is to be usable by anyone depending on what their own work preferences are. We would also like it to be accessible for everyone, so that includes supporting people who are very computationally literate as well as supporting researchers who prefer to use guided tools. And the ultimate aim is to encourage the reuse of published research, encourage the use of open data, encourage reuse of code methods and to help the research community to build on the excellent workers out there already. So we've done some work last year to start to map out what this process might be the kind of tools we might need that identify some gaps and some concerns involving lots of stakeholders. This was from a workshop back in June last year. And the project that started in September is to produce an authoring platform, a format for the reproducible document, which we're calling the reproducible document archive, and also tools for the publisher in order to actually publish these documents as they come in. For us, we're very much trying to innovate openly. This is not us trying to produce something to win a tools race. We would definitely like anything produced to be future proof, and to be something that's very easy for researchers to use but also for publishers to adopt. And it's important for us that any dependencies are minimized. It was a clear concern from the outset that a lot of people have tried similar things before and often things can break over time as libraries change as functions change. And so we want to minimize that. This is a project that's in collaboration with Substance and Michael and Oliver are from Substance. They're joining us here today. They're the main developers on this project. They're also working with Nakomo Bentley on Stentzilla, which is one of the platforms, the tools that's very much related to this project. And if you'd like more information, there are several links here. The PDF for these slides are in handouts. There's a handouts tab in your go to webinar control panel. So you can easily access all of the slides today, including these links. So today we will start with Michael going through a user interface for how researchers might start to put together these more reproducible documents, including a live demo of the work on Stentzilla so far. This is of particular interest to researchers, people who work closely with researchers, and we welcome your feedback about what's been done so far, any elements you really like, any features you'd really like to see. There is a Google Doc in the chat. There's a link to a Google Doc, and there's some questions there already for you to contribute to. Please feel free. We will then move on to Oliver, who will take us through the specifications for this format that we've got already, so the reproducible document format. And there are some questions we'd like to ask about that specification. This might be of great interest to people working in the technical infrastructure around reproducibility, which I'm aware that we've got a few of you online with us today. Thank you for coming. Again, there's questions about this in the Google Doc. Please feel free to contribute while it's cool or later. And we'll also have time at the end for general discussion, so please feel free to bring up any points during that time at the end. Throughout there will be a couple of polls that we're going to ask you to identify yourself. If you'd like to have more deeper conversations with us, perhaps one to one later with specific points, very detailed technical points. We're very open to having those discussions with you. We'd like this to be developed very much as a collaboration with the wider community. So please answer those polls when they come along, if that's something you'd like to do. We'll collect that information. We can be in touch with you later to arrange those calls. So you can on go to webinar also get involved today, so I am able to unmute you if you'd like to ask a question yourself. You may see a similar kind of control panel as this, and there's various different options for you to run through. So you can use the chat at the very bottom, and that's simple. It will be sent to everyone on the webinar, but you can ask a question there if you'd like. There's also a section called questions where you can put in a specific question, and that will notify me that you've asked a question. And I can either ask to unmute you and you can ask it yourself, or I can ask the question for you to Michael and Oliver, and they can answer it for everyone on the call. And as I said, there's the link to the Google Doc in the chat right now, so please do feel free to jump on to that and to start to contribute. We're at the top that asked us, asked you to put your name and your organization. That's really useful. It's not required, but it'll be really useful for us to know who has been contributing on the doc. And please respond to those polls if there are of interest to you as we get to them. So kicking off to start off with, we've got Michael talking about the sensor interface. We just need to briefly switch who's presenting here. So bear with us for 10 seconds. Michael, I'm going to make you the presenter. Okay. Okay, so we're just switching. If anyone's got any, any kind of questions, even if your, your webinar is not working well, please feel free to stick it in the chat or the questions and I'll respond as I can. So does everyone see the browser window and hear me. I can see it. I can see it. Okay. So yeah, I didn't hire one. And thanks Naomi for the introduction. My name is Michael from substance and so we are a company based in Linz in Austria and we basically work on open source web based editing software and that we're doing for the past seven years or so and supporting different projects. And one of those is Stancila. So we've been involved involved a lot into the development. And just to give you some background like no call me Bentley. He's from New Zealand and he's a former fishery scientist. He was kind of dissatisfied with the, with the tools and workflows that were available and that kind of motivated him to create his own tools for himself and his colleagues. And that's basically, yeah, the beginnings of Stancila. Stancila in the scope of the reproducible document stack as Naomi explained, it's taking the piece of authoring. So that's also the authoring piece. And with Stancila, you create and edit so called reproducible document archives, which means you're basically able to submit such an archive that you produced and submitted to a journal. And the journal is then able to not only publish a static few of the manuscript as it's done currently, but also like an enhanced reproducible view right on the journal page. So that's a general vision. Okay, so looking at the user interface what you see here is a publication that consists of two documents. One is the manuscript and another one is a data sheet. So these are kind of bundled together. And looking at the article, you will notice that there's some metadata. And this is particularly modeled after chat so behind the scenes here there's like the chats XML format working. So we chose that because that's what journals require and we just want to make the transition from the authoring to the journal as seamless as possible. So just to give you a thing you can just edit a new author. And this would update the metadata. And the tool also supports you to cite references. And you don't need to worry about citation styles, etc. So that's for a general scientific authoring tool and now the nobility of Stensill is adding reproducibility into those kind of manuscripts. And therefore we have introduced cells and cells are essentially expressions. So you can really compare this to Microsoft Excel, for instance, so you can just like do some merit computations and call functions, for instance. And, yeah, we were evaluating that on keystroke so basically you see the result immediately. You've also get some some aids like a documentation about the function that you can bring up. Okay, so that's that's basically how a cell works. What it can do. So we have multiple cells and can have like numbers one and assign it a variable essentially. We do another one, and we will then be able to plot these cells. And as you can see that this cell here depends on these two other cells and if I change them, I immediately see the output. Okay, so this is like given that we have a full function library like there's a lot of functions available. This will serve for many users like this would be enough for many users. But there's still things that we haven't covered and that's not part of the library. It's like where Excel is not enough where you start programming. And in our case you can switch the language from what we call mini this is just another like a name for expression Excel expression we call that mini because it's not doing much. So we're going to go into JavaScript and now we are able to just write a piece of JavaScript, which also gets evaluated immediately. So this year's a result this area here, and we're going to expose that again as numbers one and then we should see the chart update according to change things here as well. Right, that's that's the basics about cells and you can imagine you can implement different ones some are displaying information others are computing information. Here's another. So basically what we wanted to do here is to actually transclude the data here so this is our source data data that we want to analyze and use it from the from the manuscript. I managed to implement that until today. So we're just making up that information again within the document and such a table data structure is very handy we can just filter on that information. So, for instance, change here. And, yeah, do more interesting things like polyglot programming. So, for filtering and and grouping data SQL is a pretty nice tool so without okay let's take this table that we defined here and manipulated with SQL and store it as a new table. And, yeah, let's just see the output here. This is basically the sum and we could switch to say minimum or average. And yeah, this just takes place in the, in the regular flow of the document. And lastly you see a figure. So this is really like, like people that know checks know that the figure consists of a label and a title and the caption. So we have all that, but the figure itself is not just a static image. It's in this case, a figure produced in our. And that's also updating according to the inputs. If I change the data here it would also change the chart. Yeah, that's mostly so here you see the references which you can also manipulate. I just wanted to switch really quick to the sheet, which is another type of interface so that whole idea behind it is to kind of replicate familiar user interfaces, not to get into the way like not to require programming actually. And you can just use this as you would do with Excel and you can go functions, and you may want to plot things here. Yeah, and if you change the data, it reflects the chart. Yeah. And then there is. So we consider we kind of open the bigger discussion so Nakomi was traveling and visiting lots of research organizations and universities to discuss what could be done better in a spreadsheet interface. And one of the ideas was to introduce a source mode, for instance, to be able to see the data like the values, the results and the formulas at once. That's just one small thing. We've also introduced firstly column names so that you can basically assign a name to a column, but you can also assign a type. And this is useful later on to check if a cell contains the wrong type. For instance, if I would type text, I would get a warning. And that's also like useful to avoid common errors. Yeah, that's that's it from my side. Good timing. Brilliant. Thank you very much. Okay, so we've got 10 minutes or so for discussion specifically about this, this element of the project. If anyone would like to ask a question, please feel free to stick in the in the question box and go to webinar. Michael, I wonder if you could switch to the Google Doc in your screens and we're showing your screen already. We have some questions coming in. I can, JV Polin, you've got a couple of questions actually be vast already. I will try to unmute you now if that's okay. You can send in a response as a question if that's easier for you to say no, don't unmute me. I'll give you a couple of seconds to do that. Okay, I'm going to unmute you JV and you can ask your questions. Can you hear me now? Yes, we can. Thank you. Okay. Yeah, I was I was wondering with the the relation with the Jupiter notebook project because it sounds so close to it. It looks like, you know, merging the two projects would be a good idea if it's an open source community. But I'm sure there are technical reasons for having a different project that I just would like to, you know, have your perspective on that. One of the other questions I had was how all the changes versioned, you know, like if I change some stuff in the spreadsheet, how does those changes versioned, you know, some ways and can I go back? I mean, I also was wondering if you could use broadly in this in the document and what kind of, and I had to keep a question. But I need to just look at the, the GitHub repo first. I wouldn't, I don't want to bother people with that. Okay. I mean, the first question is about why Tencila is trying to provide another implementation of something, you know, probably you mean something like Jupiter or similar projects, right? So there are notebooks and software around out there providing a mixture of text blocks and code cells. And if there's a possibility to mix, join the two efforts. I mean, there's one key difference to those implementations like Jupiter is Tencila is trying to approach the problem from a, you know, more than clicker. So a simple non-text savvy user. So the reader, which is not familiar with programming. And Jupiter is very code intensive, you know, it's called coding first. You see the code blocks as a first class object. And also the interaction. So we thought Excel is a good example where people actually are programming, but they don't know that, don't realize that they're actually programming. And to introduce something which is very similar to use, very just use functions and combine them, simple operators is also a good, has a good chance to be adopted by non-text savvy users. The other angle that we're coming from is like we wanted to, so the collaboration with Elav to make a manuscript reproducible. And that's something that hasn't been covered by any of these other projects. Like we need to fulfill all the criteria that the manuscript needs to fulfill. And so that's why we're settling on jets. And besides the user interface, I mean, all you will talk about that soon is the data format essentially like this reproducible document archive. And this is meant to be a generic solution, not specific to Stensilla. And we want to be able to integrate all the other projects or we don't see each other, see us as a competitor, you know, to those projects, because they're really strong in different areas. We just want to establish something, a standard together with Elav and other journals that eventually makes it possible, you know, to submit a manuscript to a journal and have it published as is, because currently, the only way to publish a reproducible research is like as an asset, as an attachment to a bird manuscript, which has the main article in it. And we really want to kind of solve that on this higher level that has not been done. And that's basically like why we're doing this work. The second part of the question was Stensilla allow versioning. And the answer is yes, basically currently we are still finishing the application or working on the application. So it will do that. Yes. So even in real time collaboration, but also offline editing and classical type of versioning. Okay, thank you. So we've got some questions come in from Conrad Hinson. Conrad, I'm going to unmute you now afterwards we've got Stephen Eglin, and then we might move to some questions on the document. So Conrad, I am just unmuting you. Okay, yes, I am unmuted. Yeah, I was wondering in the presentation about these polyglot stuff. How do you handle that exchange between different languages in a way that will still work five years from now? I mean, this is kind of serialization format instance level, which we define. So there's a specific set of primitive types and general implementation and every language to map those data types to native ones. And I think the blame or data types won't change in the next years on these languages. Yeah, they're running in JavaScript, so we have objects and arrays and numbers and strings. And in Python, they are mapped to, I mean, like ins and arrays as well and data tables and the same with R. So there's basically a kind of application specific data type system, which is mapped to native blame or data types. Thank you. And Stephen, I'm just going to unmute you now. Okay, thank you, Naomi. And thank you for the presentation. I've got two quick questions. First of all, I'm curious if you know how these documents might scale beyond toy examples. So I've seen lots of interfaces like this before and they really don't scale when you do real world sort of research computation within them. So honestly, we didn't have a good stress test yet. What we know from a huge document in general, like just long document, that's not a problem. And we don't know yet. I mean, the internal representation of this kind of dependency graph is executed very locally. But if you had a kind of a sheet with every data is dependent on the other data. So every change in the whole sheet, we probably see some slowdown definitely. So it's kind of, I would say, I mean, it depends on the dependency graph, but if you have a logarithmic kind of distance of dependencies, then the complexity is like that, like logarithmic. But if you have a kind of full graph dependency graph where every, I mean, there are no cycles allowed, obviously. So it's probably linear maximum. Yes, generally, I mean, we're trying to get this thing as self contained as possible. So one aim is to kind of put as much as possible into this archive. So we can like later run it without any dependencies like outside dependencies, grid computing systems, etc. And we thought, of course, of these like long running tasks that you cannot perform in your local computer. That's something that's not realized yet, but like we have a kind of an idea to be able to connect to a remote resource, which does the like heavy number crunching and returns like an aggregated result that we can then store in the document in order to at least provide some reproducibility without the external dependency. So the general rule is yes, we want to connect like larger systems, but only if there is also a snapshot of the version that we can basically run this without. So that's one call. And a lot of the computation is actually done in those execution projects like like it is happening in Jupiter where the actual number crunching is done in R or in Python. So if you manage to organize your overall program to just clue to aggregate the data on a higher level and in a simple way and do the hard computing in the course, then I think this will work pretty well. Okay. Okay. Thanks very much for asking that Michael. I'm going to pick myself presenter again so that I can run through all of the slides if that's okay. Perfect. Thanks for everyone's questions. There are a lot more questions in the Google document. Some specific feedbacks asked is also an area for general feedback. If we've got time at the end, we can take some more questions on this element. And also, if you would like to get in touch with us directly about this particular part of the project, I'm just asking a poll right now. If you'd like to click yes, or maybe if you want to talk to us directly off the Google document about this, please put your hand up now. Give you 10 more seconds. Okay. The numbers are pretty stable. So I'm going to close that poll. Thanks very much for doing that. And hopefully, can you see my screen? Can you see the Google document? Yes. Excellent. So there are, as I said, there are some questions here. Thanks everyone who's already started to put in their names and some feedback. That's brilliant. We could get in touch with you. If you name what your comment is and would like a response, then we can respond to you on that. But otherwise, we'll use this as feedback. So thank you very much. And then we're going to move on to, can you see my slides? Yeah. Hopefully. I'm going to move on to Oliver. If you just say next, I'll move your slides across if that's okay. Okay. Yeah. So I'm going to talk about the reproducible document archive. So since September, we were discussing how to approach that and looking at several solutions, existing solutions like research objects and other things like CWL and stuff like that. And also thinking about the containerization problems. And finally, we came up with something more like a set of decisions right now. Actually, the solution we propose pretty simple from the file format point of view. So basically it's all about a document archive. Not very surprising. It should be as self-contained as possible. So then I also should pack all the data into which is possible to be packed. So maybe there are some situations where you have terabytes, image data or something like that. There will be problems properly, but in many cases, this will be possible. There will be a manifest file that basically describes how, which types or what kind of data is in the archive. There will be resources of different types. And there is metadata specifying how the environment should be set up. In a later phase of a publication or a document, there will be a read-only version, like an HTML version, like you know from EPUB or something. So next, please. There's a link to the git repo if you're interested to click on that at some point. Okay, now next. So the resources we currently consider are articles. So that's basically the narrative, a manuscript. So what you would edit in Microsoft Word, for example, then you have spreadsheets to work on data. And there will be assets like images, audio, speakers, etc. And to be able to extend the whole system with user code. So custom source code that will be user functions or user function libraries. Then there's always this kind of discussion if notebooks are an extra type of resource or just a special type of article. We think it's more like article without title and abstract, probably. So probably we just cover that use case using articles. And in a future version, we want to add slides as well. So that it's possible to add the presentation to the publication at some point. Okay, next, please. Talking about reproducibility. Reproducibility doesn't come per se by the document archive. So it's more like defined by the content bundled in the archive. Basically, the idea is if manuscript data and methods are there, it's probably possible to reproduce the described results. But there's also a technical precondition that you need to be able to replicate the runtime environment used to process the data and run the methods. That is furthermore not all. You also need some kind of best practices how to create such a content. So it's not probably will be not enough just to pack all the source code, but we probably want to somewhat also get between the reader and the programmer. What we want to achieve is provide best practices show the authors how to increase the comprehensibility and how to make the code more reusable. So, for example, by introducing functions instead of creating so-called spaghetti code in notebooks, it's probably more useful for other authors than just source code hidden in a notebook. There are going to be limitations like very large data or special hardware or data with access restrictions. To some of the problems, there will be solutions and to some of them there will probably a very large data is possibly can be solved by providing at least a snapshot of preprocessed data. So for example, if you have a set of terabyte images and do some feature extraction, you could then store the feature extraction or the result of feature extraction in a CSV file and allow the readers to at least follow or re-run the computation after that first aggregation step. Optionally, there could be the data available for download. So you could include that URL, for example. So it could be a decentralized repository that interested users could replicate and then do the full processing. For the data access problem, if that is the situation, then there is probably only the solution, which is probably also what people do right now to aggregate and obfuscate the data. Okay, now thanks. Next slide. So talking about the articles, so we went for a JADS format because JADS is just a very expressive document format, open document format for scientific content. And what we have been working on is trying to disambiguate JADS. So we call it JADS for machines, JADS for M, basically by providing a strict guideline how to use JADS to avoid ambiguities. We added guidelines how to encode figures backed by source code, we call reproducible figures, logs of source code like cells, and there's another thing which we want to add as inputs. For example, the author could add a slider for a certain variable so that the reader can play with settings and change the outcome to understand and explore the methods. Transclusions is another mechanism we want to add so it's possible to use to reference areas, ranges in a sheet from within an article or just to write a plot command using data from a sheet or something like this, or maybe even using an image created in a sheet. If you want you can see the current state of the specs of the JADS for M behind that link. Thank you very much. The next thing is spreadsheets. We are developing a document format we call currently just sheet ML or sheet XML. We could not use existing formats because of the open document format because it's too much about representation, not semantic enough for us, so we started basically to come up with a custom prototype of a document format. The spreadsheet supports custom functions in kind of every language supported by Stingzilla, so it's basically X for which can be extended via custom functions. And we support best practices like typing of columns and automatic meditation. Okay, next slide, please. The runtime environment is a higher topic which can be solved nowadays pretty easily probably. So our take is to that the author should choose from an existing set of runtime images. They are like container images that like you use in Docker or what we saw the open container initiative is trying to standardize these type of containers. These could be provided in a decentralized or centralized way so that people can choose and share such images. Together with functions included in the archive should provide the authors with capabilities to derive to go further from these shared images and add custom functions. There will be a necessary unified service interface for the running runtime environment. We consider HTTP endpoints for that. Basically, this creates common divisors just to provide a function to execute code blocks. So blocks of source code. If that is possible, then we probably can attach it to something like Stingzilla. There might be edge cases. We have not discussed that yet in detail where the author could include a recipe for creating environments. So if it's about special hardware or if it cannot be covered by Docker images. Okay, thank you. Next slide, please. I think I'm already at the end. You are. Yes. Okay, thank you. So we anticipate there being quite there's a lot of detail there and we anticipate a lot of questions. So I'm going to run a poll quickly again, just asking whether any of you would be interested to speak about any of those specific points in great detail beyond this call today. If you'd just like to click your options, please. I'll give you another 10 seconds or so. And then we'll move into some some questions that you might have about any of those elements. There are also questions on the Google Doc. I'm aware that some people couldn't see the Google Doc link earlier. So if you're having a problem, just drop me a question to say, please, can I have the link and I'll send you that link. Okay, closing that poll. Right. So moving on to questions. Does anyone have any questions to ask about any of these elements? One thing you can do is to put your hand up as an attendee. There's a little hand logo. If you do that, I can come to you and unmute you to take your question. Okay, I'm not seeing any hands up. So what I'm going to do is I'm going to switch to the to the Google Doc and we can run through that and use that as a way to think about questions. So can you see my Google Doc screen? Yep, as we can see it. Yeah. Oliver, would you like to run through this and take control? Yeah, I can. With the questions, you mean like going through the questions? Yeah. What would you like to hear from people? I mean, yeah, maybe we scroll. There have been conditions like, okay, I need to scan first. These are general questions. But yeah, and we could talk about this like to mark down integration. I mean, just generally, we would use this opportunity to just get some. Initial thoughts in, you know, from, from any of you. And then we're completely open to talk individually afterwards with different organizations. We've already talked with research objects, for instance, and code ocean. And we kind of, you know, really want to figure out where are the common parts and yeah, can we establish something that's really interoperable? We don't want to create an island. We're really, we're really open to any inputs from the outside. I mean, Stensilla itself might be a bit opinionated about the way it wants to communicate with the backend. But in general, the reproducible document stack is agnostic or should be agnostic. And what we thought about should be, I mean, we already have played around with using Jupiter as a backend for Stensilla. So that is definitely possible. On the other hand, you lose a bit the unique selling points of Jupiter, you know, the multi-line rebel kind of thing where you get arrows on every line and stuff like that. And what we thought could be interesting to have a dedicated integration of Jupiter into that editor into Stensilla. And maybe reducing some of the features which Stensilla is using, like transclusions or what we need to discuss how to do that. So for example, it would be possible to include a Jupiter notebook that Jupiter creates some images and then use the images in the manuscript. That's really, I think, straightforward. Or maybe use some results in the spreadsheets or something like that. So that's really these two directions. Oh, sorry. I was just saying we had a hand up from Stephen Eglin, who I can unmute now if he's got a question to ask. I'm trying to unmute you. Okay, Stephen, you're unmuted. Thank you. Yes. So I'm just, I'm kind of slightly worried about a new format turning up. And I'm really heartened to see that you're thinking seriously about the sort of the Jupiter and the R Markdown community. I mean, from my experience, R Markdown does pretty much everything that I ever need with Nittar and the caching. It's fantastic. From what I've heard today, I think the only reason you wouldn't want to adopt R Markdown is because of the problem with translating to Jax at the end of the day. And I thought that was actually already a solved problem through Pandoc. I'm slightly cautious about that, but putting that to one side. Would I mean, I think you, the problem is, is that I think Stanciller, and when I did this sort of the feedback a few months ago, it was really aimed at sort of making things easy to use. And probably what you're talking about is a different set of users. If I looked at, you know, I know several other people on the call. I mean, your primary users at the start are going to be sort of people who are already very familiar with these kinds of technologies. And so actually the number one thing I would try and urge you to do is actually make this sort of transition to Jax actually seamless behind the scenes. So the people who have already got Jupiter documents and R Markdown documents doing, you know, writing their full papers and just run a little script and convert it to, you know, the Jax format. And then it can get published immediately. I mean, I think the problem that you're going to face is that, you know, the power users are not going to switch unless they see a good reason to. So the more that you can reach out to that community first, I think the more you'll build up a big user base. Yeah, no, we are aware of that definitely. So I think that's one trick really to I mean the archive format is pretty much open. It's more or less just a manifest file that just points to different types and each of these types like the chats is just a standard. So we don't introduce something you really it's really just this kind of the top level thing that pulls things together, because that just doesn't exist at the moment. So with regards to conversion. Yeah, so that if there was a way to convert it to be the notebook or an R Markdown file to checks for M, including the reproducible elements, you would be able to submit that you wouldn't even need a stenciler for that you could just have common line tool that does that. The things that are missing then is of course like metadata but the journal could add that they draw. And additionally, you could even open it then instant and continue working there. So basically just doing your using your preferred tools. And in the final phase, that's important because I think closing the loop and then be able to edit an R Markdown file in stencilla it's also something we are thinking about but that's going to be a hard part to really close the loop and to have the converter so stable. And so you know you always need to with each feature that gets added to our markdown or to stencilla this needs to be synchronized so this could get a bit hard. So the third option that Oliver just mentioned was really including Jupiter notebooks as is into the publication and publishing them as a as currently like as an attachment, but embedded in one archive so these are really bundled together then and there's a standardized way of accessing and rendering them etc. So I think there are many options open I don't know today what's really the best one is I'm really curious what what do people want like what would be their preferred workflow when because so that the general question is I am a data scientist and I use the tools x and y. And how can I send something to for instance elive and get that published in the most nice way on their own page and most integrated way. And these are the things in between that we want to solve and help solve but ideally together with all those communities and not in competition in this case. We've had a similar question to that point Conrad I'm struggling to unmute you so I'll ask it for you if neither Stencilla nor Jupiter nor studio are good enough for my needs, what are my options. So basically the question is how to solve this problem in a general way right so the reproducible great how to create reproducible documents without Stencilla and Jupiter. I mean the you need to think about what what is a reproducible document if it's just provide the source code and some script. I mean there are solutions probably already we think I mean that's just writing a manuscript using images generated from source from running some source code. But then you I mean that's probably okay that's you can you can still create something like this archive and pack all the data. Now the question is how much you increase their reproducibility or how the author made it easy for readers to reuse their courses. That's about the best practices I was talking. So I think the tools could faster certain workflows, which you definitely could come up with a different tool chain as well, but maybe using the custom other tools to achieve the similar result. Does that answer the question. So we have some questions from earlier on as well about Stencilla, which we can move to now I am having problems with go to webinar with unmuting people. Ben, if you are there. No I can't I'm sorry the program is trying to crash, but I will ask the question for you so his question is what about dependencies and versioning if my code uses a particular our package for example, how do you guarantee the same package version will be used for reproducibility. I mean this is all about this shared and maintained images we think the right way so if there's a community maintained image containing a lot of all of the necessary image packages in a certain field. The user should just use that and and restrict themselves to use that specific version. So we are moving away from allowing to specify every detail every dependency. In a custom way, but rather choose an existing image or provide a share. So I think many 99% of the users are fine with maintain one and the one person could one person the users could contribute their own kind of image and it's not like that you specify this package in this version but you say I want to use this image in that word. But that's generally the challenge at the moment, also with the other projects like Jupiter. So how do you, if you customize some things and you spin up a custom Python session with some, I don't know, native modules included. How do you preserve that and run it in the future so that's also something that the two of us don't want to. That's not our expertise actually but we want to find a good abstraction to say we can identify an image by an identifier like image name and version. And then, like the application or the system just knows how to retrieve that image and, yeah, start it and run it. And everything works without, you know, user intervention and like manual resolving of package versions and I mean, you really have to get rid of these kind of package managers because they are unreliable dependencies. So after a long time support phase, you might not even get your own module anymore probably so there might be really problems when you really depend on them. So it's better to have an image somewhere around and maintain that for a long term. Who knows if the like node module NPM packages will be available in 10 years of time. I mean, there was possibly in JavaScript with NPM. There was one example where somebody removed their, their kind of their module and this broke a lot of projects. So you better not rely on this registry. Okay, so we've got some time for some general questions if anyone would like to go ahead and ask those. We did have a clarification from Conrad about his question that if he chose to package up his code and data in any way that suits suits him and write a top level document referencing all of this. Is there a format that he should use without using any of the the ones we know about top level format. So that would mean I mean I think that could be our archive but just like including them as as we talked before just as separate document types. So like using Jupiter or using another language just including that and the manuscript could be written in, in chess and piece together like that. I mean there, there's many options, but I'm not completely sure I understood it. Like what's the exact use cases. Just like should as an email or just put this into the document so we can follow up later. There's also a question that you mentioned earlier, research object and another shetrit of we sigh has asked whether there's any, if you're able to elaborate on that and is there any idea of collaborating with them on this on this format. Yeah, definitely we're in touch with still and scroll from research object and we had a talk a couple of months ago, and yeah we hope to kind of just use their expertise you know to make the right decisions to pull in. Like they're working on ontologies and we can pulling stuff that that is working and we're completely open as long. I mean, it's just about the argumentation is does it make sense to introduce a thing. And if yes, definitely. So, same goes with code ocean for instance, I mean they're basically just providing the execution environment so we would have hopes that platforms like code ocean would support our dark format, for instance, and then be able to run them in addition to the traditional code repositories which are published currently. So, yeah, and just getting your feedback doesn't make sense, you know, to do it in this particular way and so until September we're just trying to finish the pictures so currently we've done the authoring part and the modeling the format with the chats for and it's still a working in progress but we have all the pieces in place that we wanted. And then next on there will be tools to convert such an archive to a static web page, and also including like a reproducible runnable way so what we want to do is publish that on the eLife site on the experimental part of the site and do that with a real world article that we're going to replicate an existing publication which has reproducible elements as I think are marked on file and yeah just put that out and then see what what the responses are and iterate on that. Okay, thank you that brings us to the end of this call. Thank you very much for everyone who joined us and who contributed and please feel free to continue to add feedback to the Google document. You will receive an email after this call with a link to that document in case you couldn't get on it during the call. My email also include a copy of the recording please feel free to share that with any colleagues you think would be interested. And there'll be a link to opt in to further email updates about the project we really do want to keep you informed and able to contribute as this project progresses. If you offer to talk any further about a particular element we have that data and we'll be in touch with you. So if you have any questions my email is on the screen right there. I'll be happy to connect you through to Michael and Oliver. And I'm sure they'd be willing to share the addresses to if they wanted. So just to say thank you thank you Michael and Oliver. Thank you everyone for coming and we hope you have a good rest of your day, whichever time zone you are on. Okay, I will be ending the webinar now do you want to say goodbye. Yeah, goodbye and talk more. Brilliant. Thank you everybody.