 Hmm, okay. So I guess that's it. Can you hear me correctly? Okay, so thank you everyone to be here today, and I'm very happy to be able to be here to speak about one of my projects from several years ago. So first Quick presentation about me. I'm a Python and testing fun as you can imagine I'm a former macro reviewer and I currently work for commit.ml, which is doing a monitoring solution for machine learning But it's not the subject today and you can find me with my pseudo a lot of them almost everywhere on the I want the web So let's start by a question who in this room is writing unit test Okay, so pretty much everyone and Who in this room is not writing in test but executing them Okay, so that's pretty much the majority. So I have a good news. We are not alone in fact more than 70% of the JetBrain's developer ecosystem survey answer that they are at least running unit test and most of them Waiting then running unit test So a bit of background I actually do both Because I'm watching Python most of the time and sometimes yes So of course when I'm doing I'm watching some code I'm writing some tests and when what's on test I run them So with the experience I've learned how to run all of my tests subset of my test or a single test Check why my new test is not running or I I know it should fail, but it doesn't so why I know where to find the data I need to in the output of the test runners to understand why it's failing or not and I know how to debug a failing test efficiently is always outing on Debugger or I did some prints and knowing exactly what right where he should appears But This knowledge I took it from a lot of years of actually writing tests running tests and debugging them And when you first come To run tests in Python. So for example with my test is the interface you can use you can you can have so You have pretty much everything but you need to know where to find them And if you move them to JavaScript with for example just it's a totally different Interface so fun you come for beginners. It's very hard To Being able to use both tools Efficiently this in the beginning So that's for the language. I'm the most Able to write because that's what I'm using from day to day But I'm actually also unfortunately the CI expert is Almost all of my jobs and I am the one guy in the office or remotely when people say hey My release for is failing, but I don't know why I'm okay. Send me a link Okay, you have this broken dependency or you have here you have a national failure that you are expecting true But it's false. Let's try to debug that So I have to understand failures from one dumb Tooling that I don't know they existed in the first place I have to find the data in the output and maybe add them because by default I don't know you don't have logs or your have logs, but in info and not in debug stuff like that From time to times I haven't have to run them those tests even either on our other Another server to see if the server configuration is impacted the test or Locally and then okay, so you you need to install this Package in this way then one of these weird command with those flags If not that won't work. So always an unfamiliar language a tool an interface So I was Unhappy about this situation and so I decided that we know we need some common tooling We are engineers. We love to write tooling. So let's write Tool that can bring all of this information into a single interface So what do we want in an interface? So my Christmas list yours can be a difference that I think most of them will be there So first color because when you are waiting thousands and thousands of line colors definitely helps progress bar, of course We're launching only failing test because if you have a build that takes three hours You don't want to take to add another three hours to your debugging session. Just when you add a print Launch specific test you might know exactly which test you want It's as a failing one or one that should be failing, but you know, it should be failing, but it's not and Finally a web interface So what I did with this Christmas list is I turned it into an interface. Let's meet Balto So Balto is a language independent test orchestrator Balto check all of the wish list it has colors progress bar We're launching failing tests only Launching specific tests And if you don't trust me, I will show you with a live demo wish me luck Okay So Eric's battle. Can you see correctly? Okay, so you can Collect all of the test which will ask it the endowed Test runner to give you all the All the tests you have a new test suite you can select a single test Which will give you some very basic information because you didn't run everything yet. So let's try to run This one specifically So here I get Some new data. So I know that the test is passing which is good I get some additional tooling and If I launch everything now Let's launch those two files to beginning We can see that we get some failure with some trust back and Now we can launch everything and You can see that you have Actually, not a good idea to have not mirroring my screen, but You have like STD out you have STD are somewhere you have logs you have everything I need when I debug When I debug test suite So, okay, but that's not where I'm here Let's go back Yeah, okay So Baltho Baltho what is doing under the hood to get all of your data Baltho is launching sub process. Okay, but what is the big secret of Baltho? It's reading the sub process STD out. That's it But they still need one little piece The plug-in which is running in the test runner the web server and the UI need to speak the same language So what are the possible language to talk with each other because you might have Here is both Python the example it was in with Python test and in the facility in JavaScript So there's already some couple of output formats So you might know some of all of them you have g-unit tap most log and subunit So g-unit very quickly is it's based on XML. It's well known and used in the Java community It's one big XML file at the end of the build Its format is tied to g-unit. There's no independent Definition of it and it's not streamable as it's one big file at the end you need To wait for the end of the build before being able to consume it of course You have also tap tap which is more mostly famous in the pearl community in the pearl community It's simple, but up to extend I don't put an example here, but you can take a look Look online its format is also tied to the top pearl implementation. There's no independent Definition of it and it's need then an independent password in both Python and GS So it was not good I mean that You don't have The format is deeply tied to the default implementation There is no If you want to discuss the top format you need to create issue on the top pearl implementation For tap, okay. Oh, I didn't know so okay. I will update the slide then Okay Yeah Okay, so the question was what is non independent definition and you get the answer So sorry Okay, hopefully I will offer another solution for that the question is a comment was There's already several producer and consumer and several languages so as a starting point He will start with that so there's also most log which is used internally at Mozilla and one particular design choice was that you were you have One message at the beginning of an execution of test and one message at the end Which means that we does need to keep some kind of state and I decided that one test equal one message It's easier to to consume The last one is sub in it which is actually the closest to design to what I I have designed It's a binary format While the other one are text based And an effort has been made to try Merging actually sub in it and a little f which I will represent just after and the biggest Issue was that sub in it doesn't have an input format So what was again my Christmas list? So my Christmas list was a format which is easy to write and easy to read Which is streamable because? Here With Balto it's desktop application So you get all the data in real time for a web socket connection But you can have also on the CI and as soon as you get a failing test You can mark the CI build as well without needing to wait for the full build to be finished and Finally a format defined outside of an implementation code Dyes with age while formats define independently can continue to evolve and get Hopefully more traction and more tooling if it's not tied to a specific implementation and specific usage So I did a new format So what do we need in this format? We need a test name. We need a test status and we need an error message So let's use Jason. Can you read correctly? Okay, okay Okay, sorry So there's two example was with pacing Test and one with failing one. So that's actually a missing piece. We will want one message at the beginning which told us how many Tests we are gonna have in the test suite and one at the end for the total duration and the number of thing test passing test But we could add more data and I actually showed you more data in the Balto demo We can have Timing log message stdo stdo test CDR text and image diff for example, if you are you've done some snapshot testing either on the front end or with some specific test runner and On the common line snapshot testing is when you expect a full HTML page or full text and you get something else so you want to differ on You want to give what was expected and not and find file line and more There was actually also one stuff missing How do you launch a specific test? That's a good question because it's dependent to all different languages different test for now even in the same With the same languages different tests from a might have different way to run a specific test if they have one in the beginning so we talk about Output formats, which is just like g-unit mod log sub in it, but we were missing something We need also an input format Why because what you want to run a specific a specific test You don't want any tool that need to do that having okay So for pie tests the common line is this way for just the common line in this way for No, the common line this way you will be duplicate everything while if you define an input format all the tools can implement it In the plugin which is already needed for the output format So if you have also an input format Any tool can actually use this format to talk with the test runner So there's two main case One is I just want to collect all the different tests in the test suite and The other one is I want to run specific files or specific node ideas But for nodes, that's a bit hard because how do you format nodes? We have the same issue So let's ask the different test runner to add it and let's create an ID field So whatever test whatever unique ID a test runner can generate He send it back to whatever consumer for example Balto and we can send it back to say hey This Specific test I want to run only it so give it to you I don't know anything about how it was Formatted created that it includes the test hash whatever. I don't care. Just run it again so if we take a full example areas valid Editor of output with both name the unique ID which is again obscure for Balto the outcome and zero So that's the format I Trying to define It's called ATF for language independent test format. It's defined in its is sorry is in its repository independent Which list both producer and consumer? So that means we can have discussion and effort independently from any implementation Each message that I gave an example is defined in its own with just on schema Because it's quite easy to actually say what is required. What are the kind of type that you are expecting One thing that was I think very very important is that you have in this repository Alpers and tools which which can help you both Validate streams that you are creating our input streams. So When you are developing a plug-in for example a test runner in rust You don't have to you don't need a consumer to tell you okay. It's it's valid You have an independent format which is an independent test suite which tell you okay This is valid any consumer should accept it and and Understand it Might be some edge cases in consumers, but at least The stream is valid So what is missing data? It's currently working for me. I'm like I'm using Balto with Python In my day-to-day job So the two main thing I see right now is a version number because Hopefully it will evolve and binary data as it's based on Jason binary is kind of hard to send if you are you want to send Images for this thing or any output which might which might not be valid unique code You have an issue. So that's I know what is missing, but it works for me It's work for the test runner. I'm using I'm trying to create Plug-in for just we in JavaScript, but I realize that in just the logs are not grouped per test They are for the whole test file. So you need to dispatch you are you don't have lines For example for specific test. So there's some limitation. So I'm Trying to get more languages to support it to see what kind of assumption I made in the language itself of What a test runner have and can send me and actually Some tests were not cannot give me lots like I expected to get a line and a file from every test But apparently that's not the case So I'm looking specifically for those languages, but if you want to have any test support for your own test runner in other languages I Will be very happy to have it also and more importantly get your feedback on the format itself To being able to support everything One thing which is also On the to the list is currently I'm launching sub-processes and willing a CD out, but it's only a stream So I could also what I could also Load a stream from from an SSH connection or from a docker container remotely. I Had it working in the past. I have a bug right now that I need to fix, but it's working It's working in Python So thanks to Python new capabilities in Python 3 it's in synchronous So that also means that if you have a project with a back-end and a front-end written two different languages, which is A good possibility you could actually run both of your tests in a single tool in parallel So that will be awesome. I don't have the case myself. I'm doing only Python But that's definitely possible So the architecture just for remainder is Balto is picking up only about edit f compatible compatible Plug-in or processes if Any of the test runner will implement a little f directly that will mean there's no need for plug-in, but Until it's the case a plug-in will will do and May actually be helpful for getting all the STD out as to the air and be sure that if the test runner is failing We get something to catch it and send something back to balto so in conclusion ATF is a new protocol because I think it's input and output and Similar to HTTP or LSP if you know about it So LSP is a long-wage server protocol. It's a new protocol pushed by Microsoft So your ID Cooled talk to any LSP compatible servers. So you have one SP server for Python which can be used from Emacs not maybe not nano VS code sublime text so it's Exactly like a little if it's cutting down the compatibility matrix to one format. So everything that talking a ASP or a little f will be able to take to each other So I hope that it's will be a foundation that could be used for building two more tools. I want testing and for example if you want a cost tool to one new test you can and If it's picking a little f it will be compatible with any elite after sooner and if you want a Detail HTML report about your test timing you also can and again if it's picking a little f it will be compatible with all elite after So if you want to test balto The section our line little star PPX because packaging in Python is out for now I hope in the future. I will get a better solution to avoid installing something else But then once you have bought on your system install The elite of plug-in for a test channel and then you can enjoy So how can you help? Any feedback about elite f is a good feedback? It's young So nothing is was in stone yet. You can create new hf producer any plug-in for whatever test tool you are using or Designing can create new elite of consumer. I will be very happy to get both Ecosystem on both ends of the format and use balto or speak about it Of course both balto and elite f open source on my guitar account and Feel free to open issue or that's best of simple request And that's it for me. So I have time for question. Thank you Yeah Yeah Yeah, so in the input for yeah, sorry, so the question was in my Contention I talked both about input format and output format So when I speak about the input format Yeah, the first example is The input that balto is sending to the plug-in, which is just collect only and then the test runners itself has already All the code to just collect the test without running them and so it's sending back Specific message for test collection without the status without the timing stuff like that That does answer your question It's it's also the question was how do you discover the test in the first place? So there is a config file for balto with at the root of the test deal So I'm just detecting the config file and then I'm pointing the test runner to that to that directory so if I understood the question correctly is that test runner can actually have Colors in the data they send back to whatever consumer so I plan to support it in balto Yes, sure. We need to find a way to encode it and being able to transmit it, but yeah, definitely Definitely, so the question was when you're doing Tests for example when you are testing against the database and you want you might have you test working with one version of the Database and not another or different database might have different failure. So do I plan to support that? Yes, I plan to support that in balto for now likely with both metric definition and Docker compose support Can bring up your database automatically a snap snap shut it But I'm not sure how it will Involve Elite a format apart from sending the information about the environment and which Are testing matter matrix combination you are testing? Yeah, so the so the question was I said that sub and it was a binary format and while a tf was a text format and It was asking me what are the links? The binary versus text is likely the fact that elite F is a text format is mostly just for Easyness to write and put and read because Jason is easy to write easy to read you have library and everything Sub and it is more designed to handle higher loads and He has capabilities to actually filter streams split streams Nash streams So if we have to support binary Outputs For example in elite F. We likely have to move to a binary format anyway, but for now it's tagged because it is you Okay. Yeah. Okay. The question was how do I test test results for now? I don't Balto is only Tool for desktop But with elite F. I hope that we can get see I system which is our much much More smarter and will have to store that as for now. I don't