 Welcome back and so now we're gonna continue with Mark Hello mark. Hi Santiago How are you? Pretty good. How are you? Oh good. What are you streaming from? I'm currently based in Hamburg in Germany The fourth floor of my apartment. So nice view out in the green With a nice standing desk Yes, exactly Okay, so you are gonna talk about how to speed up a developing developer development life cycle for a cancer If you are ready, we can start Yeah, sure. I'm ready. Okay. I'm good luck. Thanks Alright Hi, everyone. My name is mark I have a background in computational neuroscience and Currently I'm the CIO at a company called minepeak So my talk is about speeding up the deep learning development life cycle for cancer diagnostics What do I mean by that? I want to tell you a little bit about how we handle things at minepeak and what we learned over the years and how we Manage to make experiments in deep learning quickly First a little bit about minepeak So our mission is to increase the cancer diagnostic accuracy and make it accessible to everyone who is in need That's quite an ambitious goal and so pathology. What is it about? Well, you might think pathology is about dead people But in fact, it's a lot about living people. In fact, pathologists are very crucial in the chain of cancer treatment So if someone goes to the doctor and a tissue is taken from say the breast or the lung That's sent to pathology lab and then they prepare that that tissue this biopsy that you see here and The pathologist will look at it under the microscope and Then the pathologist with his years of experiments makes the diagnosis and that diagnosis In turn is sent back to the general clinician that sent in the biopsy in the first place and and the clinician then Does the decides on the treatment? So the diagnosis of the pathologist is really what determines the treatment in the end So it's super important Now here's the problem We have a rising demand of pathology diagnostics, but the number of pathologists stay roughly the same So more and more diagnosis needs to be done by more or less the same number of people and that's a problem Right because I'm sure no one of you wants to be in the position that you're in need of cancer diagnostics And you need to wait for months. So that's where we want to step in and we want to help reduce that gap So this is how we envision the cancer diagnostics flow tomorrow We want to have a digital scanners that scan these tissue probes instead of looking through the microscope Then immediately after scanning an AI can run do a pre-analysis Make a diagnosis and then when the pathologist opens the case They can check on the screen what the AI did they then can take a look and say yeah, I agree or I don't agree so much and then If there isn't disagreement, we can even use that to improve the model further on So in total you can say that mindpeek deals with automation tools for visual diagnosis in pathology That means we always work on images. So this is always images taken from from body samples and We use state-of-the-art deep learning tools on it And we're not we don't want to replace the pathologist But we build tools that we give to them so then they can make more reliable more reproducible or faster diagnosis And here's some some excerpt of our partners. We have great partners many nice labs industry partners and so forth We have a fantastic team very cool people. We're here in Hamburg in Germany And this is us on a nice boat trip that we did some time ago and some of our advisors and investors on the slide So one example application that we developed is this cancer cell detection for immunohistochemistry So here you see an image of what is called a chi-67 staining. It's a staining that mark cells Brownish if they're in the process of proliferation. So if they're dividing and that means they're growing, right? And so this is one one key tool used by pathologists to determine cancer growth Now the task of the pathologist here is they need to count the number of tumor cells and then they need to look at the threshold of positive tumor cells in combination to Negative tumor cells so the ones that are active by this marker and the ones that are not active by this marker and This is of course a manual tedious task where you need to count each individual cell and determine that so what I plotted here now is the result of our AI you can see there are little dots that now found the different cells and There are yellow dots for negative tumor cells and red dot for positive tumor cells And so this way the pathologist can get a much faster result and don't doesn't have to do the manual counting Now this might look easy to you I mean, it's easy to detect some brown right but in fact the variation in the labs is really huge So there are many different use here So as you can see this is maybe an image taken more from a microscope camera And all these images here. They're all Chi 67 staining so You can see it's not an easy problem. We're dealing with in particular This happens because each lab has slightly different chemicals and users is working at different temperatures and and that's how these different shapes come about and our model needs to be able to deal with all of this variability That's also one big thing that differentiates us to some competitors Because we try to develop a deep learning model that covers all this variability So one model for all labs and others build models one one model per lab And we think it's better to have a more general model because then you're not so prone to overfitting You will generalize better because if you just find you to one specific lab There's always the risk that you over fit a bit now to the focus of the talk This was a bit of the background of my peak So I want to talk a bit about this this deep learning training cycle usually you start off with some idea and then you collect some data and and Have some annotations on them and then you move on to the implementation phase You write some code you develop your models your losses then you go ahead and train the model usually on a GPU and Then you evaluate it on a test set now if your results on the test set look promising then you might deploy it and Then you need to monitor if it's actually doing what you think it's doing in production And then often the cycle continues again You then collect some data again from from the things you've seen in production and you annotate Maybe you change some things and now the question today is okay You always need to do these experiments and how can you make that as fast as possible? So the focus is on this end here. How do I go from idea? implementing it to having a fully Conclusive result and being able to tell if it's actually improving my previous model So let's dive in their goal is to test ideas as quickly as possible so we'll start with the idea stage and That one first of course in the idea stage you need to generate an idea Right and you can either do that without data So when you start fully from scratch or you can do it data-driven and You should also think about How you annotate how you have efficient annotations and you should define some metrics How do you measure that you reach the targets you want to reach? so Some say brainstorming is is fantastic and some of the best ideas are found through it and I agree it's it's good But I think it's important how you apply it. So what I don't think is so good is if you just throw ideas in the room no limits and Just write them down because that often leads to what's called group think where you don't generate too many Distinct ideas, but rather they are all very similar. So what we usually do is every person of the team sits down by themselves you take maybe 20 minutes to generate your own ideas you think about the problem you Think about how would I tackle this and then after after some time alone? Then the team gets together and everyone has one two three minutes to present their idea in one slide maybe and Then after that You look what are similarities? How do ideas overlap and then you organize and group them together have a short time box discussion about it and Then it's crucial that that you do a voting So every member gets a few voting points and you can vote on the best ideas And if they're limited then you really need to decide otherwise it's often you have many cool ideas And it's not really clear on what you should work on You can also use Something called the business model canvas to further evaluate this in a more business context But that's usually more on the business end and not the engineering part Alright, what if you have data? Well, if you have data already and you already have your first model trained then you should use your data to drive your ideas That's much much more telling than just thinking about ideas out of the blue So what can you do you can take a subset of your validation set and you look at the errors? your model is doing on this validation set and you categorize those errors and then you can see Where are the most errors happening and that is probably the category you should focus on Because you can get the biggest improvement from there Let's take an example So remember our problem. We need to detect these cells and we need to classify them and now I might say oh I'm worried that maybe we have a lot of stroma immune miscalasification. So these are different types of cells different classes and If I just go about and I have this intuition Then I could try to create something to fix the problem But now instead I sit down. I make a small list I go through 100 images and I mark each time on an image this problem occurs with an X But I also do the same thing for other categories in the meantime that I note other errors Now if you come up with these counts here, of course, you see okay So stroma immune only happened four out of 100 images while something like scan artifacts Maybe had happened 35 percent. So probably my time is best spent focusing here rather than focusing on this problem Makes sense, right? Then if you have your idea you should collect data often So you you should think about how to efficiently do that in our case You could say okay, I let pathologist mark the segments of each cell. So they should draw the borders of the cell And that's one one possibility But instead you could also say okay They just have to mark the center of each cell with a dot and that way we also know where the cells are because for our Problem it's important to count the cells but not to find the exact segmentations of them So maybe this way would be much more efficient and you would get your data faster Now let's quickly talk about metrics. So metrics of course define your goals and Often you have several goals. You have several different metrics especially if you work on real life problems and in pathology in particular and So still I would recommend to try to single it down to a to a one metric that combines Everything because if you don't have that then it's always the question How do you compare so you have one model and another model? How do you tell which one is better? Maybe the one is better in one metric the other is better in another and it's then in the team It's really hard to tell how to judge when a model is considered better than the other and you might even disagree One might say are but this is more important and the other says no I think this is more important So upfront just decide on a clean single metric and how can you do that? Well Forth you can make aggregates for example if you have precision recall problems You can aggregate them as the F1 score, which is the harmonic mean of precision and recall You could have a weighted average of different submetrics So that just means you you decide on the weights where you think what is the importance of each metric? And then you just make this weighted average if you think Some metrics are equally important the min operator could be something you can look into that way if One metric gets much better, but the other one doesn't or even suffers You will not improve the model and that might be beneficial in some constraints And if you have the single metric of course you will use that to validate date your ideas Because you will always use it to compare if your idea helped in comparison to the baseline The submetrics can help you figure out where to improve Figure out. Okay. The sub part this classification part is not so good Maybe and then I should look at my validation set for errors and try to generate the next ideas again So for my peak one example here So we have cell detection as one metric We need to find where are the cells and for that we use precision recall so we combine it as the F1 score and Then we also need to classify them So we use the detected cells and the original image to feed the classifier and that one determines What's the class of each cell and and there we potentially have many classes? So we also use precision and recall, but for each class separately And then we have a weighted combination of these different F1 scores for the overall F1 score Now you might say but that's still two metrics Yeah, that's true So the target metric in the end will be a combination of both of them So you combine them together weighted or something like that and that's the goal you're measuring towards Okay, that's it for the idea stage So to to recap you should think about efficient annotations You should have a single metric that guides your experimentation. You should use your data to drive the ideas You should track the errors that your current model makes especially In categories so you can figure out where to improve in the high-arrow categories Then let's move on to the second phase the implementation now I have my nice idea and I want to build it out in code So for the implementation stage The thing that's very important for me is code quality if you have high code quality Then usually you can iterate fast, right? You look at your code You know where exactly you need to change something to try out a new idea If your code is messy and and you need to touch a lot of different moving parts Then it's much more complicated and you will have a much harder time And thus just take longer introduce more errors So my rule of thumb is whenever you touch code think about can I make it a little bit better do some small refactoring? If you do that that way it feels effortless more or less just Tiny improvement every time and then of course use things like typing and linters the typical software engineering quality things Furthermore use automation use take advantage of CI pipelines. I will talk a bit more about that soon Then you should Use profiling you want to make sure that your code is running as fast as possible and in deep learning It's in particular important to figure out the interplay of CPU and GPU usage because it might be for example We had that at some point that metrics calculation for validation metrics Or so take a lot of CPU resources and your GPU idols around a little bit in some phases And that's really waste the time you want to make sure that your GPU is always mixed out and The Nvidia profiler inside can give you a nice Graphical visualization of your code running and you can see exactly when was the CPU used and the GPU used and when was data Transferred back and forth because that's also another potential pitfall Lastly in the implementation stage Reproducibility is super important You want to know exactly when you run something that you can reproduce it and that you know exactly what data has been used So some examples for code quality As I think that's as I said, it's very important So one thing I really like is commences code So say you have this loss function here Where you do something namely you filter out the background targets and someone wrote this as a comment I mean now this is simple code. I just made it an example here in in real life You will have probably much more complicated code and and maybe also a longer comment explaining what happens and What I like to do instead is just don't write this comment and have a small function that Explains this. Why is this better? Well, first it's in code So if someone changes and reflect us the code, it will definitely change because they will need to consider it Whereas comments often stays there and they are old at some point and Second, it's it's easier to read, right? I mean, I have less code here. I just read outfit as a background targets I know exactly what happens and I mean, I can look deeper if I want to but maybe I don't even need to So that helps me to Be faster and that's our goal here. We want to iterate fast Second Sidenote on code quality a library that I really like is the inox library It's fantastic because in deep learning you often need to shift around dimensions and for example here We have a prediction that consists of a batch a channel a height and a width dimension And there's some permutation happening now You need to figure out what has exactly been done here with the indexes seems like the number one move to the bag So the channel it's moved to the bag then you need to look here some contiguous view of Some multiplications and then the channel so you can figure out what it does But it's not super intuitive in contrast with inox. So rearrange is just I know start rearrange It's a function where you can say, okay I want my prediction that consists of batch channel height and width to be reshaped as Just a combination of batch height width That's why they're in the bracket and then the channel in the end super easy to read You know exactly what happens. You don't need to Overthink that and what's also cool. You can also add Dimensions here so you can say comma channel equals three and that where you enforce that the channel dimension is always three in this code Otherwise it will fail with an error Regularly block about such things. So if you want to check out my block at pepper comm slash block feel free to have a look Then see I pipeline so I really want to take advantage of automation as much as possible Of course for automatic checks code formatting pep 8 and stuff like that linting then automatic unit test is is not Super popular in in deep learning not as much as in software engineering yet But I think it's it's taking more and more the same shape as well Automated unit tests super important. I mean of course in deep learning you cannot test every single thing I think for that it's too dynamic during training and stuff But what you can test for sure are your metrics your losses and that all the data the data loading and transformation Aspects because those are really essential if anything goes wrong here you are just Totally lost later on right if you're your loss is miscalculating something then later in training You might have wasted a lot of GPU Power and you need to repeat everything. So that is really really essential We also use automatic docker builds for the docker images so you can easily share news reuse in other environments And one thing that might also be nice is to have an automatic deployment of a demo or dev model So you can have visualizations and share that between team members Yeah, because bucksquad earlier the best, you know, they save you the time right if you see You just committed some code the pipeline fails, you know, I'll just fix it and then everything solved Yeah, here's an example. We use GitLab CI for that, but of course you can use GitHub or anything else doesn't matter Okay, now let's say we have written nice code. We have optimized the performance. We use the profiler and everything now we also want to make sure it's reproducible and in deep learning, it's a bit different than a normal just computer software engineering you have code, but you also have data and One best practice is that you really track the data with your code You need to know what experiment ran with what data? Otherwise, how will you be able to compare right if you compare? One experiment with another and they they use different data Then how can you tell if if the idea or the model was dish helped or didn't help or if it was the data You cannot so you need to always compare on the same data What we use is called DVC. It's a short for data version control and it takes care of tracking your data with You can have all your big data. So we have many large files We store them on a NAS or a cloud storage and then inside of git They are just some meta files that are tracked and they determine exactly the state of your data And so you can just it's very similar to get so you can do DVC pull DVC check out and so forth DVC add and That means if you get check out something you can get check out and DVC check out And you know your code is at the state that the data was at that point in time The other cool thing of DVC is having pipelines So I made an example here. You can generate a code pipeline where you say, okay my data generation script depends on my raw data my Data statistics computations script in turn depends on the data generation and so forth And if you do it this way You can have one single command that builds or your data runs a training runs the evaluation and everything in one pipeline that's really nice and also if you Change for example the run training script then DVC will recognize that nothing changed here So it will only rerun from this phase and everything here is cached Whereas if you change your data, of course, the whole pipeline needs to be rerun We also use that to track your metrics tensorboard locks that where you always have the most recent baseline at hand And you know exactly what you're comparing against so that really helps not to always have to ask someone else Or where how was the how was your run? How were the metrics house this you always have the best one checked in and ready to be evaluated So if you don't use something like that really check it out So to sum up the implementation stage you should push for high code quality You should really make sure that whenever you touch code you do some small refactoring if applicable You should use linting typing tests, of course and to do that you use continuous integration Profiler will help you to avoid easy bottlenecks and I should rest my eyes You have to achieve reproducibility and track your data for that I recommend DVC and Let's move on to the next stage which will be the training and evaluation so you you set up all of this and Now the question is let's train it. I have my idea. I implemented it go for it And now this happens usually you have to wait Why do you have to wait? Well, you let it run the training on GP on a GPU and typically deep learning is very data hungry so you have a lot of data and your training takes a while and Of course, that's not so nice if you if you implemented your cool idea You want to get feedback quickly because you want to iterate fast, right? Do you want to see that it work? Can I tweak it a little bit or? Does it not work at all? Should I move in a different direction and the longer you need to wait the more? Hesil you have because you you are stuck and you cannot really know how to improve from the state So let's fix that What can you do? You can For example use multi GPU training to to gain speed ups. So multi GPU training is An obvious way To speed that up because you can just parallelize the training. I will get to that in a minute You can also reduce your data set so you can look at How can I have less data to train on because if you train on less data, of course training will be faster And you should have as we defined in the idea stage You should have your single metric metric here as guidance That one you evaluate to in the end and that one should guide you if your idea is working or if it's not working So let's take a look at the multi GPU training So for multi GPU training in pytorch, that's what we are using at mindpeak. There is this thing called data parallelization So assume we have this bunch of training data I said here one thousand six hundred thirty-two images because that conveniently splits into one hundred two batches of sixteen images and Let's assume we can run with a batch size of sixteen on one GPU That means we run batch zero then batch one then batch two and so forth sequentially every batch Now to make that parallel you can say okay, let's use three GPUs instead Then the idea is okay GPU zero can run GP batch zero as well GP one batch one and GPU two batch two all at the same time While this one only did batch zero. So in theory you should have something like a 3x speed up, right? Well not quiet. So there is communication overhead You need to somehow organize this and also you need to make sure that your model is Just one train model in the end. So also you have to somehow take care that the model is synchronized Let's see how pytorch data parallelization solves this In data parallelization one GPU acts as the master GPU that one is able to synchronize with the others and So here in this case, we have three GPUs two workers and one master the master sends over the data The batch data and the model weights for this batch So that to make sure that every GPU runs on the same model weights Then each GPU runs Independently on its own batch. So here as you can see GPU zero on batch zero GP one on one and GPU two on two so far so good Now they send back their outputs to the master so the master computes the loss and then the master scatters the loss again to the Workers and they compute the gradients on each GPU and Then finally they share the gradients again with the master the master puts all the gradients together Makes the model update and now at this point the model is a is Asynchronized right there's a different model here than on these GPUs. That's why he back in this stage you send the model weights so they are synchronized again and then the cycle repeats and Everything is good, but as you can see there's a lot of communication here, right? There's here communication here here here So as you can imagine, this will not be a 3x speed up It's more like a I don't know 2 2x or 2.2 2.3 something like that And of course, we don't like that and we can do better. That's what pytorch distributed data parallel can do for us So in distributed data parallel you load and process batches Fully separately on each GPU. I will talk about that in a minute how that exactly works because you might wonder Hmm, but before the master organized this, how do they now take care of that? I will talk about that in a minute But let's assume somehow we have split up the data So that there's not the same batch batch running on separate GPUs. That would be bad. We have different batches here Now each GPU Computes the loss and the gradient separately. So totally independently of each other and And then this is the only communication stage in distributed data parallel We have a gradient all reduced. So we share the gradients between each GPU and Then each GPU has all the gradients can compute the average gradient and Then each GPU updates also independently the model Now they are synchronized because they started with the same model and they have the same gradients So they will end up after the update with the same model again and That way, of course as you can see is much less communication. So you are close to the 3x speedup that you want Now what what did I mean with you need to take care of this? So to do this you need to implement a Special sampler called a distributed sampler. There is a pytorch implemented one that you can use if you have a default setup But we needed to adjust that a bit so you inherit from it and you adjust it to your problem to your data loading and all that stuff Basically, it takes care of making sure that each GPU knows what data to look at in each Badge and each epoch so that they don't train on the same stuff because then that would be Yeah, not not really nice because you want to train on different data And Another thing to mention is this nice paper of Goyal et al. It's a paper from 2018. They looked at how to train large models with large batch sizes so over many GPUs they trained ImageNet in one hour and This paper I would really recommend to look into if you do the multi GPU training because you should also take care of Learning rate warm up and learning rates scaling. I won't talk about the details here now But yeah, it's it's worth checking out So if you have already optimized everything you have multi GPU training, but it still takes too long You might consider Using data set reduction techniques. What do I mean by that? Well, one thing is you can shrink your own data So your large data set and make it smaller make it a representative small set So you have 1 million images only train on 100,000 ones But it's not so easy to define Representative right you could do a random sample But then for example in our use case you have many different variables like labs and scanners and stanas and and all that stuff So it's easy if you do a random sample that you end up with a distribution that is not really representative for your Problem, so you risk of losing out on information But it's easy to do so if it's easy if it's possible for your problem Just go ahead you can then run fast experiments on the small set and if you have something that looks promising run it on the large set Another possibility is to create a toy data set Which is more work of course because you need to do that programmatically But it's highly customizable. You can parameterize it so that you can exactly define how you want some setup Of course with the toy data set There's always the risk that you don't know how it really translates to your real data But it's very good for idea prototyping So for example, this is this something we created as a toy data set for this cell detection problem It's super simplified and not and not doesn't translate to real data at all But it's a good proxy for experiments. For example, we experimented with label noise Where in pathology? Pathologists don't always agree on the exact single cells. So there they definitely some label noise in the data And we tried out how we can deal with that and we tried many different approaches And it really helps to to train a run in one hour instead of one or two days So that was really helpful in finding the right solutions Alright, so then the summary for the training and evaluation stage Basically, you need to take care of multi GPU training You want to have fast training because if you need to wait a week or even two weeks or so for your results Then by the time you get them, you probably have already almost forgotten about it again And it's super hard to keep track if you do that with many different experiments You have results coming in very late and yeah, it's it's a pain So initially a few years ago We only had a single GPU training and we needed to wait a week for results sometimes and that was really painful So then we implemented multi GPU training and now we can train in a day or two So that's much more convenient. You implement something you let it run overnight and the next day you already see if the trend looks promising and You know if your idea makes sense to to look further into or not So as I said, you can also reduce your data set by by having a small representative subset Or you can build a toy data set To yeah to to evade kind of to simulate your data problem and to make sure That you can really iterate as fast as possible Just to reiterate of course a single metric defined in the idea stage should be your Your guide pool that determines if your ideas a winner or not if it's better on that metric Hopefully your idea was nice if it's not then you probably need to look into why that's the case So now we have everything right we have our idea. We implemented it. We trained and evaluated it Now what usually happens is this it doesn't really improve over the baseline and of course that's sometimes a bit frustrating or disappointing But you must say in machine learning, it's normally you have a lot You need to iterate a lot and it's normal that ideas don't work out But the good thing is if you if you did all these tricks and tweaks And if you have a fast cycle a fast iteration then it's really not so bad because I mean, you know that you can Generate the next idea. You know that that it's not didn't spend months of work wasted You just spend maybe a few hours or a day and you could try it out and that really really helps a lot I mean, otherwise It's much harder to move towards a really good deep learning model. I think The important thing for me is that you always learn from from also the failed experiments. They're very beneficial in telling you Why things I mean, you should try to ask why did it not work? Often you take a research paper you implemented and then you let it run And have the results that the research paper promised then ask yourself. Why did it not work? What's different here than in the paper? Maybe my problem is a bit different and maybe This doesn't really apply. How can I learn something about my data? about my problem from this experiment, maybe I can even generate new ideas something that will help me in the future and What we often do is that we Write a diary entry about our experiments and that is no matter if they fail or if they if they work out That's a very good Yeah, very good method. I think we use confluence pages for that, but you can of course use anything you like You can just say what was my idea? How did I implement it? What were the results and what other learnings did it did it improve over the baseline or not? If yes, nice That's good. If not, why not? Why why why did it fail? What did you learn and? How can you? Maybe go with on with next steps and that helps not only for yourself because often later on a few months later You have a similar problem or thing and then you forgot exactly how was it then you can look it up in your page You know, ah, this is exactly how I did it. Okay makes sense. Then maybe I tried out this way or that way Much easier also for team members. You can share it if someone asked you about something that you did before You just reference them to the diary and they know exactly what was going on So that's awesome Yeah to sum up my talk So what can you do to improve the deep learning development cycle well first use data-driven idea generation That helps you figure out the potential categories where you can improve the most and of course improving a lot is what you You know what you want to do in deep learning You should have a single evaluation metric to target so you don't get confused. It's easy to communicate in the team It's easy to Have a direct comparison between ideas You should strive to automate everything That means you see eye pipelines to have automated tests, especially for the very important things as losses data processing and so forth And you should track your data with your code use something like DVC There are probably a lot of other tools that you could use as well, but we use this one and it's really nice Yeah Then you should take advantage of high code quality. I give some examples comments as code and the INOBS library But of course the most important thing is if you touch code and it doesn't look too good for you Do a refactoring make it a bit better that way you help out your whole team and You strive towards having much better code much more high quality code and you can iterate faster And it's more fun to work with Also use a take advantage of multi GPU training and if you do don't waste your time on data parallel use distributed data parallel Even the pytorch Documentation nowadays mentions that I mean there is some merit to this so this is super easy to do the normal data parallel You need to not set up anything There's no distributed sampler that you need or stuff like that. So if you just want to Have a low effort multi GPU training, of course, you can also do that, but if speed is of importance really consider this one and Lastly you should take advantage of learning opportunities and that's Regardless whether your experiments worked or they didn't work always think about what can this teach me about my problem and Then generate nice ideas from there And if you do all of this then you can iterate quickly and have fun doing it So that's it from my side. I'm looking forward to your questions. Thanks Hey, Mark. Thank you so much Interesting We have some questions for you. Sure. Let's see if we're gonna cover all of them The third one says how do you validate a deep learning model as a diagnostic lab technique? Um So, yeah, it's it's a complicated Procedure actually so what we do is we we make a C label. That's a norm of the European Union So and to do that need to cover a lot of things You need to have some quality control systems in place You need to have a study usually so you have a couple pathologists that do the analysis with and without AI and you compare that That is at least similar Results and and if you've done a nice study like that, then you need to write all of that up in a lot of nice regulatory documents, it's a lot of fun and With that you go to the regulatory body and you say hey look, this is our medical device description This is what we did. We have the study results here and this and that and then they give you this e-mark And then you can sell that as a diagnostic technique But besides I mean what what we also do Of course, we I mean, it's it's a longer process to get this label. It's a lot of effort So what what you do in the meantime is you use lab develop tests So the labs themselves they can also validate them what the method internally And then they can also use it in diagnostics without this label But of course then it's more work for the lab and that depends on the customer some of the some of them They're fine with it, but also even mostly the larger labs. They they want to see a label Okay the next one How would you define the good amount of time to spend on hyper parameter optimization? Yeah, that's a good question. I mean So I think that really depends on your hyper parameter if if you're I Mean what we usually do is you try out Some small ranges and if you see that has a large effect on the overall results Then of course it might make sense to spend more time there But if you see okay the impact is small then I mean there's only so much you can do right I mean this this costs a lot of computer you usually have several many hyper parameters and If you want to fine-tune all of them. I mean of course If you want to do that I recommend random search of hyper parameters instead of doing grid search because with random search You get better coverage of the parameter space, but yeah, it's definitely not easy to have a concrete answer for for good amount It's it's always a trade-off between compute time time. It takes you and the time you have available Thank you and I think this is the last one we can read Are you using traditional CI like a good lab CI for all the machine learning integration and delivery stages Or you're also using thinking about using more specialized platforms like cube flow So currently we're well covered with good lab CI That said of course as I as I've shown we use DVC for the more machine learning aligned pipeline so for Having code dependencies and steps that run one after one another we use DVC, but for everything else we use get lab CI But yeah, it might be worth thinking of something more specialized in the future, but so far these this combination really serves us well Okay, there is a one more question, but the discussion can continue in the Breakout room the part Thank you so much mark. Thanks a lot. Thank you. Bye. Bye. Bye