 We are going to talk to you today about how you can do less work or How your Argo workflows can do less work? My name is Julie Vogelman. I am an Argo workflows maintainer. I'm also a staff software engineer at Intuit Which is the company that founded Argo and Argo workflows specifically Hi, I'm Alan Lucas. I'm an engineer at Pipekit. I'm also an Argo workflows maintainer so What we're going to try and achieve with this talk is how we can save some time when we're running workflows By and by saving time. We're going to save cost running less pods that are and how we're going to do that is Not repeating work that we've already done if we have already calculated something We know the answer already we can use that If you're familiar with workflows, then you can do this with individual steps or tasks in your DAGs And then use the information you've done it in one workflow run In another workflow runs a skip that step You can also do this to entire templates so an entire DAG can be skipped if it's Calculates a bunch of things takes bunch of inputs creates an output as you're using multiple steps on or a whole DAG It will you can use the techniques we're going to talk about here I go over a bit of terminology workflows And steps can take inputs they can take parameters as inputs to a workflow and Individual steps can have inputs and they can have outputs and those inputs and outputs Can be parameters which are Strings so you can pass information between the various steps in your workflow using strings, which are parameters or using artifacts artifacts are files stored in blob storage normally S3 Something like that as your blob storage Minio, whatever and all of these things Hopefully a bit familiar with if you use workflows before The things we're going to introduce to you today are memorization Which is a very weird word Memorization is a feature of our go workflows. So our go workflows had code that will help you with skipping steps And the magic word is memorization if you need to search things Work avoidance is a Technique it's documented in the our go workflows documentation, but there's no explicit code behind work avoidance But they both achieve the same goal Memorization is more efficient and we'll go over that in a moment So what do you do to memorize something? You add a little memo eyes block to your template As shown here Got three sections to it It's got a cache Which is where we're going to store the information from one workflow run to be allowed to pick up in pick picked up in another workflow run The only option here you have is config maps We'll talk about the config maps in more detail in a moment If config maps are a limitation Come visit us in github or slack and talk about your use cases So that we can design something that's more flexible than a config map You've got two other parameters the max age is how long is this cache line valid for so how long is it We've stored the result of a step How long we're allowed to keep using that until we have to rerun that step in this case it's only 10 seconds a ridiculously short time for any real workload and The most important part of the memorization is the key The key is what we use to say This time we're running it is the same as last time we run it and that ran it and therefore we are allowed to Skip it and use the outputs that we determined last time instead of having to rerun the whole step In this case, we've got one of the parameters to the Template being used and that that will be a very normal pattern to use When you do this and memo is a step It looks up the key in the cache to see if we've really run it and if it's valid You will go from the left hand side Well, we've got a duration of 17 seconds to do whale say I've no idea why my laptop was so slow at that point Down to zero seconds. The pod was not spun up at all the workflow controller Decided that this was a valid cached entry. You can see that down the bottom There's a cache hit going from no to yes, and it's completely skipped the step And that's the main advantage of memorization There's just no interaction with Kubernetes at that point No need to wait for a pod to spin up to say we don't need to do any work at all This also works in a slightly more sophisticated but still very toy example Here we've got an output artifact. So the output artifact is a file. We're pretending to do DNA sequencing here so we've taken some input data and we've sequenced the DNA put it into s3 and This is the second time we've run this workflow and we're able to pull Through fairly invisible line into the use DNA step below The data that we used we created in the first time run first time we ran this And as you can see the highlighted step the yellow ring Is what shown in the right-hand side duration of zero seconds and again memorization hit of yes You can do the same thing for an entire DAG so first run it runs our classic DAG diamond to show off DAGs and then the second time we've memorized the inputs to that DAG and Skip the entire thing and provided the outputs from the DAG This cache is in the config map The config map is created for you automatically by the workflow controller This is Interesting because many installation to cargo workflows won't have the capabilities of writing The RBAC rules will not have the ability to write config maps So if you find memorization is not working for you, that's the first place I would start to look it's have you given appropriate RBAC permissions to the workflow controller so that it can write config maps It will write them into the same namespace as the workflow controller Or the workflow right sorry the same namespace as the workflow. No same name same name places the workflow controller, which is Again something if that doesn't work for you. We need to use cases for how to fix that So that you could have Caches in the same namespace as the workflow is running. Yes, genuinely is in the workflow controller namespace your config maps are usually limited to a megabyte because they're stored in etcd and That means there's a limited number of cached entries you can have The controller does understand this and starts to evict things for you But it may be that you just have too much later and it's not a useful thing as it is again Talk to us on GitHub or Slack about how your use cases can We can improve to make your use case work the config map is a Key value pair in the normal config map way the key being the key from the memo eyes So that's nice and easy to read and the value will be a chunk of Jason Which will contain the time it was run and the outputs from that Run so that we can take those outputs and use them in the event that we're skipping the step in styling entirely We're we're we're faking all those outputs. They're not really faked because we've done them already We've calculated the answer you wanted So the advantage of that Jason being human readable is you can go in there and edit it You can look at it work out why your workflow Memorization is not working perhaps and If you want to edit it to remove a line so as it reruns you can do that as well Okay, so I'm gonna walk through an example here of a workflow template whose job is to Sequence DNA and maybe do something with that and you can imagine that sequencing DNA is a very time-intensive And compute intensive tasks, so if we can do that let you know fewer times that's better So specifically step a here You know takes an input parameter You know, which is basically like the person whose DNA is being sequenced and you know Obviously, we need to pass like all of that blood information to it as well and And so it sequences the DNA and then in this case uses an output artifact to go store it in S3 and The step was configured with memoization so basically behind the scenes the workflow controller will create that config map and Maybe for the key, you know, you've indicated who the person is, right? So you could imagine many different keys list live in this config map, right one for for each of those people So now when the workflow template gets run a second time down here You know assuming this key entry has not expired then the workflow controller will not run this step at all because it doesn't need to So All right, so we talked about you know the step made output a parameter it might output an artifact well parameters and artifacts are sort of capital O outputs of our go workflows, but really you can have other types of outputs For example, you might output to a database and so then the question is Will it work if you memoize that step? And the answer is that if you actually use the very latest Argo workflows This will work But Unfortunately prior to the very latest You need to use an alternative technique if you want to be able to do this kind of thing and that technique is called work avoidance And it's actually a fairly simple concept really basically it just means that Your container logic can just go and check first before it does anything, right? Does this data exist? You know if it does if it's of a recent timestamp, then I'm not going to do anything Of course in this case, you know the workflow controller had to go to had to go and actually deploy the pod You know, which it did not have to do in the other case So if you do go and look up work avoidance on the Argo workflows documentation You'll also see that it mentions that you can use a marker file To indicate, you know, whether the data was written So basically, you know your your container would both Right to the marker file to say hey the data is here and then it would also Before it runs it all checked to see does it does that marker file? Exist already I'm not sure offhand the case is where you want to use a marker file rather than just checking for the data itself But it's sort of the same concept Okay, is there any other time that you can't use memoization well Alan mentioned that the config map, you know is going to have a size limit and it's one megabyte By at CD limitations So in that case you would could also employ the work avoidance technique instead and just to kind of look at sorry You know reaching that limit if we go back and we look at You know this You know here, we're basically saying, you know use this whale sail whale say cache config map to store the values of all of you know any potential key here, right? and So, you know if you have enough of those keys and if the value is big enough Then then you're going to get to that limitation potentially Now that's specifically with parameters because with parameters the The workflow controller actually embeds the full values into the config map itself versus if you have an artifact Then it will it will install it instead include the the paths to the artifact So it's so it's not actually the values themselves. So that data if you're using artifacts It's it's going to be smaller and less likely to hit that limit Okay, and I will turn it back This quote is usually attributed to Phil Carlton that there are only two hard things in computer science caching validation and naming things So what I'm meaning here is what I'm trying to talk about is that it's quite hard to know when you should or shouldn't memorize things what the Bounds are of valid things to skip And there's a rule that allows you to guarantee a safe memorized step computer science talks about pure functions quite a lot in Functional programming if you have what I'm going to call pure steps So where the outputs of a step I mean it could be any kind of step or a whole dag or whatever are derived Only from the inputs of the step. They are manipulating the data that's coming in through the inputs And you are storing using all of those inputs or perhaps a hash of those inputs as the memorization key Then that's a pure step. It's not interacting with anything in the outside world. So skipping it Will not miss updating a database skipping it will not depend upon the weather today versus tomorrow So you can guarantee that that information that is derived from the inputs will be valid whatever the time of day is whatever however long it is since you last ran it and that Would allow you to think through your workflows perhaps refactor some of them to produce pure steps that you can Guarantee to memorize if you can't do that you're going to have to be careful Workflows will do what you tell it to do. It will memoize things that really shouldn't be memorized It will skip a step when the keys match and the time since you started Sorry, it's time since you stored that Memorized information is less than max age as you've defined it And that may be good or that may be bad. You're going to have to work that out for yourself That's the end of our talk. You can come and find me I'll be around all week on the on our Pipekit stand Talk about memorization or anything else. I'll go related There's a couple of links there for the documentation on memorization and work avoidance system There's an example of memorized step in the examples folder in our go workflows If you'd like to book a meeting, there's a QR code on the right-hand side Has anybody got any questions? And by the way, feel free to talk to me to you. I'm not on the slide Go to the microphone I have a quick question how the database will support the multi-tenancy suppose if the cluster has like 10 names, please so it's challenging at the moment because there's there's no multi-tenancy capabilities there the RBAC is either your workflow controller can write to config maps in its Own namespace or it can't and that's your only choice. That's why I was saying we need some use cases for How there are already issues in workflows about this problem. How would you like it to work for you contribute there? And we'll try and fix it Yeah But you said that on the all the input as a key are you using like a namespace slash workflow? Yes, sir workflow or namespace slash template name is a key No, there's no the key is whatever you put in that okay in that block So you you could have two completely independent teams writing to the same config map and Yeah, there's no there's nothing to help you there. I'm afraid Because the original memorization is based on that multi-tenancy. That's why the config map is invented Yeah, it's a problem that we're aware of on needs fixing Um, yeah, I know thank you Wait, I'm baller. Oh Haven't met before sorry Yeah Hey, just curious about the work avoidance for managing a key value store Is there any advice you have for life cycle management when it comes to? Creating that database like have you done? Providing or provisioning your own like redis pod along with the workflow set so you avoid some of that key value pair Not personally no Okay, I was just curious I've done something similar with an etcd But yeah, I'm not sure that's necessarily useful. Okay. Yeah, thank you Hi on the definition of the workflow, right you had The config map name there right there at the bottom. Yeah, does that imply you can have multiple config maps? You can you can use parameters in there. I checked that this morning So you could if you put the same thing in your if you put your key into your Config map you get a config map per key So you get one meg per thing But there's no real life cycle management or garbage collection that will properly happen in that case as far I'm aware but you would have a Confirmed map per tenant per se you'd need to somehow get the tendency information in as a Parameter you could do it, but you're then relying on it depends how you're deploying templates if you're if you've got control And over your template deployment perhaps you can inject plate parameters Between the template being written by somebody and ending up in the cluster to add further things I Don't believe there's a way of sort of using workflow defaults for that sort of thing though, so it's not yeah Now it seems like it may be a bug that the config maps Aren't always being Garbage collected I guess right the config maps never disappear completely they can get garbage collected entries from them