 Hello everyone thanks for joining these sections. My name is Tommy. I'm a senior software developer at IBM's and I was at my colleague's animus sign. He's the lead for a lot of open source projects and driven a lot of open source projects in our teams. Unfortunately animus cannot join today's sessions so I'm going to present it by myself. And today we're going to go over how to bring the ML Opro on the heritage cloud native machine learning platform by using intermediate representations. So to begin with we want to start with how enterprise is actually using AI and ML pipelines in their productions. So in our research we kind of see that a lot of enterprise they're still struggling putting AI in terms of production stage. Many of the teams do like have issues just like making sure that AI is put one in a test stage and have difficulty to move in productions. And when the enterprise kind of see how AI and machine learning works right, they kind of see in these three aspects where they see data scientist and data engineer like kind of collect data and use data to build and they use those data to build models. And finally some that model could be able to do some prediction and you know create like business values. However in realities when you actually build on machine learning pipelines it actually involves much more steps. So even in terms of data preparation you have to involve like data cleansing, adjusting, analysis, transformation, validation and splitting. Same thing with model training you have to optimize them and train in scales and finally when you deploy them you have to also deploy them on different environments such as edge cloud environments and collect feedbacks and improve over time. So you can see this process actually very iterative and involve like cloud team efforts where model team have to work on model aspect of the machine learning pipelines. So having a platform to share those different kind of data processing and model creations is very useful. So because of that we kind of investigate like what kind of open source technology is good and we found like give a pipeline is very good on running machine learning platforms on top Kubernetes. And for give a pipeline is a very good platform to help your containerized ML tasks. So you could use any frameworks, any language inside your containerized workflows and you could also leverage Kubernetes like objects such as like volume and secrets. And on top of that, data science is very easy to use this platform because it has a Python DSL which you could actually create workflows just using a Python definitions. And because it actually implements inside like a Python definition you have to modify the input parameters. So whenever you actually run this pipeline using a different kind of parameter you could also do so. And once you have created this automated pipelines you could also schedule them and run them on your day-to-day basis to automate your machine learning workflows. And now let's go ahead and talk about some of the big advantage of using your Q4 pipelines. One of the big advantage is using the Python SDK. As you can see we're using the Python SDK. Data science is very easy to be able to create like a machine learning workflow by just using like a programming syntax. As you can see you will just define different tasks and have like dependencies or something. And create this pipeline that both have like sequential dependencies as well as like parallels runtime as well. And behind the scenes once you create these pipelines there's like multiple you know like workflow engine able to run these pipelines. By default Q4 pipelines use what they call like algo workflows. It's actually a you know Kubernetes container native workflows that runs parallel jobs. And behind the scenes whatever components you define on Q4 pipelines on algo generally you actually just run them as a part. And within a part like you have multiple containers you know capture all the workflows after all the artifacts and able to push them. And in our teams we see like algo have some limitation at the time. We cannot run them on top of OpenShift and cannot run them like security. So we actually introduced like another you know runtime to just run on behind like Q4 pipelines. So when we investigate Tecton is very good platform that able to you know certified by OpenShift and have good OpenShift support. And behind the scenes Tecton also use similar syntax where you could also like wrap a component in Q4 pipelines inside like a part inside Tecton. So within Tecton there's similar concepts where you know a pipeline is also you know defined as a workflow on Q4 pipelines. And it has is actually run as a part usually and it's corresponding to like a component in Q4 pipelines. And with this we kind of leverage Tecton behind the scenes and we hit like 1.0 go where we able to modify the Python SDK to generate you know Tecton specific definitions. And then you know submitted on the Q4 pipelines API engine and then use that as your runtime source of truth. And behind the scenes when we have like the Tecton pipeline defines the Tecton just able to leverage and then run as exactly what how you know Q4 pipeline run on R goals. And with this we have completed the basic implementation now we want to leverage you know additional like features for it for Q4 pipelines. And so the you know nice feature on Q4 pipelines that you know R Go and Tecton didn't provide is metadata and artifact tracking. So for metadata and artifact tracking like well the you know main benefit of them is like once you have metadata and artifacts know like how they've been used and how they've been consumed. You could easily use them to find out like what you know which data a model was trained on so you could actually go back and check your version of the data. And you could also compare previous model ones so when you actually modify like let's say hyper parameter you could easily compare them between two different ones. And you could whenever you actually have like multiple stage from the previous models you could also track them on which state you have been used. And then we use any previous you know completed outputs so let's say you have trained a model for a long time you want to reuse those results when you run on new pipelines with just different kind of treat on the way you want to consume that models. And on Q4 pipelines like artifact tracking you're just simply able to see like which object you have been consumed for the artifacts and which components producing the artifacts. And you can see that the versions and different timestamps that is being consumed and produced. And with those information you could actually create this linear tracking where you could see all these components actually producing these artifacts at these timestamps. So the next stage you could see like the next components consuming the same artifacts. And if you have multiple kind of components consuming the same artifact you could actually use this linear tracking and see what are the dependencies that is not just on the parameter levels but on like an actual artifact object levels. And with this we also like make sure like Tecton also support all these functions so we actually implement Tecton in a way that able to reproduce exactly the same functionality as you know Q4 pipeline on Argos. So in Tecton, even with the Tecton back end you could also view the loss using linear tracking and artifact tracking seamlessly and the user experience is exactly the same. And the way we actually did this is actually we have to modify you know the Tecton compilations that actually able to produce artifact the same way as Argo. And we have to work with the Tecton community to actually produce multiple features that able to like compliance with the exact functionality on Argo and then in addition to that we have to create like a new way of tracking metadata. Because the way of how Q4 pipeline tracking metadata on Argo is very different from Tecton. So in this case we have to like recreate a new concept of like metadata for Tecton as well and able to have to create like this new metadata writer to track all those metadata and we produce them on Q4 pipelines. And because of this you can kind of see like to support just like a new runtime that Tecton on Q4 pipeline we have to first modify the SDK to produce a new source of YAML and they have to update the both the UI and server to able to consume this new YAML definitions. And behind the scenes like because Tecton runs pipeline differently from Argos we have to have to like modify how you know Tecton is running the pipeline and also have to modify how metadata is being tracked and consumed with you know the new Tecton structure. And with all these efforts it's very difficult for like any particular user to add you know new backend if they want to. So that's why we want to like introduce data v2. And in addition to that we also see like other use cases let's say for TensorFlow extended. TensorFlow extended is very you know driven by metadata and in this you know metadata to be like strong consistent. And because of the v1 implementation metadata is actually like synchronicity so for TensorFlow extended use case they actually have to you know produce the metadata as part of the component itself so it's actually not ideal. So that's why with this kind of scenario let's go ahead and introduce like what Q4 pipeline v2 have provided and what are the benefits of Q4 pipeline v2. So the main you know objective of Q4 pipeline v2 is actually to architect the pipeline compilations. So we actually want to build a pipeline that is mainly you know support for metadata and driven by metadata. And secondly we want to have like this new v2 Q4 pipeline data to be backend platform agnostics. So anyone want to extend a new backend they don't have to change too much client to much server works. And in addition to that like Q4 pipeline could actually have more control on how the runtime is being working. Because right now like a lot of the feature let's say like artifacts right it's actually very dependent on the underlying you know runtimes. So whenever it's a new platform is being introduced they have to support all these features so v2 is aimed to actually reduce the amount of feature dependent on the backend. And have Q4 pipeline to handle all this kind of like metadata tracking and other facts producing right just on the Q4 pipeline platforms. So let's rewind a little bit on like what you know like metadata in v1 of Q4 pipeline is doing. So in v1 like all the metadata is actually trapped by a service called MIMD metadata writers. And of course there's some exception where you can actually use MIMD kind of to write to the service as well. But most of the time the metadata writer is actually just asynchronously collects all the parts and collect all those metadata and write them into the MIMD server. So all the metadata is being tracked. However with this approach because it is synchronously it doesn't have like a strong consistency. So for TensorFlow extended they cannot use this approach. And furthermore because this approach is actually just watching all the part events and all the part annotation to capture all those metadata. When you want to have like a new feature let's say I'll go into the new feature for HDT templates and Tech Concrete like Custom Task Controller that runs tasks like not inside a pod. You cannot use the same approach to collect metadata. So this is that the disadvantage that we cannot leverage a new feature from this like backend provider as well. So this is why we kind of move into metadata in the new versions where we want to have MIMD integrated as part of the public execution. So as part of running the components we actually have capability to read and write metadata as part of running the workflow. So it's actually able to run synchronously. So for TensorFlow extended use case it actually could rely on those metadata to driven when the artifact is being used and when they consume it also is consistent at that time of point. And furthermore because we actually have like strong consistency with this approach we actually could also use this for caching as well. So we see like the metadata is actually being produced and it's consistent at that time step because let's say we already produced this artifacts at this time and it's still the same version and it's strong consistent. So we could actually just use that as a cache key and then not have to produce not to compute the main you know executions. So it could actually improve the execution runtime significantly with this approach. And as part of the V2 we also want to optimize how we consume the pipeline specs. So in V1 pipeline spec is actually like based on the backend you know YAML. So when you have like an algo or like a tech pump backend you actually use their definitions as a source of truth. And because we use like a backend specific source of truth. So whenever you have to consume them in the SDK in a client or you bring in your own engines or bring in your own interface you have to have like interpreter to understand all these new specs. So the challenge is like when you have to introduce a new backend for people pipeline you have to change a lot of the let's say UI backend SDK and all these services right over here. And with V2 what we aim is to create this intermediate representations. So all the interaction between Q4 pipeline platform is actually using this new you know IR, what we call IR intermediate representations. And we only communicate specific with the orchestration engine specs on one of those backends when we need to you know create an object or passion object. So we have to reduce you know the code you need to you know modify for a new platform into a single package for one backend. And the benefit of this is like because of this approach in V2 we could be able to create a new UX that is purely based on the information from MAND and from the spec on this new intermediate representations right. So with this you know new approach the new metadata able to collect you know more information such as artifacts, sub-dacks and lineage able to display them all in the same UI. And because of the same time it's using like a common intermediate representation so whenever a backend needs to be updated for a new version or you have to introduce a new backend this UI you don't have to change right. You will just consume this exact same information no matter what kind of backend and version of the execution engine you are using. And another benefit of you know using intermediate representation is that we actually have like one common way to communicate between Q4 pipelines and the clients. So in Q4 pipeline we have you know the SDK and the UI but if let's say any you know cloud provider want to provide a new way, a new interface right that consume these platforms. Let's say they provide a new way that introduce like local, local you know interaction UI. This is like based on this new IR and don't have to worry about what kind of like runtime is using behind the scenes as like in the previous version we actually want to introduce this kind of feature you have to know or you want to run the open shift. And it's only certified by like for TecTons they have to like have a interpreter just particularly for TecTons and if someone is using Argo you have to create interpreter just like specifically for Argo. And it's very difficult to maintain when someone introduce a new interface in the OE1 versions. And with the V2 right when we actually run this task we had kind of mentioned like as part of this tab we actually record you know all this metadata as part of running it. And the concept we're using here is actually based on you know TensorFlow Extended the concept of driver execute and publishers. So whenever a you know a component is being run we will have a concept of driver way actually collects all the metadata informations and try to see is this you know tasks is being executed or not. Do we need to cache them? Do we need to like you know run them into a new you know architects right. And once those you know context information is being set the main execution finish and then we have the concept of publisher where all the artifacts and parameters is being recorded and push back to the metadata service. So all the you know information is strong consistent in this case and any pipeline is depending on you know this artifact information they could actually ensure like there will be no delay or no modification that will be coming from a different pipeline. And because of this right this enhancement we could actually do with like components itself as well. So because we introduced a new intermediate representations this representation is actually very pluggable by just like swapping individual components. So in V2 it's very easy for you to use the SDK to actually compile your code into a component representation in YAML or JSON. And in addition to that in open source we also have another project called machine learning exchange and also Google Cloud have provided like a new service to able to register components. So anyone that want to you know just pull the components from a public host they could do so. And whenever you know someone want to introduce you know and let's say a local interface they could you know easily rely on this registry to actually just pull code. And pull you know new components right from a public endpoints and they don't have to just you know copy paste code right and put on top of their own interface. And lastly when we actually see like we have this intermediate representation but like in the back end we still have like a lot of code you know spamming around different kind of microservice and they all using like a specific like specs for the execution run times. So this is why we need to have an abstraction layers so we create an abstraction interface. They able to understand you know both the spec from Argos, Tecton and any you know future interface that will be coming for the Q4 problem platforms. So behind the scenes with this abstraction the benefit is that like in V1 because we don't have these abstractions all the source of truth based on you know like the workflow specs right and the problem aspect for Argos and Tectons. And whenever you have to you know modify a version let's say Argos have to you know upgrade to like a V3 version with like major you know spec change. You have to change like every single point where the metadata consume that spec. So it's very difficult to track down and everything is all over the place all over different packages. You have to know the code way around it in order to upgrade like Argos or switch to different back ends. And with V2 because everything is you know on top of Q4 pipeline is communicated by an intermediate representation. As you get in front they have no change need to be done and on the background level we actually abstract everything all the background specific spec into a one common package. So whenever you need to upgrade a Tectons or Argos versions or you have to modify any specs you only have to modify in one single place. And this package will be able to like abstract and able to like able to consume and inherit through the all the pipeline services. So when all the pipeline services use them they don't have to be aware like what is running behind the scenes. Like only the spec implementer have to make sure like this is compatible to run a mobile version is being deployed. And the main feature of abstracted layer is how to break down into three parts. So one obviously we have the compiler so whenever you bring in like intermediate representation I would want to like apply that representation on a back end spec. You need to you know kind of compile them into that specification so it could be understand by that back end API. And on top of that we have execution time so when you have this spec being compiled you need a client to run the basic API operation so create get patch delete this. And last and not least when you have this new spec and this spec is being running and you need to create some patch have to modify such an attribute. You want to have like an abstracted layer where you only want to like let's say you want to update the service account on that particular pipelines. You could just use this abstracted execution spec to say update service name and behind the scenes whatever service name had to be updated for that back end implementation. It's determined by the execution spec and dependent by the runtime you are running it. And now we kind of go into like the details on what you know the execution time can be supporting. So the execution time support you know the basic CRUD execution to create updates, delete, get this patch. So this is like common you know Kubernetes actions and because of this we abstract all these actions right for this in this interface. So in theory even though you don't have like a Kubernetes base run times you could still implement all these actions and use that as a key for problem back end. So we got to the front end for like having a back end that's really non-commodated base and able to use key for problem as well. We also want the motivation to have these abstraction layers. And by just able to create and patch with the clients it's not enough. We still need like a various you know common function that whenever you need to let's say update a default let's say input parameters. We want to update like certain information let's say annotation or labels right on that particular pipelines right. We need to know how to actually do this for a particular back end. We also abstracted all this common function into this new abstraction interface. So because when you actually want a pipeline or let's say when you want to retry a pipeline you only need to modify certain set of parameters and let's say you only need to like update parameters. So the pipeline service only needs to call it update parameters and execution spec will actually determine what back end is using and update the pipeline parameters based on you know the back end engines. It's been specified on these platforms. So with this I could just kind of show you like how you know the key for problem v2 is being used and you can see in the UI. There's no dependency and you don't have to be aware what kind of back end engine is being used. So in the key for problem v2 when we actually you know click on a pipelines let me make it larger. You could see like all these components like it's been rendered and then being extracted from a pipeline specs and this is the new pipeline specs where we have defined as a intermediate representation. It's very you know driven by component base. You can see like each component is being defined nicely over here. So it's easy to you know plug in and then extract and able to upload and register on a common registry. And once we run them right because this pipeline takes some time to run so actually run this beforehand. Once you actually random you can see all this information is able to fetch fairly fast. And this is all based on one you know common pipeline specs. So you can see like no matter what kind of back end you use whether or go with Tecton. The UI doesn't care because all those information is actually stored in the metadata server. And the metadata servers share the common kind of like metadata definitions. So no matter what kind of back end you use and what kind of execution you use. It would always use the same you know pipeline specs and the metadata information to render this UI. So you don't have to need to worry about like testing and updating you know the UI based on a new let's say back end you have introduced. So I just want to like have a summary. So in this section we kind of got emphasized on how we want to embrace you know metadata as like the first step towards the MLOps. So all this kind of like information that driven by individual components will be actually focused on the metadata. So it will be kind of like back end agnostic and everything could be kind of like view as metadata and they don't have to be modified based on whatever runtime you have used. And in addition to that we want to emphasize on the new concept IR and able to make sure this IR component based. So it's easy to register and consume them in this new ML platform. So when the new provider want to consume them or create let's say a local local interface it's easy to make it plugable and consume them very easily. And lastly we want to show like how a abstracted layer could be actually how introducing new orchestration engine very easily. And this you know new abstraction layer is actually available in you know the V2 alpha 4 release. So when you want to like bring in a new client bring a new you know back end execution you could easily just use this version extend Q4 pipeline to run on your own platforms. And lastly this other reference way for Q4 pipelines and Q4 pipeline on Tecton you're interesting on using Q4 pipeline on different back end you could go to this link. And we also have a V2 design doc to show like how in details those metadata is being used and how those metadata is being produced right. And the design concept behind using the V2 design doc over here. And that concludes the end of the section. Thank you very much for joining. Is there any question in the room? Yes please. So let me have a quick question. So you are asking is there any like computation optimization for this IR right? So this IR is you know kind of optimized to be like plugable so all the components easy to kind of swap in and swap out. When you kind of emphasize on like performance optimization are you kind of emphasize on like the runtime performance or you emphasize on the compilation of performance. Right so in the new IR it actually abstracted a lot of the you know concepts like just really driven by a bit of like offset to purely to the Q4 pipeline platforms. So you just want to asking like how you could optimize you know performance on like a run times like pipeline run time. You could kind of see this kind of new concept where we introduce the way we quote metadata using drive-in publishers. So before that we actually have to rely on another services to you know just capture them asynchronously. So it's difficult to scale them when you have a lot of pipelines. And with this new kind of like pipeline constructions we leverage the concept of drive-in publishers. This driver is actually just a concept to you know get metadata and then you know send them to the executor and the publisher. So in the algo committee they have a concept called ACT template and in Tecton they have a new concept called custom task controller where we leverage those concepts. We actually have some experiment implementation for that as well. You don't have to run you know this task as a container you can actually run them just as a request to like a server. So that server could be responsible for you know getting you the metadata and make sure that everything is consistent and you could just scale much easier by just relying on this new concept rather than you have to manage your own metadata writer service and manage your own let's say caching server to cache your own components as well. So that's kind of the benefit on you have more control on how you could optimize and scale on your own services. Yeah thanks. Is there any other question in the room? Is there any question online? Yeah if that's the case yeah thank you very much for joining and this is the end of the session thank you.