 I'm Prasant, I'm a senior engineer slash data scientist slash sometimes I don't know what I'm doing engineer on the AI center of excellence at Red Hat. So as you see from the slide it's machine learning for developers and QE. So you can see from the title like I'm going to start rambling on and on about machine learning but why developers and QE, why disturb the poor souls and not let them live in peace. So what do I mean by machine learning? Now if you walk down the hall not that hall nobody's there so if you walk down and ask someone like what machine learning is like you'll get like n number of definitions and God for but don't ask a data scientist what machine learning is like they'll confuse you even more. So but here I'll refer machine learning I'll use it as like an overarching term like and relate that to anything related to statistics like analytics data anything that you use with data and try to get like meaningful information from there. And coming back to like why developers and QE so they're like several personas out there like users like data scientists like rocket scientists the mighty Thor and even Ant-Man like they all have data what do you do with it like and what do you do to like get meaningful insights into the data so machine learning is one popular technique to do that and it lets you get meaningful insights not into the not only into the data that you have but also lets you foresee the data that you will have in more technical terms that's like the prediction. So I'm gonna focus specifically on developers and QE like explain certain use cases that's specific to development and QE so let's see I forgot what I practiced. But so I mean when you talk about I mean when you want to learn a new language or when you are introduced to something new you always look for a hello world example well is there like a hello world example for a machine learning that's the field is too broad to ask for that but think of this as like a hello world presentation or a hello world like template where I'm gonna show like a tool or a framework that lets you turn prebuilt machine learning algorithms into a service and kind of provide an easy interface for you to access these machine learning models through the service and play with your data and if you want to move further and kind of advance the code like tailor it to your own use case you could do that and and then you can follow the same fashion of like how do you turn that into a service and make it more user user friendly. So I'm gonna explain the tool or like the framework that I'll be using for this it's called AI library so it's a part of a bigger project called Open Data Hub I don't know so this Open Data Hub there was a session yesterday about Open Data Hub so that's like machine learning as a service platform that has all the bits and pieces like ranging from streaming services like Kafka like how you saw in the previous slide I'm sorry in the previous talk and bunch of other tools like Jupyter Hub and like data data handling services like plenty of other stuff so AI library is just a part of it and it has like an open source collection of AI components which ranges from like simple statistical algorithms like regression correlation analysis to all the way to like natural language processing models and so so and it also provides like support around like bringing up the infrastructure around machine learning models like data handling and let's you turn that into a service so having this environment set up now you can quickly play with the algorithms without like trying to develop it from scratch and kind of like lets you do rapid prototyping so let me jump into the architecture so once deployed like the AI library sits on top of like the container platform that's we usually order that open shift and on top of it is going to be Selden so Selden is an open source platform to serve models so anytime you have a model like a Python code or a Java code anything that does machine learning you want to package that into a container and kind of serve it through like a REST API or a GRPC so Selden lets you do that so and we kind of like wrap all this in a nice package so that it's easy for you to do it and of course for the store data handling part like it's compatible with any S3 storage but here we've used Ceph but if you have any other S3 back and it will work too like AWS or MINI or any S3 compatible ones so the top most part is where like it shows what models are there in the library right now so like association rule learning correlation analysis so I'm going to focus on four of those because I have data set specific to QE or development environment so that's like association rule learning correlation analysis flake analysis and duplicate bug detection detection so those four I'm going to explain in detail as we move on but to give you an overview of like the rest of the stuff so let me start with fraud detection so fraud detection was the one that was showcasing the last talk so where they were streaming data through Kafka and kind of like trying to predict whether certain transaction was fraudulent or not typically this is used in a financial setup like where you're trying to detect fraud in a credit card transactions so that's the model and it's based on I think random forest regression and sentiment analysis and topic modeling they are pretty common natural NLP use cases so sentiment analysis you want to get gonna classify the sentiment around like a certain text whether it's like positive negative or neutral and topic modeling is kind of like a sub part of like sentiment analysis where you kind of like try to condense any given natural language text into like short information so you have like a huge document you want to just get a bunch of topics that actually tell like what the whole document is about then that's a simple use case for topic modeling or if like there are a bunch of like tweets and you want to say like okay what is it being what are they talking about and then use like topic modeling on the whole and just get condensed information and regression that's plain simple linear regression or could be like multivariate or simple and matrix factorization that's that's a use case that's game from the DevOps team so you might remember this algorithm from the Netflix price challenge where they were trying to get a recommendation system built so that's the algorithm that's underneath the hood but in in our IT environment it's being used to like recommend like packages for certain build environment you might have anomaly detection and so anomaly detection is basically very like you try to look for anomalous records in a data and in this case like it was used on a build log data so they were trying to like find if there was any anomalous data and it's based on an unsupervised learning technique called isolation forest so here's a typical workflow on how you would interact with the AI library so you start with deploying it so there are like two ways to deploy it one is it's available through the open data hub operator or if you want to like get hands on it's more you have the ansible roles they use like the ansible playbook to install it and I'll show it in a bit and once you deploy this like the models are gonna show up as service on the OpenShift platform so the model itself they exist as Python core and a Git repo so what does the deployment do it basically pulls down the code like builds the container pushes it to the Docker hub or any repo that you have and then tries to spin up the container on the OpenShift platform and then Selden also lets you kind of like expose a rest interface so any further interaction with the model happens to rest API calls and once you have everything there like you need a trained model so the trained models you basically save the train models to your back end there are pre-trained models which are available but if you want to have your own model do your own training like use at this moment you need to use like Jupiter hub or like any external environment that comes with Jupiter sorry open data hub train the model and push it to the set for like the S3 back end and once you have the model story basically start interacting the prediction endpoint like through a rest call which so once you send the rest call it Selden forwards it to the specific model or the part that's hosting the model which in turn fetches the train train model does the prediction and it doesn't see the result so I'll kind of show how the deployment is done through Ansible so this is a can you guys see the font or do you want me to increase it okay so this is an OpenShift environment we have and so I've pre-deployed all the models in this namespace so the last three you see here is Selden the last three components belong to Selden it's the Selden Core Redis server the API server and the cluster manager so the Redis is more for like in-memory data management and the API server is the one that kind of takes in your rest calls and forwards it to the part that kind of hosts your model and the Selden cluster manager kind of keeps a watch on all your deployments and manages that now let me go back so I've cloned the repo and I'm gonna try to deploy one model because I'm gonna by default like when you run this play Ansible playbook it's gonna deploy the entire set of models like all the libraries and everything but now I'm gonna just restrict the deployment to one specific model so as you see like I've just turned on like the fraud detection now before you run it you might want to configure like certain parameters that the model kind of acts just like what's the namespace that you want to deploy it to like what's the base image that you want to use to build up the containers and where does it code live and where do you want to push the built containers and when it comes to model specific things you just give it a name and like resource constraints like how much memory do you want to use like how much CPU you want to use and all this is available to the operator but I'm gonna just show the Ansible version and the other thing that you want to define is like the S3 backend details like the endpoint the access key and the security it's defined down I forgot to delete it so I'm not gonna show my secret key here yep now see so because I already had the fraud detection model kind of cleans up and then tries to create the part that hosts the fraud detection model let's go in and see so the parameters that you actually define there like in terms of the CPU memory you can see like it gets reflected here and when you go into the logs you can see that it's ready it's running on and it's ready to receive the request if you come to the seldom part because that's the one managing the thing so you can see that so it added the deployment for that fraud detection model and it's watching out for its request similar as the case for the cluster manager so it's gonna look watch all these deployments and manage that so now I'll just briefly explain what this happens so basically like very first that you have everything in the configuration and clones repo then kind of builds the template to spin up the container and creates the container and then pushes it up and host the model now this specific part it says like update graph details so seldom not only allows you to like host like single models but you can kind of like host like multiple models kind of do like a sli a b testing like kind of like pipeline stuff there but at this moment like a library supports like hosting model as a standalone but if you want to have your own deployment file modify that way it would definitely work so now let me jump into the models itself so that's specific to the development or QA environment so let me start with correlation analysis so what is correlation analysis so it's it's very in simple terms it tries to get the relationship between variables you might have like so you have X and Y it says like is there a possible connection between the two like does one influence the other like if one increases does the other increase or decrease so depending on like how it which direction it goes whether like if one increases the other one like then it means that it has a positive influence but if it's in the other way in the other direction it means negative and it also tells you the strength of the association like whether it's strong or weak so that basically gives you a value between zero to one saying like how you move between zero and one it tells you whether it's strong or weak so how do you use this in a DAV QA environment so I'm gonna the data set that I use is focuses specifically on the bugs reported so in a typical DAV QA environment there's like the bugs reported and the bugs that's are fixed so it's basically less like a think of it like an incoming QN and outgoing so it needs to balance itself like the number of bugs that come in and the number of bugs that go out or get fixed it needs to be kind of like equal so that you stay balanced and you don't build a backlog to it this is not any made-up data so it's from the OpenStack Neutron team and I'm gonna show like the data from that team so you basically plug in the data that you showed in the regular format like and then try to send in the cold request this is the one yep so now what happens here is like so you send the data to the model that's already hosted on the OpenShift platform and it basically gives you back the result saying okay what is what is what can I say about the data so in this case the two data sets have like positive small correlation and it's statistically insignificant so which brings back to the question like the OpenStack Neutron team is not keeping up with the bugs that come in so which definitely leads to a backlog and the other one is like in our model we don't save the plot but we can also save the plots and kind of look at it so so you can see like the red is where the incoming bugs and it starts creeping up and they don't keep up with fixing the bugs so this is definitely a backlog that's happening and you can look at analyzing the data so the next data is more from a QE endpoint so like once a bug is fixed like it goes to multiple stages like it goes to on QA where like you ask the QE to verify if the fix really works and then once it's verified it moves from verified to released where you actually release the fix to the customer so that's a critical process to like the once the the fix hits the QE pool it needs to be verified quickly and also it needs to be released quickly so in this case we are trying to like wave the bug fix verified which is the released now does that like tally and they keep up in that sense so this gives a more insight into the QE process typically like a QE manager would want to know this like the QE team is actually keeping up with verifying bugs as well as like and if it's being released so here's a query that we ran and it came back saying the result is statistically significant and there's a strong correlation but if you look at the correlation it's like 0.5 it's just over like 0.5 6 it means like it's a 5050 it's more than a 50 chance that they're keeping up with the process but you can dive more deeper and see like if you want to look at specific trends then you go into like actually modeling the behavior like can use regression if there's a strong correlation so now let's move to the the next model that's association rule learning so this is another model that's let's you find relations between variables but it does in the sense of in the form of rules so this is this model is widely used in like market-based analysis like when you do like shopping and do like market like grocery transactions or things like that you want to see like is that like an association of a certain item with another so an example would be like anytime people bought butter they ended up buying bread so that's like an association or a rule that was discovered within the transaction now what what measures does it come with like there are like it's confidence support and lift but to explain that in plain terms confidence basically tells you like saying the above example like what are the chances of someone buying a bread given that there's butter or things like that so you're looking at the chance of one happening given the other one and support basically tells you like how frequently does this association happen and lift is another lift is a measure that's more like in purely technically terms it's like the conditional problem probability of an event happening given the confidence because it's that's already a probability now you're like conditioning on the knowledge that that probably that thing is happening but in simple terms lift just tells you like what is the effect of the rule the rule body likes the in this case it's the bread on butter if it says like does it have like positive is it positive negative or there's no relevance at all it's like the more you buy bread so the more you buy butter we buy more bread or like is it the other way around or there's totally no relevance here now how do you tie this to a software development or QE environment let's see okay so here's here's some data that I took from the same OpenStack neutron beam does anyone work on OpenStack neutron here okay I think please don't sue me as such my life runs on free paying mortgages so this is true data so basically you look at the developers and the QE pool and you say like what is the severity of the part that they worked on it's like urgent high I mean it's just a sample data but there's like a bunch more like medium severity or low and how much time did it take for them to fix the bug and how much time did it take them to verify the bugs so if you look at it like there is definitely an association here because I'm gonna when you hit a bug it needs to be explained in a certain way that both the developers and the QE need to understand and once the developer fixes it and provides the fix he needs to give the steps to reproduce it or like steps to verify it so and that's kind of like flows into like how quickly the QE can verify so if you kind of think about it as like you can derive like an association between the developers QE and the time it takes for them to go through all this process so when you run this let me show you it's so when you run this like you're gonna get a bunch of data which makes no sense but this is separate out in columns but once you export this to a spreadsheet and start looking at the different columns then you can start asking questions there so as I showed in the previous slide it does come with like those metrics like confidence frequent like support and lift basically tells you like the chances of one happening with the other or like how frequent they are and things like that so here are some of the questions that I asked and I'm certain that I'm gonna end up in trouble but I'll still present it so so you take a developer and kind of try to associate what is the time taken to fix bugs right now you can you can look at the confidence level saying like you have the time taken to fix bugs and you have the developers now you look at the different time until say take a week so you can ask the question whether a developer will be able to fix an urgent severity bug in a one week's time and let's see what the analysis came out with so all these names so I think this would be more useful from the management side or like when you're assigning the box so here Tim Roger has like a hundred percent chance of fixing that bug within a week and so does Steve Hillman has like an 80% chance and if any of you are on the bottom there please erase that or like that tells you like how confident can you be with the developer in fixing a bug within that week within a single week and you can start asking more questions like say if it's not an urgent severity bug but it's more like medium severity and your process lets you go for like a two-week interval you want to you can study that relation too now see you move one step up in the management chain you want to look at the development team as a whole and says like what's the chance that how's the team doing with level with respect to like addressing like urgent severity bugs in a single week's time so that's is like there are different percentages that and with respect to the time intervals and the same you can ask for the QE teams so will they be able to verify the bug fixes or will they be able to reproduce and there are so many ways you can ask questions but basically tells you can look at the individual data even not if like the association rules can also look at the individual metrics that's thrown out there and get meaningful insights so next is the the flake analysis so this came out as a came out as a use case from the Fedora cockpit team so they're basically trying to find test flakes test that fails but shouldn't have failed so how do they how do we do it so we use clustering and classification so you basically take a bunch of test logs and say like you try to group them into clusters like something that's they're pretty common to each other and you start forming these clusters and once you have clusters and of course like within these clusters you have you would have like test cases that have failed but where like I mean there were like flakes so you would basically kind of like compute the probability of a test being a flake within that group so now you have like within a big pool of test logs you have these like small logs where you can narrow down your focus now when you get a new test failure you can kind of push this into a classification problem where you're taking the test log and trying to see like which group can I go into or in another case like we use the k nearest neighbors say like okay which is the nearest cluster that I can associate myself to and once you do that you go back to the compute the probability that you computed saying like what's what is my chance of being a flake now that I belong to this group and that basically tells you like what's the probability of a certain test being a flake yep so this is a sample execution so give the data what goes into the request is like you give the model the pre-trained model and then you show a sample log so this is the whole log and say like hey tell me if this is a flake or not okay so it came back with seeing like there's a 33% chance that this is a flake now based on your experience or you might want to set the threshold and decide whether you want to like take report this as a flake or not typically like anything over 50 or like 75% that's that's a good chance it's a flake yep so that's the result that you saw in the demo so next we have a duplicate bug detection so that's a classical case that you would see in any development QE environment where like when you're submitting like a bug there's chances that might already have been submitted by someone else or it already exists in your bug database so you might want to detect duplicates there's plenty of techniques and the one we have here basically is topic modeling so it runs through your existing pool of bugs and does topic modeling in the sense like it condenses all the bugs and all the information related to each box into much shorter information and once a new bug comes in it goes to the same step of like condensing its information and then running it through similarity measures basically runs through the previous set of condensed information says like hey who do I match so closely and it basically shows you like top few matches and that's configurable like if you want to look at like the top five matches of ten you could do that so you could know this these flake analysis or the duplicate bug detection you can eventually turn that into a software bot you kind of like doing it automatically for every request that comes in okay so finally come to the conclusion here so we I introduced you to AI library like how do you deploy it and how do you use it so I'll show you like how I'll show you the repo and like what the contents of it and so what do we have in plan in future so right now like everything all the models having happens to seldom so if you want to train a model now when I explained in the earliest slide you have to like save the train model to the set back end but what if you want to train a model using this framework so we're going to like start incorporating Argo project into it so Argo project is it's kind of I think it's a part of the Kubeflow community and it lets you manage workflows so the workflows typically nothing but like a set of containers that gets executed so we're going to use Argo to handle a training of models and the reason being is like seldom is more of a synchronous because once you send in the request it waits it needs to come back immediately so that's more suitable for like in like real-time processing or really fast models and training is typically training a model kind of takes a long time within hours or like days or things like that so Argo lets you submit the workflows in an asynchronous manner and then come back and get the results so that's one thing we are looking into and the other one is right now like when I showed you the Ansible Playbook like you are kind of like deploying the models through I mean through the code that's already in there but if you want to just do the I mean once you submit the code to the Ceph back end if you want to turn that into a service automatically without even like going through the ODH operator or the Ansible Playbook so model runner it's already there in the library but it's still in the testing phase so once you save a model to your Ceph or an S3 back end it kind of like detects that okay a new model has come into this pool and let me spin up a service for it so that it's being done automatically and last is the dashboard or the user interface for this AI library you know like I've been showing you like a bunch of curl commands and it's it's really hard to like start forming the curl commands you know like but once you get a hang of it it's easier but still you would want a much more easier interface to interact with AI library so we started putting together like a dashboard whatever I showed you like you could do it through a graphical interface I'm gonna show a slight demo so this is what you're kind of like started testing out with linear regression it basically tells what the model is about like what's the API here so this is an interesting model here for once like we'll have to look into the health of developers and QE as well so just to have some fun element here so we took data set that's from people in various professions with their height weight and their health index and it turned out to be a simple linear regression model so you input your height your weight and it basically throws you back an index between zero to five zero meaning like extremely weak one is your weak three is your normal for you need help and five oh god you need help right now so so that's it and so let's try here there it came back with a 2.2 minutes it's between the normal range so does anyone want to try here at life I pretty much pissed off my other colleagues saying sending this to them they don't like me anymore so it's set to 50 Zac is still my friend so okay okay yeah for good so that's the UI and that's a simple model to play but we're gonna start putting in all the other models that be sure it's yeah that's the thank you slide so I'm done so any questions no questions from the OpenStack Neutron team thank you good presentation I just had a question about so so all the model servings gonna happen through seldom what if I have a machine learning service that doesn't have a UI or doesn't have something that you know something like a back-end thing like a function do you guys have plans to have something like a function as a service type of thing so when you say it doesn't have a back-end or a function like is it no no UI it just runs as a as a back-office job runs at a particular interval and there's no serving it'll it'll run its predictions and it'll store its predictions in in a data store so yeah I mean it's up to you to like so the model behavior is something that's decided by you so what we show here is like you send so you have the model in your back-end you send the data that you want to run prediction on and it'll send you back the results but if you want to want to alter the behavior no I don't want to send it through the rest API back but I just want to like store it in some back-end that's certainly possible too so all seldom just submits the request and sends back the results whatever the model sends back to seldom if you're not going to send anything then it's just send back another string but on the other hand like your results are going to go to a back-end or where you want to store it so I think in that case how do you kind of like can monitor that maybe like if you want to put that into an Argo workflow where you can get the logs within that container and say like hey what happened to my workflow and I want to see that so probably want to move that into like an asynchronous rather than a synchronous request yeah so this the synchronous when you're interacting with the cell then you can tie that to like really fast predictions like when you're doing it on the fly like in case of like the previous talks they had like data streaming in through Kafka and they wanted to run prediction instantaneously so in that case you can like hit the cell then rest endpoint and get back the result but in some cases like you want to run the analysis like on a weekly basis you don't care coming back immediately you just run like on a daily job or things like that that's possible to do. Thank you it's a great talk. I'll give you a beer for that. Sorry I'll give you a beer. So like we discussed four models I think those already trained models that's what you send the prediction on or that's the thing I missed the part. So so linear regression so like correlation analysis and association rule learning they are kind of like doing they're not models as such but they are like doing analysis live on that data but flake analysis more is a pre-trained model so something that was trained on your data set and it was stored in the self-packing I didn't show the training part because training takes a little while but when you're moving into more complex space like clustering classification or like neural networks you want to store the model ahead of time and typically like how the process goes is like you get a bunch of data to say like you stop like a week on a week timeline train the model and try to run the prediction for a week and see if it performs better and then start retraining on a plus one week timeline so the retraining part is something that would be handled as part of the Argo inclusion that's going to come up. Okay thank you. If you start with a great talk I'll give you a cake. Yes sure. Thank you Mr. Embaligan for your presentation I appreciate that and so I mean thinking back so what we have done so far is the evaluation of developers and QE using machine learning techniques that's what I understood and the data scientist I think he or she falls under both the categories a developer as well as a QE person right no in terms of QE what do they do with the model part. So here let me rephrase the question so you're asking like how is it relevant to the QE space versus the development space ask against how a typical data scientist would do or is that am I right or what is the role of QE in data science. Okay so as I said like I mean so traditionally like it all started with data scientist as a role being associated with anyone who want who had wanted to do analysis on the data but we've moved away from that point and transition so much like with so much tools techniques available like now like everyone has data like you have data and I have data so we can start using the tools and start doing the analysis you don't have to like develop like all the algorithms from scratch but as long as you understand like what the model throws back at you like in terms of the metrics or what is the result and then like the the differentiation starts to disappear you know so in this case like QE's devs or like anyone else like they can just start using machine learning algorithms and tools rather than like trying to develop them. Thank you.