 So, good morning everyone. My name is Grigory Fursin and just a few things about myself. So, some years ago I founded the so-called non-profit tuning foundation, where we developed open source tools to help researchers automate their experiments, share them, reproduce them and so on. And some years ago I also created the consultancy company, DVD in Cambridge. So basically we help companies, we consult them how to kind of accelerate the research use of all those open source tools and this year I'm also a reproducibility vice chair for supercomputing 19 and again we discuss there how we can improve reproducibility of our research and kind of how to automate it. And obviously I'm very glad to be here because I think some of the high level goals basically helping researchers like to simplify their life in HPC experiments. I think it's very correlated with easy built high level kind of vision and I think that tool which I'll present to you today called collective knowledge I think can be well complementary to easy built and similar tools. So obviously I'm also here to get your feedback and to discuss things how we can automate and reproduce experiments. Just why what is it all about? So I spent many years like probably more than 10 years working on machine learning and systems research and recently we started looking at quantum computing and obviously if you look at the research it's all booming. So there are thousand more than thousand papers published every year just in machine learning. So it sounds like great, it's good. However if you want to like as an end user you want to take some of those techniques published in papers and move them into practice then you start having troubles and basically you need to look at many different tools out there before you get from your idea to production. And there are of course many nice tools which supposedly help you in this route and you spend your time you learn all of them and if you're a student you go there you spend like two three years learning them and by the time you're kind of ready to implement your fantastic idea which was out there everything changed. Software changed, hardware changed, ideas changed, models changed. You get depressed, died, you either quit or you search for some other tools. That's what was my path and like path of improving many other people here. And of course at the end now we're kind of stuck with what do do with all those complexity and there is no universal solution for that. So just before being kind of completely depressed I started, I tried to initiate what we call artifact relations and some conferences just to start talking with the community and looking at how they share their experiments and how we as end users can reproduce them and so on. And so I organized many artifact relations now at different conferences in the past five years and I think I overviewed about more than a hundred artifacts, more than a hundred papers and what is interesting practically none of them are automated and if you look at all of them actually have very similar workflows, most of them. So basically you just get some, either an archive, zip, tar or some container, some Jupyter notebooks, then you have some ad hoc scripts, so you have to learn how to work with them, you download some extra source codes, what if your URL changed, you're stuck, you try to get some data sets, data sets sometimes also disappear, what if I want to change and try my own different data set, you really like start hacking all those scripts, you're lost. Of course you install many, you have to install lots of software dependencies and it relates to this build which is trying to solve this. You start running compiling program and by the time you already usually as an evaluator spend like one or two weeks like hacking all this stuff and of course usually a program, it's good to compile but then you run it, you get results which are not correlated with your paper, you start hacking it back, realize maybe some library some way, change, so on. It's total mess. So evaluating one artifact nowadays takes like a couple of weeks and this if you have thousands of papers, what will you do? So this is not really good thing. However, discussing with the community, you start thinking how can we solve it and okay one other thing of course nowadays everyone speaks that okay let's start using kind of virtual machine containers, they're really good tools but they're like and final tools which hide all this mess, they're not solving it so they're extremely useful to kind of give you like a snapshot and run it somewhere but if you are still a researcher and you want to try to change your data set software and so on, they're not really providing a solution for that. So then I started discussing with the community how we can solve those problems and I started thinking that what is missing here is actually a lack of APIs and common meta descriptions, all those ad hoc steps which everyone is doing that why not to provide some simple APIs for them so that everyone can have the same API to perform the same task and just very quickly again very high level very kind of conceptual. So I started thinking that okay I would like to have some kind of tool which would allow me and everyone else like to get some repository or other repositories with components out there. I'd like to have an API that instead of scripts I would be having an access to this, to see this shared artifact as kind of repository where APIs, module, datas and so on. Obviously I would like to have an API which would install software for a given program and it doesn't because you have like easy build, you have spec, you have scones, you have many different tools out there. I want to have the same API for them so that I'm not changing it all the time. The same for data sets I don't want to consider data set different from code in fact because you can install data software in the same way so I want to have the same API. So this is my first was original wish list and the list of kind of our colleagues. The same for models, packages, more importantly if we want to move our shared workflow to some other new platform we don't want necessarily to have like fixed software. I want to adapt our workflow to an installed software so we don't want to have pre-installed old software. I want to really try to get really latest software out there. I want to automate compilation, trying different compilers, flags and so on. Run code with multiple data sets so if I can pull a new paper with a new repository with a new data I don't want to do manual hacks. I want to immediately see all those multiple data sets out there and with one click I would run different data sets and so on. And the last thing I would like to automatically validate results, compare them with paper and show discrepancies. I don't want to look at like all our evaluators, they spend time looking at the graph like checking what is wrong, what is not wrong manual. It's really it's crazy and the more papers we have and the more artifacts we have it's it's not sustainable. Now these are just maybe five minutes I'll spend on low level details of this collective knowledge framework which we came up with and I'm sorry if it's low level I just want to give you kind of a feeling of what it does and if something is completely unclear I'm here to discuss it more. So again last four or five years I started discussing all this with both universities and companies and was thinking how we can solve those API issues and we decided that maybe we should try to create a very simple small program very portable in Python which would allow you to kind of convert all your code data mess into some kind of APIs and it should be very simple so that it's not like you download like thousands of dependencies or so on just pure APIs like wrappers to your different code and data out there. So we created this framework which you can see like you can install it with this you can install CK. You need Suda because it uses binary as a front-end and again you can run it on Windows, Linux, macOS on different systems and basically it's a tiny Python library. What it does first again it will be some colorful things because I will show you what it what each command does and then hopefully we'll get a feeling how does work how it works. So basically my first idea was that okay I want to start having on my machine a directory which now will have some structure which I can afterwards maybe share through GitHub and so on and everyone can reuse it. So let's do something let's create human readable APIs. So what it... Sorry I'm just trying. So it feels like we started working like on front-end CK, CK is a front-end and let's say CK add repository my paper. I'm trying to share some artifacts for my paper. What it does whenever you install a CK framework for the first time it will create you in your home directory in CK a set of repositories and you can then whenever you pull other repositories from different people in the CK format you always pull there. And then you just basically now create your dummy repository with just one file. Again you can show you can go through those steps yourself when you try to do it it should work on the machine. And this description of this repository just have you dependencies on other repositories because the idea is that when you start sharing APIs or a community share API you can actually not just copy paste your modules or APIs but you can just get a dependence from other repository from other paper and so on. Again all these ideas but now when you just also have comments from Linux, Unix, CPLS and so on which deals which shows you the structure of the repositories what you have and later I'll show you how you can add APIs. So you can look at your repositories now you have a few ones like scratch pads and your my paper repository you can search for them. Now adding APIs the idea what we were talking to our users they don't want to have it as simple as possible you don't want to have like to I don't know to write your C code or something you want from your common line to create immediately a wrapper and API for something. So let's create an API called hello and when we want to say sorry let's create an API we have a module called hello basically all our APIs are shared as Python modules and inside your Python module you have an API called say and we want to have from common line API called CK say hello. So what we do we just have two comments first comment will create you dummy CK add in my paper module hello and you basically have a dummy here inside your my paper repository with module hello it will create you just a simple module by automatically and then you can add an action inside this module called say and it will create you this just dummy function inside your Python module and what is important is see here an input an input is always a dictionary there is always one dictionary which can be serializable to JSON so it means that all our APIs can be serialized through web services again later I will show you why it's important. So now what is important that right now you created immediately API you can now start using it immediately so basically you can write say CK say hello and you can write whatever you want here some common lines some parameters and some input JSON and what it will give you CK will just find this module call it print to dump your input and you will see this information that's all what is important is now you can also use it from Python from other pysons or from other modules so what it means that if I run my Python or Jupyter notebook I just import this CK kernel and I can access this same function with the same common line action say hello yes cool and so on you do it return and return as always dictionary it sounds like very simple but this is enough to create you so to let users create and share API is very simple for anything for now it's very abstract but I will show you afterwards how this converted to HPC research and basically just couple of more things so now you can also all those modules which you create you can start creating data associated with this module so that basically abstract your data so for example we can now create this the last piece of information what you need to know about CK is now I want to create in my repository my paper with the module hello some data dummy cold world and with some tags and then your local directory again in my my paper repository CK repository and the module hello it will create your data cold world with some meta-json where you will see some tags and so on and inside you can actually keep it as a holder for any files which you want to share with some users so in such case afterwards you can find both your modules APIs and data which anyone shares so here you can search it through those tags you can load this and get your met information and so on and as I said finally you can now say even CK say hello world with some again common lines and it will again print you some met information now you don't see all of it but I I hope you grasp the idea again it sounds like and so what why it's like kind of obvious seems like an obvious thing the thing is that sorry I'm skipping that now we can basically convert all our code and data into some format and start sharing it through github so on both code and data and in such case we can start abstracting all our system gradually so instead of all those steps which I showed you can implement those MPIs for this and then everyone can reuse those APIs so everyone hopefully will be on the same page when you perform experiments and also more importantly as soon as you have an API you can really enable DevOps because you can connect those components with Jenkins with Travis and so on now when we created this framework we're talking about few three four months probably just to create this framework then the idea was to start talking to the community and basically start adding those APIs for our common tasks and if you go here to those links you will see actually already there are about hundreds of repositories shared from different partners which we have within also hundreds of maybe a few hundred modules which are shared with those APIs for different tasks so which task we started sharing so we started working with ARM first and they were saying some basic things saying you know even like when we start we have to run across many different operating systems why not to start sharing information about creating systems it can be Linux Windows Android and so on so we created an abstraction called module operating system and the CKN repository you will see it you can see it online in the github and basically the idea that if you do CKLSOS and you pull this repository we'll see now about so ARM commits most of the stuff for Android operating system but you will see about 100 description of different operating system this is just meta information and this diction is very easy to extend so again the point was that someone else can say oh I want some extra information this JSON I'm adding it and you when you pull to suddenly see it while keeping backwards compatibility then we start saying that okay everyone is trying to get information about platform it's always on unified why not to provide API called platform and we use the CK Detect platform API which basically goes through different tools which are installed out there and collects information about the platform and saves them again in the unified format in JSON and again it has an API the same third step which we started discussing everyone complains about that all our workflows are not adaptable to a new software installed so why not to we prepared module called soft with action called search so that it's like it can actually detect a given software or data set a model on your machine and return you some different information about past libraries and so on again this is a very conceptual I'm just trying to give you an idea how those work and the last thing what was missing to to create adaptable workflows was these packages so we wanted to have a unified API again where you can install some specific package whatever afterwards is used as easy build is a spec is cons it doesn't matter want to have a common API which has dependencies which can install your given package and again you will see about 600 700 packages which are now shared to install not only code but also datasets and models so anything it doesn't matter what we install and finally again I worked a lot on benchmarking and other tuning and so on so our next module was called obviously program and what it does is that we shared we created a really complex workflow we spent about a year on doing it where you could actually compile run programs but it's you describe your software dependencies data set depends models it automatically adapts to your software to your platform it picks up your libraries your datasets if they install the new ones if they install and if not it's automatically installed missing packages again for spec through the build through whatever it is and and at the same time because we now have APIs we can record the experiments with all this information for because we now monitor all the information for this API and we can replay experiments and even find differences automatically this is really powerful tool and now so again many companies when I using it now so why because again this API now measures that it's not just a script it now has an API so there was an integration with Jenkins with Travis and it's very easy now because again you have a standard API you have Jason API and the last just couple of things I was very glad when I got this those this program workflow because I spent what I think 10 years working on other tuning and each time I was I was spending probably 10% on auto tuning or machine learning and Kenneth probably know this task and 90% I was spending on like oh my library change or compiler change or flex changing and it was like that was depressing time and I think you you probably see you know it very well so now we when we converted when we have those workflows actually probably in one month I implemented the tuning universal tattooing kind of module on top of this pipeline basically we expose information about choices which we have a characteristics you want to tune and it's automatic I don't think about it anymore because if someone else shares me compiler they share where they can expose flags and so on I don't care about this anymore and I was kind of crying because I spent now one month doing it and I collected so much experiments in one month so I spent like five years for my PhD of my life doing all those experiments in one month now I can get all those experiments and I'm not working on this anymore so it's kind of sad but okay Raspberry Pi Foundation found it very interesting so we worked with them to actually replicate all this compiler the tuning work and we even generate fully through CK interactive articles so you can see it here if you're interested and and as I said there are some fun parts some other colleagues share modules which do kind of analytics on this collected data so they perform reduction on compiler flags or on we want doing software hardware auto tuning for model for machine learning models like for image classification and if you know AutoML so we kind of cover all those spectrums we can do the whole software hardware auto tuning now okay this was just to give you an idea about CK framework and I have I think about seven minutes so I just want to quickly go you go through some real practical examples with all with with our partners so in case some of them will be interesting you can actually go to the website and you will see so links with our partners and there you usually have shared repositories for each of the projects which we have in case you can reuse it and so what is an example is the first fun part which we had last year was to say okay we have all those components and programs why not to start actually organizing tournaments where basically rather than you just publishing your paper and then hoping that someone in one month will read it on one year read it and so on reproduce results why not to share papers with for some specific tasks for example improving image classification with those workflows and then during reviewing those papers what would be actually reproducing your results and publishing them on a scoreboard and then everyone can check them and find the best solution so the first tournament which we did at Asplos last year so we had a very nice many interested colleagues in this tournament so we created we said anyone can submit standard image classification using any framework which you want on any hardware which you want and then we want to look at the Pareto frontier in multi multi dimensional space get the accuracy execution time power whatever you name which is published we place it on the scoreboard and then we'll look at the Pareto and it was very fun part because we got so we get originally eight people eight submissions but three drop because they got scared that you show not good result that people kind of name and shame them which was not correct because we actually don't care we don't care about one winner so eventually we had five submissions but again submissions were very interesting because they were very diverse so we had from Amazon cloud from those Intel submission on cafe with the intake inference on Amazon cloud from some grid of Raspberry Pi machines which showed you that you could work faster with five Raspberry Pi devices you can be faster in image classification than Tegra Nvidia card again all this is published now and you can see that was really fun so we unified all those work for all those submissions through CK workflow and then again if you go through this link you will see the live results which were regulating when someone was reproducing and reviewing those results and again as I said you have really lots of metrics which we started collecting but here you can just see I think it's inference throughput versus accuracy on on image classification and again models were very different from ResNet, GoogleNet, MobileNet and so on so obviously here you can see obviously that there is some parietic kind of frontier but again these are all obvious things I don't want to get into discussing those results because you probably worked a lot on those topics you know it what I want to say afterwards is we had a very interesting case reproducibility case so Intel actually so they claimed a few months before this workshop the Intel published a paper were reclaimed that they got about 400 FPS record number at that time on one of the Amazon instances and some people had problems reproducing it so they had the submission to our workshop and we started producing it and it's true that it took us a week to reproduce it because each time we're getting like first 80 FPS then you realize ah library was not exactly the same which they were using because they had to use some specific sublibrary we changed it to this version we get like 200 FPS still not the same we start checking it further ah it means that they use intate inference but the model which they used there was a bug in the conversion so it still FP32 so it kind of it doesn't give you speed up we fix it and we get those 400 FPS so we reproduce it but then we shared the workflow and now all those bugs are fixed so something what you as a paper reader you would read and so I don't want to read this paper anymore I don't trust it anymore so now this this workflow was shared here and what is really cool that two months later Amazon contacted us and they said oh wow you know we got this workflow we run to the AWS and we actually produced in one minute this result it's correct so furthermore we tried it was different data set and it worked and they shared it as a docker image now inside the AWS because now underneath this docker you have the CK framework where you can actually run it test it with different data set with different model with different framework so it was really great example of how sharing your artifact in some common format will allow other people to quickly reproduce it rather than just having thousands of papers and reading them and trying to figure out what to do with them and they're usable so Amazon gave a presentation about our work a few months ago to really conference as a consultancy company General Motors to contacted us and they said oh wow we have you know our tasks again public information you can see it on YouTube they spoke about us last year so they said you know we spend years triangle like they build self-driving cars obviously so so we're trying to build the software hardware stack which has it has to be off the specific price can't be like $2,000 to put you into self-driving cars expensive has to have power budget on I don't know 10 watts has to have specific accuracy and so on we don't have issues with not detecting pedestrians and so on and and it's really complicated because we read papers we check them we don't know what to so they look at all those reproducible workflows and they start using because it's immediate you can get those five six workflows you run them you test them on different hardware software and so on and it's 100 times faster to perform this validation so it's really accelerates your research and as a so as I said so I'm reproducible device chair for super computing we started looking also at this year just announced whether the world if I'm correct if you submit paper to super computing you will have you will have to submit artifact appendix you still don't have to provide all the validation but you have to describe how you which experiments you did and we are trying to discuss also it's not for this year but also how to automate it and it's not necessarily about CK and so on there are again many interesting tools like easy builds pack and so on which are out there but trying to connect them all and discuss that's why I'm interested to talk to you as well what you think about this but one of the idea if you look here for this year we had there was this student-class competition tournament with the very complex social application and we shared all the CK workflows with components there for this tournament and the months later the authors of this workflow they actually validate result on a different supercomputer and with different API and so on why because we had all those different packages which cannot add software detection plugins where you can adapt and we continue discussing them so follow possibly if you're interested follow this link because we continue extending this workflow and yesterday guys from super computing I have three more slides and I think I'm just like two minutes late and yesterday I'm on this committee for super computing that's me to put this slide inside all the presentations just to remind everyone that there are many open calls right now which not everyone knows about different like tutorials posters so look at the website because there are many deadlines coming just the last interesting thing what we had just two days ago three days ago all those workflows obviously now what is a buzzword quantum computing it's cool is it really cool it's really fast okay we know that there is no yet a task which is faster than traditional computing but apparently maybe in one two three years quantum computing will be faster but HPC is also moving so you want to compare those results all the time you want to compare systems so several colleagues came to us from so IBM regati and the river line so on and we started discussing why not create workflows where you can run algorithms which are usually traditional for quantum computing cryptography machine learning and so on but run them both on traditional machines and on quantum computer walk quantum either we actually ran some results some workflows on real IBM machine quantum machine but also on simulator I start sharing results so provided those workflows and if you look this in Paris we had this hackathon where everyone came ran this workflow and they were trying to tune parameters for machine learning with kisket from IBM you have online results the winners and if you look at them you will see someone the one some guys are really in one hour they produced a schema with gates if you know with some quantum computing which actually really get you a very good accuracy on supposedly quantum computer simulator but it's it's an interesting result so collaboratively we can start searching for better solutions and share them and I'm done because I'm actually mostly on time so just to conclude what I told you kind of either you can say oh it's too simple why do we need it or you can say oh everything is solved neither of these so I think we have a really interesting prototype which can show you where that we can actually deal with this complexity now and we can bring community on board so that you can share those APIs you greatly abstract the system but well really at the beginning of the journey why because now we actually need to standardize all those API's because they're changing and so on it's still itself I wouldn't say a mess but it's still you need to standardize it and this community has to play a role in that obviously installation documentation sharing more components we need to work on this and I'm very open to collaborations of projects someone is interested particularly relate this automation and sharing experiments and if you're interested as I said there are a few websites where you can look at all these components which are out there already you can reuse them and please if you're interested just contact me and I'm happy possibly for some future projects together okay thank you very much and I'm mostly on time actually so strange stuff maybe I have a question okay so it looks like ck the framework and all the models around it all spread across multiple repositories there's a whole bunch of that's correct how do all these things together so good question yeah so for now for now we have this website where everyone looks at it and finds the most close model which they want to use and then they provide inside your whenever you write your new repository provide dependencies inside just say URL which one I want to pull that's all and then we'll pull it an important thing we don't have a cool index where someone else creates your own repository it's all manual at the moment so you have to tell us that you have a repository where they're and it's not indexed and this is one of the big thing what we want to do next is to actually make a automatic index a bit like people rather things so when you share some module repository you make everyone aware automatically of this system it's not there yet that's why it's still must be done what about if I want to play around with the quantum computing workflows just there's a single repository where we can start yeah so ck minus quantum ctuning ck minus quantum these are workflows which work with several simulators so you don't need super computer but we have some apis even if you actually get an access to IBM machinery gate here we have API so you just need to arrange a login you need to get tokens and so but it's actually pretty automated so we again if you look at those reporters search for quantum and we will see okay any more questions and you have two days and that's right thank you thank you