 So, first of all, thank you everyone for coming out. I know this is almost lunchtime when you could be doing better things, so cheers for you. We're going to talk about mixed workloads, so that's basically high-performance computing workloads and container workloads, and the experiment that I've been doing, we've been doing, at ICHME with Univa. I'm Tyrone Grandison, if that wasn't clear. I'm the CIO for ICHME, just started a few, well, a year and a half ago there. This is Jason. Hey, it works. Hey, I'm Jason Smith. I've been with Univa for, I don't know, probably seven or eight months now. Principal Solutions Architect. And prior to that, I've been doing quite a few consulting gigs with Kubernetes and all the other fun stuff, so thanks. Okay, so the flow of this talk is going to be pretty simple. It's going to be talking about ICHME, which I'm going to do. Then having Jason come up and talk about Univa, and then I'm going to come back up and talk about the ICHME environment, and then Jason's going to come back up and talk about Univa and ICHME, the collaboration we've been actually doing. So by show of hands, before you actually read the asterisk for this talk, how many people knew what ICHME, the Institute for Ethometrics and Evaluation, was? Two hands. That's fantastic. That's awesome. There's two more than I ever have anywhere else, so that's perfect. So this is going to be a quick intro. So ICHME is a non-profit, so it's based in the School of Medicine at the University of Washington. We focus on population health, sometimes called public health, and we look at it across the globe, so every single country in the world. And we collect data from censuses, surveys, scientific papers. We have data sharing agreements with many major companies and organizations. And our one mission, our one goal, is how do we actually put out the best data out there that tells people, again, this is a fun topic, how long they're going to live based upon, like, factors around them. We normally produce it in one, a dataset. Two, some visualizations. We provide training and we do special analysis for the World Bank, the World Health Organization. We do county by county analysis for different, like I said, countries. For example, India and China. We also do some work for the Sustainability Development Goals, that always catches me. And that is with help from the Gates Foundation, who is a big funder of ours. Our customers tend to be researchers, so internal to ICHME and external. We have a huge community of epidemiological researchers that depend upon us. We want, and we actually do currently work with, some advocates. So people that have special interest in particular topics. You know, say you're interested in the impact of smoking on the population of Peru. Or, you know, you may be concerned about the effect of diarrhea on the livelihood of kids. I mean, all fun topics. Policymakers, so people that actually make laws that want to see the communities that they serve actually improve, we work with them too. And if you've seen a story in the news on financing global health, on sustainability development goals, on a topic that actually touches on a data aspect of public health, the source for that data is normally us. It's normally ICHME. The academics and the researchers tend to be our largest segment right now. And the reason why I'm here and I'm going around to other developer conferences is that you get the word out that we're looking for people to actually use our data for multiple different things at a local level, right? So all this data is free for non-commercial use at this point. So ICHME's process is pretty simple. I mentioned before we have multiple different data sources. Each year we basically run meaning like look at 150 to around 300,000 data sources. We basically transform those, transform them multiple ways, standardizing them according to ICD-9, ICD-10, which are, sorry, disease codes for healthcare. We then actually do some checks and remove some garbage codes. For example, you don't expect men to have a high risk of something based upon pregnancy. So those associations have to go. Once it's actually transformed, they then become inputs to our analysis process. And that's where we have around 250 at this point researchers internally that run statistical models on the data itself. So they, one, use existing off-the-shelf models, two, they modify them, three-day figure out what gaps we have in the data currently because we have to do it across all countries and the world. And we run models based on a continual basis. We can probably run for the entire global burden of disease process, which is the data set we produce, run models at least six to seven times. We have our environments in the back end always going. Once we actually have what people call an intermediate result, we then go between get a collaborative network. And a collaborative network is around 3,000 researchers across the world that then vet that the numbers actually look good and they help us modify the models themselves. So highly collaborative process. At one point we come to a point where we say this is good enough and then we call that outputs. And the outputs we publish in everything from the science journal to, well, essentially all the person's journals in the public health space is what we publish to. We actually produce policy papers for different countries and our global engagement team within IHME actually actively works on stories with the media to kind of highlight the issues that are actually in the data itself that we produce. So in terms of exact artifacts, I just mentioned the global burden of disease and what this is is a study of everything, of injuries, of risk factors, of diseases that actually cause people to lose years on their life, right? We have a metric that we call the disability-adjusted life years that we use to compare multiple different phenomena across the entire spectrum. So it could be car crashes in Lima versus the effect of smoking in Greece versus number of murders in, you know, bigger countries, Zambia. And we do that across age, sex, geography, and overtime. Overtime meaning from 1990 because we're not that good, right? And this leads to something that we call the GHDX, which is a global health data exchange where we put every single source that we've actually used in one place. We actually make it available for everyone to use because we want to be transparent and show people what we've done and why we did it and what decisions we actually made. The current work, which is far more exciting for me, is what we're calling the local burden of disease, our geospatial analysis on our current work. And that is taking that, you know, metrics of the rubric of risk injuries and diseases across different dimensions, across time, and looking at it right now at a 5x5 kilometer level and then further at the 1x1 kilometer. So being able to, at the very finest grain, tell people what it is it can actually do to live healthier and longer lives. Added to that, what we're focusing on now is what we call future health scenarios. So how do we tell a country or an individual or community member that for your community in 60 years, this is what's going to actually be the biggest factors for you. These are the things that you should be worrying about. That's the forecasting work. For that, of course, we have to do multiple different scenarios, you know, one optimistic, one pessimistic, one, you know, given the trajectory we're on, this is what's going to happen. And I mentioned before, we do spatial analysis both on geographies. We did one on India just two weeks ago, the India launch. And we do specific ones for China. And we have a roster of 11 per year that are coming out. So that's in terms of what we produce. That is it. Quick example, we just launched in February, February 8th, the global burden of disease 2016 version. And we do this every single year. So we have, or we cover, 335 different diseases. We cover like a thousand different sequela of disease. Who here knows what sequela means? Awesome. Good. So sequela is a consequence of a disease. So something that happens as a result of, right? I won't go into this. So risk factors are things that cause diseases, cause risk peers. So we have quite, we have multiple different facets to the day we produce. We have it over a number of years. We had around three billion data points that we ingested. The outputs data itself was only 30 terabytes. The input data and intermediate results, you'll see it at, it's in the petabytes. We have 3,000 points of metadata. And, you know, last year, this is rough. We probably use around like 175,000 data sources, right, to get there. In terms of what we produced for it, it was very simple. We produced the reports in the Lancet. That's the main publication. We produced the visualization tools. If anyone is online right now, go to vishub, v, i, z, h, u, b, dot health data dot org. And we have, at this point, around 15 to 20 different visualizations. These are the six main ones that we update annually. An example of one of the areas of impact we have is in the policy space. So we collaborate, again, like I said before, with World Bank and WHO. We have agreements with all those governments there. We basically work with the China Collaborative Center really well. A good example of, like, how we normally do impacts on the policy space would be us working with the Rwandan Ministry of Health. And having them look at the data and analyze it and seeing that in-house air pollution was a number one factor in having their people not live as they should. So the average life expectancy back in 1990 was around 44, 45. The government started a program. They developed, you know, clean stoves that burned more efficiently. Didn't release so many toxins. Within the space of 16 years, they basically got their life expectancy up to, like, 63. So that's them seeing the data, figuring out what it takes to actually solve that problem, and just implementing it and seeing how it works over time. So with that, I'm going to pass it over to Jason to talk about Unova. Okay, cool. Can you hear me? Yeah, awesome. So who is Unova, right? Leading Innovator, we actually are working with many different partner organizations to help customers really figure out the best approach for running their applications on top of Kubernetes. We have many different features that come along with our product. And we're based in Chicago, with offices also in Canada and Germany. We have many different customers, right? We guys probably don't care too much, but a few different verticals there as far as customers and what they do. A lot of these enterprise customers run many different analytical batch jobs, which is what Tyrone will talk about in a little bit here. And then finally, kind of our product that we're talking about, NavOps. So NavOps has many different features that it enables within Kubernetes. Some of these are virtual multi-tenancy. We actually offer mixed workloads, right? And be able to run your batch jobs on top of Kubernetes, which is kind of the whole point while you guys are in this room. So we're going to get into that in just a little bit here. And then also other things, right? So you can set priorities for your applications. And then depending on the resources that are available and what the priorities are on those applications, Kubernetes will now be able to run your higher priority apps before your lower priority apps, which is not currently built into the default scheduler. Okay, so back to Tyrone for the interesting stuff. Cool. It's not bad. Can you hear me? Oh, perfect. Not that interesting, but we'll see. So let's start talking about ICHME and the reason why we decided to go along, go on this path, this journey with Unival. I'm going to start off with the boring stuff and go through this quickly. The team itself is here to actually make sure that both the collaborators and the researchers within ICHME can actually do their jobs and can actually do it with a mindset to innovation, right? Right now the team is 61 people and we basically cover everything. So we run our own cluster environment. I'll tell you more about that later on. We control all the data. So we have over 100 different database servers in-house. We do our own civilizations. We have a data science engineering and kind of desktop and help team, right? When it comes to our users, so we have two main users, right? And the first set is the researchers. And the researchers have one, well, several things. So background. So they have different technical backgrounds, right? They may have actually learned, you know, state or sea in grad school and want to continue using that. They may have just left and actually know our and may want to actually use that. They have their own particular stack that they're comfortable with. And when they come to ICHME, they don't really want to change to one standard stack. They want to use what they already used before. They tend to want to actually write a lot of statistical methods from scratch, which is a blessing and a curse all at the same time. And they want to customize what they want to run because some of them are actually on the cutting edge and will use whatever the latest thing is out there, even though it may not be supported by my team, right? So the task here is that you have a bunch of really smart people that are one academics, right? That may have adequate training as a program as computer scientists. That are running jobs in the order of, you know, I have one job that spawns off 10 to 100,000 other jobs that runs on, I don't know, probably 100 to 200 different data points and actually does a lot of like IO, right? That's the high-performance computing need from the research org. We also have a particular kind of researcher that, you know, after they've done their processing and have their models to a point where they're happy, they want to have a web server and a way to actually interact with the outside world and they want a stable environment where they can actually do that, right? So two needs there. One is obviously we have to support the high-performance computing and obviously we need to actually support them having services that stand out by themselves and necessarily don't fail, but they're not really too upset if they do fail. The second set of group users that we have is the support functions, right? So the first set is what I call the business critical, mission critical user group. And the second is, you know, the finance, the HR, the global engagement means marketing here. What we produce for them is this storage, right? So document management for them, collaboration management and customer relationship management, because we do have to interact, like I said before, with 3,000 collaborators in some sort of a seamless fashion and that is not easy in the current environment. So what are we talking about in terms of, like, bare bones? I said that we have four rooms right now in Seattle that we actively manage every single day. It's a curse. This number changes every single month. We now have, last count, 554. Sorry, this is wrong. HPC notes. Mix of Intel, NAMD. I have a description of one of the newer Intel nodes that we just got. We run vSphere. We have, at this point, probably over 400 VMs. We have Docker and Rancher in-house, 300 containers. In terms of, like, active daily available storage that can be used, we have 5.8 petabytes. And tape storage, because tape is never going to die ever, ever, because it's so cheap. We back everything up. And we basically have in-house 9.2 petabytes and access to God knows how many in North. All right, so a little bit more. So the HPC cluster itself has two main purposes. We do primary modeling. We do machine learning. We have storage tours that use Stornex, NetApp. We are a big cumulon client. Cumulon loves us. And we use Infiniband and Fibre Canal to actually connect everything. In terms of software, you know, before starting at ICHME, it was the Wild Wild West and everyone was using everything. After, well, it's still going on, but the path is that we want to actually take everyone to our Python path. So right now, we're saying, for primary modeling, we want to support R, so RStudio, RShiny. We want people to actually not have to mess with submitting jobs like Maniwata Command Line, which is the current de facto, right? So we use Jupyter notebooks for that. Of course, Python, non-py, pandas, yada, yada, yada. Underpinning all this is univogrid engine because the researchers, they don't love it, but they understand well how to submit jobs to their different cues and how to actually get the status of the jobs. And we're working on basically how to give them real-time analytics on what's going on with the jobs. In terms of CICD, we use Jenkins primarily. Luigi to a small extent. And we're also building our own custom workflow management software, which we're going to release back to the open source community in the next year, simply because our environment is kind of different and harsh from the existing tools that are out there right now. I can talk more about that offline, that's fine. For databases is Percono, MySQL, MariaDB Column Store, and we're looking at what the future is. So how do we actually move off these environments because these environments don't actually support all the scenarios that we need in terms of the queries that we have to ask. And in terms of web, we are, you know, standard HTML, CSS. We're a big React.js house and we like to develop stuff. We like to grow stuff inside. We have something called iCheme-UI on GitHub, again, it's open source and it's free. Go check it out. Most of the components we actually build that are online at vishub.healthdata.org is built using iCheme UI. All right, so current architecture in one horrible diagram is 25,000 cores. Most of the users actually use a command line to access our shared storage and we have a really, really fast connection between our storage and we mode everything on the particular nodes themselves. All right, so that was a boring part. The part that is interesting is this, we have this need, move to a new environment, but we have an investment, I have investment in UGE because we've been using it for 10 years now. Actually, we've been using it for, however long it got rebranded to UGE, like whatever it was before, SGE. Some form of it we've been using. So the researchers are familiar with it, right? When we got in there and we actually started looking at how we're using the environment, we realized that we were underutilizing a lot of resources and at peak times, people were complaining and not getting resources. When we looked at the stats, we realized that we just weren't using everything efficiently. So the path here was back in February, we got opportunity to actually start fresh. So we got opportunity to actually make a huge purchase in HPC nodes and use this as a forcing factor to start talking about how do we move to an environment that supports both containerized workloads and HPC. So we bought the nodes, we started a plan where we said, okay, we're going to actually put Kubernetes in from scratch, which was a luxury, a luxury for most companies. And let's figure out how do we actually tie in Kubernetes, Rancher, Univore, the Grid Engine, the Scheduler, and NavOps to create one environment we can actually just manage from one pane. And that is where I turn it back over to Jason. All right. So I mentioned a few of the different features that NavOps has. One of those is the virtual multi-tenancy, which actually allows you to share your clusters across teams and applications, and then you have more control on what they have access to and what the resources are available for those, as well as mixed workloads. So mixed workloads, as he mentioned, are running your HPC workloads also on top of Kubernetes and taking advantage of all the features built into Kubernetes. And so if you don't have, if you try to run those on a default Kubernetes environment, there just isn't that functionality built into it. This is kind of how we integrate with Kubernetes. So we've got the regular Kubernetes architecture. We come in and we just replace the default scheduler, and then you can actually set up in your YAML files using annotations or whatever to tell it to actually use our scheduler rather than the default Kubernetes scheduler. But that default Kubernetes scheduler still runs in the background, right? So if anything does happen, those jobs get passed to that default scheduler as well. So there's no single point of failure or anything there. This is what our pod looks like. So we have a couple of different pods that we deploy. The way that you can communicate with our application is either through a web UI or we have a CLI tool or also a REST API. This makes it fairly easy to communicate with it. You can create new application profiles. You can set the priorities, lots of other things that you can do to make sure that you're configuring our tool correctly. And then it's just a simple pod installation. So kubectl apply, point to your YAML file, done. What it does is it allows you to have advanced policies for Kubernetes. So here we can set up our workload priorities. We can have different proportions and we can actually adjust those on the fly. And if you are a UI person, you can go in and drag this around, right? And it'll automatically take resources within, for example, batch workloads and move those over to development. Or you can, if you don't want to use a UI, you can do that easily through the CLI as well and you can also automate that within your workflow or anything else. And then once you enable and you actually activate command, we're going to have some type of a solution like this, which is what we provided for IHME. And here you can see that we've got command running on top of Kubernetes and then we've got our grid engine stuff running the batch workloads on top of that, as well as you can see we've got container A and container B, right? Different colors there. And we can run those all on top of the same shared cluster, which then leads to much better resource usage and utilization. So prior to setting this all up, they were actually below 20% utilization, lots of resources that weren't actually being used and they were just being wasted. And then after getting everything set up, we're actually over 50% utilization, which is fairly good for an always-on cluster. Okay. So this is very simple. So we've been really happy to work with Univa and rolling this out and just comment that this is still pilot in my eyes that we're still working through. And we love the fact that it's simplified our administration and it's actually made our environment actually really more flexible. And with that, I'm just going to end it there and give people time for questions if we have time for questions. Well, so NavOps runs on any Kubernetes deployment, right? It's just a pod that you deploy. So as far as VMware goes, we don't do anything on the VMware side, right? If you can run Kubernetes on top of it, whether it's Rancher, like IHME is using, or CoreOS, or some type of OpenShift, something you've rolled yourself even, you just deploy our application on top of it and it enables those additional features. Oh, that'd be a question for you, I guess. Sorry, could you repeat the question? Tradition workloads being the HPC workloads? Okay, cool. They are bare metal right now. Yeah, no, not there. The VMs are basically for the data servers and web servers and... Sorry, yeah. Exactly, yeah. Yeah, so as far as command goes, it's actually not open source. You can download it and run it onto five nodes, right, for free and give it a shot and see what you think about it. We have some other open source projects out there, right? So if you are interested, you can go to github.com and see what we've got. One of them is like a broker for running Mesos frameworks on top of Kubernetes. So if you're using Hadoop or Kafka or anything like that, applications written in that, you can use URB, which is our broker, and just run those straight on Kubernetes, right? So you have to have multiple clusters. But as far as command goes, it's not currently open source, but we're looking into something along those lines. Yes. Yeah, it does have preemption. Cameron, if you want to answer the question on kind of the details around that. I work at Unova. I've been there quite a while. I've done a lot of work with the grid engine part. And so, yeah, we do have preemption in Nadoop's command and it's following some of the policies and work that we've been doing in our HPC scheduler. In general, what we always say, preemption's a great thing, but it's always really good to try to get to environments where you don't have to do it. And that's what we really try with command. We understand you look how Kubernetes clusters are growing. We have a container event a day which gives the scheduler chances to fit the cluster utilization against the policy without having to preempt work. That being said, we do have the, well, I really need my thing to run and we'll kick something out and let this really important thing get going. On the point of preemption, that is actually really critical for us because at certain points in time we have certain analytical tools that just have to run and have to consume all the resources. So we do need it and we do use it quite heavily. So right now, because it's a pilot, we allowed them to actually run their own Nadoop containers. We're looking in ways to simplify this because I firmly believe that the researchers should not be as technical as they are right now. Right. Yeah, we do right now. Yes. Okay. That would be awesome. Yes. I would love to. Right now we are experimenting with Kafka as the primary source which we're trying to actually re-architect everything at this point to actually have the data not move as much and have the compute move. So as an interim, the simplest thing is what we do, but I don't think that's going to survive long-term. So, well, right now, pilot phase, we've only actually done jobs that actually are under 100 processes. So it's not that large. It's not like the 10,000 that we're used to in main jobs. So it hasn't been an issue that we've actually seen like pop-up. Yeah. So it's definitely, I know it's an alpha feature that was made part of 1.8. I think one of the big things that command is really doing from a prioritization perspective that's still not in Kubernetes is, you know, Kubernetes has the ability for you to set a priority on a pod and say this is a high priority pod, but that's a, you know, somebody setting that. You know, so you create your classes. You could have some policy that sets. What command's really about is dynamically calculating over time the priorities of your pods, which can change based on reconfiguration of the scheduler, additions of new workloads. So it's redone every 15 seconds. All of the pods are prioritized based on your current policy. So yes, there's definitely convergence, I think, in the Kubernetes community. We've actually been thinking about maybe doing something that's much closer to the Kubernetes schedule that's pretty much just manipulating the priority field, you know, like with a controller or something. So there's definitely that. And then, yeah, with preemption, we've been looking at that as well. It's a little rough right now. You know, I know it's not respecting the pod disruption budget, which we do. Okay, yeah, so, yeah, I mean, it's going, right? I appreciate it. Okay, well, if you guys have any other questions, please come to our booth. It's S56. And thank you very much for attending. Really appreciate it. All right, thank you.