 Live from Vancouver, Canada, it's theCUBE. Covering OpenStack Summit North America 2018, brought to you by Red Hat, the OpenStack Foundation, and its ecosystem partners. The sun has come out, but we're still talking about a lot of the clouds here at the OpenStack Summit 2018 in Vancouver. I'm Stu Minimanu, my co-host, John Troyer. Happy to welcome to the program the 2018 Super User Award winner, George Mehais Gu, who's the senior cloud architect with the Ontario Institute for Cancer Researcher, OICR. First of all, congratulations. Thank you very much for coming in. Thank you so much for joining us. Thank you. So, cancer research obviously is one of those things we talk about is how can technology really help us at a global standpoint help people. So, tell us a little about the organization first before we get into the tech of it. So, YCR is the largest cancer research institution in Canada and is funded by government of Ontario. Located in Toronto, we support about 1700 researchers, trainees and clinicians, staff, and it's focused entirely on cancer research. It's located in a hub of cancer research in downtown Toronto with Princess Margaret Hospital, Sick It's Hospital Mount Sinan, very, very powerful research centers. And YCR basically interconnects all these research centers and tries to bring together into advanced cancer research in the province, in Canada and globally. That's fantastic, George. So, with that, to sketch out for us a little bit, your role, kind of the purview that you have and the scope of what you cover. So, I was hired four years ago by YCR to build and design a cloud environment based on a research grant that was awarded to a number of principal investigators in Canada to build this cloud computing infrastructure that can be used by cancer researchers to do large-scale analysis. What happens with cancer, because the variety of the mutations happening in cancer patients, researchers found that they cannot just analyze a few samples and draw a conclusion because the conclusion wouldn't be actually valid. So, they needed to do large-scale research. And ICGC, which is International Cancer Genome Consortium an organization that's made of 17 countries that are donating, collecting and analyzing data from cancer patients, okay? They decided to put together all this data and to align it uniformly using the same algorithm and then analyze it using the same workflows in order to actually draw a conclusion that's valid across multiple data sets. And for, they are focusing on the 50 most common types of cancers that affect most people in this world. And for each type of cancer, at least two countries provide and collect data. So, for brain cancer, let's say you have brain data sets from two countries, for melanoma, for skin, and this basically gives you better confidence that the conclusion you draw is valid. And then the more pieces you have, the puzzle you throw on the table, the easier to see the big picture that's this cancer. Yeah, George, I mean, I'm a former academic and the more data you get, right, the more infrastructure you're going to have to have. I'm just reading off the little announcement, 2600 cores, 18 terabytes of RAM, 7.3 petabytes of storage, right? That's a lot of data and it's a lot of, accessed by a lot of different researchers. When you came in, was the decision to use OpenStack already made or did you make that decision and how was the cloud architected in that way? So, the decision was basically made to use OpenSource. We wanted basically to spend the money on capacity, on hardware, on research, not on licensing and support. A good use of everybody's tax dollars, yeah? Exactly, so you cannot do that, like if you have to spend money for paying licensing, then you probably have only half of the capacity that you could, so that means less large analysis and longer it takes and more costly. So, CEPH for storing the data sets and OpenStack for the infrastructure as a service, offering was no brainer. And my specialty was in OpenStack and CEPH, I started OpenStack seven years ago, so I was hired to design and build and I had the chance to actually do alignment and mutation calling for some of the data sets so I was able to monitor the kind of stress that this workflows put on the system, so when I designed it, I knew what is important and what to focus on. So, it's a cloud environment that's customized for cancer research. We have very good ratio of RAM per CPU. We have very large local disks for the VM, for the virtual machines to be able to download very large data sets. We build it so if one compute node fails, you only impact a few workflows running there. You don't impact single small points of failures and other tunings that we apply to the system too. George, can you walk us through a little bit of the stack? What do you use? Do you build your own OpenStack or do you get it from someone? Yeah, so basically we use a commodity hardware. We just hide density chassis currently from Supermicro. Ubuntu for the operating system, no licensing there, OpenStack from Debian packages. We focus more on stability, scalability, and internal support cost because it's just myself and I have a colleague, Jared Baker, who's a cloud engineer, and you have to support all this environment so he tried to focus on the features that are most useful to our users as well as less strength on our time and support resources. Yeah, that's, I mean, let's talk about the scalability, the team is you and a colleague, but mostly, right? And in the olden days, you would be taking care of maybe a handful of machines and maybe some disk arrays in the lab, but now you're basically servicing an entire infrastructure for all of Canada, right? At how many universities? Well, basically it's a global, so we have 40 research projects from four continents. So we have from Australia, from Israel, from China, from Europe, US, Canada. So approved cancer researchers that can access the data, open up an account with us, and they get the quota and they start their virtual machines, they download the data sets from the S3 API of CEPH, okay? To the VMs and they do analysis and we charge them for the time used and because we use everything as open source and no, we don't pay any licensing fees, we are able to, and we are not for profit, we charge them just the cost that it costs us to be able to replenish the hardware when it fails. Nice, nice. And these are actually very large machines, right? Because you have to have huge, thick data sets, you've got big data sets you have to compare all at once. Yeah, I mean, on average, BAM is called a file that has like the normal DNA of the patient and they need also the two more DNA, okay, from the biopsy, an average whole genome sequence is about 150 gigabytes. So they need at least 300 gigabytes and depending on the analysis, if they find the mutations, then the output is usually five, 10 gigabytes, so much smaller. For other workflows, you have to actually align the data, so you input 150 gigabytes and the output is 150 or a bit more with metadata, okay? So, nevertheless, you need a very large storage for the virtual machines and these are virtual machines that run very hard in terms of like, you cannot do a CPU oversubscription, you cannot do memory oversubscription. When you have a workflow that runs for four days, 100% CPU. So it's different than other web scale environments where you have web servers running at 10%, you can do 10 to one subscription and then you go much cheaper different solutions here. You have to only provide what you have physically, okay? And... That's great. Yeah, George, you said you participated in the OpenStack community for about seven years now. Yes. What kind of, do you actually contribute code? What pieces are you active in the community? Yeah, so I'm not a developer. My background is networking, system administration and security, but I was involved in OpenStack since the beginning before it was a foundation. I went to the first OpenStack public conference in Boston seven years ago, the International Hotel and over time, I was involved in discussions on the RSC channel, mailing lists support reporting bugs. Even recently, we had very interesting bugs that affected us where the cloud in it package that is supposed to resize the disk of the VM as it boots, it was not using more than two terabytes because it was a bug, okay? So we reported this and Scott Mosser, who's the maintainer of the CloudyOtils package, worked on the bug and two days later we had a fix and Canonical built the package. It's in the latest Cloudy Ubuntu image and Red Hat and everybody else is going to use the same version of the package. So somebody who now has larger than two terabyte VMs when they boot, they'll be able to resize and use the entire disk. And that's just an example of how with OpenStores you can achieve things that take much longer in a commercial distribution where even if you pay, doesn't necessarily mean that the response. Yeah, sure. Well, so George, any lessons learned you've been with us a long time, right? And you, like, Seth, one thing we noticed today in the keynote is actually a lot of the storage networking, virtual storage networking and compute wasn't really talked to, but those projects were maybe down focused a bit as they talked about all the connectivity to everything else. So I mean any lessons, so my point is the infrastructure is stable of OpenStack, but any lessons learned along the journey? I think the lessons are that you can definitely build very affordable and useful and scalable infrastructure, but you have to get your expectations right. We only use from the OpenStack the project that we consider are stable enough so we can support them confidently without spending. Like if a project is at 5% value to your offering, but it's 80% of your time debugging and trying to get into working. It doesn't have packages and missing documentation and so on. That's maybe not a good fit for your environment if you don't have the manpower to and if it's not absolutely needed. Another very important lesson is that you have to really stay up to date. Like go to the conferences, read emails on the mailing list, be active in the community. We host the OpenStack meetups in Toronto for the 2018. We present there, we talk to other members. In these seven years, I read thousands, tens of thousands of emails. So I learned from other users' experiences. I try to help where I can. You have to be involved with developers. I know the self-cored developers, Sage and other people. So you can do this just by staying on the side and looking, you have to be involved. And... Yeah, George, what are you looking for next from this community or you talked about the stability are there pieces that you're hoping reach that maturity threshold for yourselves or new functionalities that you're looking for down the road? I think for what we want to provide to our researchers because they don't run web-scale applications. So their needs are a little bit different. We want to add Magnum to our environment to allow them deploy Kubernetes cluster easily. We want to add Octavia to expose the services even though they don't run many web services but you have to find a way to expose them when they run them. Maybe a Trove, a little bit of a service. We'll see if we can deploy it safely and if it's stable enough. Anything that OpenStack comes up with, we basically look, is it useful? Is it stable? Can we do it? And we try it. So George, last thing, your group is the super user of the year. Can you just walk us through that journey? What led to the nomination? What does it mean to your team to win? I think we are a bit surprised because we are a very small team and our scale is not as high because T-Mobile or the other members but I think it shows that again, for a big company to be able to deploy OpenStack and scale and make it work, it's maybe not very surprising because yes, they have the resources, they have a lot of manpower and a lot of, but for a small institutional organization or a smaller company to be able to do it, without involving a vendor, without involving extra costs, I think that's the thing that was appreciated by the community and by the OpenStack Foundation and yeah, we are pretty excited to have won it. All right, George, let me give you the final word. As somebody that's been involved with the community for a while, what would you say to people if they're still maybe looking from the outside or played with it a little bit? What tips would you give them? I think that we are living proof that it can be done and if you wait until things are perfect, then there will never be, okay? Even Google has services in beta, Amazon has services in beta, you have to install OpenStack is much more performant and stable than when I started with OpenStack where there was just a few projects, but definitely they will get help from the community and the documentation is much better, just go and do it, you won't regret it. George, as we know, software will eventually work, hardware will eventually fail. So George Mihayescu, congratulations to OICR on the super user of the year award for John Troyer, I'm Stu Miniman. We're getting towards the end of day one of three days of wall-to-wall coverage here at OpenStack Summit 2018 in Vancouver. Thanks so much for watching theCUBE. Thank you.