 Live from Washington D.C., it's theCUBE, covering AWS Public Sector Summit 2018, brought to you by Amazon Web Services and its ecosystem partners. Okay, welcome back everyone, we're here live in Washington D.C. for Amazon Web Services, Public Sector Summit, I'm John Furrier, Stu Manley man, our next guest is Dr. Swayen Chen, Senior Research Scientist, Infectious Disease, the Genome Institute of Singapore and also an Assistant Professor at the Medicine and National University of Singapore. Great to have you on, I know you've been super busy, you're on stage yesterday, we tried to get you on today. Thanks for coming in and kind of bringing in to our two days of coverage here. Thank you for having me, I'm very excited to be here. So we were in between breaks here, we're talking about some of the work around DNA sequencing, but it's certainly fascinating, I know you've done some work there, but I want to talk first about your presence here at the Public Sector Summit. You're on stage, tell your story, because you have an interesting presentation around some of the cool things you're doing in the cloud. Take a minute to explain. That's right, so one of the big things that's happening in genomics is the rate of data acquisition is outstripping Moore's law. So for a single institute to try to keep up with Compu for that, we really can't do it. And so that really is the big driver for us to move to cloud and why we're on AWS. And so then, of course, once we can do that, once we can sort of have this capacity, there's lots of things in my research, it's mostly on infectious diseases. So one of the things where really all of a sudden you've got a huge amount of data you need to process will be a case like an outbreak. And that just happens, it happens unexpectedly. So we had one of these that happened that I talked about and the keynote yesterday was on Group B Streptococcus. This was a totally unexpected disease. And so all of a sudden we had all this data we had to process and try to figure out what was going on with that outbreak. And unfortunately, we're pretty sure that there's going to be other outbreaks coming up in the future as well and just being able to be prepared for that. AWS helps us provide some of that capacity and we're continuously trying to upgrade our analytics for that as well. So give an example of kind of where this kind of hits home for you, where it works. What is it doing for specifically? Is it changing the timeframe? Is it changing the analysis? Where's the impact for you? Yeah, so it's all of this, right? So it's all the sort of standard things that AWS is providing all the other companies, right? So it's cheaper for us to just pay for what we use, especially when we have super spiky workloads, like in the case of an outbreak, right? If all of a sudden we need to sort of take over the cluster internally, well, there's going to be a lot of people screaming about that, right? So we can kick that out to the cloud, just pay for what we use, we don't have to sort of requisition all the hardware to do that. And so it really helps us along these things. And also gives us the capacity to think about, as data just comes in more and more, we start to think about, you know, let's just increase, let's just increase our scale. This is something that's been happening sort of incessantly in science, incessantly in genomics. So just an example from my work in my lab, we're studying infectious diseases, we're studying mostly bacterial genomics. So the genomes of bacteria that cause infections, we've increased our scale 100x in the last four years in terms of the data sets that we're processing. And we see the samples coming in, we're going to do another 10x in the next two years. We just really wouldn't have been able to do that on our current hardware. Yeah, Dr. Kent, fascinating space. We love, for years there was a discussion of, well, oh, how much it costs to be able to do everything had gone down. But what's been fascinating is you look, you talked about that data and outstripping Moore's law and not only what you can do, but in collaboration with others now, because there's many others around the globe that are doing this, just talk about that value of data and how the cloud enables that. Yeah, so that's actually another great point. So genomics is very strong and to open source, especially in the academic community. Whenever we publish a paper, all the genomic data that's in that paper, it gets, uh-oh, whenever we publish... Three minutes, cloud down. You got three minutes, go ahead. Whenever we publish a paper, that data goes up and gets submitted to these public databases. So when I talk about 100x scale, that's really incorporating worldwide, globally, all the data that's present for that species. So as an example, I talked about group B streptococcus. Another bacteria we study a lot is E. coli, Escherichia coli. So that causes diarrhea, it causes urinary tract infection, bloodstream infections. When we pull down a data set locally in Singapore with 100, 200, 300 strains, we can now integrate that with a global database of 10,000, 20,000 strains and just gain a global perspective on that, we get higher resolution. And really, ADAWIS helps us to pull in from these public databases and gives us the scale to burst out that processing of that many more strains. So the DNA piece in your work, does that tie into this at all? I mean, obviously you've done a lot of work with the DNA side. Was that playing into this as well? The DNA work you've done in the past? Yeah, so all the stuff that we're doing is DNA, basically. So there are other frontiers that have been explored quite a lot. So looking at RNA and looking at proteins and carbohydrates and lipids. But at the Genome Institute of Singapore, we're very focused on the genetics and mostly are doing DNA. How has the culture changed from academic communities with cloud computing? We're seeing sharing, certainly a key part of data sharing. Can you talk about that dynamic and what's different now than it was, say, five to even 10 years ago? Huh, I'd say that the academic community has always been pretty open, the academic community. It's always been a very strong open source, compatible kind of community. So data was always supposed to be submitted to public databases. Didn't always happen, but I think as the data scale goes up and we see the value of the sort of having a global perspective on infectious diseases and looking for the source of an outbreak, the imperative to share data. Right, that looking at outbreaks like Ebola, where in the past people might try to hold data back because they wanted to publish that. But from a public health point of view, the imperative to share that data immediately is much stronger now that we see the value of having that out there. I would say that's one of the biggest changes is the imperative is there more. Yeah, I don't think there's any. I agree, there's academic people I talk to that they always want to share. It might have been uploaded fast enough. So time is key. But I got to ask you a personal question. Of all the work you've done on, you've seen a lot of outbreaks, this is kind of like scary stuff. Have you had those aha moments or mind-blowing moments where you go, oh my God, we did that because of the cloud? I mean, can you point to some examples where it's like, that is awesome, that's great stuff. Well, so we certainly have quite a few examples. I mean, outbreaks are just unexpected. Figuring out any of them and being able to impact or sort of say, yeah, this is how this transmission is or this is what the source is, this is how we should try to control this outbreak. I mean, all of those are great stories. I would say that, you know, to be honest, we're still early in our transition to the cloud and we're kind of running a hybrid environment right now. Like really, when we need to burst out, then we'll do that with the cloud. But, you know, most of our examples so far, we're still early in days for our class. So the spiking is the key value for you is when the hits you pipe out. So what excites you about the future of the technology that you believe we'll be able to do as we just accelerate, prices go down, access to more information, access to more. What do you think we're going to see in this field the next, you know, one to three years? Oh, I think one of the biggest changes that's going to happen is we're going to shift completely how we do, for example, in outbreaks, right? We're going to shift completely how we do outbreak detection. It's already happening in the US and Europe. We're trying to implement this in Singapore as well. Basically, the way we detect outbreaks right now is we see a rise in the number of cases. You see it at the hospitals. You see a cluster of cases of people getting sick. And what defines a cluster? You kind of need enough of these cases that it sort of statistically goes above your baseline. But we actually, when we look at genomic data, we can tell, we can find clusters of outbreaks that are buried in the baseline because we just have higher resolution. We can see the same bacteria causing infections in groups of people. It might be a small outbreak. It might be self-limited, but we can see this stuff happening and it's buried below the baseline. So this is really what's going to happen that, you know, instead of waiting until a bunch of people get sick before you know that there's an outbreak, we're going to see that in the baseline or as it's coming up with two, three, five cases, we can save hundreds of infections. And that's one of the things that's super exciting about moving towards the future where sequencing is just going to be a lot cheaper. Sequencing will be faster. Yeah, it's a super exciting time. And more research against a flywheel. More research should come over the top. Exactly, exactly. That's great work. Thanks for coming on theCUBE, really appreciate your time. Congratulations, great talk on the keynote yesterday. Really appreciate it. This is theCUBE bringing you all the action here as we close down our reporting. They're going to shut us down. theCUBE will go until they pull the plug, literally. Thanks for watching. I'm John Furrier, Stu Miniman and Dave Vellante, Amazon Webster's Public Sector Summit. Thanks for watching.