 Thank you. What a way to end your day talking about cancer. It doesn't quite do well at cocktail parties. Let me start out by telling you how I got interested, and then I'm going to go into what it's all about. When I was at Dell, there was a project at the Arizona State University, and it was called TGEN. And TGEN was basically doing molecular analysis of genomics in order to be able to cure or at least identify some of the problems with pediatric cancer. There's nothing that is more important to open networking than to be able to enable this kind of application. Everybody in this room has cancer. The reason you're not dying of it is because your killer cell, which is in your immune system, is basically killing the cancer. But somewhere along the line, the killer cell either goes to sleep or basically cannot see the cancer. Cancer is cloaking it. What happens is we are going to be talking about precision medicine. When we look back at this year in medicine, we're going to look at the way we practice oncology as the way almost like bloodletting. And I'm not saying that negatively. We do not understand, up until just recently, how to work on solving the problem in our lifetime that's going to help us. We want to improve patient outcomes. I don't have to read it for you. But what's important is what I'm going to show you in the next couple of slides, how it works. And it requires a network in order to basically create enough data to give the oncologist a view and let the cancer tumors speak to us and tell us what it needs. And eventually, we'll be able to deal with the information. But the problem is no one, I know LinkedIn was just here, realizes that genomic data is the second biggest big data issue that we all have. Number one is YouTube's largest. Genomics is number two. But what we need to do is take this thing up here and you can see it. That's the genome. You can look at it. You can see the helix. You see the ACGTs, the way that it's created. That's really a blueprint that generates proteins. And from that helix and the 23 chromosomes, thus the name 23andme, you're able to determine what a person is like. Everybody in this room, male and female, we have 99.9% of our genomic data is identical. It's that 0.01% that makes the difference of who we are, whether you're tall, whether you're female, whether you're basically going to be fat like I am or short or tall. But it also, what we need to do is take a look at it because if you look at it, it's 3.3 billion pairs of nucleotide pairs. I'm going to show you how much and it's coding 500,000 proteins. What this does is this generates a lot of data. And from any one of these proteins, you can rebuild the human being that came out of that DNA. And the way it gets down there is something called RNA or nucleotide acids. And I'm not going to give you a cancer lesson. But what's important is we need to give the oncologist through all our technology the best evidence-based information available. And what we found is pharma genetics companies need to accelerate the drugs. Nat Health, one of my heroes, Dr. Patrick Soon Shion, has just gotten into phase one with a cancer vaccine. We never thought we'd see that. What that is done, he's able to build that because he can look at the unique biology and the specific health conditions of a patient. And it enables the patient to also participate in the curing of their disease. We have to speed up the analysis. And that means we have to take all this data and move it through the network extremely fast. And one of the things that you just saw from LinkedIn which is being used is Kafka. And what that is is basically a producer-broker-consumer model where you can create virtual speeds that we never thought would be in the network. And you've got to be able to do the data flows right. And I'll talk about that in a second. So one of the things that is used is NiFi. And people look at NiFi and say, well, it dupes running around talking about it. But what it does is it enables you to do data flows. Anybody who thinks of the network as a plane and not doing data flows has to start rethinking some of these applications. Cassandra is in the mix for key databases. Spark Apache is there so you can do in-memory manipulation of the data. And one of the things we want to do is once we get down to that 0.01% which we will call the variant, we want to analyze it up against something of a healthy gene which is called a germline. And when you see the differences, you start to see where the discrepancy is. So what happens is you have sequencer machines. As you know, we've gone from $10,000 to sequence back 10 years ago a DNA sequence to under $100. You have Illumina all the way down to Oxford Nanopore that just came out. But take a look at the sample size for a 50x. And what that means is you take the DNA and you basically cut it into 50 slices. And the reason you're doing that is so that you can start the variant analysis and I'm not going to go through how that happens. But it's 150 gigs of data per person sitting in this room. The BAM file is another output and it's 150 gigabytes. And then you have a 90 cram which takes you down to 94 gigabytes and it has something called lousy compression. And why it's lousy compression is there can be many errors. So what we do is we don't want to use cram files. We rather get it down to a more usable mechanism that we can move across the network. And we want to do it in a pull model, not a push model. We want to then take it and do a mapping alignment. So once you get these 50 pieces, the first thing you do is called a hidden Markovian matrix and put it all back together. And when you do, what you want to look at is the variance in the DNA that is not identical. And then if you look at it, this says 3.3 billion, of which only 3 million are variants. So the thing you're fishing for is the variant. But once you find the variant, you also want to look at sequencing the RNA, which is the acids that move down to the proteins. And you want to basically look for what is going on. And you also want to use images of the tumor. And you want to take biometric data, such as electronic medical records. And all this stuff has to move at the speed of real time. We don't have time to wait. The sequencing used to take three days. We're lucky if we can get it down to three hours. Some of the new processes get it down under an hour. And then once you get it, you've got to get it in line with the radiology, with the imaging, with the germ lines, and put this all together. And it's moving over a big network. This is not just an HPC problem. This is an open networking problem. And the one thing about open source, and you heard Arpit say, it came from the days of OSDL when Linux first came out. The open source community can work hand in hand to making this a reality. So what you need is you need a data pipeline. You heard of, and you see Kafka and Nifine. You want to basically transmit this sections to a server. And then what you need to do is you need to get, and why we went to Nifine, it's lossless. You can't get lousy reception. And I hate the fact they call it lossy, because it sounds like lousy. And then you want to deal with data provance because you're moving all kinds of different types of data as fast as you can. The key is parallelism. We always knew it. But we didn't know what applications would have to drive it. So one of the companies is using an FPGA at a Go genome down in San Diego. Has a chip called the Dragon. And it takes the four hours down to under an hour. And then one of the things you've got to do is you can't afford hours, so you've got to use Smith-Waterman to basically do quality control. And at the same time, you want to look at the IBM just announced the first commercial quantum computer. Cubits are beautiful things, because the parallelism in the quantum entanglement really takes away the fabric, which you just saw. And by doing that and being able to create all this parallelism, all of a sudden you're speeding up to no end. When Ginny Rometti, the CEO of IBM announced the IBM Q, which is a quantum computing, one of the things they talked about, which is using it for genomics, we actually believe it's great needs. The networking needs to be distributed. And one of the things that we're starting to look at, we haven't, is blockchain fabricating so that you get more security. NIFI gives you great security with encryption. But you need speed, and that's where you guys come in. The open networking committees have to look at it not just as moving around general ledgers or any other accounting data. You've got to look at your moving valuable data that saves people's lives. I can go through the molecular profiling. I'm sure this is going to be available to you. DNA is 3 billion base pairs, 20,000 genes. By the way, 14 million people in the United States today have cancer. And we get 7 million every month. So you've got to ask, is growing that exponentially? No, we're losing people. So we've got to win this war. The recent publication is the Genome Atlas, and I'm not going to go through that. So here's the problem. 20,000 genes and 3 billion DNA base pairs. Until recently, the scientists have focused only on 2%. And incorporating something called a priory assumption captures the most common genes, we have to get everyone in the country sequenced. Right now, we have about 32 million people sequenced, countries of over 350. So you can imagine what kind of data we're going to get and what kind of network needs we're going to have. So it's really also increasing because we got to take the radiology, the imaging, and the electronical and medical records, put them all together. We got to create something called logitutal record access across multiple cases. And we have to know the outcomes. So when you get a variant, you can see whether or not it affects people. And a classic example is the guy who invented the concept of sequencing was a young guy. And when he did it, he found that when he did his own sequence, he had a variant that really matched a lot of people who had prostate cancer. So he went to his doctor. And he said, I want to be checked for prostate cancer. And the guy says, we don't do PSA tests unless you're over 35 or 40 years old. This is the old medicine. Go to the prescription. Go to the concept. And believe it, oncologic doctors get very uptight when you want to use a new protocol. So all of a sudden, by looking at this, he got his PSA backing. Guess what? He had a very aggressive cancer. And he had to get operated on, but it was caught very early. That is the beauty in the magic. This is not magic. This is a lot of computations, a lot of cognitive and machine learning, a lot of different sources bringing it together. There's a couple of software companies say, hey, we can build our databases. Databases are worthless unless they're proactive. So what are the technology issues? Transport, messaging and data graphing, and real-time transport and security. As of today, and you never know in this country, HIPAA is still in effect. And we basically want to make sure that patient confidentiality is there. The SDN is required for aggregating the bandwidth and also dynamically routing and bringing up more bandwidth when it's needed through multiple ports and dynamic adjustments. Cognitive systems are inadequate today for analysis. I know IBM is going around saying Watson can do it. And Cassandra is the fastest mechanism we found for data storage. Problems with the existing data? We've been collecting data for years. 4%, 6% of the data did not provide the proper DNA RNA to be accurate without a germline, which is a cancer-free concept. We have 69% false positives because the data we're getting smarter and smarter in letting the tumors speak to us. We had 26% false negatives. If I took this to any one of my old bosses and said, I'm going to be wrong 69% of the time. And 26% of the times, I'm going to tell you it's not there. It's there. I'd be shot. So we've got to create a bridge, and that's where you guys come in. We've got to basically take the open source technology to be able to deal with the traffic demands of this amount of data. I looked at NFV, block chaining is fascinating. If we can get it performing properly, open SDN is absolutely important. What's happening is we're getting closer and closer to developing a cancer vaccine. It is in phase one trial at FDA right now, and it needs FDA approval to go through all four phases. But unless we get the technology to move the data, and Dr. Shun says I got it, that's great. But you have 26 cancer centers in the United States, all getting samples, all getting real time data. We need this group to really think about open sourcing things, not just for the businesses and trying to make networks better, but actually solving a major problem in this world. Cancer 2020, if you don't know what that is, that is the National Immunotherapy Coalition started by Joe Biden because his son Bo died of cancer. And we're seeking to accelerate the potential of combination immunotherapy where most of this data comes from. So what my whole message to you is when you come to this conference, and whether you want to talk about open daylight, NFV, whatever it is, you want to talk about the whole stack, you really should talk about the whole stack because these are the kind of applications that are going to change the world. My old friend Steve Jobs had a saying, those people who think they can change the world really do. Dr. Sheehan is doing that in cancer. And I think this group working together, not just the carriers, but everybody else, can change the world. So I could have gone very deep. I could have explained the problems with payers and pharmacies and doctors and patients. But I'd rather just show you what the problem world facing is. And with that, I want to thank you for your time unless you have a question. If you want to talk about immunotherapy, we'll be glad to talk about that. But we'll take one or two questions. Well, thanks very much for this inspiring presentation. I think curing cancer is something that's really exciting. Perhaps motivates people more than maybe, I don't know, help the phone company save money. Although if you're the phone company, saving money is great. And cheaper phone calls and more services, good too. But I have a couple of questions for you. A couple of things that I find are very interesting and kind of genomics. One of them is that often there are at least two flavors of humans, XX and XY humans. There are differences at a cellular level. And actually male humans are maybe closer genetically to chimps than to female humans. And that's sort of question number one. Question two is this idea of privacy of data. As you know, whenever you send something out over the network, well, the cat's out of the bag. You no longer control it. And so do you think that in the networking, open networking and open source can help improve the situation in both of these areas? First is considering all the different types, both types and other types of human cells and possibly their differences, which we don't normally necessarily do enough of. And the other is dealing with this data privacy issue. Data privacy is important. The reason you see NIFI splattered all over data flows is you know that came from the NSA. They donated it to the Apache. We can't unmask who donated it. But we can tell you that it was donated by the NSA. That's a joke, by the way. I'm not. Before I end up on TV. The one thing that we do know is networking and privacy is going to be a big hot topic, especially in this. As far as looking at chromosomes, proteins, RNAs, and DNA, and looking at the differences, your germline, if you're a female, is totally different than if you're a germline of a male. But believe it or not, you still have 99.9% similarities in the way your genome is formed. And by the way, the genome is literally determined at point of conception, not at time of birth. What you're going to be and what you're going to grow to be was determined when the egg was fertilized. So I don't know how open source can help with that, but I know that can help with privacy. Thank you. OK. Thank you.