 Great, thank you. It's wonderful to be here and happy to be speaking to the Data Science Institute here. My academic background is in experimental physics. What I found is that the data analysis techniques I learned in physics are really highly applicable to so many of the situations that we face in the real world. Data science wasn't really a field when I went to school but it's emerged as that as a key driver of so much of our technology industries and within my company we have a data science department now and you know it's about nine people that covers we do computer vision within that group, we do management of large-scale data structures and we do a lot of data analysis on specific experiments and so I'll show you a little bit about that today. So this talk I'm going to give you a kind of broad overview of Viom the company and I'll try and weave the the data science pieces through it and happy to talk about it and take this wherever you as a group think we should go with it. I wasn't quite sure how to target this so there are no equations in this in this in this lecture but there are a lot of charts and some data so let's get into it. First a little bit of background on the pharmaceutical industry and this is where I decided that I wanted to start my next company and on the left here most of you are probably familiar with this chart this is a illustration of Moore's law you can see on this beautiful log scale the computing power what are we actually plotting here mips per thousand dollars going up into the right and this is you know one of the most remarkable technological phenomena even one of the most remarkable natural phenomena that we know of for things to grow in this way at this rate this scale over this many orders of magnitude and we just look around we can see how it's transformed our society and the things that's enabled well on the right this is the sort of corresponding chart for the pharmaceutical industry and what we're looking at here it's again a log scale and we're looking at how many drugs get produced per a billion dollars so you can see that going back to 1950 it's down into the right that every year it gets more and more expensive to develop new drugs and there are various reasons that are hypothesized for this you know perhaps it's that regulations getting stronger one of the most compelling ones is just that we've got a lot of good drugs it gets increasingly harder to get the next good drug because you're they call it the better than the Beatles problem how do you make something that's better than the Beatles well it takes more and more money there's actually kind of interesting structure here if you look at that a little uprise around 2000 where the slope goes up that's the biotechnology revolution where companies like Genentech and Amgen actually gave us a new way to find drugs instead of just using small molecules you know molecules that might have a hundred atoms or less in them we're now using proteins and antibodies and all those great things but this is a terrible story from a sort of sustainability standpoint in fact it's been called euroms law which is more backwards because it's going exactly the wrong way and one of the ideas behind founding the company was how can we use all the great advances that have come out of Moore's law on the left to bend the curve on euroms law and so when we were looking to start a new company that was kind of the big space we stepped into so a little bit more about the drug discovery process this is one way of thinking about it sort of this funnel view where you start out on the left with say 10,000 compounds and you do some molecular tests on them you've got a hypothesis that this compound will bind with this target and that'll modulate this pathway you test that in a test tube then you move on to cells and you do what's called high throughput screening where you have these big robots that will take a single cell and they'll put a drop of the drug on it and you'll sort of look at what you happen to the cell give you some indication like is it getting through the cell wall is it changing something the cell is it killing the cell that all feeds into the information so in that way you're screening out things that are just not interesting for various reasons. Then out of that you come with a set of compounds then you go into animal studies and here you've got a hypothesis that it's going to affect some disease you want to test and see if it's safe and effective in various organisms because you're you're sort of walking up the phylogenetic chain closer and closer to humans and then there's a big a big step here where you go do your first in man studies and that's when you've gotten enough belief that these compounds they are promising that you know they're safe enough that you can put them into people at a low dose and then you advance through various phases of human clinical trials testing progressively on larger populations being more rigorous about is this drug not just safe does it actually do something does it do something better than the drugs that are already on the market and if you get through all that well then you submit that big data package the FDA they review it and they say go go sell it and that's a huge success and this whole whole funnel takes about 15 years to run and two and a half billion dollars per drug so it's tremendously enormous endeavor and to give you an indication like last year I think there were 19 new drugs approved so each one of these that makes it through there is really a gem and there's this enormous R&D apparatus that is built up around doing this about 200 billion dollars are spent every year running running this this pipeline basically because the payoffs are so big pharmaceuticals are a trillion dollar industry so we looked at this and our analysis was that there was a real opportunity in animal studies and I'll give you a few pieces of that but on this level it's a really high leverage point because the really expensive stuff is in human clinical trials that's where because you're dealing with people and safety and everything privacy everything associated with that the cost just go through the roof you know it can cost just that part of it more than a billion dollars and so we looked at it and we said well if we could make better decisions at the animal stage oh and one other thing I should add compounds are failing all the way through here only about 10% of the compounds that go into human clinical trials end up getting approved so it's mostly a story of failures and you can benefit if you can fail earlier because then you aren't investing all that subsequent money in it so we said if we can make better decisions at the animal level well in that way we if we can move that from 10% to 11% or 12% that would be pretty incredible from a financial standpoint not to mention the human benefit of this I mean you know many people who participate in clinical trials the drugs are they're really in pretty desperate circumstances and they're pinning a lot of hope on the drugs and unfortunately most of the time they don't work so if we can get better drugs into human clinical trials societal benefit and there's a good business there so we started digging into how animal studies work and people don't like animal studies very much because they're hard to do the results aren't reproducible they're often corrupted by various things on the right here so nature's kind of been on a tear on this these are our three headlines from Nature magazine and it's just sort of acknowledged that if you do a mouse study in a lab and someone else runs that it's pretty unlikely you're gonna get the same results really hard to do and so we said why is that and what can we do to make it better and how can how can we bring technology into it and so what we did is we started touring these facilities and seeing how they run and we actually teamed up with Cliff Roberts who is the the vet here at Berkeley he was over at UCSF at the time took us through and I don't know how many of you have ever been in a mouse lab you've got just racks and racks of basically stainless steel racks with plastic cages I'll show you a picture a little bit and oftentimes the data collection is really really manual I mean it can be at the level of someone walking up to the cage looking in it and saying okay that animal looks a little bit lethargic and they note that down on notebook or on a five-point scale this animal is a two and a half you know sorry and maybe there's some rules that go with that but they're they're pretty flexible and we've actually seen that if you ask like three people to grade the same animal they'll all get different answers and so we thought that this was a big opportunity for technology and to put it in context a little bit more if you look across the drug development process beginning on the left at the molecular and cellular there have been huge technology investments in data activities that have come out of that you know you're all familiar with the genome and the omics revolution and out of that there have been whole industries that have built up so if you think about a company like Illumina which has made better gene sequencing technology next generation sequencing they call it that's allowed scientists to understand the effects of genes on biological processes more and then that's fed back to Illumina who then is challenged by that to make better instruments and so you get this kind of virtuous cycle where the technology pushes the biology which then pushes the technology back and it just gets better and better and better and because there's so much money involved in this a lot gets invested into it you get these incredible tools where I think one of the few things that succeeded Moore's law is the gene sequencing the law governing gene sequencing where the cost is plummeting even faster than Moore's law if we look on the right side here there has also been a tremendous amount of data that's collected so now human clinical trials because they're so expensive and because you're putting people into these situations where their safety has to be considered you collect all the data you can increasingly well there there's all kinds of blood work that comes out of that biomarkers imaging increasingly people are using digital health tools activity monitors things like that and that's led to a better understanding of patient segmentation so maybe my drug doesn't work in the whole population but in this group it works really well and if you can find that group then you've got something that helps people a lot so finding the right drug for the patient that's what's called sometimes precision medicine and there's been a tremendous amount of advancement around that if you look in the middle here this this animal stage it just hasn't gotten any love from anyone there's been no technology investment into it it's this field that most people don't know about they don't realize how important it is how much money is spent in it and that it's still really pretty archaic in terms of the way things are run and so this is where we focused our efforts and that was sort of the founding idea behind volume so we founded the company back in 2013 we had this kind of cute name now Sarah when we started it and we spent about three years in stealth I guess I need to update this because we're in 2017 now we launched publicly last year and I'll show you a lot from that in just a little bit one thing of you know over lunch we were talking about the interdisciplinary work nature of a lot of the work that happens here well one of the amazing things that happens when you come into an adjacent field is that you often see things you can transfer translate new ways of doing things so when Joe Betz Lacroix and I founded the company the first summer we just spent right writing IP writing patents because we look at and we say well why don't they do it like they do it over in the semiconductor industry or geez we can do that better and a lot of the time I can't actually believe they still do it like that and so with this we we built up a big invention portfolio and what I'm going to show you today is really sort of the leading edge of that these are the things that we've spent the past three years developing they're like ready at a scientific quality that they're actually delivering value to scientists but this is part of my pitch for if any of you are interested in coming to Viom come talk to me there's a lot more behind this and you know we've got probably a decade of work to fill out this whole thing so what is it we've done well we looked at it that the first area that really needed transformation was around data collection to do away with these subjective manual measurements to get people out of the loop because well people stress out animals they interfere with the experiment and you'll get a different answer depending on which person is doing the measurement so our approach was to take every look cost sensor we could get our hands on sensors that are low-cost because they're in your phone or your computer or your car consumer electronics I mean we really just went through the Digi-Key catalog and ordered everything we could and we we pointed it at the at a mouse cage and started collecting data and just started looking for interesting things and I'll show you some of that but what we learned really quickly is the most powerful low-cost sensor today is the HD video camera which is you know $10 because it's in your cell phone there's all kinds of capabilities around it it's really a remarkable device and so we have a number of other sensors like air sensors to measure temperature and water vapor sensors various gas sensors we've just developed an in-cage scale but far and away our most powerful data stream is that HD video data and so one of the principles here was we wanted to design this for scale we didn't want this to be a specialized sort of piece of equipment that you put on your bench and you have one or two of them there's 8 million mouse cages in the world and we wanted to do something that could actually take over that entire space to make this the new standard for data collection so we've got all this data from these video cameras and other sensors going up into the cloud and we run all this in Amazon Web Services which again it is one of these technologies that just makes it possible to do things you couldn't do five years ago we've at any given time got five to ten thousand CPUs running our corpus is about four petabytes right now and if we were to start this company earlier we'd essentially be in the data center business and with the great apis and everything like that we just abstract that away and focus on our work so we've got the data going up into the cloud there we've got an image processing pipeline that I'll tell you a little bit about pulling features out of the video we run various algorithms looking for patterns in the data collecting cleaning the data and then we present it back to the scientists through what we call our research suite and this is an online application where you can through your laptop design a study run it and analyze it all from all from a web-based app and I'll take a minute to show you how that works now so this is our research suite all browser based over HTTPS there's three there's a study designer in here which I won't spend a lot of time on this is just you can select your study type say I want to run a lupus study I can go in here I can give it a name and I've got the protocol I can select the model and there's like a little wizard here where I can design the experiment and put it all together and when I get it right well we're in business so we quote you the price I just submit it and then it goes off and it starts the study right away once the studies underway it shows up on your dashboard and so here are a few different studies we have so this is a an arthritis model this is a lupus model this is multiple sclerosis this is a lung injury model and I can go into any one of these let's look at this lung injury model and I've got a nice kind of overview here of how the study is running so I can see exactly where my animals are they're all through the acclimation phase and they're in the induction phase right now and what we're bringing here is transparency to researchers so normally if you want to know what was going on with the study you'd have to call someone up or go down in the lab or look at it here we're putting that that information right up for you here below this we have these two charts these are what we call our hero charts and we customize them for every study type but they're the two two charts that sort of show you most concisely what's going on with the study and what we're doing here is we've got a control group in black this blue group is giving it being given a chemical called paraquat which causes an injury to the lung that simulates many of the conditions you get in say chronic obstructive pulmonary disease asthma also industrial accident type situations and what you can see here is we're looking at the breathing rate and after we give them the paraquat which is what happens right here on the dotted line the breathing rate goes up and it's high and you can see these nice tight error bars beautiful separation between them this is all collected automatically in real time so as the studies going on you see the new data points coming along the way and that's something that we found is really important here it's immediacy if you think go back to thinking about that 15 year timeline where we as an industry are spending $200 billion a year well if you can make decisions faster if you can get the data faster and make your your next choice do I continue working on the struggle do I switch to another one that has tremendous value to it shortens up that whole timeline here on the right you can see motion this is collected also from computer vision you can see the the night time when they're active in these dark bands they're active and inactive active and inactive then we give them the paraquat and the blue group is inactive for about three days and then starts to recover and then the black group has the same sort of control pattern all the way through so I won't spend a lot of time on the biology with this group but this is a you know from a physiology standpoint this gives you really important insights and then you give drugs and you say can I change those effects in one way or another so that's kind of the overview now I can dive down and these are the individual cages so let's go into this group here and I can click on this and this is kind of the in-depth view so across the top here I've got a timeline and I can pull out my selector here let's look at a little bit more of the data and then below this I'm looking at kind of a zoom in on that so here is motion which we're extracting from the video and you can see the video over here on the left and I can click on this at any time so I might say what's that little blip right there let's see if I can get on top of it there and this is actually a rat let's see oh there we can see the rat coming out of its little house and sort of playing around and you can blow that up if you want to get a good look and this allows a researcher to actually go back in time and say hey I noticed this thing where did that come from what was the course of events leading up to that because oftentimes in these experiments something unexpected will happen and if you don't have this kind of record of it you think you have no idea what it is you speculate and then you have to run another experiment to isolate that here because we're able to get it all at the same time you can go back and learn that and then you don't have to run another experiment that means you go faster it means you use fewer animals it's great for everyone so here we're seeing the motion and you can see these nice regular circadian patterns active less active active less active and then the period of inactivity here in recovery here we're looking at breathing rate so this is also extracted from computer vision and I'll show you just a little bit about how we do this we essentially see the heaving of the chest walls and we have ways of detecting that and here you can see here's the baseline with occasional in outlier like when the rat is really active its breathing rate goes up and then this change to a new state here after it's given the the paraclot this observations panel is where we pull together all the other data around the experiment so every one of these dots if I hover over one so here this is the it's hard to read that's an infratracial dose administration 0.3 milliliters and the rat was put under anesthesia so it represents either something that was done in the lab or something that was done like someone went on this interface and looked so here you can see that someone went online and they checked that there was enough water and enough food and that the animal looked all right that's what all these dots are or increasingly these are outputs from algorithms so our general strategy here is that first we use people artificial artificial intelligence so to speak to go and watch the animals label things and then that serves as a training set to go back and eventually develop the computer vision and so we've got a ton of labeled data here and really one of our bottlenecks is making sense out of that and we we bring together all sorts of different data sources here so like here let me see if I can pull it up like here these are this is from the necropsy after the study was done you can see the liver and the lung and the heart and so we've got a complete record here of like everything that happened around the study all in one place whereas previously it was like in a notebook here and maybe in a spreadsheet there and this probably didn't even get recorded at all what we've done effectively is that if you look at the the data volume created in a typical experiment it's on the order of kilobytes and we're collecting terabytes per study here so just a huge increase in the amount of data which corresponds to a lot of different tools you can you can put to work on it I'll show you a few other interesting things this is one that's a really interesting observation that we haven't had any time to work on so here we're looking at the temperature at the input of the cage and the output of the cage and you can see that the black line the output is higher the the output air is warmer and this is what you'd expect these are little thermal generators they're heating up the air and in fact you can see there's a lot of structure here and if you go into it and you look at like these little bumps you'll find that they correspond to periods of activity of the animal so there yeah you can see the animal is active and it made that little bump there so we're fairly sure that this is a metabolic readout of some sort that by looking at this difference we can actually get an indication of the energy expenditure of the animal but i don't have anyone to do that project right now so any of you are looking for a project this is a good one here one other thing we brought into this whole thing is quality control so looking at say the illumination and this was really drawing from techniques in the semiconductor industry where you measure everything and you control chart you make sure it's right on well we're doing that with these kind of studies and bringing that kind of rigor to them so that you eliminate sources of variability so one of the biggest sources of variability is you're running your study someone comes in during the night turns on the lights and that can completely wreck a study so we're measuring that and we make sure that never happens or if it does we can detect it so in this way we can dive down deep into what's happening on a very particular animal in a particular cage like this what you often want to do in a research situation is you want to compare you want to understand like is my drug working better than my control group and that's where our analytics studio comes in and so what we've done here is we've got a kind of way to configure chart types really rapidly this is done on kind of a per study basis this is done behind the scenes this is just writing python code I wish I knew exactly which library we were using given the people here but we set this up and then a researcher can come in and say I want to look at the average daily breathing rates and I want to look at the average nighttime motion and I want to track the brain weight and I'm going to add those onto my dashboard here and then every day come in and see how the study is going and learn from that and maybe in the course of the study learn the study is going well and I want to extend it in one way or change it in one way or I've learned all I'm going to do time to work on something else it allows that really rapid feedback because this is all updating as the data is coming in so that is our app in a nutshell and what we've done here is taken a process which is really traditionally been very hands-on low data not very good tools around it and we've put it on the web we've collected a lot more data and we've given the end user tools so that they can make sense out of that and this is what it actually looks like so you can see this is our lab here we're we're under red light because rodents can't see red light it looks dark to them and then we've got these white leds at every cage position which illuminates each cage which helps both the computer vision because we've got uniform illumination it's also much better for the animals traditionally you just illuminate with the room light so the animals at the top are getting 10x more light than the animals at the bottom and people have convinced themselves that doesn't matter we say let's just try and control it let's try and do it a little bit better all the electronics are in this little slab here with cameras looking down gas sensors going in and out there's a little scale that sits inside the cage here and communicates wirelessly up to this whole thing and then all these get networked together to a little box which you can kind of see in the back here each of these little boxes has a raspberry pi or two in it that's actually attached to the cameras they're networked there's a switch box here they all go into that switch and then they're consolidated up into the roof and they go up to the cloud and I'll show you a little bit more about how that works in a moment so we are what you call a full stack startup in that we're not just doing hardware or just doing algorithms or just doing services we're doing the whole thing and that makes for a really interesting company and a really interesting experience because if you look at the the type of people involved we've got our data science group our hardware group our software development group we've got scientists with backgrounds in all kinds of different fields of biology we've got people who are expert in animal handling veterinarians we've got of course you need you know business people and sales people and stuff like that and it's really through the intersection and the interactions between all these people that all this great stuff happens so not very many times you get you know someone who's expert in TensorFlow talking to someone who's expert in multiple sclerosis and discussing hey how can we work together and do something neat and that's that's a big part of what's behind what we do so I because I unfortunately don't get to do much technical work anymore I'm going to just give you a little gloss over how some of this works here's our basic system architecture so on the left those are all those cages and racks that that I showed you and they all come together to what we call a base station and this base station collects all the data and it serves as a buffer so we're we're streaming about 15 gigabytes per cage per day and we've got thousands of these things so you can do the math on this base station we have a big storage device a big super redundant storage device all the data gets written to that storage device it's an nfs mount basically and then there are a set of processes on that storage device that copy it up to aws and the reason we did it that way is this is important data if the fiber gets cut if the network goes down we don't want to lose data so that storage devices is sized to hold about a week's worth of data and we can buffer it up and then spool it out if there's a loss of network connectivity once we restore things we go into aws and there we have an architecture with a series of workers that process that data so the biggest data set we have is video and we run a set of algorithms around that we use heavily the spot instances on aws for those who aren't familiar you know you can say i want this machine this particular machine i want to use it for the next month and you'll have that lockdown and they'll charge you some some rate for it if you don't need a particular machine you don't need a particular time they have this market where you can say i'll bid that much for that machine and if no one else bids higher you get it this downside is they might give you with just a minute's notice say that someone else has been more i'm going to take that machine away from you so we use that to run our computer vision algorithms because it takes roughly one core per cage and it would be cost prohibitive otherwise and so there's a whole bunch of infrastructure around spinning up those instances finding where they should be getting their data from making sure the jobs get done and finish them up. Out of that comes various data stores time series we have a lot of time series data and this is a place where i'd really love maybe afterwards to talk with various people in the group what the state of the art is around processing time series data because it's a field that i think is really important to us and maybe there has there are some big opportunities there to improve it and then we make it available over the research suite and then we have various ways to download the data and the size of this whole thing we were at 2.4 petabytes not too long ago it's growing really fast one of my most nail biting moments as a ceo is when i see the amazon web services bill every month because it just keeps going up and a typical study here's one 18 cages 11 days it's about a terabyte and a half so let me tell you a little bit about how we do the activity measure and this has evolved over time actually our very first way of measuring activity was we took the cameras and the cameras had a setting for constant video quality and we set that and then we looked at the bit rate coming out and that actually gave a pretty good indication of motion just basically how many pixels are changing on the screen what we do today is we use optical flow and so you can see a typical image on the left we run an optical flow algorithm not a dense optical flow just on a grid like this and then we store the results from that and do various processing on on the results afterwards and so out of that you can see we've got a motion centimeters per second active periods in active periods when you do this kind of thing of course you have to do validation around it here's some validation data where we took a little whirligig a little motor that goes around in circles we knew the exact speed of it compared that to our the speeds given out by our optical flow data and nice nice correlations there good high r-squared values so that's one thing you can do with the video and with this optical flow map another thing we do with that optical flow map is we line them up into kind of this time space structure and then we look for regions of that that have periodic motion it's essentially a time space for your transform and we do some filtering around that we find the right frequency and with that we can see the breathing rate of animals we can see that as the chest heaves in and out that causes these little fluctuations in pixels we can detect those and it's amazing how sensitive this can be you know you can look with your eye and not see anything that's happening but just a few pixels there this can pick them up and of course we had to validate this as well you can see a typical validation data set what you do to validate this is you take the animal and you put them in a tube and you measure the pressure on that tube as they as they breathe in it that of course the first thing that happens is you take the animal out of their cage put them in that tube they're scared out of their wits and they're breathing a really high rate which isn't good for science because now you've perturbed the experiment you've made that observation process part of the experiment here we're getting all that naturally in the home cage which is much more representative so i alluded to this a bit before in that we're now bringing in tensor flow and using the image net libraries to move from this optical flow based method to using neural nets as classifiers and some of the things we're classifying are like are they climbing on the ladder are they at the running wheel are they eating are they drinking and our general approach to this is we begin with defining the metrics we label a corpus and so we'll often hire high school students to come in for an afternoon and just sit in front of a terminal and click on that that that that that we built some tools for them we take that corpus and then we train models around that and we iterate and iterate and iterate and we're to the point now where we're getting some pretty good results out of that and we've got that running on our our big central infrastructure at AWS those things just haven't been released yet so you'll have to come to vime if you want to see them in detail but there's a lot of really interesting work going on there so maybe just to sort of bring this all together let me show you just a few scientific results that come out of this so you can understand exactly how this this translates all the way through so here's an example this is a model of rheumatoid arthritis this is a a disease where you get inflammation degradation of the joints results in pain loss of motion terrible disease many people get it as the at various stages in their life there's a model where you take a rat and you inject a protein collagen and it causes an inflammatory response that mimics arthritis and what you do conventionally is you would take a pair of calipers like this and you'd measure how thick that rat's paw is and you'd do that maybe three times a week every time you do that you stress out the animal of course depending on how tight you squeeze with the calipers and whether you measure here or here like this or like that you get variable answers well what we did is we said wouldn't it be better to look at how the animals are moving because after all that's what you care about you know if i have arthritis and my joints are swelling but i can move around that's better than vice versa and so we were able to develop metrics that track pretty much exactly with this caliper measurement and we had to do a number of different statistics we did try training a network on this and we did get a network that matched really well it's not what we use because we couldn't explain why it worked and because we're talking to scientists and people are making decisions that affect human safety you want to make sure you can explain what's going on so we found a statistic where when we looked at the maximum velocity and we took the top 25 maximum velocity events during the day and we charted that it looks uh that's what you see in the red the orange and it looks exactly like the blue lays over very well and if anything it's maybe a little more sensitive it's detecting the onset of arthritis a day or so earlier and then you go and you you do all kinds of standard of care drugs so these are several different drugs dexamethasone, emerald, ibuprofen, methotrexate that are used to treat arthritis and you can see that we're getting good correlations between them and that gives us a validation that what we're measuring is actually important. On the right this is comparing to the gold standard which is when the experiment's over you dissect the animals you look at the joints under a microscope you have a trained pathologist look at that and looks for wearing and inflammation and things like that and you can see we actually our motion metric has better concordance with the conventional histopath so if you compare the orange to the light gray they often line up better than the dark gray to the light gray which is the conventional way of doing it so we're actually giving a deeper view into things and a more accurate view and so this is all done automatically and what that means is that now I don't need to have those people making those measurements every day those measurements which weren't very good anyways that means that I can run the experiments in a more hands-off way it means I can run larger experiments because previously to get consistent results you'd have to have one person make all the same measurements and now because we've automated that it means that you've removed that roadblock so that means you can test more drugs make decisions faster and move them through that pipeline faster so maybe just to close and I've got a lot of these examples that we could go into let me talk a little bit about another application here which is humane endpoint prediction and this is kind of a neat predictive analytics thing so one of the biggest problems that have in an animal study is you'll come in the morning and you'll find an animal dead in the bottom of the cage and that's bad for a couple reasons first of all you usually when an animal dies you want to do a necropsy on it you want to take blood you want to look at the tissue because that tells you a lot about the biology that's going on and if you find the animal dead you can't do that because there's a cellular breakdown process that occurs and unless you get it within about 30 minutes just that data is not very good also if you're giving a drug to this animal now you have a problem because you don't know if your drug actually caused that and that might that's a big safety issue you know how are you going to feel confident taking this drug into a person if you found an animal dead and so what we did is looked at these various signals we collect and we were able to develop ways that we could predict when an animal was going to be found dead the night before even two nights and here's a little experiment on the right where in this group this is actually a brain cancer study glioblastoma terrible disease 15 animals were found dead and using our predictive analytics 10 of those we were able to find we were able to predict we had a very clear signature and so we could have predicted that and not sacrifice those animals properly and get gotten all that additional data that's useful for understanding the disease in the middle here these are animals that reach humane end point so typically what you'll do is you'll have a veterinarian looking and say this animal looks healthy this animal's sick we really need to euthanize it well we detected those the same way and for the healthy animals we were 100 percent on those so in this way we're developing a tool that allows you to better care for the animals and this is important for a number of reasons and i'll just close here by talking about animal welfare a little bit because you know i could see people squirming a little bit when i'm talking about animals dying and necropsy i mean for me to get into this it it was a big step and it's because i believe that this work is important it's in fact behind pretty much every medical advance that we all benefit from but that means that it's on come incumbent on us as society and as researchers to do the best that we can and there are a set of principles around that called the three rs which are replacement reduction and refinement which means that wherever you can you should replace lower species with higher higher species with lower species so you know if you can learn it from a rat you shouldn't do it in a dog and if you can learn it from a computer you shouldn't do it in an animal at all it means you should refine how you do experiments to care for the animals better to get as much information out of it and you should reduce the number of animals that you use and we're able to do all those things so because we have a much more sensitive measure here we can offer often do experiments as with smaller group sizes you know to give you an example a case where conventionally you'd need to do seven animals to get statistically significant results we can do it with two and so that's a huge benefit in reduction in animals by watching the animals much more closely we can tell when they need veterinary care when they're in distress and rather than waiting till the next day the system sends an alert the veterinarian pulls it up on her cell phone she looks at it she says this is what needs to be done and we have a very tight loop around that and then we're getting more information out of rodents which makes it less necessary sometimes to do studies in higher species so to close out the net of all this is I showed you this at the beginning this 15 year and two and a half billion per approved drug well because we're able to run studies more effectively at scale it means that much of the cellular work we can do that earlier we can test more compounds which gives you better better information earlier we're able to do the animal studies more efficiently we're able to make better judgments on whether a particular drug is working or not and so we shorten up that whole part of the pipeline and then we get better compounds into the human clinical trials and so the result of all this is we're trying to pull in that 15 year timeline so that we can get better therapies to market faster so I'll just leave it there happy to answer any questions thank you