 Well, I'm delighted to be here today and the reason you haven't heard about this very much before, if at all, is because this is a new system and it just got launched in November. We had done six months of solid testing. We had worked with Dr. Hegman, but it's just being rolled out and of course with the holidays and everything else is catching up now, getting a lot of steam. We are teaching all of the medical students to use this as part of one of their third year rotations. And Speed was in one of the first cohorts of us teaching it to the medical students and he's really a huge proponent and we have other folks who are also huge proponents as we go farther. The farther system stands for Federated Utah Research and Translational Health Electronic Repository and its purpose is to support clinical and translational research. It joins data from multiple sources. It is for retrospective study, although people are saying they could use it in prospective studies. For example, when they have a clinical trial going, they can look on an ongoing basis to see how many more patients have come in that I could then contact to enroll in my clinical trial. So it's got a lot more capability depending upon your curiosity and your ability to search it. And we also have ongoing work that's happening as we all being done under my hat, not as department chair in biomedical informatics, which is a department in the school medicine. But in my associate vice president hat, where Dr. Betts asked me about four years ago, would I please take on this role and work to, as one of my very top priorities, make clinical data accessible to the research community. So this is actually a lot of hard work under the hood to make all of this match up and make sure we're querying the right sources and it matches up with the electronic medical record folks and the new PDB. But it's a fairly simple front end. It doesn't do everything you want. It gets you started and I think that's the big thing. So the current state is this aggregate count across data sources for pre-research questions and unique in common patients across these data sources. It is new, as I said, and really the reason that you do cohort searching, as Barbara was saying, is that it's the way you start saying, do I have a big enough patient population that will manage my clinic? Actually, there have been studies which show that about 60% of clinical trials, enough patients to complete what they anticipated would be an adequate patient number to finish their study. And we did a study here at the University of Utah, looking over five years of past retrospective data at the IRB, and 75% of the studies here were either delayed or did not fulfill their requirements for getting enough patients in their cohort. Some portion of that is because people have not had access to figuring out what are the right criteria to ask for. So we did this originally because of the CTSA, Clinical and Translational Science Awards. There's a large national award. We have one here at the University of Utah. There are about 16 medical schools who have them. And that award sits in the middle of our Center for Clinical and Translational Studies. And part of this, we envisioned being able to search not only University of Utah hospitals and clinics patients, but also Huntsman Cancer Institute patients, Intermountain Healthcare, VA. This is the Utah Department of Health and other partners as we go forward. And saying we have such data resources here in the Utah area, let's search across all of them. And we are in the process of doing that. And across more than just clinical data, but also genotypes, phenotypes, public health data, and pedigree data. And we probably have more of all of this combined in Utah than anywhere else in the United States. But it's hard to get access to. So right at the moment, we have two data sources that you can search from your desktop, and I'll show you some of these today. So first of all, it's the Electronic Data Warehouse, which sits at the University of Utah. We have data back to 1994, 1.5 million patients, lab results, medical orders, diagnoses, and procedures. And so when you search, you're searching across all of these records of 1.5 million patients. And then sometime in the month of March, we're adding encounters. And because of all of the new epic systems, we're bringing in all the clinic data in the encounter. So there will be a whole host of new things. So it changes all the time and will be bigger. And then we're searching across the UPDB Limited database. That's the Utah Population Database. That's got 6.5 million individuals. Goes into hospital claims data across the state, 96 to present, gets you into cause of death. These sit in the Vital Statistics Registry at the State of Utah in their Department of Health. First certificate's pedigree quality, and has a master subject index that they've developed in that group over the years, which is pretty amazing. As they link all of these records together, you want to know are these unique patients, do these records belong to the same patient, or are these different patients? So they've worked pretty hard to develop that. And we use that in our further as searching across these sources to say who is a unique patient and which ones are duplicative. Right now, we use it particularly across the University and Intermountain to help make sure we know which is a unique patient. And we're working to bring Intermountain live, but it's not live yet. So what happens? You can get online. Is anybody in here search the UPDBL on person? Two people. I think that it's a very useful resource. And if you're especially, you're looking at these data sources and only these data sources, then I certainly recommend you get on there and it's very useful for cohort finding. But what further does is search all of the UPDBL and the EDW data on top of that, gets you cohort statistics, and then it does not right now get you record level patient data. It may or may not in the future. And the reason it may or may not is because the IRB is watching us pretty closely and making sure what we're doing is totally HIPAA compliant and in line with regulations. So right now what you have to do is to take these patient accounts that you see, and when you finally fiddle with all of these parameters to say these are inclusion-exclusion criterias for my cohort, then I'm going to take it, I'm going to write an IRB, I will get permission to do it, and then you pull that data and right now you have to go back to the UPDBL and the EDW to jointly pull that data, get yourself a set of data and work with it from that point on because at that point it's identified and IRB has approved it. The environment is, we use a system that I'll show you the query tool, the front end query tool, it's called I2B2. It's a query tool which has been developed by Harvard, they have a national center for biomedical computing assistant for the country, and they've developed this front end query tool which is pretty zippy, actually it doesn't do everything you want but that's where we're starting. As I said, we have two data sources, we federate by patient ID, aggregate queries, some union and intersection and these data types. So here's what it looks like, and I put the URL right up there so you can get online at any time you want to, and you start over here and it says register for an account, and that's the very first thing you do and anybody who has a UID can register for an account. You have to wait after you register to get an email back saying we confirm your registration so it doesn't take too long but you can't just immediately jump in, you have to wait a little while to get an email confirmation and then you can start querying, and I will do some querying in just a moment, but I have to tell you once you pull it up, you need to read this data use agreement. The data use agreement is IRB has said please make sure this is on there and you can't get an account until you agree that you'll do it. Now it's kind of like all the time on my laptop I'm always downloading things and saying I agree, I agree, I agree, but this is pretty important that the information I have provided is really you doing the searches, and they want you to use this data to provide information for proposals or project development preparatory to conducting your research. So it's not meant to be frivolous, but it is a pre-research tool. The biggest thing is I will not attempt to identify any individual represented in the query system, and we have gone through all kinds of statistical rules and regulations to make sure that what you see are large numbers. You'll see asterisks in various places, and that means if you choose, if you have a cell which is five or fewer, it won't tell you it's five or fewer gives you an asterisk because they feel like if you do a series of queries you can start identifying people, and IRB and our regulation folks don't like that at all. So I'm going to get online here and I'm going to do diabetic cataract. I talked to Barb ahead of time and she gave me some clues as to what might make sense, so I've done several of them. This is where you do your query, and this is how you find your terms. You also have a list of previous queries and a workplace where you can save everything you do. The previous queries will save for maybe a week or two, but you'll lose them. If you want to save them, you have to drag them up and save them. I'm going to do it by finding terms, and this is simply something which says search by names and this contains. Now you could do other than contains, but this is pretty good. There's diabetic cataract. It is searching now by ICD-9 codes, and you can take that and drag it over here. You could do more queries, so this is if you put more and more in here, it would be diabetic cataract or macular degeneration or something, you could look for all of those. If you do it across these groups, it's AND, but I'm just going to run this query with the single source, and I say I'm going to say tell me who is in the Enterprise Data Warehouse or the UPDB Limited with those who has diabetic cataract. Now it's in, so it doesn't fit all on one screen as well as it did before, but maybe it does on that one. Down here it's executing the query, and it'll tell you how many seconds, and I do have to say that you have to be careful when you do this. Sometimes you'll launch a query which takes a long time. This is not a system which is like when you're using APRIC or CERNR, and you're not seeing patients when you do this. You're sitting at your desk and you're thinking, I wonder how many diabetic cataract patients we have. You're not in a patient care situation. I'm contemplating my research situation. While it's doing that, I want to show you some other things that are to move my screen around. I have a help screen which will give you a sense of assistance introducing it, but the help videos I think are quite useful. I would start there, overview or how you select terms or building a query, and I think that they're really quite useful for helping you get started. This tells you about the analysis tools. I'll show you how to do that in just a moment, how you use previous queries in your workplace. It definitely says that ITB2 comes from Harvard Medical School, but we did some modifications and we're using it here. This is the only one I'm really going to run live. I have some saved ones so that you can see them a little bit better and blow them up. Down here, it has executed your query. It took 120 seconds to do that, so it took two minutes. It searched across six and a half million records in all of the ICD codes. It found that you've got 1223 patients and that most of them are in the EDW. You've got diabetic cataract in that many patients, and you have to say, is that enough to do a study or not? It's a pretty big number. It only found 34 of them in the UPDB, and there's nine patients in common between the two, so unique patients. There's 1248. Let me show you how you could say tell me a little bit more about those patients. It says load a plug-in, but we only have one plug-in, and it's a demographics plug-in right now. First of all, I decided to compare diabetic cataract. We found 1,020. Wait a minute. Well, I don't know what it was. Anyway, we found some. But if, in fact, instead of that, you were going to do diabetes and cataract together, so you could search diabetes, mellitus, and cataract, then what you find is 10,000. The suggestion if you were going to do a study is that not only would you want to find those people who with a definite diagnosis of diabetic cataract, but you would also want to look at that other group in order to see maybe there were some with diabetic cataract who weren't actually given that specific diagnosis, but they've got both of those diagnosis together, and maybe you've got a larger group than you saw before. So what you see over here is that all I did was to choose cataract. I chose the entire group of cataracts, 366, drug it over there, and did this particular query. And then you can see how many unique patients you have along the way and how many patients in common, which is coded over here. So you could add that to your query list, and you can, if in fact you have a new group, you could keep adding things to your query list. So it doesn't just have to be three, it could be five, six, or seven, and you can search for other things as well, and I've got some examples of that. I think what I did was to load your, well, here's a glaucoma example that goes on for several of them. So I chose glaucoma over here, queried that, and then I added to that with how many do you have with organic sleep apnea and glaucoma. And what you find here is that you've got 160 patients in the UPDB, 465 in the EDW, unique patients, you've got 605 of them. Now submitting that and then going forward, I added some more things to it. I said, there's other kinds of sleep apnea, unspecified sleep apnea, hypersomnia with sleep apnea, insomnia with sleep apnea. So you can find other diagnoses to add, and then you can ask it in age categories. In this particular case I said between the ages of 30 and 89, those are demographics age categories. And it took longer to run, but what you found is 308 here, well, 308 in the UPDBL, 806 here in the data warehouse, and 1075 patients overall. Now if you got your, if the demographic plug-in had worked, this is kind of what you would see. So it, first of all, looks at how many, what are your patients, some total, unique patients, patients common to all sources in this particular query. So that was the glaucoma and sleep apnea query. Two sources responded to that, et cetera. And here's your age distribution. You do have some younger patients going back to 35 with that. And it will also tell you the pedigree quality. This is, of course, being pulled from the UPDB. So if you're wanting to say, does this run across generations, you've got 109 multi-generators and pedigrees that have that in there, and you can certainly use that to think about this a little bit more. It also, I mean, there is missing data along the way. And then here's an other, and you'll see the asterisk which says it's five or less in the other category, and therefore you can't answer it. It tells you race, religion, male, female, deceased. Let me go back up and say, deceased comes from the vital records, vital status deceased. So that comes from the state vital records. And then this looks at patients in common over all data sources. So that's the UPDB and the EDW both, and gives you once again the data related to that. Now, in this particular case, unless you're interested in those pedigree data, then I would suggest that you just go to the EDW and find it because you're not going to find that much in UPDB. Here's another one, seen on macular degeneration and antihyperlipidemic drugs. Barbara said, how about statins? And what I decided to do was two queries, one of which was looking at the drug. So I was over here looking at medication orders, pulled antihyperlipidemic agents, and this gave me 1067 patients. But I want you to note particularly, it says negative one in UPDB. And negative one is our symbol, which says, I can't find anything because I can't search this data source. UPDB doesn't have medications, it doesn't have drugs. So it simply says, if you want a query to find this kind of patients in the UPDB L, you can't search it this way, you'll have to search it and not have drugs come up. But then I decided I would compare, if you just looked at statins, how many you'd find. So it found, well, let me go back to this one. With antihyperlipidemic, it found 1067 patients. And if you look at searching it with just statins, here's your breakdown of within the antihyperlipidemic agents, you've got a bunch of them which come in along the way, all of these different kinds. And here's all your statins, HMG, CoA reductase inhibitors, and you find 115. So for most of the patients who are on antihyperlipidemic drugs, most of them are on statins. And you find 1,015 patients. Once again you can say, what's the age range? Barbara was saying you think that there might be a protective effect to statins, and this tells you how many patients you would have, and then you would have to figure out what to compare it to. This doesn't do your whole experiment with you, for you. It says now let me think about what do I compare it to, to see if the incidence is higher or lower. And I think I've almost run out of time, got two minutes. But this is what you do next, register for an account, start querying, check out the help videos, email for assistance further at utah.emu, send me comments and use it. It really can help you with your research. And it'll grow as we go forward. So thank you very much. We are working on specific electronic communication with the IRB, but that isn't here yet. That'll come soon, I don't know about soon. It'll come several months probably. The IRB, what you can do is to print out those further screens and you can talk specifically about your query. And when you queried the EDW and the UPDBL you found the following cohort numbers and then write your IRB around it. And then you have to get both IRB and RGE approval, RGE if you want any data coming from UPDBL. And then that has to be pulled for you by the EDW team or UPDBL team. We hope in the future the IRB will be comfortable with our system pulling the data once we hear that you've got an IRB, but that's not here yet either. The IRBs are just taking it as slow and we're still developing the system. So what we've had people say is it cuts about three months off the front end of their cohort development cycle to figure all of this stuff out at their desk nights, weekends, and get their research assistance also working with it along the way. If you're just searching EDW then you don't, EDW team, and pull data from them.