 Good afternoon. Welcome to our monthly health policy and bioethics consortium. I'm Christine Mitchell, the executive director of the Center for Bioethics at Harvard Medical School. I want to take just a moment to remind those of you who might not know that the Center for Bioethics now has four monthly ethics consortia during the academic year. And those are, on first Fridays, the Clinical Ethics Consortium for members and leadership of the Clinical Ethics Committees around the Harvard Teaching Hospitals. This one, which is usually on second Fridays, although it's complex to organize, and sometimes for a variety of reasons. It ends up on a different day, like today, but is generally open to the public, according to their interest in the topic of the month. And the third Fridays, we do one with catalyst on research ethics. And these first three all have tutorials attached to them after the consortium that our master's students take. And then on fourth Fridays, we also have one on organizational ethics. So this monthly consortium on health policy and bioethics, Erin Kesselheim organizes. I think you know that we do this in collaboration with Portal, which is the program on regulation therapeutics and the law at the Brigham, which Erin leads, and also in collaboration with the Petrie-Flom Center for Health Law Policy, Biotechnology, and Bioethics at Harvard Law School. And we want to acknowledge the generous financial support of the Oswald Cayman Fund at Harvard that makes this consortium possible. So let me just give an inadequately brief introduction of Erin. I think most of you know that he's an internal medicine physician at the Brigham and is also an associate professor at Harvard Medical School. He also has a law degree and a master's degree in public health. And his research, at least some of it, focuses on regulatory policies, which affect pharmaceutical development and the drug approval process, among other things. He leads Portal, as I said, and is also on our faculty at the Center for Bioethics, where in addition to organizing this consortium and leading the tutorial that is associated with it, he also co-teaches a course that will be offered this spring again on health law policy and co-teaches that with Holly Lynch. So Erin. So thanks. I am only going to be up here for a brief amount of time. I want to see these yours here, right there. OK, yeah, so just a brief introduction on Portal. Portal is within the Division of Pharmacovidemiology and Pharmacoeconomics. At the Brigham, we're a research core, an interdisciplinary research core focusing on intersections between law, policy, science, and drug and device development. And it is therefore a great honor to be able to welcome two of the leading lights in some of these areas to talk with us today. If you're interested in Portal, and you should make sure to follow us, you can follow us on Twitter or send me an email. And we'll be happy to get you involved in our letting you know about stuff that's coming out of Portal. We are in the last couple, in the last month, there's been a lot of talk about regulatory policy and other issues that we in Portal have been able to be involved in some of the national discussions on relating to the trial deaths in the Juneau trial, the 21st Century Cures legislation, which is going to be voted on in the Senate in a couple hours, and other issues related to insulin and other factors. And finally, I wanted to give you guys a little bit of a preview for the health policy and ethics sessions that we're organizing in conjunction with the Center for Bioethics in the spring in February about returning results to participants. I would point out that that one is most of these are at Harvard Medical School, although that one is going to be conducted at Harvard Law School, promoting accurate and useful public reporting of physicians, medical errors, and ethical implications of new funding, in which we're going to ask questions like, what are the implications of different economic models of funding biomedical sciences for the types of products that emerge? And we're thrilled to have a bunch of other leading lights in the area come by in the spring. So please do mark your calendars. We're looking forward to seeing all of you back here. And finally, let me introduce my colleague, Amit Sarpawari, who's the associate director of Portal to talk a little bit about today's topic and to introduce our featured speakers. Thank you. Hi, welcome, everyone. We are honored today to have two acclaimed experts to discuss clinical trial data sharing, Dr. Jeffrey Jason and Dr. Harlan Krumholz. Dr. Jason joined the New England Journal of Medicine as editor-in-chief in July of 2000. He is the distinguished Parker B. Francis professor of medicine at Harvard Medical School, professor of physiology at Harvard Chan School of Public Health, and a senior physician at Brigham and Women's Hospital. A specialist in pulmonology, Dr. Jason maintains an active research program and has published more than 300 articles. He currently serves as co-chair of the Institute of Medicine's Forum on Drug Discovery Development and Translation and on the World Health Organization's Scientific Advisory Group on Clinical Trials Registration. Dr. Krumholz is a cardiologist, health care scientist, and health care improvement expert at Yale University, where he's the H. Heinz Jr. professor of medicine. He serves as the co-director of the Robert Wood Johnson Foundation Clinical Scholars Program at Yale, the director of the Yale New Haven Hospital Center for Outcomes Research and Evaluation, and the principal investigator of the Yale Open Data Access Yoda Project, which we'll hopefully hear more about today. He has published more than 800 articles and two books and is a frequent commentator on health policy and the national. My goal here is to just give you a little bit of a brief background to set the stage for our two experts to begin discussing clinical trial data sharing. So to begin with, clinical trials remain at the heart of medical research. Nonwithstanding limitations, randomized control trials remain the gold standard of medical evidence, but really haven't been along that long. It's really the 1940 where the Seminole Bradford Hill trial in Streptomyocin began and sort of ushered in a new era of medical evidence. It is the principal, RCTs are the principal basis for regulatory approval. The FDA requires substantial evidence of safety and efficacy, which is at least one adequate and well controlled investigation and a review led by researchers at Yale from 2005 to 2012. More than 85% of approvals are based on at least one RCT. RCTs represent a significant investment of money as well. The average cost of a phase three RCT at a US site between 2002 and 2012 was 11.5 million in dermatology, ranging all the way to 52.1 million in pain and anesthesia. And that funding, if you look at this figure, is significantly supported by public health money. So that makes clinical trial data sharing that much more important, specifically participant level information. It provides public utility, which is important in the sense of reproduction of clinical trial findings, just to highlight the sort of crisis of reproducibility in medicine in terms of 37 reanalyses of clinical trials conducted by John Ioannini's group. They found that only 35, well, sorry, that 35% resulted in a different interpretation. They can also be used for individual participant data meta-analyses and for secondary analysis. And importantly, and what we're here to talk about a bit today is rapid and early sharing of clinical trial data fulfills an ethical obligation to patients who really participate in research with the understanding that it won't benefit them, but it will benefit people with their condition and the future. So in terms of ushering in or promoting policy developments, there have been several steps being taken by stakeholders before 2014. The FDA required in 1997 a registration of at least a limited number of trials and expanded that in 2007. The ICMJE required trial registration as a precondition for publication consideration in 2004. And then when you talk about clinical trial data sharing, the FDA Amendments Act of 2007 required at least summary result reporting for non-phase one trials and journals members of the ICMJE have required at varying stages different degrees of clinical trial data sharing already. In 2007, the Annals of Internal Medicine required authors of original articles must state their willingness to share study protocol, statistical code and data, and the BMJ in 2009 required authors must disclose what data are available to whom and how. And really the leaders of clinical trial data sharing have oftentimes been industry, not the academic medical centers. And so despite this push, there are still continuing data silos. In terms of a review that was recently featured in the New England Journal of Medicine, independent review panel for access to GSK clinical trial data of a request window of that data from 2013 to 2015, there were 177 requests. Only three of those, or 1.7%, were for a confirmation of research results. And only 27 survey responses were returned, but Dr. Krumholtz will tell you that he only received the day of the publication I believe, if you follow him on Twitter. So that three of the 24 survey responses, only three, 12.5% said that they had completed the data analysis. And then when you take a look at academic medical centers, again coming from Dr. Krumholtz's group, that all 4,347 trials completed between 2007 and 2010, after a period of two years after reported completion of the trial, only 1,560 had actually published aggregate, not even shared participant level information, but published aggregate results. So in response earlier this year, the ICMJE put forward a proposal for consideration. As a consideration of publication of a clinical trial report in our member journals, the ICMJE proposed to require authors to share with others the de-identified individual patient data underlying the results presented in the article, including tables, figures, and appendices or supplementary material, no later than six months after publication, and that this policy would take place one year after adoption. And what they did is to solicit reactions in terms of the mandatory sharing, the timeframe, the data sharing plan, and how you could facilitate credit for these primary trial participants for the people actually conducting the research. So when we think about clinical trial data sharing, they're really competing forces in terms of progressing rapidly and pushing forward with clinical trial data sharing. You've got the advantage of it really ushering and pushing forward scientific advancement. You've got the ethical duty to patients and oftentimes to taxpayers too who are a major funder of this research. But there are also reasons for caution and those include privacy, the sort of knowledge that even potentially de-identified data can be re-identified. Fairness for the trialist and the sponsor who have spent oftentimes upwards of five years of their lives collecting this information and the unintended consequences that can happen if you cut into this incentive of having them have the ability to analyze the data, whether or not you would actually have the same amount of interest and motivation to conduct these trials. And finally, although it's good to push forward clinical trial data sharing, we've also got to have the ability to analyze the data and there are serious concerns about how usable that shared data are. So without further ado, I would just like to highlight the two initiatives here, the YOTA project and the Sprint Data Analysis Channel which I think you will hear more about in these future talks. And speaking of the next talk, I'd like to introduce Dr. Dresen. Thank you. Thank you, Amit. Good afternoon, everyone. So I think that sometimes it's pretty clear that data ought to be shared. And I'd like to share with you an interesting scientific experiment that was done a few years ago. So this is some basic immunology, right? I think you might have learned some of that in this room. You have a resting APC, an antigen presenting cell and it's talking to a naive T-cell. And that T-cell, it's like trying to pick up somebody and they're not getting any response, right? You need to be able to connect with that person. And the way that T-cells do that is through CD28. And if that gets stimulated, that T-cell wakes up and said, hey, yeah, let's talk. And that's a critical part of getting antigen presenting cells to be activated. So these guys had this idea that there are diseases like rheumatoid arthritis and psoriasis and other autoimmune conditions which are characterized by T-cells that have gone a little bit nuts. And if I could calm them down by capping CD28, I'd have a bunch of quiescent T-cells. Now, if you were in the middle of a smallpox epidemic, it might be a problem, but for a lot of autoimmune diseases, it might be a good thing, right? So a group of scientists developed what they thought was a non-activating anti-CD28 antibody. The idea was that anybody was gonna hang around here in the CD28 receptor. It was gonna prevent the B7-CD28 interaction. You'd never get a lymphocyte turned on. They'd all be just asleep. Pretty good idea. So they did studies in lower animals and they did studies in human cells and it seemed like it worked. So then they went out and they put an ad in the paper. They did this in London and they said, okay, we'd like some volunteers, otherwise healthy, come on in and get an infusion of this antibody. We're not looking for people with disease, we just wanna check. We're doing a study to see if it makes a difference. So we'll pay it for your time, but it's probably pretty low risk. So they do a phase one study with six healthy male volunteers who got active treatment, but they did it with reasonable scientific rigour. So there was a couple of placebo thrown in there. They show up at the center. Check them out, they're pretty much okay. And then they infuse them all. Six with active treatment, two with placebo and they were expecting absolutely nothing to happen. And they were surprised. They were fainted, coming back to consciousness. Actually, again, I assume that headaches was a lot of them, kind of like holding the heads, but the general left, they were going through like, the biggest bit of pain was horrible because he was screaming and saying his back was hurting. All of the men who received the real drug reacted quickly and badly. So they were like, okay, we're going to do this. We're going to do this. We're going to do this. We're going to do this. We're going to do this. We're going to do this. We're going to do this. We're going to do this. They reacted quickly and badly. The family marshal is the girlfriend of one of them. He's a young, fit, healthy, gorgeous cubite, you know, 28 years old. Even there, the cubite, 45-year-old cardiac prescription. His face is bloated out, like Elfman, like this, with a lot of support. And they're telling me he couldn't die anyway. A U.S. firm, Parexel, was running the experiment for the German company that developed the drug. Officials said it had been extensively tested before it was given to humans for the first time this week. It said the adverse effects were completely unexpected. Doctors are trying to save the volunteers' lives and determine why the experiment went so horribly wrong. Richard Roth, CBS News, Monday. Now, the follow-up, for those of you which are in the clinical side of it, is that all the people survived, but some of them lost fingers and toes because they were so hypotensive when they showed up in the intensive care unit that they were given vasopressin in addition to catecholamines, and they had a massive peripheral constriction, but they survived. But some of their fingers and toes didn't. So this study was conducted by a private company, and we never would have heard about it. It would have been totally off the radar screen, but something forced the data to be shared. And in this particular instance, it was a, I heard about this because the patients had been admitted to a public hospital, and the public hospital made it public. It was Northwick Park Hospital outside of London. If that hadn't happened, this would have been a secret. Now, how many people here think it was been ethical to keep that a secret, right? There were other drug companies who were developing anti-CD28 antibodies. Now, maybe they wouldn't have had this effect. Maybe they actually would have been non-activating. But from the data gathered from these patients, we knew that the non-activating antibody in lower animals and in human cells in vitro turned out to be activating in vivo, and the programs were stopped. So we don't know how many people were spared having very severe negative reactions because of this data-sharing event. We made it public. Once it was on the news, I called up Ganesh and Ranglotham, who I knew from the ICU mafia, and I said, you know, write this up. We want to know about it. And the company bought them hand and foot. They didn't want them to have the data at all. And so none of the data we published were gathered by the company. It was all the stuff that came in with the patients. So we don't know whether the people were normal when they came in. The white counts were a little off. We assumed that they were normal to start. But they all had sky-high cytokine levels. These people were really sick. So here's an example of data-sharing. The world knew about this example. And it's one of the reasons why we think that there's an ethical desire, an ethical need, to do clinical trial data-sharing. When you're in a clinical trial, and you're normal, and you end up with an x-ray that looks like this, and even in the lighting here, you can see this is not good. Things end up like that. Somebody needs to know that you've put yourself at risk in what we've learned from it. I don't think that anybody anywhere is going to argue that it's safe or appropriate to keep those data secret. So where do you draw the line? At what point do you say that when somebody has put themselves at risk, how many of you have actually ever been in a clinical trial? When you're in a clinical trial, when I was in grade school, I was in the polio clinical trial. You guys know what polio is? The good news is it's mostly gone, but it's not all gone. My parents enrolled me in the trial, like everyone else. But you're giving up an aspect of your care to the trial. Something is not under your control or your physician's control. It's under the trials control to answer a question. And the statisticians say that you've put yourself at risk. Even if you're studying the dermatologic effects of a new cosmetic preparation, you're at risk. And so that period of being at risk is a sacrifice for the world at large. And that's something that we have to honor. If you've ever tried to do a clinical trial, the biggest problem is enrollment. Getting people to walk in the door, roll up their sleeve, and say, go ahead, use me as your guinea pig. It's hard threshold to cross. And we need to honor that commitment. And sharing the clinical trial data to me the thing that's driven me to think this is an important thing to do is that commitment. Without people, and I thank all of you who raised your hands, and I encourage those of you who didn't, when you have the opportunity to do so, participate without your help. We would know nothing. We would be using observational data all the time, and we'd be wrong a lot of the time. And we'd be wrong a lot of the time. Not all the time, but a lot of the time. And we need these people to do that. So to me, it all comes down to that simple thing. We're making a commitment to people who put themselves at risk. Now, exactly how we're going to share data, the specifics, it's a very long and complicated thing. And I don't think in our discussion, Harlan and I, and then with you, we can discuss some of those things. But I want to point out one of the things that I meet talked about is the idea behind making clinical trials something that's played on a public stage. You all can think back to October 28th of this year. It looked like it was curtains for the Cubs. They're playing the World Series. They're way behind. Didn't look like anything was going to happen. Let's say they had thrown a blanket over Wrigley Field and just told you the score. How would you have believed it? No, you want to see it played out in public. You want to see the trials played out in public. This is a public game. And I use the term game to mean that it's something we don't know the answer to. We want to see it come to fruition publicly. So all of us in the clinical trial publishing field in early 2000s, me talked about clinical trial registration, faced the issue of what to believe. And the key study, the one that's been cited over and over and over again, almost ad nauseam, is a study done by Glaxo, and that is the study that forced us to do clinical trial registration. Have you guys heard of study 329? All right. You probably have and you don't know it. So this is study 329. It was a study of peroxidine in the treatment of adolescent major depression. It was funded by GSK. And at this point, GSK has been picked on a lot. They didn't behave very well at this time. They admitted. They did this trial. And it was, just like this trial says, they gave kids with major depression peroxidine. And anybody that studied depression knows it's a hard thing to measure. It's not like blood pressure. Like blood pressure is sort of easy to measure, but anybody that's really thought about it realizes how hard it is. Depression is like really hard to measure. There are all these scales. You get the same scale with somebody a couple of times. You get different answers. It's not like your bank account. You've got $85 in the bank. You can't say that with depression. So they did a bunch of scales. And they analyzed the data six different ways, one of which turned out to be positive. The other five weren't. So they published this article with the one that was positive. And they don't mention the other five. And the drug was on its way to being approved at an FDA advisory committee. Of course, the FDA gets all the data. And the FDA said, well, we know about this study that was published, but you didn't tell everybody that there were five other ways of analyzing a data that didn't show an effect. We won't give you the claim for your label. You're not going to be able to take a bottle of peroxidine and read the label and say that you can use this for adolescent depression when we're going to allow it. Suddenly, this became public information because the FDA, which had kept stuff secret. That's the way the FDA is structured, suddenly made this public. And some people got annoyed. One of the guys who got annoyed was Spitzer, who was at that point the attorney general for the state of New York. And he sues Glaxo for fraud. He said, how can you sell in drugs? We in the state of New York are buying drugs for the teaching depression in adolescents. And it probably doesn't work. This is fraud. So they go back and have a big discussion with GSK. And GSK says, OK, uncle, we agree. We screwed up. They're going to put all their data on a public website for drugs that have been approved. Spitzer walks away happy. We know what he did then. And if he didn't, you can Google Spitzer and find out. And GSK is off the hook, sort of. But this happened in June of 2004. Medical editors were meeting. And every one of us had had an experience where someone had sent us a manuscript that said, this is a study to show X. It was designed to show X. But when we looked into it, we discovered it was actually designed to show Y. But when it got to the end of it, it didn't show Y. It showed X. So they kind of repurposed it. Said, oh, well, it really meant it to show X. So they changed the rules after the game and started. And every editor had every single one of us. There were 12 of us in the room. We all had examples of this happening. The fence had been moved after the ball had been hit. Might have helped the cubbies early on. But in this particular case, it was happening in clinical trials. So we required trial registration. We said that when you start a clinical trial, you have to tell people what your primary and secondary outcomes are, who you're going to enroll, what your inclusion and exclusion criteria are going to be, when you're going to start. We originally said, we need to tell you when you're going to finish, too. But we listened to the investigators and said, no, don't make us tell you that because we never know. It's going to depend on enrollment. It's like the world's most fickle thing. So we agreed to all this. And starting in 2005, we said we're going to require clinical trial registration. And the clinical trialists, especially industry, fought us tooth and nail, the editors. They said, you're going to put us out of business with all these trade secrets. Academic investigators said, the big guys are going to put us out of business. The little guys are going to die. And we stood our ground. All the trials, a lot of trials got registered. We're still only running around 90%. But it went a lot up from 15% or so. And now it's the standard of behavior. People register their clinical trials when they do them. So it's our first step. The second step has been getting people to report aggregated data. And journal editors have agreed not to consider it publication when you report the results of a clinical trial on a website with the three minimalist criteria, a table of who was enrolled, a table of the primary and key secondary outcomes, and a table of adverse events. We don't consider that publication, pre-publication. It's perfectly OK. That law went into effect in 08. But we didn't get the rules written until earlier this year. And now those rules are in effect. And it's the law that when you do a clinical trial, it's been registered. You need to report the results. So now we're working on this third step, which is getting the individual participant data. And this is getting people to change how they behave, which is never easy. And we're beginning to go down that road. We haven't yet succeeded in a big way, but we're continuing on our journey. Because we're going somewhere that we've never been before. And we need to work with each other to figure out ways to do this. There have been some pioneers in the field. And at Yale, the Yoda, I guess I'll call it experiment, such an idea was. Is that how Yoda would speak? Putting the verb in at the end? But for the general world, this hasn't happened yet. But we're still headed that way. We need some things to make it happen. And we're working on getting those things put together. But our goal is to honor the sacrifice made by people who are in clinical trials. First, we had trial registration. Now we're asking for aggregated data. Our next step is getting individual patient data so they can be shared in a responsible fashion. Exactly how we're going to get there? We think we know, but we don't know for sure because we haven't done it. And what we're hoping to get from today is to round out that idea so we'll know how to do it. Thanks for your attention. Thanks, Dr. Dresens. It's my pleasure to introduce Dr. Crumholtz. So thanks very much, and it's a great pleasure to be here. This place looks so much different than when I was a medical student and it's a beautiful auditorium. It's nice that everyone showed up. You know, when you hear Jeff talk about this, it's a compelling case. And I'm also going to go through, at least for me, what makes this so important. But you've got to realize that this is a challenge of the status quo in that there is substantial resistance to a lot of the views that you heard here. And so hearing the editor of arguably the leading clinical medical journal in the world take a principled stand on this issue and to bring together the editors from the other journals is quite an accomplishment and quite a demonstration of leadership. I can tell you that if the other journals want to move in this direction and the New England journal is not part of that group, the entire effort falls apart. You can't do it, because people, I believe, prefer to publish in the New England Journal of Medicine. Lancet and JAMA, excellent journals. They are competitive. They're a good one. As the editors are listening to this, I think you're wonderful. And please be kind to my submissions. But I mean, if you look at the number of trials that are published, a disproportionate number are published in Egypt. There's a strong desire for people to publish in Egypt in general as evidenced by those numbers. And if they're not part of this group, the group falls apart. I pulled this issue of the New England Journal from September 22nd. And I just point you to this part here, the importance and complexities of data sharing, where as you just heard, the issue of implementation is complex and one that merits our attention and we need to solve this problem. But for the first sentence to say, and Steve Morris is the second author in here, and Steve's here with us, and so was Debra Molina and Meredith Hamill and Ted Campion. And I assume you've got everyone on board with this. And they start by saying, we at the Journal are committed to making the sharing of clinical trial data an effective, efficient, and sustainable part of biomedical research. I just don't want to, there's no way I can underplay this because you may hear this, you can hear this presentation, but it's hard to appreciate what those words mean. Now it means they're committed to creating a means of implementation that's effective, efficient, and sustainable. But more than that, it means that they are committed to this idea, which is not widespread in its beliefs among investigators. I can tell you, I know that. I don't even have to do a survey or study of that. I know that. I've felt the resistance. I've heard the resistance from influential people in our field, and so has the editor, and so has the team. And at the end of this, as they introduced the sprint competition, they said, the journal is committed to making data sharing part of our everyday business. Just as we introduced the inclusion of clinical trial protocols with the publication of all clinical trial research reports, we are working in the same spirit of transparency toward the goal of making data sharing a reality. This is just in September. And it represents an arc of work that has gone on at the journal. And I just deep respect for what you've done. It also meant that in the registration, it's the same thing. It's a very interesting exertion of power by these powerful actors within our biomedical research ecosystem, the journals. And the journals, for better or worse, have a lot of power concentrated in a very small number of individuals. And the question is, how do they exercise it? How do they exercise that power? Because it's quite considerable. And in this case, you're seeing an example of a focus on what is a potential issue and flaw within our current biomedical ecosystem in an attempt to move it. And I can tell you, there is also an attempt to bring people along. It wasn't mandated and dropped down. It was put into place as a public comment and a discussion in an attempt to understand this. And I don't mean to speak for it, and you just heard him. But I'm trying to put this in context for you, because it sounds like just another talk. And this is an ethics group, and you're hearing about an editor talking about an initiative. But it's much more than that. It's got to be understood in the context of the history of this. And I can tell you, a decade ago, there was no one talking about this. There was no one talking about this. This represents a progression of thought, along with action, and now proposals and meaningful pathways toward actually changing the way in which work gets done. So I want to just go through for you my own thinking about why this is so important and why it deserves your support and interest. I'm also just going to highlight for people that there is a hashtag on this meeting. Policy ETHX, hashtag policy ETHX. And that's my Twitter handle, HMKYAL. So I'd be happy to follow whatever you're writing on. But the reason I say that is, aside from the self-promotional part of that, is because I want this to reach beyond these walls. You just heard an extraordinary talk from the editor of the New England Journal of Medicine in which he's making clear his commitment and also how he's motivated by thinking about the people who are rolling in the trials. People throughout the world should be hearing that message. That the first case he's making about this is about honoring and respecting the people who are participating in the research. That that's the principal driver. Now there are others, but that's the principal driver. And what I like about Twitter is the notion that you can reach out to people around and spread that word because it has the potential to influence others. My own potential conflicts I think that are relevant here is we do, at Yale, we do receive funding from Johnson & Johnson. We have received funding from Atronic around these issues. We also work with the FDA at Atronic on device surveillance. And I think my biggest bias is my writings in public positions and my beliefs so that I favor study registration results, reporting and data sharing. So why share? Well, you know, so people will throw out and you hear from me and you'll hear from any others that there's this idea of promoting research as a public good, honoring the contributions of participants, enabling a deeper understanding of the research. These are all critically important. And I like the idea, actually I made a mistake. I should have put it at the top, honoring the participation of those who are part of the study. That is and should be the number one piece about this. But I go back to like, something like this was on my sixth grade wall. And it's like, there's something fundamental here about are we scientists? Is this really science or what else or is it something else? Because you ask a question, you develop hypothesis, you conduct the experiment, you observe and record it, you analyze and then there's this really interesting sixth step here in the scientific method from the poster of the sixth grade wall which is you share the results. You actually tell people what you found that you disseminate the information. But I think a compelling case can be made that there is, aside from all of these other issues, that we have to wonder whether or not we can make the correct inferences and whether we're using our resources properly if we don't share. And I would suggest to you that it's a fundamental part about our learning and the efficiency of which the research takes place. And so it's because of bias, it's because of error and it's because of waste as you get down on this, that it needs to become an intrinsic part of the way in which we conduct ourselves. So let's think about bias first. So this is an email that I just pulled out from this is from the Vioxx litigation. And the thing about the Vioxx litigation that was so useful was that all the emails sort of came forward in discussions among the scientists. And I wanna tell you when I was involved on behalf of plaintiffs in the Vioxx study trial and one of the trials, two of the trials. And one of the reasons I got involved was because I was promised that they would release all the data as a result and we could take a look at what the data showed. But also at that time there was a lot of emails between scientists. And I will tell you at the time that I dug into this I thought there was a problem with industry. I now think that what we discovered in Vioxx is endemic to the scientific enterprise. And this is from one person at Merck talking to another person at Merck and they're just discussing a paper. And they're saying I'm okay with the first two points however for point three we can either say this is 52% versus 52% and give the numbers that you have in this table. It's a matter of emphasis. I'm trying to give the best face possible that's compatible with the truth of the data. One can emphasize what one wants. It's not clear from a clinical perspective when any of this means for erosions they're talking about stomach issues. And why not try for some better appealing message. And what I'm saying is I've heard this discussion this isn't an issue just with industry I've heard this discussion many times. This is what we got. We're sending it to the New England Journal. If we don't make this thing look amazing they're not gonna be interested in it. I mean they're interested in incredible studies so this thing better look amazing. And so how do we compatible with the truth present it in a way that's gonna be most compelling according to what we believe. So in fact what I'm not at all critical of this company I'm saying that this is a common conversation where people walk in with their own ideas about how things should go. This is actually an interesting chart from the Viac studies of Alzheimer's disease. The middle line shows you the number of cases that were sent for adjudication that were confirmed. And the top line shows you the percent that were confirmed for placebo and the bottom shows you the percent that were confirmed for Viacs. You can see that in the end many more of the cases are confirmed for placebo than for Viacs. Somehow bias it would appear leaked into this because of the cases that are suspected of an endpoint to a blinded adjudicators there should be an equal number of confirmation on both sides. But as you get into the metadata in these particular studies you say why is it that more of the cases are being confirmed on one side than the other? What is it that's going on? There are hard to see in this but I'm just saying there are tons of cognitive biases. The Nobel Prize has been awarded for cognitive bias. It is intrinsic to the human condition that we take shortcuts, our way our brain works. We approach things with our own points of view, our own ideas about what's right. And one of my favorites with regard to studies is confirmation bias. We tend to listen to only the information that confirms our preconceptions. And one of these in this thing is one may reason it's hard to have intelligent conversations about climate change. But in studies I know this to be true. This isn't that I'm purposely ignoring things. I walk in and I say okay I'm discounting this because I want to promote my own pet theory. It's that a part of the human condition is I'm tending to notice things in the data that tend to be more supportive of what my prior beliefs are. And I'm taking out the whole thing about anyone doing anything that's unethical or in any way improper. I'm just saying that we do this naturally. It's natural that we notice things. We draw conclusions. We come into it with a point of view and it influences what we do. There's some beautiful work that's done by Brian Nosik in the Center for Open Science and where he's showing us that when you start doing studies this is a, you can go online and actually play with this. It's an interactive website and you can look up Center for Open Science. This is on actually 538 wrote a story about them and it basically says I'm trying to ask the question is there a relationship between a political party and the economy? And they start showing you all the degrees of decisions you can make. Should I include all people in government, presidents, governors, centers, representatives or some of them? What should I do for how I measure better economy, employment, inflation, GDP, stock prices? What about other factors who's in power overall? What about should I exclude recessions? And what you'll see is that you can manipulate this result to find anything you want to anything from it's a non-significant to highly significant in either direction. And again, this is an observational study but I can tell you that within trials we have this sense that the protocols are written in stone and that anyone given the same thing would come up with the same findings but it's just not true. Brian goes on to show this situation where they're trying to make an inference about whether the color of your skin is related to your risk of getting a red card in soccer. And so they collected extensive data on this and they gave it to expert groups. Now they didn't crowd source this, say anybody who wants this data can have it but they purposely identified experts with deep experience or highly respected and considered to be at the top of their field. And then they said, okay, what's the answer? And these represent the answers that they got. The point estimates and the uncertainty around what they came up with. Each one of these circles is a team and anything from it's equally likely not and all of these are non-significant results. So these teams are all saying there's nothing here and all of these teams are starting to say that referees are more likely to give red cards to dark skin players. And you start to go, each one of these is a research team all the way up here to the end and the estimates are different and the inference is different. There were 29 teams, all the same data. We had the same situation in our first Yoda project. We did this on behalf of Medtronic and they had a controversy about one of their products. We said, solve the controversy by giving us all your clinical trial data. And we said, and to reassure you what we'll do first is we'll give it to two of the world's leading systematic review groups. We first put out an RFA, we saw who applied. We identified York and Oregon, two of the leading ones in the world and we gave them the same data, the same money, the same charge and we said, tell us what can you learn from these trials? Largely the same results but they have different numbers, actually even of patients because of the way they dealt with exclusions. I would say they had, if they were published in the Annals of Internal Medicine side by side, you can see that they are not coming up with the exact same inferences. And again, sort of showing that the way in which they're thinking about it, two expert groups, same money, same data, different ways of looking at it. I just say, I wrote a piece about this where I said, we have this idea sometimes in science where we have this large ladder and one person gets to climb up the top and look through this large telescope and see the universe and they climb down from the ladder and they tell us what the universe looks like and then everyone else is scribbling down what they say. And we know that that's just not the path toward good science. And in this case we're saying there is room, there is room for discussion and dialogue among the experts and they may come away with different ideas that may become a bias. Now error, and again, I'm not going to suggest that there are problems, but do you believe that the, you look at what's going on around here, I mean, most of the studies are built on at this point, not only large scale data collections but lots of many decisions within it. And then a lot of computer code, a lot of computer code goes in this. Do you think it's possible that some of these studies, someone might have miscoded something, written a line of code that might have actually led to some change or they made some decision that was a mistake? I mean, but we rarely see it because all of these things are beyond public view. Error is a big problem. This is a study that appeared in the New England Journal of Medicine. I'll give you another Vioxx example just cause I have it at hand. And this is a study that was published called The Vigor Study. And, you know, it's a high profile study. It's comparison of upper gastrointestinal toxicity of rofecoxib and naproxen in patients with rheumatoid arthritis. It gets a lot of attention actually. A million dollars worth of reprints are purchased because of the, from this publication. I mean, it's being handed out to every doctor in the country virtually. I mean, this is a very popular study. And one of the issues in this study was did Vioxx cause harm? And one of those issues was did it cause heart attacks? And there was a very controversial piece about this. One part was whether naproxen was protective. There was an argument about placebo but there was also whether or not the number of heart attacks in the two groups were close enough to not really be that big a deal or whether it was a big deal. Well, it turned out in this study that there was a little mistake that only became evident in the litigation when other people could look at the data. And the mistake that they made, it was that interestingly, you remember this, Vioxx is intended to be better for the stomach, maybe worse for the heart, they're not sure. Better for the stomach, worse for the heart. So the more stomach events, that's probably gonna favor Vioxx. The more heart attacks, if it's bad for the heart, it may show the other way. So what they did was they, well, I'm not gonna say accidentally, somehow it turns out that they kept counting GI events longer than they were counting MI events. So they stopped counting the heart events before they stopped counting the GI events. So it turns out that these three heart attacks, which all occurred in the Vioxx group, end up not being counted in the article. And they didn't have that many heart attacks and those three actually make a considerable difference in your interpretation of whether or not Vioxx is being associated with a greater number of heart attacks. Now, a reason I'm just showing this to you is because, again, these kind of errors are likely to be occurring from time to time. The idea that there's extra scrutiny, people are able to take a look, they're able to examine it. Newland Journal then put out an expression of concern, they did address this, and I don't think it was ever retracted, but it was highlighted as an issue in this paper, but it would never have come to light, as in the same situation Jeff was saying, never would have come to light unless there was an independent group who was scrutinizing the data and being able to understand that there was an issue here that's worthy of discussion. Long story short, within four years of this, Vioxx has taken off the market, and there's a general consensus that the heart attack risk is real even today. Now, retractions are on the rise, this actually all goes to 2009. I think it's, maybe, I saw something in Retraction Watch that maybe it's leveling off a little bit. Now, but I think that we, this is, it's only by serendipity that some of this stuff is found. The, in general, most of the data is out of sight, out of mind, unable to be evaluated. And then there's this waste issue, and I wanted to show one other example of waste. So, one part of this waste is non-publication. We're not spending a lot of time talking about that today. The journals actually have been very generous and thoughtful and socially conscious in saying you can report the results, and it will not interrupt, it will not interfere with your opportunity to publish in the journal, for example. And so they're fully endorsing this idea of results database. It would take you about 15 minutes to report your results. And we're saying, you know, if you do a study, report your results. So even if you're having trouble publishing, the main results can be put out. And we started looking at this, and we started seeing that, you know, 46% of trials, experiments on people are published within two years, 68%, even if you went out 100 months, you know, it starts to top off at two thirds. We looked at, these are NIH trials, by the way, NIH trials, publicly funded studies, experiments on people, only two thirds of them are published. A lot of these, admittedly, are smaller studies that we published just in BMJ. NIH didn't believe us. We got calls from NIH. I heard behind the door meetings at NIH where they were scathing about our article in BMJ. And then they wanted to repeat it. And then they published it in New England Journal of Medicine, a confirmation of what we published. Virtually the same numbers. Virtually the same numbers. Now they extended that study to show a lot of, and this was their thing, a lot of them were smaller. But essentially they find the same thing, two years later, NIH ends up publishing it. Mike Lauer, Mike Lauer here. But you look all across, you're seeing these things, is it Academic Medical Centers? I just wanted to show you this, we did a report card. Industry is ahead of the Academic Medical Centers. We can't get the Academic Medical Centers to show the same kind of leadership that the New England Journal is showing. Couldn't get the double AMC, anyone listening? Couldn't get the double AMC to say, this is an emergency in American Medical Centers that experiments are being conducted and not being published. Couldn't get them to move. Couldn't get them to call this a crisis. We created a report card, published the report card. You can see the best places are still only about 50%, the worst places are down about 25%. But these are experiments being conducted at American academic institutions that are not being published within two years. Within two years. Oh, let me just say one third thing. Reported or published? We combined reported or published. So it's not like, oh, I'm having trouble, oh, I'm getting a lot of rejections, you know, I can't quite get it out. You know, we can ask whether or not, how important it is if you continue two years later, can't get it published, fine. You know, I've been there, you know, sometimes it's hard. But this is reported or published. You haven't told anyone. You haven't put it on the government website. These are trials. And these are all registered trials. Registered trials that for some reason, someone took the time to say that study is done. So we actually had a starting time to say, they say it's completed. Can you imagine? I mean, a lot of people probably don't even put on the clinicaltrials.gov that they're done. So this is the people who said they were done and then nothing ever happened. But this is another waste of effective digoxin on mortality and morbidity in patients with heart failure. It's published in 1997. It's an NIH-sponsored trial. I've seen the database on this. It's a remarkable database. They collected information about depression and function in six-minute walk tests. And there are hundreds and hundreds of variables in this database that patients were subjected to and funds were committed to. This is table one where they're just showing that the two groups are equal and then they show the results in the two groups. And then the study, nothing ever happens after that. They never let the studies not shared for years. Now then eventually the NIH actually posts it on their biolink site and people start publishing. We published two papers, several more even, but this one published in the journal Sex-Based Differences where we show there was an interaction. It was hypothesis generating, but said, you know, women died more often who got ditched. And it seemed, and there was a statistically significant interaction enough so that the New England Journal accepted it. I mean, that there was a concern about this. We published the information about levels and said previously we've been thinking you should treat people to a level of one to two. We showed that in this trial, the benefit was somewhere between 0.5 and one. All of this stuff was unavailable, no one was moving on it. Once the government started giving it away then, then they could leverage it and take advantage of it. I'm not, I'm just gonna go, this is the INE's paper about, I think I'll just skip through this, but this is about the people who do do replication are often having trouble replicating them. We started this idea of saying, you know, it needs an implementation platform. So there are a lot of people who are interested in sharing, particularly in industry, went to industry and said, this will be a reputational boost if you take the position that you should be sharing your data. You know, you should put it out there. We'd work first with Medtronic, with J&J. We have 180 trials. I urge you, if you wanna work with trial data, we have 180 trials in the metadata associated with it that you can ask us for and we can give you access to. And here's the website, yoda.yale.edu. Others are doing this, but I'll just say that this is, we see ourselves as an independent third party without interest. We don't own the data, we have no interest. J&J, we said the only way we'll do this is if you sign a contract and give us full authority over the data. You have no decision making in this. You're not gonna see it, you're not gonna make recommendations. We have full authority over it. And by the way, we want to share. We really want to share. So we wanna work with you to make it happen. We wanna promote the sharing of research. You can come and see this, product info. It's a fairly easy application, but we, one, it shouldn't be associated with litigation, shouldn't be associated with commercial interest. And you've got to be able to formulate a research question. And we will post that research question. We will be transparent about your request. And then we'll have you sign a DUA. You'll get access to a SAS enclave and be able to work on it there, and which prevents distribution, protects patient privacy. And we're trying to work to make sure that you have the proper software and ability to do work within that enclave. We've had 55 now slightly more requests. We have the first papers coming out that have come out of this, and we're seeing more interest all the time. I beg you, this is only gonna be successful. We can demonstrate that people actually make use of these opportunities. It's fine ways to spread it and scale it. Anyway, so then I took the Star Wars thing. Try not do or do not. There's no try. I'm not sure what this means, but... But, you know, if George Lucas is listening, we're not really trying to co-op Yoda or Star Wars, but we are trying to say that it's time to make this shift in the culture of medicine. We need to find the tools and strategies that are gonna be successful. We are fortunate to have leaders in medical journals now who are allies in this, but recognize the responsibility of doing it at a pace and in a way that's acceptable and workable, and that's just up to us now to find the way. Thank you. Thank you very much, Dr. Dresden and Director Krumholz, and we're in for a treat. We've got a decent amount of time for discussion and for Q&A. I'm gonna take a little bit of that though with some initial questions before we open it up to the audience. And so to begin, let me just load a slide. We heard a little bit about the Yoda project from Dr. Krumholz. We haven't heard about the Sprint Data Analysis Challenge, and I'm wondering if Dr. Dresden can give people who are not familiar in the audience a little summary of that, and if Dr. Krumholz can follow up with what his experience has been submitting applications through that process. So who's heard of the Sprint Data Analysis Challenge? Well, I can give a shorter form. A year ago, actually 11 months ago, 12 and a half months ago, we published the results of the Sprint trial. That was a trial in systolic blood pressure control in older Americans. The study compared a target systolic blood pressure of 140 to target systolic blood pressure of 120. There were thousands of patients enrolled in the study. When the study was conceived, there was concern that the group assigned to the lower systolic pressure would do poorly. They were older people, they had a hard arteries, they were gonna pass out more, they were gonna have more kidney failure. So it was everybody's surprise in September of 2015 when the study was stopped early because of a benefit in mortality for the people assigned to the lower blood pressure group. And that announcement was made and within days of Harlan and others said we wanna see the data. At that point, the investigators hadn't actually seen the data. It had been seen by the data safety monitoring board, but they had a chance to go over the data and in pretty much record time, they wrote up the paper and it was published six weeks later. Now at the time they published the paper, they didn't have the entire data set. You've gotta study with thousands of patients that was stopped abruptly in the beginning of September. You didn't have close out visits for many, many people. So they took what they had and they wrote it up and we published it. So then we issued a challenge and that is for people to reanalyze the sprint data to teach us something that we didn't know. You can use the sprint data and anything else that you can find in the public domain. But to get into this contest, you have to do three things. First, you have to get permission to get the data. Which means you have to go to an IRB and say I wanna reanalyze the sprint data and the sprint data had been anonymized and the IRB here at Harvard when we asked to reanalyze the data said this doesn't constitute human subjects research and therefore we don't need to get into special permission for this. They said it was okay. We took that down to the NIH and with that information, the NIH released the sprint data to us and they've released it to about 150 other people. Now to enter the contest though, you have to jump over one more hoop. We then asked people who got the data to a couple of questions. We asked them what the blood pressure was in the closeout visit and we asked them what the odds ratio was for the primary outcome. And so far we've received somewhere around 35 responses to those two questions and about half the people have gotten exactly right. Now one of the things that we really learned in the process was that we thought we had asked a precise question to which there was one answer. We started getting answers that were different from ours and when we looked into them, we realized that we had written the question for people who are clinical trialists. And you know, you get a couple of clinical trialists in the room and pretty soon they're talking to a clinical trial jargon. He went, what, what, what, what? He understands it entirely, right? He'd talk that way all the time. And for the clinical trialists, they got it right the first time. But for people who were data analysts, we weren't so clear. And it turned out when we went back and looked at the question we'd actually asked, there were eight correct answers. It didn't differ by very much, but you know, this is one of the things we should be able to get a really correct answer. And so we've had to accept that. We've had to revise our data definition. So it was a, it was a bit of learning process for everybody, but once you get the right answer, then you can submit whatever you want. The, it's still open right now. You can get IRB approval probably in a week. The contest closed on Valentine's Day. You got plenty of time. Use these data to find something and teach you something new. We've gotten a bunch of submissions from Harlan's group. I think it's terrific. There's cash prizes, $5,000 first cash prize, but more importantly, you're gonna get international bragging rights because you're gonna be able to take a data set and find something in it that people think is really cool. Now, to judge this, we've got four different groups of people. First, we have clinical trialists. We have a bunch of people like that. We have a bunch of data analysts. We have people from industry because they do a lot of clinical trials. But more importantly, we have three people who are actually participants in the sprint trial who agreed to be judges. And we're asking them two questions. This is, people are gonna write a 700 word, one figure abstract. We're asking them to rate it on a zero to 10 scale of zero being boring and 10 being really cool. And then the second thing is, okay, on a zero to 10 scale, how likely do you think this is to be true? And so to help with that aspect of the judging, we have a bunch of biostatisticians. But we're also asking the other judges to take a look at it and see whether they think it's true. And so based on those two scores, we're gonna be picking the winners of the contest. Now, in the spirit of 2016, 2017, 10% of the final score is gonna be from crowdsourcing. So if you submit an entry and get your mother and your sisters and brothers and all their friends, they can vote for you too. And but that's gonna be part of the total score that we're gonna use to determine the winners and the winners are gonna be able to present their data at a conference we're having in the beginning of April. So I encourage all of you, any interest at all, get ahold of the data, find something interesting and neat in it and show us that when you go over data, you can find stuff that really makes a difference. Well, I mean, there's so much in that. First, I think the fact that there were eight answers is bingo. That's what we're talking about here is that people looking at the same question with the same data are approaching it in somewhat different ways. And there's not one singular way that it's right. Even this statistical question now, you could argue, I think I take from what you were saying and maybe a de minimis difference is, but it's part of what this is about that there may be different ways of looking at this. And I think you asked me what our experience is with this, it's been terrific. Actually, I thought it was a very interesting approach. And in our own work with Yoda, what we did think, you've got to have an IRB, you've got to be able to write something that makes sense. That's what we, and by the way, because we're oriented toward wanting you to be able to get it, we'll go back to you and say this doesn't, what you wrote doesn't make any sense. I mean, you need to get another collaborator or to work with someone or, we'll give you a little bit of feedback, but we're worried that you don't quite know what you're doing. So we're trying to help you, but Jeff actually has taken that to another level, which say, let's have an entry question to see how people do. Our group has found this to be very energizing. I again, recommend you all engage. It's a fun thing to do. It's interesting, but more than that, I believe it's important. Sprint trial is one of the most important trials that's been conducted recently. It has the potential to dramatically change the way in which we think about treating the very large number of people who have elevated blood pressure and to change what our notion of elevated blood pressure means. So the idea that we could potentially add additional insight is a very important one. I just wanna highlight one other thing, which was his second point here, which is how likely is this to be real? It means you have to think about validation. You're gonna give this to a bunch of people. The number of people who are doing a number of things is quite large. So the bar here is quite high to say, what evidence do you have that this is real, as opposed to having been occurred by chance because there's so many people working in this area? And I think that that's a terrific thing too, because as these things go out and lots of people do lots of things, most of it's gonna be hypothesis generating. Oh, I observed something, I wonder if it's real. Can someone else replicate it? What is it? I'm not gonna purport that it's true. But for it to win this contest, you'll wanna identify something that is substantially important enough. While maybe not 100, we can never know 100%, it's telling us this is likely not a quirky finding that occurred because there are a million people trying to do this. And so I think that's an interesting challenge. But I urge you all to engage. This is I think a terrific opportunity and I'll say one other thing. I think the editors are watching. They're trying to understand whether or not this is gonna work. And so we wanna demonstrate to them that there's quite a lot of enthusiasm for this and that there are some people who can come up with things that are quite interesting and accelerate our learning from an investment, a substantial federal investment that was already made in trying to ask a principal question. I will say that this is quite, when I've talked to Richard Pito about this, he's very much of the mind. You do a study for one question, you answer the question, nobody should touch the date again. And that's how they conducted ISIS-2, for example. There was never a second study of ISIS-2. And I have tremendous respect for Richard. I admire him and I have affection for him. He's someone I like quite a lot. For those of you who don't know, Richard Pito is sometimes thought to be the best statistician in the world while others argue he's the best statistician in the universe. So again, just showing that there are various views among people who are in very influential positions. But anyway, and we found this to be in our interacting with the site and everything to be a very easy thing to interact with. I think people, if they take the time to learn about this, we'll find it to have been set up very well. Thanks very much. One question, I think you've seen very strong voices on either side of the spectrum in terms of should we push forward cautiously? Should we push forward very, very aggressively? And something that I think has been brought up as an argument for caution is the lack of familiarity with the data in the clinical trial and how that could lead to misinterpreted results. And I guess one question I had and something that I've seen irked data analysts, data secondary and data scientists is this notion of why is this particularly different than observational studies? So how are the concerns of data familiarity different for clinical trials than for observational studies? And what are the ethical implications of this differences? Well, I've heard this a lot. I've had prominent people in the trial world tell me only we can understand our data. And I've thought this is the strongest argument possible for data sharing because if you have made decisions that your metadata is in the shape that no one could, essentially you're saying it's unauditable, that there are decisions that were made that weren't documented. If people were to look at the books they wouldn't be able to figure out. And I've heard this said, no, we did things that we know about, we have the institutional memory but aren't documented sufficiently. So somebody's gonna get into trouble if they start using our data in that way. And I find that a very compelling case because I believe if you knew that your data was gonna be shared that would also from the very outset affect the way in which you were gonna document and clarify each of the decisions. The question is, is that inefficiency or is that equality? I mean, I guess debatable, but from my perspective it's people who do trials well, I think have metadata that would explain each definition and each major decision. And if somebody came in and tried to understand what the data set meant, they could do it. That being said, there are people who don't understand science or don't understand the topic. And just like they would for observation or clinical trial, that's a different problem. But in terms of saying that it's so idiosyncratic that only the inner circle could really touch and understand and appropriately use these data, I think is a red flag. But here's Jeff, or anyone here who's worked with trials, I think. And we should bring in others but I think, I don't know, Jeff, what's your view of this? Well, it's interesting, David DeMetz, who is maybe the third best statistician in the world, he was the head of a data shop that did a clinical trial and they reported it. And he was on this IOM committee with me to look at clinical trial data sharing. So after the committee meetings were over, he went back just for the fun of it, he said to start, and looked at a trial that they had published six years before. And their professional outfit, their professional data shop, they went back and tried to reproduce exactly what they published. It took them months before they were able to do it and with their own data. And what it turns out is that our data husbandry is not been very good. We don't really curate the data the way they should be curated. And it's a difficult thing to do. I like to ask people here, how many people have ever balanced a checkbook? You know, you're off by three cents, you say to hell with it, I'm never gonna find that three cents, right? But imagine something that's a million times more complicated and that's what these data sets are like. And so you have to document everything you do, but we've been doing it for so long without the level of curation that we need. So what we're asking by sharing data is we're asking people to behave differently than they've behaved in the past. And it's more work for them. And they don't see the benefit. And one of the reasons we're doing this sprint challenge is that the only way we're gonna pull people forward is to show them that we can learn something that they didn't see. And that the thing that we're learning is useful. We haven't had those examples. We haven't had as many as we like. We have individual patient meta-analyses. We've got genetic analyses that can be done. But the big examples, the one that's made the newspapers have been where someone's re-analysed a drug company study and shown that those results really weren't true. Well, they don't think they're true, but who wants their data to be put down like that? No one, because it's just a lot of work. And from my perspective, we need to change the way we take care of data. It's gonna be a lot more work. I acknowledge that. But we're living in an era of transparency. And things that we were okay to do 20 years ago, we can't get away with it. So if you're gonna try to change clinical practice like the sprint investigators, you need to put your data out there and the people who re-analyze it are going to first need to show that you understand it and then teach you something else new from it. So we're in the process. We're in an interesting time. We're learning how to do it, but I see the value and the major value is to get people who are rolling in clinical trials to believe in the result because the data are transparent. It's a game played in public. Everybody can re-analyze those data and there's nothing being hidden. And it'll take us a while to get there, but I think it's a worthwhile goal to get to. And we've listened carefully to the trial community. It's not something that we can implement tomorrow. Although it's interesting to say at the National Institute of Allergy and Infectious Disease, they've been letting contracts for clinical trials that require data sharing from the beginning. So when the data are gathered, they're gathered with uniform methodology and uniform data dictionaries, and then they're put in the database. So for example, in February of 2015, we published a study that showed if you take a three month old and feed him or her peanuts, they're much less likely to develop peanut allergy. One of the big scourges of elementary school parents of peanut allergy. We can make this go away by feeding these kids peanuts, but the kids that enrolled in that trial, they really have put themselves at risk. It wasn't substantial risk, but it turned out that we, the doctors, had created peanut allergy by withholding peanuts from kids. So that data set was made public the minute we published it. And David Harrington, who teaches first year statistics at Harvard College, uses that data set. And people are able to take the data and show and see in it that you can actually prevent peanut allergy. So it is just a way of changing the way that investigators thought. And I forgive kudos to the National Institute of Allergy Infectious Disease here in the US and Gideon Lack, the guy in London, who agreed when he started enrolling these kids to make the data public. It took a little more effort, but it's really paid off in the long run. I'd like to hear actually from Paul Richter, big clinical trialist who's here. How much extra work is it gonna be to do clinical trials so that you are gonna share the data? So first, I only begin by saying, I think it's been a very, very interesting session. And I greatly appreciate, Aaron, that you've put this together and the fact that it's being moderated. And I appreciate Harlan coming from Yale and Jeff has been in the center of this from many perspectives. Yeah, so for those of you who don't know, I'm a data generator. I'm one of those people who spends seven to 10 years trying to get funding, organize the study, find sites, have to travel over the world to get people to do the research with you. And I ultimately feel responsible for the tens of thousands of people who enter into our trials. And I think everybody in my shoes would agree with the fundamental principles that both Jeff and Harlan laid out, which is this is all about patient care, it's about doing the right thing. People don't argue about that. I think where the rubber hits the road is issues of trust and issues of what the actual intent is. So it's really interesting to me that the title of the session was sharing and reproducibility. Personally, I don't think reproducibility has much to do with this. There's lots of other ways to achieve reproducibility. And for the students who are here, the FDA basically gets the database and does it themselves and they don't match up, the FDA says so. And we've offered, Jeff knows this, we've offered just put our SAS printouts online if you wanna check our analyses, well there they are, if you have the sophistication to do that. So I think reproducibility is actually not the issue. The issue, and Harlan's right about this part, is it's non-publication, that's a tragedy. So as a clinical trialist, when I ask a patient to take drug A or drug B, I always say to the patient, you have a wonderful physician who is willing to admit his or her ignorance in the best sense of what that means to allow a computer to decide how you're gonna be treated. And that's a phenomenal bond we have with people. And that's what Jeff and Harlan are getting at. That is what we all wanna greatly, greatly respect. So the tension becomes, how do we do this? So I don't wanna represent the data generating community in this one discussion because I'm in that community who, A, came to this session because I kind of believe in sharing. But many of my colleagues feel very mixed and reluctant about it. So as someone in that community who believes we should do this but how we get there is tricky. Let's be clear about some things. There are many fields where the data generator is already readily shared. The clinical trialist collaborative, you were talking about Professor Pito and Professor Collins, they run a big collaborate. All the cholesterol trialists put all their data in there willingly because we trust these people to do a superb job. And the quality that comes back is very high and those are the ones that drive our guidelines. I happen to do genetic epidemiology in the genetics community. We all readily share our data because the net value of sharing is enormous. We get much better data, much better results by having hundreds of thousands of data points instead of the 10 or 20,000 that any one institution might generate. So there's lots of examples where it already happens. The fear is, and Jeff goes back to the sprint challenge, the fear is those investigators in sprint who put in nine years of time and effort, they should be the ones who write those secondary papers about gender and digoxin. Now, I don't know why that was never written. It's great that Harlan did it. We learned a lot clinically. But if you were an investigator, you'd say, well, wait a minute, that's part of what we hope our people will do. And just for the students, these secondary tertiary papers aid me very important clinically, but more importantly, they're the mechanism where you learn to be a clinical trialist. They're the mechanism you learn how to handle data and do the work that's involved here. And so even within sprint, I know there's already anxiety about, well, gee, all these other people are getting access to our data before we even had access to it and how are we gonna do that? And do we have a long list of what our preferred secondary hypotheses are? And those are outside the competition. I don't know how that's being handled. I'm not a sprint investigator. But these are the kinds of things that people worry about in the application side. All that being said, I represent someone in that community who wants this to go forward. Because I do think we have an obligation and a fundamental altruism of the participants. That's what we're trying to honor. But I have anxiety. You had a slide up about what's the difference between observational and randomized trials. I'll tell you what's really different. And Jerry, I say this with all love. The real difference is the trials matter in terms of policy. They matter in terms of what people believe works. The observational studies, as beautifully done as they can ever be, will ultimately still be usually hypothesis generating and don't tend to fit, they don't change things quite as dramatically as a trial. So trials are very worried about, the errors will be the same. But when there's an error in observational database, there's lots of noise, particularly among researchers. But the public doesn't get two different opinions about does this work or does that not work? And that's a third level concern. All that being said, it's great to have this discussion and both Harlan and Jeffrey to be congratulated because they're both right. None of this discussion would have happened 10 years ago. And the fact that we're having this discussion is a huge step forward. And Jeff's right, getting from grouped data to individual data has not been figured out yet. But I am optimistic. And I'm optimistic because at the end of the day, what both have said is that if we just did a better job curating all this, which might be the de facto consequence of this, most of this is gonna go away. So that's a broad view. Thanks very much, Dr. Richter. If I could get Dr. Avorn an opportunity to respond, that'd be great. Dr. Avorn's the chief of the division of pharmacokinemaology and pharmacokinomics. I know some people have to go at two o'clock. If you do, please do. But our experts have kindly stated their willingness to stay until 2.15 to answer questions. So you will have an opportunity to pose some questions as well. Thank you, Amit. I will rise above the two slurs against observational studies that I've heard today because it has been such a great session and I really wanna thank both of you for not only coming but also for your leadership in this area. There's one really key issue that I know we're all concerned about that has not yet been mentioned in this context and I'd love your thoughts about it. And that is what happens to data about adverse events in trials which do not result in a drug being marketed. My understanding is that those remain the kind of awful example that Jeff started out with that amazing videotape about, that that information remains the property of the company that was conducting the study and that they do have the legal right still to suppress that if the drug does not make it to market, thus potentially putting patients at risk and setting back the science on that question because nobody will ever know the details of those adverse effects. Could you guys comment on that? Well, I think that this goes along with the reporting requirements. And right now, I think they need to be strengthened in that area. I mean, I don't think there's any question that if experiments are run, we need to learn about the results. And I think for a company, sometimes, if they're not gonna follow through, it's a question whether it's even worth the resources to continue this. And I think we have to say, in my view, there's an ethical obligation to be able to share that information because other people are acting on the same hypothesis or this particular thing had it never come to light, someone else could make the same mistake and other bunch of people could be harmed. To me, it's not controversial. I mean, we just need to, I think, articulate clearly what those expectations are. I think, and I agree with Paul. I mean, the first level thing is this thing about reporting. And I think it's, in this case, you're making a really good point. It's not just about the efficacy endpoints, but it's also about what did we learn about safety? And that it's in a different legal status in terms of being able to oblige anyone to discourage the data. Jeff? The problem is practically the German company that developed TGN1412, went out of business. Recently, we've had the, I think it was a U.S. company, but the trial done in France with the fatty acid, I mean, hydroxylase inhibitor, which actually somebody died in a phase one trial. The companies disappear and there's no one left holding the gun or the bag, depending on your perspective. And so it's really sad. So the law now says that if you've registered your trial one year after the last patient last visit, whether it's the last scheduled last patient last visit or the last one that actually happened, you have an obligation to do it. But the fine is $10,000 a day. Now the company's out of business, there's, you know, nothing's gonna happen. So it becomes an ethical issue. You know, I think we have the right laws in the book. We just don't have a way to enforce them and what we would need would be to have someone like the Flom Center to say that, you know, this is in the public interest. We're going to take these data. The law is going to possess the data and give them to some third party, some Yoda-like organization, and have them analyze it and put the data up for the ones that don't do it themselves. And if there was that kind of threat, maybe the companies that were still in business would do something. But it's hard to know. It's a tough area. I do want to respond to one issue Paul brought up about when he said that he preferred that the papers that we published, for example, be done by the investigators of the trial. So what happened in that trial was, as you know, Tom Smith was one of the principal people leading it. Tom passed away, unfortunately. There were other leaders of that trial who had made the decision, and in the same spirit of like, I think what Richard has, you know, believes is that we're done, you know. And I don't know why they collected all that data. I mean, maybe there were some people in the trial who wanted to collect a lot of data. Some people didn't. In the end, the people who had control over the data. There were people, I know Mike Lauer came up to me and said Mike was part of that trial and said I wanted to do that. And he was shut down by the investigators. I think he didn't realize that it had been posted publicly. I didn't realize that he had that interest or we certainly would have invited him or would have told him about it. But the point was nothing was happening with that. And that they had been stopped by the leadership of the trial. So it wasn't gonna happen otherwise. We didn't preempt them. I think it's an interesting issue. What does, what is, you know, someone like Paul works in nine years to produce a trial. What do you think in that case? I mean, if I could tell you that by opening it up that we could make more progress in the world that more people's eyeballs would be looking at it, that the quality of the research would be improved, that the degree of discovery could be surpassed. To what extent should we say that we should hold that back because we've created, we'll create a disincentive for people to spend the nine years that you're spending. I mean, you're still getting the principal publication and you still have the advantage of knowing the data faster and better than anyone. But what's your advice to us about this? Well, she's getting in common. And for people in the group, I mean to be fair, Paul's one of the world's preeminent trialists has done many of the leading and important clinical trials. He doesn't speak for all the trialists, but you are an influential trialist. Yeah, so I guess, again, with the caveat that I'm actually in favor of doing this. And you're here, I mean, you should have. Many of my colleagues are nervous that the cure is gonna be worse than the disease. That's the fundamental issue here, right? That we have a fundamental problem in non-publication that's atrocious, but this doesn't really address that. The investigators should simply be punished somehow. They should be banned from future NIH grants, whatever. There's gotta be some issue that goes along with that. So this isn't about that. And as I said earlier, I don't believe it's about reproducibility. I think those are important issues, but they're not really what this is all about. So if it's about more and better quality data for people, then I'm all in favor of it. But I do think we have to absolutely respect the effort and time and money, frankly, and human capital and sweat equity that people put into these things. And I know that we're already running up against that as a problem. So again, we're here at the medical school, we have a lot of young people here. Would you want to enter large-scale clinical trials as your avocation if you thought that your data would be freely available to anybody else? And the answer is no. If you spent four years getting your degree and someone else picked it up, you wouldn't be very happy about that. So that analogy is real. And so we're trying very hard to figure out a way to find a middle ground so that these awful examples that are egregious don't happen. And as I say, we don't throw the baby out of the bathwater because if there's not another generation of people who want to spend seven to 10 years doing these things down the road, they're not going to get done or they'll get done poorly, which would be a much bigger public health catastrophe. So these are the challenges that we're trying to face. Dr. Ricker, I think you've got a comment or question behind you. If you could just introduce yourself to the audience, that'd be great, thanks. Mark Goldberg, I wonder if there might be an intermediate ground, the same way that when you write a clinical trial, you prospectively define your primary, secondary, tertiary, exploratory endpoints. There might be very novel ways for an outside scientist, a physician scientist to review, to analyze your data in a very novel way and you wouldn't want to lose that. However, would it be appropriate perhaps when you write your trial along with that, you report prospectively what secondary analysis you plan to do in your group that won't be part of that first paper, but will be things that you've already decided you want to do, they're pretty clear, they may be obvious. Those would be the, at least for some period of time beyond the first publication would be within the domain of the original trial is to protect all the sweat they've put into this. But that if there was a novel analysis that they hadn't thought about and they hadn't defined prospectively, that would be fair game for outside reviewers. And then after say some two year period of time after the initial publication, it was open to anybody. What about that? To me, it's a good idea that you have a data sharing statement and part of the data sharing statement has to do with future plans. If you've worked very hard, and before I took this job I did clinical trials, and I'd like to tell the story that if someone worked in my basic science lab, in a basic science lab and a clinical trials lab and they started, two people started at the same time and six years later, one guy had yet to publish a paper in the clinical trial side, and the person in the basic science side was working on his dozen, in his 12th paper, and he's now a professor somewhere, that was 20 years ago. So it's a very different reward system. And I think that we need to allow clinical trialists the opportunity to stake out ground, stake out a claim like you did in the gold rush. This is what I'm planning to do next and after that. But there needs to be some time limit. The NIH has said that's two years, but they're thinking of shortening it. But I think even if they shortened it, we ought to give clinical trialists the opportunity to stake out a claim for longer than that, because I know it takes time to do it. We don't want to destroy the incentive for people to be in clinical trials and to do clinical trials. So I'm strongly in favor of that idea, exactly how we're gonna implement it. It's not clear, but I think it's part of the way forward here, and that is to give people the opportunity to stake a claim. I'm Dan Grammer from the Beth Israel. And this is a question for both of you, but the examples that Dr. Kreml showed from the 538 blog, the soccer example and the political economic manipulation example were both chilling in a way, I think for the concept of data sharing more so than reporting and some of the reproducibility issues. And the message that I took from those was one is that for the soccer example, it shows that there's some art as well as science statistics and that different smart well-trained statisticians can approach a problem in a different way and get a different answer. But the other example shows that, I think that statistical software and statistical tools are broadly enough to be distributed and that the general research ecosystem that you've described knows just enough statistics to be dangerous and that people can use R or SAS so quickly and so easily now that easily available observational data sets are secondary analyses of large clinical trials and Jupiter data sets. This is 15,000 people, 18,000 people. You can find a lot of appealing P values in that data set very, very quickly without all that much work. And how do we safeguard against, even though everybody says that, well, these are hypothesis generating, but they can be hypothesis generating subgroups with 8,000 people in them. I mean, these are, there are a lot of people who will find ways to publish those kinds of data and how do we safeguard against that, even if it's well-intentioned, that misattribution of causality, how do we safeguard the integrity of the kind of research that's being done with those sorts of data spread more broadly? Well, I'll take a shot of that. I mean, again, this is my opinion, but for instance, Collins in one of his books said, science is progressive and self-correcting. And if the data are out and available, then it should foster a dialogue about what's the difference. I mean, what I'm interested in when I see that soccer example is, I mean, I stopped by just showing you that there are expert groups that disagree with each other, but I'm actually more interested in what happened. Like, what did they actually do? And does that help us in a dialogue about understanding what the truth is likely to be? Truth is elusive, but if we're gonna try to study that, it begins opening a discussion about what is it that's leading to one thing versus the other, same as when I was showing the vaccine adjudications. I mean, I don't know why that looks like that exactly, but it starts opening a question about what's going on and why does it look like that? And I don't think we should be afraid of that. I think that the issues, and now I'm just separating it out because I think Paul's got a lot of good points about these issues about respecting people's contributions and how you figure this out. And I think we have to listen carefully to those who have done good work and have created the data sets. But on this issue about what the truth is and how you get to it, I think that we have to be humble about that and willing to go into uncomfortable spaces. And I guarantee you the same issue exists with the trialist in terms of what's primary, what's secondary, when was it declared, how was it pre-specified, how exactly did they approach it? There's a lot of variation in the way in which that is done. And when you're a group, I mean, I like the idea that the cholesterol folks are sharing with each other, but I also know that's made up of a lot of cholesterol advocates. Now, we're not looking for cholesterol deniers, but are they truly independent or would different eyes see something else in there that's worthy of discussion and dialogue? What we did with the Metronik example was said, we're gonna get two groups with this data and then we're gonna give the data away because if people accuse one group of interpreting it having an agenda, people will be able to get the data and take a look. And I think somehow we have to get more comfort with uncertainty. We have to recognize the degrees of differences. That's why I love the fact that New England Journal thought they had an answer that anyone could, I mean, this is like a great tale that should last for hundreds of years. There were eight answers to what the New England Journal thought was a right answer. And that's a very, that's almost more important lesson than anything because you think that there's only one right answer, but it may depend. I do, we've got to inch our way towards what truth is and we've got to understand when people get different answers, why and what does it mean? There will be crackpots out there, but there are, there's no protection against that except that the data are out there and people can inch toward it, yeah. Last comment, and then I'll let Dr. Dresden and Dr. Cromholtz give a concluding thought. Thanks. Dr. Winter, go ahead. Yeah, I just have one comment. My name is Hesu Kim from the Dan of Averkins Institute. We are talking about data sharing, but patients, I think we should also consider those outcomes from patients who are not on clinical trials. So based on my experience, patients who are not on clinical trials, their outcome is much worse than patients who are on clinical trials. I mean, for example, like Ibruutinib in CLL, chronic lymphocytic leukemia, you know, when the clinical trial outcome came out, the response rate was nice and the taxes rate was reasonable, but when it was applied to general population to CLL patients, the response rate was much worse and the toxicity was much higher. So how can we capture this real world data in addition to data sharing from clinical trials? Paul, do you want to just say something quickly? Yeah, so I just want to make one more comment about the last interchange was quite interesting to me and it makes me a little nervous, Harlan, because it's John Stewart who put truthiness into all of our concepts, right? This idea that what the truth is is highly variable. And intellectually, of course, I totally agree. The difficulty is, and for the students in the room who are becoming physicians, this is medicine. This is not physics where we can share all kinds of data, have all kinds of opinions and no one understands it anyway, except for the six people in the world who are going to act on it. In medicine, and Jeff has to deal with this every single day, I'm sure, in medicine, people act on this information and they're going to give a drug X, they're going to give a drug Y, they're going to do therapy X, they're going to end people's lives depend on this. So truthiness here makes me very nervous. I must say I, so I'll give you an example. I sit on oversight for the NHLBI for its clinical trials and we've argued that to get funded by the federal government you have to have an extraordinarily rigorous process of grant writing that goes on and grant review that goes on for literally years. Then you have to convince them to still pay for the study and you can get a great score. Then there's protocol review committee steps in and lots of experts get involved. And every step of the way, it's rigor, rigor, rigor, rigor, rigor, rigor. And then I'm just going to hand the data off at the end. And that to me is makes me very nervous. Not that we can't have 12 opinions, but if they're diametrically opposed, we're going to undermine the whole system and they shouldn't be diametrically opposed. And the tension here is that people do have agendas and we all have lived through people having agendas and those agendas can be matched up to people with excellent analytic skills to look at the data and you may or may not get different degrees of truthiness. So my hesitancy here, which is real, is this is a field where physicians act upon these things and our patients depend upon us as investigators, journal editors as the arbiters of this and the outcomes community to do as good a job as they can. But there is a lot of nervousness about there about the slippery slope of truthiness and I think we just have to acknowledge that. Dr. Krumholtz, Dr. Drazen, last comment and our response. So we're in not really totally unexplored territory. The National Heart, Lung and Blood Institute has had BioLink out for a decade. They have hundreds of studies. There's only been a very small fraction of people requesting the data to reproduce the results. And I agree that it only takes one terrorist to blow up a lot of people, but that most of the time people are using data to answer questions that are ancillary to what was being asked in the clinical trial, which is getting the most value for it. So do we worry about the deep book, the devil of what can go wrong enough to stop us? And I think that if we have a code of good conduct among investigators and more importantly, code of good conduct among journal editors, but we don't publish crazy stuff, or if somebody does an alternative analysis and get them up with a different answer, they have to tell you what they did differently to come up with a different answer. Then the argument has to do with the validity of those assumptions. That's a fair argument. But when they find an adverse event, people with bald hair, no hair, take this drug are likely to lose a foot. That's what they found in the database. Probably happened by chance along. So I think we're moving forward in a new area. We have to acknowledge it's a new area. We don't know exactly what we're doing, but there have been explorers out there and they've come back unharmed. There have been some explorers that have gotten hurt, but it doesn't mean that we don't explore. And I think that the richness of the territory is such that it makes the exploration worthwhile and that we should do it. And I'll just say, as I started, deep respect for the leadership of the journals, I want to say deep respect to the trialists who are generating data. I want to see them be able to get the credit credit. Every paper that derives from their work, I think it should, it's the Human Genome Project. Everyone who's worked on the reagents that were developed as a result of that reflects well on the people who presented that. I think medicine's becoming an information science. I think that just, this is the different views you're seeing. In my view, it's exactly because of the consequence of the trials that they have to be able to withstand public scrutiny in a way that is higher than what we do for almost anything else. And it's gonna be up to us to, there will be people who try to corrupt that system, but the fact is that, since everyone can touch data, no one is privileged beyond anyone else to be able to take a look, but I want to honor the view of trialists too. And it's just, I think there's gonna have to be a lot of public dialogue, we have to study this carefully. I think it's already being done and we'll continue to make progress. The last word here is that a lot of this is gonna be in the implementation too. How is this implemented? To what extent is it made open and to what extent are we able to protect privacy, promote science and ultimately get better translation of real clinical trial results that are meaningful and we're even doing now? Thank you and thank you, Oriens, for attending. I hope you've enjoyed the event. Stay tuned for more events from the Center for Bioethics, from Portal and from Petrie Flung. Thanks a lot. Thank you. Thank you.