 All right, I guess we're going to try to get started. Good morning, everybody. Like to first, my name is Steve Engelberg. I am the editor-in-chief of ProPublica. And I'd like to thank the Kaiser Health Foundation first of all for this magnificent space and breakfast. Really amazing. And in the conversation before we started, I was asked, and I think it's perhaps useful, how did ProPublica get into the business of this in the first place? And like so many of these stories, it's a slightly long one. Many years ago, it seems, we brought into our ranks Marshall Allen, who had done some fine work on the question of patient safety and patient harm in Las Vegas. And he had this tiny ambition of saying, what could we do nationwide? And it sounded actually kind of impossible. There are 50 states after all, but we started looking into what data existed. And we began to think we could do a project that the quality of care and patient safety issues at a hospital by hospital level. And so we started a Facebook group. We got ultimately thousands of people to give us sort of anecdotal stories about their experiences. And then we began crunching data with Olga Pierce, our data guru, working hard on this question. And several years into the process, Medicare, both a good and a bad way, changed all the rules on us and suddenly released surgeon level data. And so we said to ourselves, well, gee, how hard could that be, really, to do this at the surgeon level? All you'd have to do is risk adjust and figure out some formulas and just run a few regressions. We could probably have it done by noon. No, it was not like that at all. There's an iron rule of investigative reporting. And I've been both an investigative reporter and editor for much of my career. I always say to people that investigative projects take twice as long as the most optimistic estimate. And I now have a corollary to that rule, which is that data projects take four times as long as the most optimistic estimate. Anyway, all of this led to the creation of surgeon scorecard and a very vigorous conversation that I look forward to continuing today about the pros and cons of that and the value of it. As we get into this, I do think it's worth keeping in mind something that certainly I've kept on my desk throughout this process. I have to confess, I'm so old. I was a journalist in Washington in 1999 when Two Heirs Human came out. And at the time I was covering the CIA, so it was just a kind of a sort of interest of mine. And I recall reading it and thinking, wow, this is an incredibly important thing. And I just like to quote a couple of lines from it because I think it's useful. We're not that far from the 20-year anniversary of Two Heirs Human, which is a very good report. And this is what they said of this question of patient harm and patient safety in 1999. They said, the status quo is not acceptable and cannot be tolerated any longer. Despite the cost pressures, liability constraints, resistance to change, and other seemingly insurmountable barriers, it is simply not acceptable for patients to be harmed by the same healthcare system that is supposed to offer healing and comfort. And they continued. To Heirs Human, but errors can be prevented. Safety is a critical first step in improving the quality of care. The Harvard Medical Practice Study, a seminal research study on this issue was published almost 10 years ago, that would be 1989. Other studies have corroborated its findings. Yet, few tangible actions to improve patient safety can be found. Must we wait another decade to be safe in our health system? I think those are some words to consider and to even ask ourselves the question, have we made as much progress since 1999 as we would hope? Our group today, and it's an incredibly accomplished group, our moderator, Scott Hensley, has covered healthcare for more than 15 years as a reporter. And since 2009, has been the host of SHOTS, NPR's online health channel. We have Nancy Foster, who is the Vice President for Quality and Patient Safety Policy at the American Hospital Association. Mark Friedberg, Senior Natural Sciences at the Rand Corporation. His research focuses on performance improvement by healthcare providers and methods for measuring health system quality. Dr. Martin Macory, Professor of Surgery at Johns Hopkins University and the author of Unaccountable, a book about physician-led efforts to increase transparency and improve healthcare quality. Olga Pierce, the Deputy Data Editor at ProPublica and a member of the team of data journalists that created Surgeon Scorecard. Ashish Jha, a Professor of Health Policy at the Harvard School of Public Health and Professor of Medicine at the Harvard Medical School and Director of the Harvard Global Health Institute. And of course, Marshall Allen, who is going to give us a very brief overview of the subject. Thank you. I hope we have a spirit exchange and I know there's gonna be a lot of Q and A and a lot of exchange. So here we go. Good morning. Thank you for joining us today. My name is Marshall Allen. And I'm Olga Pierce. And we have been covering patient safety for ProPublica for quite a few years and we're really honored that you all came to join us today to talk about some of our work. As Steve mentioned, there's a major problem in America in healthcare for patients and for providers. Patients keep getting hurt when they undergo healthcare. If you were to total the hundreds of thousands of people who died every year because of medical errors, infections and injuries, patient harm would be the third leading cause of death in the United States. This problem is largely kept secret. It's even kept secret from the patients who suffer the harm themselves. And unfortunately, the self-policing of the medical community, it just doesn't work. Patients, even when they undergo a common elective surgery face a huge dilemma. How is someone supposed to choose a surgeon? The medical community does a lot of internal quality improvement work. The medical community refers patients to doctors but essentially the message that the medical community sends is just trust us. Unfortunately though, patient safety is estimated to be worse today than it was 15 years ago when to errors human was published. So has this trust really been earned? Is the best way forward more of the same? So coming out of these questions, we developed a tool called Surgeon Scorecard which reports data about eight common elective procedures for about 17,000 surgeons. And because we're journalists, we started by trying to answer about these surgeons two basic questions. How many operations did a surgeon perform? And how many of those patients died in the hospital or were so ill or in so much pain that they had to return within 30 days? Being that we are journalists, we relied on a lot of experts and a lot of sources to help us with this project. We have gathered the stories of and heard from thousands of patients who have been harmed in the course of our reporting. And they greatly informed our perspective and certainly sort of the moral force that we bring to this argument and this debate. And we also talked to a lot of providers. We talked to frontline surgeons who did each of the procedures that we focused on in the Scorecard. We talked to doctors who treat these patients in the hospital. We talked to experts who work with this data on a daily basis. And every step of the way, as we made these key decisions, we treated them as journalistic questions where we went to the experts and asked them what decisions should we make? What's the most responsible way to do this analysis? So of course, to undertake something like this, there are many challenges. So we just wanted to walk you through some of these challenges, some of these forks in the road and explain why we made the decisions that we did. The first challenge was that perfect data is unavailable. Even inside the medical industry, data is patchy and incomplete and most of that data is not public. So what we were able to use is Medicare billing data. It does include basic information about the patient like race and age. It includes basic information about a hospital stay, like how long it was and basic information about the procedure that was performed, but it absolutely is not complete clinical records. The second challenge was that data can be unreliable. We learned from looking at the data that some hospitals just code better than others. And so with the guidance of experts, we identified two reliable data points to identify sort of acute patient safety events. The first was deaths in the hospital. Most of the time hospitals are able to tell if a patient left data alive. And the other measure was readmissions within 30 days for a complication related to the surgery. And we had dozens of doctors and frontline surgeons helping us identify those instances. Another challenge we faced was accounting for high risk patients. We all know that all patients are not equally risky. And so we tried to tackle this in two ways. The first was to eliminate as much as possible variation in patient risk in the pool of patients included in our analysis, which is why we chose low risk elective procedures. And then further, we excluded anyone who came in through the emergency room. We excluded anyone who transferred in from another medical facility. And we also made sure that anyone with an unusual diagnosis was excluded from our analysis as well. But we actually didn't feel like that was enough to be fair to surgeons. So we went a step further and built a risk adjustment model. We were guided in doing this by a biostatistician at Harvard named Sebastian Hanoes. He's very smart and also lovely to have tea with if anyone's in Boston. And basically what this model does is account for patient factors like age, for example. And also through the vagaries of statistics, attempts to account for the fact that different hospitals overall can be more or less safe. And finally, an important goal for us was to make sure that the tool we built in the end would be parsable by normal human beings. CMS uses measures like infections per thousand ventilator hours. It's not always clear to patients how they should use that information. Different measures might conflict. So we wanted a single measure, which we called the adjusted complication rate for simplicity's sake, where the units were people. So how many people went in? How many people came out? So we ran this analysis on five years of Medicare data. And what we found was a lot of variation. I mean, as you would expect, the average complication rates were quite low for these procedures. We're looking at hip replacements, knee replacements, spinal fusions, prostate removals. These are very low risk procedures. And so the average complication rates were, as you would expect, around two to four percent for these type of patient safety events. But even on these low risk elective procedures, there was still quite a bit of variation, even in the same hospital, with the same surgeon performing the same procedure. We also found this gentleman on the right here is just one example. A lot of surgeons who do very well on this measure, even working in places you would not expect. So Dr. Aaron Joyner here is from Muscle Shoals, Alabama. It's a community with an extremely high diabetes rate, very high smoking rate, very high obesity rate, and yet his performance by this measure was one of the best in the nation. So we'd like to take you just on a quick tour of Surgeon Scorecard. We understand not everyone visits every day. So this is what you'd see if you went to a hospital page, either by searching for the hospital or by searching for hospitals in your zip code. Each of the bars you see at the top represent a procedure performed at that hospital. Each of the little sort of peggy guys on the bars represent a single surgeon. And the yellow, green, and red zones just put that in context with surgeons nationally. Then if you zoom in on a particular procedure, you can see side by side the surgeons who perform that procedure at that hospital. In this case, for example, you can see that Dr. James Ronzo performed the procedure 297 times in Medicare and had an adjusted complication rate of 2.8%. And that Dr. Konstantin Tumbis performed the procedure 246 times, had 27 complications, and has an adjusted complication rate north of 7%. And I wanna take a minute to talk about the way we try to be transparent in portraying our data. So what you see is on each doctor's result is both the point estimate that comes out of our model and also a confidence interval that goes around it. We felt that in order to be transparent, it was really important to also include information about the precision of our results. On the other hand, we knew that there was a trade-off between that transparency, which we felt was the right thing to do, and patients being able to necessarily understand what we were showing them. And so we did everything we could in our visualization to try to ameliorate that problem. When you hover over a surgeon's result, a box like this pops up. There's a link to learn more about what we're measuring in the adjusted complication rate. There's some additional graphical things that pop up. And also when you hover over certain parts of what you see here, text boxes pop up with additional language, helping patients understand what they're looking at. So there are some obvious limitations with this analysis. We've always disclosed the limitations because they are important. For one thing, we're looking at Medicare fee for service cases. So we're not looking at every single one of a surgeon's cases. We're also counting what we would call more patient safety events. So we're not actually capturing every complication that might occur that might be important to a patient. Obviously it would be ideal in this kind of analysis if every surgeon in the country perform 5,000 of each of these procedures. But in the real world, the volume counts just are not that low. Past performance doesn't guarantee future results. So again, we're looking at previous data. And the data does not say who's to blame or who caused these complications. But the American College of Surgeons and others agree that surgeons are still responsible for the complications that occur to their patients. And our mindset is that surgeons are also the best place people in hospitals with the most influence and the most information to actually intervene and see what might have happened in these cases so that future patients can be protected. There is tremendous power in transparency and we've been very pleased with the feedback we've gotten from the scorecard. The National Patient Safety Foundation put out this report last year called Shining a Light where they call transparency a magic pill that can cure the patient safety problems that are ailing America's healthcare system. The scorecard has been used more than 2 million times. We've had enthusiastic response from all different stakeholders. I talked to a hospital CEO who had been dealing with a surgeon who had a very high complication rate in his facility and that surgeon wasn't being compliant with the peer review or the interventions that the hospital was trying to make. Once we made that surgeon's complication rate public, the surgeon, the hospital CEO told me that that surgeon immediately changed his tune, came into compliance with the efforts to actually improve his performance. We've had meetings with specialty societies. The specialty societies realize the need for them to develop their own quality metrics and this has spurred them along to do that. That's what they tell us. We got this email from a quality improvement expert from Palo Alto. This is Dr. John Cooper who's with a specialist group that has over a thousand doctors in it. He congratulated us on the scorecard. He said he wants to help us as we continue this work. He talked about how difficult it was for him to get even the most basic quality information from the six private hospitals where all of his doctors work. And he said, I'm hoping to leverage what you have to help us support our ongoing patient safety and quality improvement efforts. And as we said, Dr. Cooper and other providers on the front lines have volunteered to help us as we continue this work. So in the roughly eight months since surgeon's scorecard first published, we've received lots of amazing feedback and we're using it to help us put out what we're calling scorecard 2.0. We think it will be an improvement. From the beginning we imagined this as an iterative process and so we're really happy that people have helped us move it along. Here are some of the changes that we have in mind. First, we're working with the medical community to try to capture more inpatient harm which happens before patients leave the hospital. No more exclamation points. Nobody was really a fan of the exclamation points that appeared next to some hospital names. We're also tweaking our visualization to put less emphasis on the red, yellow, and green categories mostly so that we don't overstate the precision of our work. We're also adding clarifying language about what types of complications we're measuring and also what we're not measuring to make sure that it's clear. We're also working with surgeons and others to create a risk adjustment that's more precise and more specific to each procedure. And we've also been working with and will continue to work with hospitals where there are severe coding problems. At the moment, we've had to switch off their profiles and our data and we wanna get them back in and we've had some success with that. So a few final thoughts. I mean, for us, as we've explained, this is an iterative process. So we feel like we're kind of at the starting point, not the end point. But this is also a starting point we like to say for patients. As they try and decide where they're gonna go for one of these eight procedures we've looked at, this is a good place for them to start having informed conversations with their doctors as they choose where they go. So starting point for the medical community to look at the surgeons who have the lowest complication rates and see what might they be doing that others could learn from and then look at the surgeons who have the highest complication rates and look and investigate and see if something might be done to intervene or to coach or to help that surgeon improve. We also have a challenge for the medical community and the challenge is this. We would love for you to put us out of business. We didn't get into this. We got into this as a journalistic enterprise and we have learned a ton in the process but we would love for the medical community to start gathering the type of data that can be used for this type of analysis and publicly reporting it for the sake of patients and also for the sake of the improvement of the medical community. We appreciate you all being here and we invite you to help us. You can reach us at these email addresses and find us easily online and we look forward to the conversation. Thank you. Anywhere you guys want. Thanks a lot for everyone who came in person. Seeing somebody? No. No one. Yeah, I can sit down at some point maybe or we'll see, we'll see. Maybe someone will show up. Thanks to the folks who came in person and also to the folks who are online. There is a hashtag for this event if you are interested in tweeting and it is pro-pub live. Is that right? Okay, good. So tweet away. I was gonna tweet, but I had to turn off my phone because of the mics. But Cynthia, if you see something during the open discussion part, please let us know and we can incorporate that into the questions that the public can ask after we do our moderated discussion now. I like to start with Olga. I had a question. You anticipated some of my questions in the presentation. Marshal, thanks and Olga also. Can you talk a little bit beyond the impressive two million uses of the scorecard so far about how you would assess the impact of the debut of Surgeon's Scorecard? What's happened as a result of getting it out there? I love the anecdotes, but I'm thinking about conversations you might have had with some of the stakeholders or where you think it has maybe surprised you. Yeah, so for basically since Scorecard launched, Marshal and I have been on kind of the Marshal and Olga traveling road show. We've met with surgical societies. We've met with hospitals. We've met with providers. We've met with patients as well and our ongoing conversation with them. And I think if I had to choose sort of two things to highlight in terms of impact, one would be just the fact that patients feel like the power imbalance that occurs between patients and doctors has been shifted in some way, that you have something you can go in and point to and say, Dr. XYZ, explain this to me, and Dr. XYZ might have a perfectly good explanation, but it still gives you a starting point so that it's not a case where the doctor holds all the cards and the patient has nothing. So there's that which I think is a tremendous impact that's difficult to measure, but I think is real. The other thing I think that's happened is that a conversation has been jump-started. Some of it is in the patient safety community, which is amazing. Some of it is jump-started in a way that maybe some people wouldn't prefer, but the surgical societies are having conversations now along the lines of we hate Scorecard, what can we do to replace it, right? Clearly, patients want this information. Clearly, the future is here. What can we do to inform patients better? And honestly, that to me is an excellent outcome. Thanks. Steve pointed out that since to AIR as human, things have gotten worse, not better. I know, Mark, that you've had some criticisms about the way that Scorecard works, but I think Olga brings up the point that ratings aren't gonna go away. I know a couple of people here in this audience who are proceeding apace with their own rating systems. Could you talk a little bit about some of the things that you and some of your colleagues have written about, which is this balance of imperfect data, incomplete data, and yet the desire that Olga points out for patients to have some information to make better choices and to have a better conversation with the person who's going to operate on them. Sure. First, I just want to thank Republica and everybody here for coming and for inviting me. I just want to start by saying a couple of very nice things about the Surgeon Scorecard. I think the intent behind the Surgeon Scorecard is widely shared by people in my line of work in health services research and many, many people who do quality improvement in hospitals and outpatient clinics across the country. There's clearly a need for more transparency, especially regarding individual surgeon and other provider performance in this country. So I agree completely with the need and the intent of the Scorecard. Where I and colleagues had some concerns was about the validity and reliability of the Scorecard, and I'll explain what those mean. Validity is really the holy grail of performance reporting. Whether the information in a performance report is true or not, and you're never going to get perfect truth, but you can start to approximate the truth if you do a good job and check all your boxes along the way and do your due diligence in checking your data before publishing what we would consider to be sort of a high stakes report people might use to choose their surgeons and that surgeons might use to decide whether or not they need to improve. The concern is if you get it wrong, you could of course have some patients, maybe a lot of patients, being misled to surgeons who are in fact not as good as other surgeons at producing whatever health outcome the patient wants to have, and they may not be as safe. Reliability is a related concept, which is just the signal to noise ratio in a performance report. And validity first though, if you don't have validity to begin with, don't even worry about the confidence intervals or the signal to noise ratio. The important thing is to be accurate and correct on average. When you produce a new performance measure, and I would argue that the adjusted complication rate that ProPublica has produced is novel. It's not something that's been reviewed by NQF, it's not a report or rather a measure that's been used by anybody else. And for that reason, I think a little due diligence would have been advisable before publishing the scorecard. Namely, and this is the most important thing, well there's two really important things. The first is just make sure that the measure that's being reported that you've calculated and spent all this time producing tracks pretty well with some other kind of measure of the same underlying construct. And I would argue the construct being measured here is whether the surgeon's a good surgeon. Is the surgeon gonna produce a good outcome? Or is the patient gonna be harmed? If you don't do that step, you never really know whether your measure coming from these claims data, which are available, but as Marshall and Olga rightly pointed out, have their limitations are actually leading you in the right direction. There's also the issue of just, are you even assigning the right cases to the right surgeons? And this is a really tough thing to deal with and actually to detect. And unfortunately in Medicare claims data, part A, these are the hospital claims which were available to the public to create this scorecard. There is an error rate. This manifested to some extent in really obvious ways in the scorecard where there were some cases in some hospitals, one right down the street from EMS General, where you had primary care physicians and other kinds of physicians who don't do surgery listed initially as being surgeons and performing things like hip replacements. And there's still a couple of those folks listed on the scorecard at EMS General. These are some surgeons and cardiologists who just don't do hip surgery. The issue there is that you're sort of seeing the tip of the iceberg. You're seeing smoke, but there's probably a fire that you're not seeing which is if you have that kind of error, which is very obvious to detect through looking at specialty codes, there's an unknown amount of misassignment between surgeons in the exact same department in the same hospital. And this is kryptonite to actually having an accurate measure on an individual surgeon level of the risk or benefit of surgery. There was a group at RESDAC, this is a CMS contractor at University of Minnesota, very well regarded researchers who in 2012 actually looked at this nationally. So we sort of had this hint in the scorecard, but they looked at this a few years ago and found that for one of the orthopedic procedures, they didn't look at everything, but one of the orthopedic procedures reported in the scorecard, the rate of mismatch between part A claims in Medicare, which is the hospital claims, which are available to ProPublica, and part B claims, which is what the surgeons submit to be paid by Medicare for their part of the operation, those disagreed 28% of the time. You have no idea whether that's random disagreement, whether it's systematic in some way that could bias the scorecard in unknown ways. And the only way to really figure that out is to actually do a credible validation study to just make sure these cases are actually assigned to the right surgeon. So those were our concerns. There were a few others that are in some critiques we've written. Marty, given what Mark said, is that a reason to, what's the action to take? Not do a quality measurement system, or as I think Marshall and Olga talk about, do it iteratively and come up with something, get the feedback and look to improve it. What's your view on this sort of taking action and making ratings versus waiting until you get more validation or getting the information sussed out a bit more? Well, first of all, I wanna thank ProPublica for doing this. I think this is a great conversation among really strong leaders in the field of quality and safety. So it's a privilege to be a part of this. I'm always amused at how much people are worried about harm on surgeons. We're doing just fine. And I don't know anybody in the surgeon scorecard that came up so bad that they are in jail or no longer practicing or the practice has taken a big hit. So I think we're doing okay. What I'm worried about are the patients. And that's a big deal. Just like there are milestone events in healthcare, there was just one with Leslie McCarthy at St. Mary's Hospital in Florida. A baby who died from a cardiac surgeon that probably should not have been doing cardiac surgery. And it took a team of investigative journalists to discover that she was really one in a string of seven different babies that died. It was a Johns Hopkins surgeon that went down as a part of an independent review commissioned by the state to discover what was happening. And in a very strong report wrote that the surgical care was horrendous, that it should be shut down and that people were dying needlessly. Now that's real harm. There is nothing out there now that detects patterns of egregious complications. The malpractice system doesn't pick it up because everyone consents that they agree to have a complication. But when you have a pattern of complications, what do we have out there? We've got nothing. These are serious things. And as a surgeon, you see stuff and you just wanna be able to say, do we have a duty to do something? And if you leave it up to our own professional associations, things generally move at a pace of about one inch per decade. And it took the cardiac surgical community about 30 years to perfect or master a system. It turns out that hospital in Florida didn't even participate in their registry. So we need some greater transparency if we believe, and this is the primary question we have as a society, does the public have a right to know about the quality of their hospitals? And I think they do. Well, we're gonna hear from Ashish Jha a little bit later, but an even younger Ashish Jha did write on one of the issues that you brought up looking at the New York state coronary artery bypass surgery system in knee pants in 2006 that the results of the scorecards there, the registry and the ratings had not led to changes in market share. People weren't acting on the information even though that was the gold standard. But he did find, he and a colleague did find and suggest that there was a suggestion that the poorest performing doctors in the registry moved on to do other things. So maybe he'll address that later, but I did wanna bring that up as one potential downside of the system. In that case, I think Ashish points out, well, maybe that was a good result. Maybe that registry worked and it helped incentivize people to move on to things that they were better at or less bad at. But anyway, I just wanna throw that in there and I hope that maybe Ashish will give us an update in the final bit. Nancy, one of the issues that I hoped you might be able to talk with us about is the limitations on data that are available now. So I think Olga and Marshall laid out why the scorecard relies on the Medicare claims data even though, as Mark points out with Part B stuff that was included, at least I don't think so, that it's only a slice of all the data that anybody would wanna have. What is the role of hospitals in collaborating in this or in helping make available data that would get to a better result? Great, thank you. Thank you for the opportunity to be here and I too wanna join my colleagues in congratulating ProPublica and having this forum in being open to the idea of people talking about how to make the system better even if in the eventuality, it might be ProPublica getting out of it and somebody else taking over. Appreciate it, Marshall, raising that question that the field might put you all out of business. That's an interesting notion, or at least out of this business, not totally out of business at all. So it's interesting to see most people tend to use Medicare fee for service data because it is the most readily available. It also has all of the problems Mark was just talking about and some others, some other critical problems. For example, many of our clinicians tell us that you just can't get enough rich and appropriate clinical information out of claims data to effectively risk adjust for differences in patients. We've seen that with cardiac patients, we've seen it with a variety of patients and it plays out in a number of ways. In addition, I really appreciated the fact that you all attempted to adjust for socio-demographics. Again, not something that the hospital holds but you have to look for other data sources that are available to identify in the community are there factors that are affecting, in this case I would think the readmissions rates and Ashish and others have written eloquently about the impact of socio-demographics on readmissions rates. So maybe we'll come back to that in a little bit or maybe he can deal with it when he talks. But for a hospital to just turn over data is an interesting question. There is this little law, HIPAA, that does protect the confidentiality of patient data. And as you might imagine, hospitals are getting requests for data from multiple, multiple sources these days. And having to choose, quite frankly, with whom to partner because we need to walk that line between protecting the data and knowing that the organizations with whom hospitals are partnering are effectively protecting the data from unwanted disclosure and generating something useful so that the value of the investment in giving the data out provides information back to the hospital to use. So lots of different sources. Hospitals have constructed useful partnerships, for instance, with the Society for Thoracic Surgeons to share data for that because that partnership can lead to important information to affect practice. And it's that that the hospitals are looking for. And on a related question, you know, the Tierra's human looked really at the system and saying the system is broken. And I think other people, including some of the folks on the panel here, have talked about yes, but there are also the individuals we need to know about these outliers. What about the hospital factor and in the quality of the patient experience and what happens to the patient in the procedure versus what the doctor does? Is it possible to come up with a composite, super easy to use score that melds the doctor in the hospital or is that a foolish goal? Should we be educating people to look yes at the hospital and yes at the doctor? What do you guys think about the sort of interplay between these factors and what people might wanna know? Morning. Well, there's a special place in my heart for risk adjustment, not only as a scientist, but also I do pancreatic surgery and our baseline complication rate is 30%. And we do the most in the world and we're a center of excellence. We're proud of our results. Our baseline readmission rate is 25%. Risk adjustment is critical. And that's where I really appreciate these comments. But these are very low risk procedures. I mean, these are procedures where there should be almost no complications. I mean, it's hard to have a complication from some of these procedures. None of them were urgent. They're all extremely low risk procedures. So I am a little amused when we selectively criticize. We do this all the time in medicine. We selectively have this outrage. Where's the outrage about all-causery admissions that most people in health services research have all celebrated as a great success story, even though the metrics more accrued? Where's the outrage about ARC patient safety indicators? Once again, a noble step, but not perfect, is the methodology here more or less sound than ARC PSIs, for example. The selective outrage tells me this is emotional. This is territorial. This is the first time we're looking at ourselves and doctor behavior. And that's where I've noticed some of this conversation really pivot from a scientific one to really a territorial one. And that's what I've seen at our meetings. The surgical meeting, there was a lot of outrage and I said, what do you think of the surgeon's scorecard? And they said, well, you know, they don't adjust for colon infection rates. And I tell them, well, you know, there's no colon data in there. Oh, okay, well, they looked at 90-day mortality. You know, they don't have 90-day mortality. Oh, well, and there's this emotional reaction. And that's what I think we've been seeing a lot of here. And about the doctor bill's part, if the doctor sends a bill to Medicare, that's the name on the bill. So maybe they should be sending different bills and I don't know if they're doing too much concurrent surgery where they've got other doctors' names on the bills at MGH. Well, that's a whole other panel. Mark, it's a great mystery and worthy of much research trying to figure out what's going on with party and party Medicare. But just a couple of, it's your question also, a little bit of what Dr. McPhee mentioned. These surgeries are actually not that low-risk. I would take issue with that characterization. So if you look at registry data where all surgical complications are captured, let's say for one of these hip surgeries that was reported in a certain scorecard, if you look at surgical registry data like the NISCWIP, that's the National Surgical Quality Improvement Program, you find that the 30-day readmission rate only captures about 10% of all the complications that actually occur within the 30-day period. So the complication rate being reported by ProPublica is about one-tenth of the actual complication rate that patients should expect over a 30-day period, let alone long-term complications, which of course are a real problem for procedures like radical prostatectomy, which was also reported by ProPublica. My concern would be, when you call something an adjusted complication rate, a casual reader, and someone who's not very methodologically sophisticated, which I think describes a lot of people, is not going to know to read all the technical documentation and see, okay, here's what they're actually reporting. It's a readmission rate for conditions plausibly related to the surgery. That's 93% of these events, and then 7% are deaths, so hard to know what contribution those make. But in the case of prostatectomy, it doesn't capture erectile dysfunction or... No, no, and I think that's right. So that falls into the 30-day category, absolutely, but it doesn't even capture what's going on in 30 days. So there was just a paper last week in JAMA surgery looking at a comparison for, it wasn't that big a sample, it was just a state of Michigan, about 40-odd urologists, looking at their complication rates according to their registry data and mapping those over the scorecard data. And there was unfortunately no relationship between the two that was statistically detectable. But one thing that was noted was that, boy, those complication rates within 30 days are really high, relative to what's reported in the scorecard. And I would actually be very concerned if patients get the idea that they can sail through a radical prostatectomy 30 days, only a 3% chance of a complication. Many of them might be having a procedure that if they were better informed, they shouldn't be having. And I would hate to have, and this is hard to know, but if you anchor to something that's in a scorecard, you might not be asking, well, what's my real rate of complication here? It might be much higher, maybe in the 40% range. Well, what do you, I know there were trade-offs and you guys were very thoughtful on how you could use the data that were available to come up with something that was useful. What are your thoughts about these issues? Yeah, well, I think there are two important things to clarify. And some of this is something that we can phrase better, I think. So some of this is on us and we're working on it going forward. I think the first thing is people are making the logical error of conflating surgical quality with patient safety, right? I think that's the fundamental mistake that that University of Michigan study made, right? Measuring acute patient safety events that happen within 30 days of surgery is not necessarily the same as what happens six months out in terms of erectile function or not, right? And so what our analysis is intended to do is find extreme outliers in terms of patient safety and not to correctly rank all surgeons across all metrics from one to 1,012. That's a really useful piece of information to know. For example, for bragging rights in the locker room, it's not really what our analysis is intended to do. We know we don't have the data to do it. I think the other thing that is important to keep in mind is that at some point you have to choose to draw a bar and say something to the left of this is a complication. Something to the right is not, right? We choose to be extremely conservative where we drew the bar. We wanted something that would have a serious impact on the life of the patient, an extra pint of blood while you're in the hospital. If you leave the hospital on time, if you don't have to come back, you could be counted as a complication. We chose not to because we wanted to be extremely conservative. We understand it's a place where reasonable people can disagree. Dr. Friedberg and his colleagues come down at a different place in terms of where the bar should be. We chose the most conservative bar possible and the conversation should continue about where an appropriate place to place that is. We had a really interesting conversation a few weeks ago with someone who said, maybe you should convene patients for a conversation and ask them where the bar should be. What is a meaningful complication to a patient? If you have a thrombosis that's not painful, that doesn't impact your life in a serious way, should be counted or not? And so I think that's an important ongoing conversation to have. Have you gotten complaints from doctors that have led you to change things in the scorecard? We, you know, it's funny. One of the number one complaints we've gotten from doctors is why am I not in your scorecard? Because a lot of people who do high volume don't necessarily do high volume in Medicare and they understand that it's better to be a known quantity than to be sort of a triple question mark mystery. So that's one complaint we've received. Some surgeons talk about verifying a number of surgeons possibly in violation of numerous laws have sent us redacted medical records from their patients to try to sort of negotiate with us. You know, you say I have 12, I think I have 11, whatever. And for the most part, actually it's been very affirming. We felt really comfortable calling most of those things complications. One weird issue we did realize was that sometimes surgeons do a knee replacement on one knee and then do a revision on the other knee within 30 days. And so we had to slightly tweak our algorithm to screen for that. Though that's, it happened about 14 times in all of our data. You know, but I think other things we're hearing from surgeons, you know, I mean, one other thing that we're thinking about is right now if a surgeon did any case at a hospital, they are attached to that hospital, even if it's five cases out of 4,000 or something. And that's something we've heard from both hospitals and surgeons about. And we've listened, we heard you guys. And so we're gonna tweak the way that we do that as well. Great. We're about to turn it open to the audience. So get your questions ready. I had one little thing before we go to that, which was just a comment that, you know, when I want physician anger or patient anger, I often go to Twitter. And one of the, the urologic surgeon, Ben Davies at UPMC in Pittsburgh asked, will there be a discussion on non-validated scorecards defaming doctors? And I think we've covered most of this, but I also wanted to say for the record that the scorecard gave Davies a medium for prostatectomy after adjustment in his raw score showed no complications for 31 procedures. So I think he was angry, but his scorecard inclusion appears to be benign. He doesn't feel defamed. One thing I do want to point out really quickly, I know it's question time, is that sort of the way our model functions is that the less information we know about you, the more you get pulled in toward the mean. Just kind of a way of saying, you know, we don't know a lot about you, so we're gonna call you average. And so, you know, it's not necessarily, I realize all surgeons want to be green, sort of the Lake Wobegon kind of situation, right? But to be yellow in our data doesn't necessarily mean you're, you know, the butcher of Baltimore or something. It just means that we didn't know enough about you. We understood that the call somebody red was a big deal. And we also understood that to call somebody green in our data would be perceived as an endorsement, and we wanted to be very conservative with that as well. So if I could respond to that for just a moment. You know, we have the same struggle with a number of the hospital quality measures. Medicare does the same thing. If you have insufficient amounts of data, they pull you towards the mean. But when you have worked for perfection and you have no errors, no marks against you, to be called average does feel the faming. And I understand that. I also understand why the statistics work that way. But it is, in a sense, and very much to the point of the doctor's question, it is misleading the public to say I have a worse performance than I think I actually do. All right, Mark, real quick, and then we'll open it up to two observations. I think you can have, we had a lot of concerns about the scorecard that had actually not that much to do with the faming surgeons. We're just as concerned about surgeons being inappropriately categorized as totally safe. And that can easily happen. We're much more worried about misdirecting patients and misdirecting improvement efforts. And just one thing, if anyone's interested in really falling at this point about the shrinkage, Dr. Davies actually has a pretty sophisticated statistical point, which is there's a thing called Stein's Paradox that I recommend just looking up if you're into this, ask your physician about it. As it turns out, there isn't a gain in efficiency for the entire population of doing the shrinkage, but it results in biased estimates for each individual surgeon. So everybody's off a little bit. They're off by a little more if they're out toward the tails. Okay, so do we have some brave souls out there who would like to come to a microphone and identify themselves and take advantage of our terrific panel? Sure, thanks, Scott. I'm Ben Harder. I oversee the public reporting program at US News and World Report, and have talked to many of you in the past about these issues. We've been publishing public reporting on hospitals, as Nancy knows, for many years based on the same data set that, or very similar data set to the one that ProPublica used. And for many years, we have resisted dropping in the one extra variable that would enable us to publish scorecards for surgeons and other doctors. I want to thank ProPublica for, as Western, you know, World War I infantryman said, going over the top first on this, you've paved the way for us to do it, and we will, but not necessarily in exactly the same way. So this sort of discussion here is very helpful for us. So just so we're clear, US News and World Report will rate doctors. We've been talking publicly about that for so many years. Okay. Yes. A couple of things that I wanted to probe a little deeper on. One, and I would invite Dr. MacCurry and Olga particularly to respond to this, is the potential unintended consequences of public reporting, particularly where you may incorrectly assign a tier or a score to a surgeon, or out of fear that a surgeon might not be adequately sort of adjusted for in the risk adjustment, that there may be aversion to taking on certain cases and that sicker patients, and so on may have a harder time finding a surgeon who's willing to take care of them. I think that's one thing that we think a lot about in our public reporting program, and so I'd be interested in how you think about addressing that going forward from the public reporting standpoint and the surgeon standpoint. And then I think the other piece that I would like to get a little bit of feedback from the group, and perhaps particularly from Mark Friedberg, is from the patient perspective, the adverse consequence of a false positive, which is to say saying that all surgeons are equal, or saying that this particular surgeon is average, are potentially much more harmful than the patient consequence of inaccurately assigning someone as an outlier. And I think that's very counterintuitive for researchers and for clinicians because they're used to having very high statistical confidence intervals, and you don't say anything, you don't say anyone is different than average or any different than anyone else, and that you're absolutely certain of it. But from a patient's perspective, if I'm told these two surgeons are equal, and I flip a coin, and I tend to go into the word that's worse, there's harm in that. Whereas if I'm told this surgeon appears to be better than that one, and I go to her, and she's actually just as good as the other one, no harm has occurred. I just made it, I was just inconvenienced. All right, who wants to chip away at these questions? All right? Well, good to see you, Ben. I know you do a lot of good work in this area, and you wanna be careful, but at the same time, I feel sometimes in academics, we're debating what bathing suit to wear before the tsunami hits. And right now the building is on fire. 10% of all deaths in the United States are from medical errors or from medical care itself. It's the third leading cause of death in the United States. And in general, about 11% of all operations are unnecessary. If we move people towards more conservative surgical care, delegating cases to colleagues, teamwork, we do a great job triaging you if you walk into our emergency room. We don't do a great job triaging you if you call for an appointment or need an elective operation. Those are the... One thing we're finding is a huge benefit from what all of you are doing in the public reporting space. And we all agree it's not perfect, right? But we are now with a Robert Wood Johnson Foundation grant at Johns Hopkins. We have partnered with the doctor's associations to say, give us one metric of quality and we're gonna show everybody where they stand and give them a confidential, non-punitive report friendly, this is where you are, this is where the rest of the doctors are in your specialty around the metric. And if the doctor's associations endorse the metric ahead of time, then there's no, none of this sort of classic argument my patients are sick or the data's bad, all this other outrage. And what we're finding is that because there's a general feeling that all of you are gonna make data publicly available, that this is something doctors wanna work on ahead of public reporting. And even in the cover letter from the Surgeons Association, it says, as a courtesy ahead of what's anticipated to be publicly reported metrics, we're sharing your data with you. And as you know, doctors are the most competitive creatures on the face of the planet and surgeons may be the most competitive subgroup within that group. So we're finding that this movement towards transparency is not only helping unveil dangerous spots like St. Mary's Hospital, which took journalists beating on the door and Freedom of Information Act requests and all kinds of stuff to try to get the data. But it's also helping the medical community say, hey, we can't do this over 50 years. We need to start doing this now. So those are my thoughts. I wanted to respond a little bit to Ben's question about sort of type one versus type two error. So the concern everyone is raising is what if we are calling surgeons bad who are not bad or good who are not good? We like to think we're not calling surgeons bad or good, just merely putting them on our scale. But what esteemed colleagues like Dr. Friedberg I think forget about sometimes, right, is that we're comparing a somewhat known risk in the form of a confidence interval around our data to the completely unknown risk that currently exists, right? When patients use their nephew's wife's recommendation to find a surgeon or referral, we know that referral networks can be very problematic. I even read an article recently that apparently assholes refer to other assholes. So... That's a technical, yeah. Yes. I wanna see that article. Well, okay, so in fairness, it said that people refer to surgeons with the same personality type. Okay, I'm zoomed in for you. Anyway, and so if you're relying on that sort of thing, if you're relying on sort of word of mouth or who has the best looking billboard on I-95, there's a risk there too, right? And we can't necessarily measure that risk, but it's real. And it's something that we need to weigh when on the other hand, we're looking at the uncertainty that comes with using this type of data as well. Go ahead, Mark. Just briefly, there's two angles to Ben's question and Nova's response. One is, so you have these wide confidence intervals and low reliability for a large number of surgeons in the scorecard. And this is a consequence of the data. There's not really anything you can do about this. This is just the nature of the universe as captured by Medicare billing data. That's where we get this type one, type two error thing and you can make an argument and Ashish has made a good argument, I think, that 95% certainty is probably too much certainty. Maybe we lower it to even 50.0001. That's better than 50% of pure coin flip, right? The problem with that argument is that it presupposes validity is there. And that's what we just don't know. We don't know if the confidence intervals are accurate. We don't know if the point estimates are accurate. We don't know if they're inaccurate in a way that's random or if they're in a way that's systematic. We don't know whether the net misdirected patients actually outnumber the net correctly directed patients. That's where that validation step has to be applied. Just one comment on that though, unfortunately the music registry, the one in the JAMA paper you referred to is not publicly reporting. And I don't know of other surgical registers reporting at the surgeon level. Many of them are reporting at the hospital level now. There is one. The STS, the Cerebral Thoracic Society. Surgeon level? Yes. I don't think it's reporting at the hospital level yet. We're reporting at the hospital level though and we compared our cabbage results against the publicly reported data available for them and there's an incredible correspondence. So I think your point about sort of convergent validity is a very important one and it is very helpful but it's important that a registry partner then is able to come forward with that data in order to actually access the data. You want to do that at the surgeon level if you're going to report at the surgeon level. I agree. I think one other issue that I think everyone who wants to create a report card is going to have to think about very carefully is that going forward, more and more physicians are likely to pursue alternative payment models, Medicare Advantage. It is a consequence of the macro legislation and the incentives built in there for moving towards these more advanced payments. The good news about that is that it really promotes this team look at care. You asked a few moments ago, is there a hospital factor versus the doctor factor? I think there's both and we need to understand both and absolutely the American Hospital Association and the request of our member hospitals has long supported public transparency. We want good measures. We want them used appropriately but we want public transparency around quality. In truth, it helps us identify opportunities for improvement in more effective ways than other strategies have ever done. That said, how do we move this forward if we're losing half the data? The number of physicians, the number of surgeons who have Medicare fee for service data may dramatically drop over the next couple of years. How does that affect it? And what are the implications? Because you're not, I don't know who the two million people were that looked at your data. I think I was about 500,000 of those. But when you look at those hits, you don't know who's looking at them or how they're using it. So it's Medicare fee for service only data around special surgeries. What are the implications for middle-aged, blue cross, blue shield patient who might be looking at the data? How do they interpret that? Unknown. For some of these simple surgeries, relatively simple surgeries, nothing is ever simple. But for some of these relatively simple surgeries, they may not have big implications. But for others, as you get more complex and the patient population diverges in terms of the complexity, they may have huge implications. So I think there are a lot of things to be thinking about. How do we get to good data? And what are the real decision points that patients will have in the future? Will they actually be selecting their surgeon? Or will their network sort of dictate what goes on? It certainly will constrain their choice of surgeons. If I can ask a question, sorry. Sure, go ahead. So one of the things, when we started our reporting, we assumed that all hospitals had amazing data about everything going on with surgeons in the hospital and that they were just sort of hiding it from us. And one thing we realized as time went on, was that often hospitals don't actually know we would get to surgeons. And the first question they would ask us is, how did I do? Did I have any complications? And so I guess my question would be, what are hospitals doing to, I mean, obviously there are centers of excellence where they probably follow every surgeon around with a clipboard and write down everything. There are hospitals we know, community hospitals that do almost no tracking whatsoever, what happens at their hospital. What's going on in the hospital side to improve the data that's available? Well, I wouldn't say hospitals are doing anywhere, doing no tracking whatsoever. Even our smallest critical access hospitals are tracking relatively important data for them. Are they breaking it down by clinician? Not necessarily. They have even smaller numbers there. So the questions of small sample size loom large for them, but understanding what's happening inside the hospital, particularly around critically important metrics, is exactly what hospitals wanna do. And in fact, they have been asking us to help policymakers understand that this blizzard of measures is actually detracting from their ability to identify the important and to prioritize that and move forward to making care safer. To Marty's point, I mean, we really need to be focused on our greatest opportunities for improving safety. That's where hospital leaders wanna be. And I haven't interviewed a lot of surgeons, but I suspect that's where they wanna be too. That they're not in this business to harm patients. That they really wanna make care safer. And they need that view to understand where are their opportunities. Thanks. Hi, Carol Cronin with the Informed Patient Institute. And I again wanna congratulate you. I'm gonna follow the two question. My first question is just if any of you and even I would include Marshall in this too, can talk a little bit more about what is happening with the specialty societies and particularly access to the registry data is besides the society for thoracic surgeon. I mean, Dr. Macri's comment that they move an inch in a decade is not terrifically promising. So that was number one. And number two, I'm just wondering to Mark's point, I know there was a pretty robust debate between ProPublica and some of the folks that were critiquing it. That was all public, which I thought was really wonderful. But I wondered to what extent ProPublica is thinking about this issue of literally submitting this as a measure as things traditionally happen. So I'm just wondering if that's, what your attitude is that? And I will reveal that I am on the board as a consumer member of the National Quality Forum. So it would be very interesting for me to hear kind of where you think that is in terms of your future thinking. Thanks. Great, so specialty societies, we heard a little bit about that, but any light you could shed on what they're up to and whether that might be part of the answer? There's a group within every doctor's association that gets it, like the members of this panel get it and care deeply about patient safety. And they are trying to move the field towards more transparency. And it's really a revolution and it's exciting and it's great, but right now the pace is still quarter of one mile an hour. And things like the Surgeon Scorecard help jumpstart all of that because we need to answer, why haven't we done this? Some certain data and registries that we've been reporting into have been around for 15, 20 years. Does the public have a right to know about what happened at St. Mary's Hospital or not? So this is moving forward and I think people just wanna do it carefully and properly, but I think that the time has come to say, look, the public has a, taxpayers have funded the Medicare billing payment system. Do they have a right to see the data or not? And I think that it's right now it's moving forward. Olga or Marshall? Oh, sorry, could I just add very quickly to that? I think those of us who've been following Medicare's implementation of the macro legislation are keenly aware that in the next couple of weeks, maybe month or so, they will lay out in proposed regulation the requirements for data reporting, which in large measure based on their conversations with the specialty societies will be derived from some of those specialty society measures. So you won't see it in the specialty society database, but doctors may be able to report into their speciality society and have that society transmitted to CMS for public reporting, which I believe will be the track of most of that information that then becomes part of their incentive payment. But that's my hypothesis, we'll see in a month if I'm right. Olga or Marshall, any comment on submitting the measure for review and blessing? Yeah, it's interesting. I was actually at a data journalism conference about a week ago and there was a lot of conversation on this topic, right? You know, to what extent are journalists responsible for undergoing peer review, you know, submitting to kind of the traditional processes, things like the National Quality Forum, the range of opinions went from, you know, you can't peer review the First Amendment to, you know... I know those guys. You should actually, you know, absolutely, you know, try to have everything reviewed in the most traditional way possible. I think we're definitely open to input from the National Quality Forum, for example. We haven't ruled out submitting what we're measuring. I think we would argue that something like re-admissions has been submitted many times. So, you know, a lot of it has to do with what Marty was saying. You know, there's a pretty serious epidemic going on, right? And so journalists operate on a different timetable than maybe other folks. And, you know, so whereas we make sure that everything we do is extremely vetted, maybe vetting it in such a formal way is not a thing that will work, but that doesn't mean that we've rolled it out. Okay, could you share with us? Yeah, so... I think that it was vetted sort of in two ways. One of the ways was just to make sure that we executed the analysis correctly. So in our data department, we always have one, it's, we call it two pilots in the cockpit. Somebody else has to be able to recreate your analysis soup to nuts to make sure it's correct. In addition, because we had thousands of lines of code, we brought in a professional programmer to recreate our work as well and make sure it was all up to standard. The second way you want to vet your stuff obviously is to make sure that what you attempted to do was sound as opposed to just correctly executed. So we had advisors who helped us develop our methodology from the beginning. Then we sort of had a concentric circle approach. So then we had people who work in this space but who weren't involved in the execution of our analysis. Review what we did. And then we also made sure that people who weren't going to like what we did like the American College of Surgeons saw our work before we published as well. We didn't take everyone's advice but we did weigh all of it and consider it very carefully. We also went in the field. Anybody who is going to be named in the story or might be named in the story, we shared our data with dozens of hospitals, including hundreds of providers, ask them for feedback, ask them to correct us if we were wrong. Anybody who was named in any of our stories or might have been named in our stories all the data was shared with them. And there's been remarkable consistency every time a hospital or a provider is compared, their internal clinical data with what we've been reporting it's been very consistent. Yeah, I will say as journalists there are things we do that the research community probably wouldn't do. Interview somebody's coworkers, figure out how to get inside the peer review at that hospital. So, and I think there is value to both ways of gathering information about whether something measures what you want it to measure. I mean, we've had very strange conversations with surgeons. One surgeon claimed that he had never had a complication and then we pulled his data and two of his patients had died and others had had really serious problems. And he was like, oh, okay, you're right. So, I think peer review accomplishes some things and asks some questions. I think journalistically reporting out what you have found in your data also generates useful information. Great, if you'll bear with us just one second. I got two questions from the Twitterverse that are kind of related and they're coming from a voice that's not represented on our panel and it's nursing. So, we have a question from Patricia Davidson who's the Dean of the Hopkins School of Nursing and another question from Kathy Day who identifies herself as an RN and I'll put them together. So, the first question is how do we measure the quality of nursing care? Healthcare is a team sport. So, getting back a little bit to the hospital and the team versus the individual surgeon. And then Kathy Day asks what can be done to shut down the small percentage of very dangerous surgeons. So, could someone quickly talk about the role of nursing and how to assess this team sport aspect and then what do we do about those identified outliers or really bad apples to get them moved out? You want me to take the nursing question? Sure, I'll take the first one, which is that the nurses have been very active in this space. The organizations that represent nursing, particularly the operating room nurses have been very interested in identifying appropriate measures of the care and quality that they provide as part of this team. I haven't had a recent conversation with their leadership to be able to represent accurately where they are in that process. But I know a number of us are looking forward to having really good, effective measures of nursing care quality incorporated into public reporting going forward. So, right now in the public reporting, you'll see things like fall prevention and a few other measures of what would logically be called nursing care, pressure ulcers and other things. But we need better measures. We need better understanding of what are the key inflection points in the nursing care where their care makes a difference in the patient outcome. And that needs the professional community to work on it and generate those measures with some help from measure developers or RAND or anybody who can really help them manipulate the data. And that's, I think we still are waiting but there is work underway. Bernie or anybody else on the bad apples. These men or women are just not good at their jobs and everybody who sees them doing their work knows it and yet they get passed from hospital to hospital or they stay at the hospital where they are. What's the answer? Well, if you go to the professional associations they tell you it's the job of this chief of surgery at the hospital to address the problem. If you go to the chief of surgery they often tell you it's a state medical board and that deals with the problem. You go to the state medical board they say it's the associations issue. So there's this permutative triangle that these major surgeons live in. The reality is that we need to measure quality in America and we shouldn't fool ourselves as to where we are right now. 99% plus of all medical care in the United States is unmeasured. When I do an operation tomorrow that patient's gonna go home that patient's not gonna be measured. No one's gonna be coming in six weeks or three months and saying are you back on your feet? Can you mow the lawn or can you hold your granddaughter again? Until we start measuring care with real measures then we're kind of fooling ourselves that we're gonna be able to address the problem. And because this issue has been vastly underfunded and the people doing this good work are underappreciating and a few technologies out there measuring real patient harm. The case, the financial case hasn't been made beyond the moral case to them to adopt it. We're kind of fooling ourselves thinking that we're measuring quality when we measure patient satisfaction and all cause readmission. We're probably really measuring whether or not parents are getting antibiotics when they demand them and whether or not hospitals are moving people into observation beds. That's why I think when we have better metrics like surgical specific readmissions like the ones in this scorecard, they're not perfect. And I would even suggest the data which the methodology is being applied to is not perfect. So certainly when you'll get all Michigan all your data, yeah, it's gonna be better. It's not the methodology there. That's the data you're applying it to. So that's my thought. Okay, Mark. Just a reaction to quickly to something Dr. McMean mentioned. Just to also show that I'm not against the idea of reporting at all. As I said, patient experience surveys have never been claimed by any of their users or developers or advocates to measure technical quality of care. That's a separate dimension. And of course, the Propholoka scorecard gets after technical quality, not patient experience. The other thing is there's absolutely no evidence that this antibiotic thing is actually true. It's a little bit of a myth among a lot of doctors. So just feel like I should do my duty there and say that. But getting back to the question about what to do in the system to try to deal with outlier physicians. And there's a few different ways you can handle this. Some of this could be education or health decisions. Some may just need to be doing something different. The system needs to have incentives to produce that kind of result. I think what we see is a system that doesn't have an incentive right now to really measure and act on quality information. Until those incentives are in place, I don't think we're gonna see too much that's different. Great. Please tell us who you are and what you're interested in asking about. And this will be our last question before she's done, wraps it up. I'm Bernadette and I'm from Mount Sinai. I give chemotherapy. I've been fired from a major Ivy League hospital for reporting a dangerous situation. So I also, I'm an excellent nurse and I audit charts for Jacoa, for instance, in quality. I have problems sometimes auditing these charts because it's reporting on coworkers. I feel like it should be an independent body. I don't feel like I can go to an unbudged person if I needed to. I don't feel like it would go anywhere based on my experience from another hospital. So I would say nurses are fundamental to quality, but a big challenge nationwide is safer staffing. And I think if we have safer staffing, these numbers will greatly improve. And I'm curious to hear your thoughts on safer staffing. Thank you. So your comments remind us of one of the critical factors that isn't measured here in this particular report card, but is absolutely essential to moving safety forward in the country, which is having the appropriate culture within the organization. One that allows people to raise questions, to point out errors, to engage in conversation with everyone, to stop someone from making an error even if that person is at a higher rank. I hate to use that word because we're not in the military, but in a higher hierarchy, there is a higher rank. We have to have that culture. I have seen it in some of our member hospitals, and when you walk in the door of those institutions, you know it exists. People feel free and safe to talk about what's going on and to raise a question to say to the doctor, did you forget to wash your hands? That's the atmosphere we need, regardless. And that is fundamental to safety. There are some safety culture measures, but the data on those are not yet publicly reported because we don't have one standard. We need that. We need to understand what's going on in our institutions to move this forward. Thanks. Ashish, we're ready for you. Stay on the stage and we're gonna let Ashish synthesize this and make our closing remarks. Thank you, panelists, and thank you to the audience in advance. Great. Well, thank you for having me. Thank you, Marshall, for inviting me. There has been, I think, a very helpful and robust debate, and I'll thank the panelists at the end again, but let me start off by saying I'm gonna speak on a personal level for a second and say, let's talk about how I would evaluate the surgeon report card. I'm a physician. I practice medicine, and if I were to look at their surgeon report card de novo, I would say, it has problems with it, right? It's got problems with risk adjustment. The risk adjustment isn't very good. It's not as good as it should be. The sample sizes are not always as big as they should be, and that gives us large error bars. They adjust for socioeconomic status, but the adjustment is certainly not as robust as I would like for it to be. And it focuses on Medicare patients only and their questions about the validity of what they're doing, and all of this makes us think, boy, this thing is hardly perfect. It's hardly perfect. And so the question is, would I use it? As a practicing internist, if I needed it, would Marty use it? If Marty needed this, we needed a surgeon. Would he go to the surgeon's scorecard and pull it out? And I'm not even gonna make you answer that question because I think the answer is probably not, right? Probably not. I wrote a blog about this, and I said a couple of years ago, I had a shoulder injury. I needed a shoulder surgeon. What did I do? I called my friends. I called my orthopedic friends. They all pointed to two surgeons that they thought were really terrific. I talked to people who operated with them, got a whole lot of inside information, and picked one surgeon that was really, really good, and that's how I picked. But so in light of that, you look at the surgeon report card and say, boy, it's really, it falls short compared to that. But all of you are probably thinking, what if you're part of that 99.7% of Americans who aren't physicians, who don't have a network of colleagues you can pick up the phone and call? What do you do then? And in that light, the question is, is a surgeon's scorecard a step forward? Is it neutral, or is it a step back? And to answer that question, I actually think about a patient, and again, I talked about this a bit in a blog that I wrote, a patient of mine, a real patient, named Bobby, who was being admitted for the third time over the last nine months for a complication after lung surgery. His previous complications had led him to miss his daughter's wedding. He had missed other really important family events. He was very frustrated. And when I asked him how he picked a surgeon, I said, because he was very frustrated with his surgeon, I said, why'd you go with this guy? He said, I asked a couple of friends, and they said, that's the guy to go with. And they steered me wrong. So if you're asking the question, is this an innovation? Is this moving the ball forward? The question is not, is it good from an expert perspective? The question, one question is, is this good for somebody whose alternative is a sample size of one, an anecdote? The second way to think about this question, and I think this is also very important, and there have been a lot of critiques of the pro-publica report card, and I think a lot of those critiques are valid, is how does this compare to everything else we're doing nationally? If I were to collapse here right now, and I promise I won't, I'll try not to, but if I were to collapse here right now with a heart attack, you guys, hopefully somebody would pick up the phone and call 911, and I'd be whisked off to a hospital. Would it matter which hospital I went to? It would, right? What we know is across the country, your chances of dying of a heart attack vary two-fold, sometimes three-fold based on which hospital you end up at. Is that information all available to everybody in the room? Could you direct the ambulance driver, or let's say something not quite as dramatic as a heart attack, imagine something where you have a little more time? Do you have the information you need to say, take me to hospital A, don't take me to hospital B? Now you can say, I could look it up on hospital compare, but the way we do it on hospital compare, issues with risk adjustment are real. We've talked about that. There is a shrinkage method that's applied that basically suggests that only three to 4% of hospitals are worse than expected. It's extremely conservative. And so I would argue that our public reporting efforts that we are doing right now are not working. And that's for mortality. We actually think we're doing it better than most. You know, we report on readmissions. And what we know is that a large part of what drives readmissions is how sick the patients are and how poor your patients are. And the risk adjustment on readmissions is not very good. And we don't even bother adjusting for socioeconomic status at all. And so that is the other alternative. So even for the consumer who's armed with really good information doing public reporting, that's the other alternative. And so in that light, when you look at the pro-publica report card and ask, is this an innovation? Is this a step forward? I would ask you to take one other consideration. It's been 17 years since the IOM report that Steve started us off with. And they said, need we wait another decade before the system is safer? Oh, 17 years ago. And so the question is, how much further along are we in 17 years? And I would say that we are better off in a couple of ways. Infection rates in hospitals have dropped pretty substantially. I would argue the culture of patient safety has gotten better. We've had lots of issues, but 17 years ago it was all about the individual. You screwed up, the doctor screwed up, the nurse screwed up. I think more generally we now understand that it's about systems. But at the end of the day, if the question is, are we measurably and meaningfully safer today than we were 17 years ago when the IOM report came out? I would argue, personally, I would argue that data suggests that we are not. We are not meaningfully safer than we were. And why are we not meaningfully safer? Well, a major part of that is we have not had the data to know. The fact that when Marshall goes to a hospital or Olga goes to a hospital and the surgeon says, tell me about my own complication rate. Wow, I had no idea. I find that shocking. In this system, where we spend $3 trillion a year on healthcare, most surgeons do not know their own complication rate. That should disturb all of us. And the lack of good data, we have a long list of reasons why the data are problematic. The claims data aren't terrific. The risk adjustment isn't great. All the stuff I started off with. And where we were gonna solve this problem is once we got electronic health records in place. Well, guess what? 80% of American hospitals now have an electronic health record. We're there. So we should be getting all the right data. We should be doing terrific risk adjustment. And the truth is that isn't happening. The other kind of important context in this is we know how to do this. So one, you might argue, well, we just don't know how to do this. It's complicated. We've been doing surgical quality for 20 years before the IOM report. The NISQIP program, the National Surgical Quality Improvement Program, started collecting data on complications, readmissions, mortality in the mid-1990s. And so today, if you wanna use NISQIP to find a great surgeon, good luck with that. It's not out there. It's not available. It's available to the surgeons. I could probably get it. But unfortunately, none of you can. And I think that's fundamentally a major part of where we are. And actually, the one other thing I wanna make on that is by the way, that's a choice that we have made. It isn't inherent in the system that we can't make NISQIP data mandatory. Every hospital has to collect it and needs to be publicly available. Could we do that tomorrow? We could. So we have chosen not to make it available to you. It's mostly available to people like Marty and Mark and me. So that's where we are. And that's the context in which we have to think about what ProPublica has done. Are there problems with validity? Are there problems with sample size? Absolutely. Absolutely. Are there statistical problems? Yes. Are there easy fixes? I don't think so. I think these guys did about as good of a job as could have been done under the circumstances. And I wanna finish off with the notion it's a Clay Christiansen notion around disruptive innovation. I've asked the question, is this an innovation? Disruptive innovation is something that everybody loves talking about, but most people don't understand because the whole point of disruptive innovation is a couple of things. One, it's usually done by outsiders. Insiders almost never do disruptive innovation. These guys are outsiders. I've been in the quality business for a dozen years. Never had journalists who thought they could show up with data and hold surgeons and hold the healthcare system accountable. Takes a certain amount of audacity. It is the other important part is not only that it's done by outsiders, but it's not as good as what the experts have. That's the most important point about disruptive innovation. It's never as good. It's actually, I shouldn't say never as good. Initially, it is not as good as what the experts have. Experts know how to pick good surgeons. What these guys have created is not as good as that. But what happens with disruptive innovation is that it's meant for people the 99.7% who don't have access to surgical buddies that they can pick up the phone and call. And once it takes hold, it starts getting better. We see that with version 2.0 that's coming. This will get better. It will drive everybody else to get better. And the truth is that what this reminds us, what Marshall and Olga remind us, is that we are no longer living in an era where I get to decide who gets to see my performance. We are no longer living in an era where the medical community gets to hold this information in. That is just no longer tenable. And so the question for us in the medical community is do we get on board? Do we help these guys get better? Or if we don't want to work with these guys, do we do our own job in making this better? I like them, they're great folks. But you may choose you don't want to work with Marshall. Who wants to work with Marshall? Not Olga. That's fine. But then we have to make something better than what they have. And the standard line has been we can't, we have to wait, we gotta wait until we have electronic health records, we have to wait until we have better risk adjustment, we have to wait until we have better data on socioeconomic status. And I would argue 17 years is long enough. We've waited, we've not made as much progress. I am enormously grateful for what these guys have done. I've also been critical of some of the things they've done. They asked my advice, they didn't always listen to it. I wish they had, no I'm kidding. But the bottom line is that this is a very important step forward. Because it pushes all of us to up our game. And to no longer sit around and say we've got, sorry, I lost my train of thought. But bottom line here is what these guys have done is I think they have changed the game in letting us, no longer letting us get away with waiting for the next set of best data. And so I think this discussion has been enormously helpful. I think it's important to talk about the problems with the data. I think it's important for them to make their data better, their methodology better. But I think we would be much worse off if we didn't have the surgeon scorecard. And so I wanna first and foremost thank Olga and Marshall for sticking through a bunch of years of hard work in getting us here. And then I actually really wanna thank people like Mark and Nancy who are gonna hold them and all of us accountable for making sure that the work gets better. Because if we stop here, we will not have done what they're hoping we do. So I think this is important innovation. We gotta keep working on making it better. Thank you for hosting this event today. I think it's a step in keeping that conversation going. And thank you for letting me come and speak. Thank you. How are we? Steve, do you wanna say something at the end here? I guess what I would say is the first rule of medical conferences is don't follow a she's jaw. In all seriousness, thank you all for coming. Thank the panelists. I think we could all agree that that was really a challenging and robust and fascinating discussion. And as we work in surgeon scorecard 2.0, I feel confident that we'll be thinking about these and other issues as we try to go forward. And thank you all. Thank you. Thank you. Thank you so much.