 In last few lectures, you have seen how a technology platform like NAPA, Nucleic Acid Progruble Protein Arrays could be so useful to perform high throughput assays for the proteins without having the purified proteins available to you. Just by taking simple cDNA, you can express the proteins on the chip and use them for different type of applications. Today's lecture, Dr. Josh Lebert is going to continue discussion about NAPA technology and mainly emphasis will be on one of the applications about how to use these arrays for doing research on tuberculosis. As you know, mycobacterium tuberculosis, it really affects large population in whole world and especially more relevant in Indian context when we have advent of several resistant strains of mycobacterium, especially MDR strains of TV. So how to use these array platforms to do some sort of novel biomarker based projects using NAPA technology. So let's welcome Dr. Josh Lebert to discuss about applications of NAPA for screening mycobacterium tuberculosis. Okay, so I showed you this. So this is how we analyzed the data. So let me remind you again that this little array up here is multiplexed and this array down here is all the same proteins but all as individual spots. And so we spent a lot of time trying to figure out how do you represent these data. It's always fun when you're doing an experiment the first time and you realize that no one has ever come up with a way to show what you're looking at because it's so new. So what we ended up deciding to do is this kind of ball and stick model where this ball is representative of the mixed spot and each of these individual circles here represents the five proteins that are expressed in that spot. And then the color here means that we detected signal at the mixed spot and color at any one of these spots means that when we tested the individual spot it also gave color. So if you look at something like this one right here that suggests that we detected M6, we detected this spot and probably it was the adenovirus E1A protein that was responsible for that signal. That would be our best guess, right? Here's another one and probably that protein is responsible for that. Here's one that was kind of weak and notice that we don't see any of these five spots lighting up. So that becomes a little bit of a question mark. So what would you call that? Thinking back to our statistics from yesterday, it's a possible false positive, right? The mixed spot says there's a spot there but when we went to confirm it we couldn't find it. So that might be a false positive. We don't know. It could be that we just didn't get good detection here, alright? Here's another one. Also just a little bit esoteric here but we actually look at two qualities of our spots on the arrays. We look at the spot on the array itself, the blue, and we also look at the signal intensity of this area around the spot which we call the ring. And a lot of our features have that sort of ring intensity which is another sign of a very strong response. So if we see a ring we usually think it's a very strong positive. So we went through and we did some protein interaction studies using this approach. We also, this array comprises a variety of viral proteins from common viruses that people are infected with and then we probed it with serum and that, for each person would tell us which viruses that person has had before, right? So that's shown here and yes, yes, oh how do we get the value for that? Yeah, that's a little complicated. What we do currently is that we have a software application that pulls up the images and the investigator doesn't know what the spots are but we have a five star scale for scoring a spot from one to five based on essentially the size and the look and we have images for each of the five and the investigator looks at the image and then he scores it or she scores it from a one to five. And you do it unbiased so you don't know which proteins are which. You just score the ring size. It's not as precise as an instrument doing it but it's not bad. And so a very narrow spot would be like a one, bigger would be two, three would be a pretty good size, four would be like spilling into the neighboring spots and five would be, you know, huge. So kind of the size and intensity of the spot of the ring. Yeah, all right, so I think here you can see that would be like a five, that ring right there whereas this one probably would be a one or a two. So there you can see that the spot there you can see a little bit of the ring right there. This one is like, that's a five, it's spilling all over, you know, this might be a three up here, that sort of thing. So you get used to it after you do a lot of these. So here you can see some examples. So each column is a serum sample. Each row is a multiplex spot and then the little nodes around it indicate which proteins were individually detected down here, right? If you look at this guy here, it's sort of a standard what you'd expect. You've got a strong signal in the spot and you also have a spot signal for that for one of the proteins in the five, right? So presumably this is the one that gave that signal. Then here's another example where we had a strong spot signal and there were two positives in that in that midge. So that's something that you have to keep in mind is that up until now I've been acting like within every mixed spot there's only one target. It's certainly possible that there may be more than one. And obviously you hope that the mixed spot will give you that signal. Here's another one. So what's the concern on this one? It's a false positive, right? So this one, the major spot gave us, and it's interesting that the false positives often are weaker than these guys. It's a little bit weaker. And then what's that? False negative, right? So the individual spot gave us a signal, but the master spot, the mixed spot did not, right? And so this is the one that we worried about the most. And we looked for these and I think we found a few percent where that happened. But the vast majority of the time, the mixed spot was sufficient to find whatever was present in the individual spots. Because if the two methods agree well, then you feel a lot more comfortable using the mixed NAPPA, the multiplex NAPPA, over the standard NAPPA because it's so much less expensive. And so that's, this is kind of the strategy. You produce a mixed NAPPA array up here, you get good signal for all these proteins, you show that you have good correlation. You screen for antigens that are detected in the patient, let's say, and you see a number of spots that light up. You compare array to array to make sure that you're getting consistent results. This is all sort of quality control stuff. And then after you've got these hits, you take the individual spots that lit up and you break them down into individual proteins the next day to verify that those signals are real and to deconvolute which spots were positive, okay? And so this is kind of the summary of what the mathematics works out, right? So in the old way, to do 10,000 proteins, we would have to use five slides. And then after we did those five slides, we would have to come back the next day and use a sixth slide to confirm the hits and make sure that they were real, right? Using the mixed NAPPA, we can do all of the spots on one slide and then we just have to come back the next day and do a second slide to confirm that it's real. So here we have to do a total of six, here we have to do only two. So there's a lot more work on this side than this side. And of course, if you're trying to save money, this is definitely a cheaper way to go and yet you can get roughly the same numbers, right? Okay, so I don't have to tell you guys this, right? People, this is a big issue, it's a huge health issue all over the world, including in this country. One of the big challenges in some parts of the world is the co-infection of HIV with TB. And the diagnostic methods that are available for TB are already limited in standard TB, but it's particularly limited in the context of HIV where the common symptoms and the common molecular studies don't always apply. And so we were interested in studying a little bit whether or not we could identify good biomarkers for the detection of active TB, particularly in an HIV-positive population. Okay, so these are some of the things that I think you probably already know, those of you who are aware of a little bit of the TB issues. Antibiotrophils from patients can be very different from patient to patient. Not that many antigens have been reported. Sensitivity and specificity are not ideal. And even these that have been reported, many of them have not confirmed in other studies. We almost all the studies so far have been in HIV negative individuals. And of course, ultimately we'd like to get a point of care diagnostic. So we had done a study together on a funded grant to look at the whole TB proteome, screened with serum and all that. And for a variety of complicated reasons, that particular study could not be used. There were some issues with mixed up spots on the array. It just that, you know, normally we don't do this, but this was one case where it was a problem. And so we reached the end of the study and we couldn't use the data. And yet we really wanted to do this study and we felt responsible to the agency for publishing it. And so we needed to repeat it, but we were out of money. And so we were really operating on a shoestring. We had like no funds at all. And we had to figure out a way to study the TB proteome with no money. And so that was where the idea for multiplex NAPA came up. I was racking my brain. How can I do this inexpensively? And so it was for that reason that we decided to try the multiplex NAPA. Because that way we could get the entire proteome of TB in one quarter of an array. In fact, one array would be able to do four proteomes at once. So that really lowered the cost. And of course, the other problem, besides the lack of money, was that we were almost out of serum. So we had very little serum to test. And of course, we needed that to do the study. All right, so this is how we outlined the study. For complicated reasons, we decided not to use five proteins per spot. We used three proteins per spot. Part of that was, if you remember when I started, I showed you that the number of spots that you can use depends a bit on the expected hit rate, right? And we thought that with TB, particularly in the population from South Africa, where we were getting the samples, the hit rate was likely to be higher than just 5%. In part because that population is also co-infected by other mycobacteria. And those would cross-react with the mycobacteria proteins from TB. So assuming that there was a higher hit rate, mathematically it made more sense to do three spots rather than five, okay? So we had three proteins per spot, right? And then we screened that with TB sera and positive disease and healthier individual disease compared that. When we got a hit, we took that hit and we divided it into three individual spots in that secondary verification and deconvolution array. We did this separate array and then from that, we got individual proteins that were positive. And then we did a third level of validation by testing different samples on a verification array. And then finally, we took those individual spots to ELISA. So once again, a multi-tiered set of experiments to make sure that whatever hits we observe really makes sense, okay? And so this is kind of a summary of that. In phase one, we screened 4,000 antigens, which is the entire proteome of TB. Phase two, we did a deconvolution on what turned out to be about 870 antigens. We did those 870 antigens on an independent set of samples. And then we did a ELISA verification on the best 40 of all those, all right? And then this kind of gives you a breakdown. Our samples, I think I have a better slide for that part. Yeah, this is a breakdown of the samples that we had. So we were looking at samples that came from the US and samples that came from South Africa. We had TB patients and controls. We also had, of these 66, a fraction of them had HIV. Of these 68, again, a fraction of HIV. So we had patients who had both who were either HIV positive or HIV negative, and did or did not have TB, if that all makes sense. So it's kind of a complicated clinical design because we were looking at two factors, TB and HIV, and two different countries. And this is how we broke the samples into all the different studies. This is published last year in MCP. So I won't go through all the numbers here, but you generally get the idea. So this is what the array mix looked like when we printed it, the total protein. So again, we're using the HD NAPPA and multiplexed. So we've combined the two technologies I told you about into one experiment today. And so these are all the proteins. We also had a subset down here of individual proteins and viral genes. That did two things for us. It gave us a reference spot for individual proteins. It also gave us some positive controls so that we could make sure that everything was working because we knew that most people would have a response to some of these common viruses. Okay, and then this is just to show you that we got good expression on the arrays. So the dotted line is the cutoff line. Everything below that line was considered to be absent on the array. There was a signal there, but it wasn't real. All of this up here shows that by and large, the vast majority of proteins were well expressed on the array and easily detectable. And of course that means that we have a good chance at detecting an immune response. Okay, and then this is just once again to show you that, okay, I should have mentioned this back here. This, what you're looking at right here is one quarter of an array. So every array had this set four times. And what that meant is that then we could screen four patients per array using special chambers that isolated each of the four chambers. So this, what we did here is we compared one of those subarrays to another subarray to make sure that from subarray to subarray, they were reproducible. And that's what this is. This is within a slide, and then this is looking at two different slides. Again, to make sure everything is aligned right. And then when we did our screening experiment, once again, we wanted to make sure that we were getting the same result day to day. And so here you see two different arrays, two different days. And again, you're getting very good alignment there. So this is kind of what you end up with. Here were three spots that were detected by this particular patient. So here's M1, here's M3, here's M2, right? Then we come over to M1, and then on the next day, we break it down into individual proteins. And what we could see is that it was protein number three here that was the protein that lit up for M1. For M2, it was protein number four, this protein here that lit up. And for M3, it was this protein here that lit up. So we were able to deconvolute the results in the next day. And then when all said and done, and we analyzed all the data for all the patients, there were some trends that were interesting. Here's a protein that's clearly showing a preponderance of signal in TB positive over TB negative in an HIV negative population. That's exactly the kind of marker we were looking for. Here's another marker, again, showing a strong signal in the, and even a difference in overall median signal in TB positive, in this case, for HIV positive. You can sort of see a difference even in the HIV negative population, which is what we were really trying to get to. And then by combining those markers together, we were able to get this AUC curve, which is a measure of biomarker quality. And you can see the curve is not at its strongest in the HIV, in this South African HIV positive population. But we were really aiming for this HIV negative population. And that looks pretty good. So I think that is what I've got here today. Maybe I can stop and take questions. So if I were to develop a clinical test right now, we would probably take the top antigens, the overall performing antigens, make more robust assays. Because these assays are academic lab grade assays. They're not commercial grade. Make them more robust. And then I would probably test that product in each population separately in order to derive what the cutoff value should be. That's the approach I would take if I were going to commercialize it. All 40 were tested with ELISA. Yeah, and these results are ELISA results. But they're pretty high confident with these. Yeah, we're pretty high confident with these. But they're not, I mean, if you look at the sensitivity, the sensitivity here is not outrageous. Because there are negatives, TB negatives that have signal. Now, as you know, this is challenging. Because what we call a TB negative patient might not be a TB negative patient. A lot of these patients have had TB before. And they don't have it right now. Or they have other mycobacteria or all kinds of other things could be going on. Not as true for the US population as it is for this population. But nonetheless, it can still be true. And so you do the best you can. But yeah, so I would probably include multiple antigens just because of this issue that there is a little bit of overlap. In today's lecture, you have seen that a new technology platform like NAPA can make research into new areas so easy and so useful. Imagine that expression and purification of all the MTB clones could have been so difficult. But now, because Dr. Laberit's group has access to all these clones for mycobacterium strains, it is very easy for them to prepare the chip which contains all the genes of mycobacterium strain. And therefore, now one could use this technology platform to screen the patient serum samples who are affected from mycobacterium. Of course, you have seen that there is a need to look into various type of controls, people who are never having infection, people who are having latent infection, active TB. And then also, if you could add the sample population affected from the various type of resistant strains, those could provide very novel information and probably new insights for the clinicians to treat these deadly diseases. We'll continue our discussion about use of NAPPA technology and other protein array platforms, especially for the drug screening in the next lecture. Thank you.