 Lita, you're setting a terrible example. We need to be using them. Pardon me. Where did it go? There he is. OK, I'll go up there with you. So Ed Young, who's a science writer, obviously very well known blogger, too, a science writer for Nature and National Geographic, kindly, kindly offered to moderate our open floor sessions. So the goal here is to really get the group talking basically based on Owen's charge of what are the current gaps, needs, and challenges for this field, considering how diverse the population is. So I think Ed wants to maybe kick start their conversation. And I'm going to just sort of wander down here. Yep. OK, thanks. Hi, everyone. I'm sorry, one more thing. Chris and Nick are going to be taking notes from this open floor, so we'll have a record of all the open floor discussions. OK, hi, everyone. It's a pleasure to be here and to finally get to put so many faces to citations I've written about a lot of your work before, and it's a fascinating field of science. Please do ask questions. We are relying on you to be piercing and insightful to your colleagues. If you do not do that, then I, the journalist, will be forced to ask ridiculous and dumb-ass questions in place. So it is in your interest to speak up. Our four panelists here are meant to be lightning rods for that discussion. And a lot of our other panelists who you've heard from today are in the front, and they will answer questions as well. But we want this to be a very open discussion, so feel free to answer each other's questions. And let's make it a very open, free-formed chat. However, unlike what I just did, please make sure and use the mics. There are mics peppered throughout the room. I'm also willing to share mine. And I think there's a couple of others floating around. So if you want to ask a question or comment, please make sure and line up to the mic. Thanks. And also, this is the first of three sessions. So we obviously have a lot of time to cover a lot of the areas that we're talking about in the next few days, like translational aspects. So let's talk about big picture staff, tools, resources, all the issues that you've heard about today already. So I'm going to open with a question that one of your microbiologist colleagues who isn't here sent me via email. You couldn't make it. I'm sure he's cackling with glee. And it's this. Many different groups have recently said that they know and understand the key to various complex human phenotypes and the rise of various human ailments. Some think it's epigenetics. Others have touted copy number variation. And now we're seeing that a lot of people are saying it's the microbiome. So the question is this. Is that hype? Or is there truly something special here? Would any of the panelists like to? I mean, to me, it's not complete hype. I think that anybody who claims they have the key and there's only one, I think that's nonsense. I think the microbiome is a component of a complex system that involves, I think we've seen it here today, not just the microbes, the host, the immune system of the host, and so on. And I think that without understanding the microbiome component of that complex system, we're not gonna get the big picture of all those phenotypes. So that's my view to it, my view to it. And does anyone else have any views on that tool? Yeah. Microphone's just bad. Going on, can I ask something? So at least in America on TV, we see more and more advertisements for probiotics that'll make you feel better. So is that part of the hype here? Or do any of the speakers feel that, yes, you can manipulate through some organisms the health and wellbeing of patients with regard to specific diseases? So one thing I thought was that we could leave the probiotic discussions for the third day, which is focused on the translational aspects of the work today. So I'm gonna write that down and we'll lead off with that for the third of these discussions. But Martin, do you wanna go? Yeah, so I wanna respond to this question about whether it's hyped or not. The short answer is nobody knows yet. But we have a lot of diseases that have increased dramatically in a relatively short period of time. The human existence in 50 years, certain diseases have gone up 100%, 200%, 500%. Enormous changes. Changes in the human genome can't account for that. On the other hand, we have lots of data that is emerging that the microbiome is quite plastic and is subject, as David mentioned, to lots of different kinds of perturbations. And so it's at least a testable series of hypotheses that perturbation in the microbiome is of sufficient magnitude to account for many of these changes in disease incidents. Great, thank you. So do we have any questions from the audience? So we hear lots and lots and lots about DNA sequencing, right? And there's hardly any other techniques out there. Is there room for any other techniques given the power of DNA sequencing and data analysis or is it, is the future just gonna be DNA sequencing? Both in the clinic, analysis, studies and so on. Is that one for me? I'll take it. So I think that DNA sequencing is foundational. And so that we need as a platform for many of the other techniques. But I think that other techniques are highly complementary. That they're not as well-developed, perhaps, as the DNA sequencing. But we're starting to see a trend, for example, with proteomics in development of the technologies that are considerably lagging behind being able to produce as much information as the DNA sequencing. But there is the upwards trajectory there. So I predict that in maybe 10 years' time or so, we will have sufficient data from other types of data sets like proteomics to complement the sequencing. But right now we're very sequencing centric. And I'd just like to add to that that a lot of the analysis tools that are originally developed for DNA sequence analysis, including ones that Greg and Curtis and I have put together, those are equally applicable to other kinds of data sets. So all the stuff that I showed you, for example, there's absolutely no reason we can't do that. What's a metatranscriptomics or metaproteomics, except you're not going to get thousands of samples any time soon, which is the volume of data that you need to use those kinds of visualizations. But all of that type of data, including both taxonomic and functional data, can slot right into a lot of the analysis methods that you're familiar with from DNA studies. So if I could just comment, I like Sarkis-Masmanian's line. I think it was that he was doing experiment omics and that the whole idea of doing hypothesis-driven experiments based on genomic data and then following up on aspects of the hypothesis with biochemistry, structural biology, whatever, and get to new studies, you might then do with omics methods and then back again to develop a richer picture. And I think already we're seeing some of the, many of the high-impact papers with microbiome-type data also have an array of other kinds of stuff, immunology, whatever, that fleshes it out and turns it from being descriptive to mechanistic. And I think that's a pretty strong theme in the last couple of years. Thanks. That's awesome. Ah, yeah. So one of my questions, when I looked at Rob's discussion about the American gut, and I think that's an absolutely wonderful project. It's the project where you're starting to really get an idea of really human biodiversity. It's an opportunity to look at human biodiversity of the human gut, but then it makes me think about what we often affectionately refer to as the normal cohort. And this normal cohort is 80% white and affluent, largely students from universities in urban areas. And I wonder, is that okay? I mean, maybe it is. Maybe the design is fine and it's gonna really help us move forward. But do we need to start representing a more diversity of people, even within our own country? And if we're gonna do that, how are we gonna get it done? Because these smaller pockets of money or side projects like the American gut, which are, again, absolutely wonderful, is that really enough to get the job done? And so I guess that's my question, is if one of our charges was to evaluate the core human microbiome, which implies a human biological diversity project, that implies a biodiversity project, did we really even touch that? So thank you. So it would be absolutely spectacular if NIH were to fund a human microbiome diversity project on the scale of the human genome diversity project that was done as part of the human genome project. We just don't have a good stratified sample that covers diversity. That would be a far, far more efficient way to do it than self-participation. But at the same time, it has been very difficult to get a project like that funded, at least to date. And it's not entirely clear to me why reviewers hate it so much. Maybe a bunch of them are in the audience and would like to comment on that. But getting that kind of project going would really expand what we know about the human microbiome, what we know about how it can vary, and what the major factors are underlying that variation. And so crowdsourced and crowdfunded projects are going to go some way towards that. But it's going to be a long time before that type of project can get the infrastructure that you would need to do the sampling properly if you were going to use it for epidemiology, for example. So actually along the same theme, I really wanted to applaud Rob, your paper about studying the meta of studies. And personally, I'm beginning to get a little bit of an anxiety disorder about facility effects in animal studies. We're seeing a lot of good animal studies, but as we saw, different mouse providers, different bacterial backgrounds. And I wonder, I think we may need a real Manhattan project in a sense to address some of our key large animal models, some of our key small animal models across multiple vendors, multiple strains, multiple facilities, and do that sort of ahead of time, rather than post facto, because it might be a little bit hard to disentangle after the fact. Yeah, again, if I could just comment on that briefly. So pharma has a huge amount of data, but unfortunately, you can't have it. And in addition to the facility effect, there are also huge cage effects, and so doing things like looking at genotype effects in the context of litter mates, replicating them across multiple cages, and additionally, replicating them across multiple facilities can be really critical. In the interest of time, I dropped a bunch of data that we have on that specifically. So in that case, I'm not sure we need, so that is one of the cases where I think we can do that by a distributed project rather than a centralized project, though, because there's enough cases where the same mutation has been rederived multiple times, or where you have mice that have been shipped between facilities and so forth, that as long as people are able to very accurately record what they did, where they got the mice from, what the mutation was, and so forth, how they've been fed in the way that Owen was describing very nicely in his talk, that's exactly the kind of distributed project where it would be a little bit of administrative burden for everyone, rather than a giant, centralized $100 million project to even get off the ground. But I think collecting that data would be absolutely, would be absolutely invaluable for making a lot of progress on figuring out what you can reproduce. If I could make a comment on that. We recently completed a study where we studied fungi as well as bacteria, and we saw wave after wave of fungal colonization in a cage-specific way, even in our control. So I think it's probably even worse with fungi than it is with bacteria. But one thing you can do, at least with bacteria, just an element of the design, is use a lot of cages for your controls, use a lot of cages for your experimental, and then permute by cage, and try to treat that as a variable from the start in your design, and then ask if you have a signal over the cage effect. If I can, if I can speak here, just to add a little bit to that. I mean, I think certainly for those studying the immune system, I think for the last 30 or 40 years, there's been enormous variability in the kinds of results that have been reported for various autoimmune diseases, and I think now it's appreciated much of it has to do with the source of the colony. Now, it's becoming obvious that it goes way beyond the immune system, and of course, we've heard some other things. Recently, people in the autism field have found that they cannot reproduce behavioral defects that they have with particular mutations in autism genes, and it hasn't yet been shown that that's related to microbiota, could be related to feeding, to other conditions, but I think many of us would not be surprised if that's what it's going to be. So, I think we do need some kind of an appreciation of what constitutes a basic microbiota, certainly for rodent studies, but you can definitely see with something like SFB, maybe SFB is an outlier, but it affects a lot of different disease processes in mouse models, and I do think that we need to have some kind of better monitoring so we can do comparisons across the board. Okay, there's a question there on the standing mic. I think she's first. Okay, sure. So, I just want to go back to Owen's charge a second about potential gaps in knowledge and move a little bit away from computational gaps and more into the concept of study design gaps, and so I'm interested in people's feedback to the issue of how do we actually power up metagenomic studies? You know, from the WashU group, we have some derelict multinomials as to how we actually can adequately power metagenomic studies, but I don't think that we adequately have described, depending upon which type of metagenomic analysis we're going to do, when have we actually adequately powered a study to detect the differences that we do? And the second kind of gap in terms of study design I think is important to think about is we've all published in various forms about all kinds of different stratification that we see either whether we're talking about the HMP reference data set, which I will comment back, was actually very well representative of the US population at large. We were just shy of 30% of non-caucasians and we had good race and ethnicity looking across the US as a whole. It was very representative actually, but when we're starting to think about these stratified analysis, we're looking into all of our different data sets. We know that males and females have very different profiles across multiple body sites, and we don't always go ahead and stratify our analysis into male and female or stratify our analysis by race or ethnicity. And the third kind of gap that I potentially see in terms of our study design gaps is really thinking about when we're doing different types of stratified analysis and how we're going to look at our data, whether or not if we give any given moment in time what we call healthy or reference is always going to be amenable to change, especially if we're working in younger reproductive age populations simply because disease manifestations haven't taken full. And so it's important to remember that a fixed phenotype in a moment in time will not always be true necessarily as that time goes on. And the importance of actually looking at that in a broader kind of longitudinal dbGaP data set that becomes a little bit problematic because it means that investigators actually have to update their phenotypic data. But there have been some of us that are associated with CTSA and other clinical trial networks in which we actually do update our phenotypic data over time so people know how those disease sites change. So I'm curious in other people's feedback and thoughts and thinking about study design and whether or not these are good knowledge gaps for us to be approaching and thinking about moving forward. Does anyone have any response? Sounds good to me then. I agree that those are gaps. Okay, our official rate a lot about good study design in the 1920s and it's still true and with reading that literature. But I think especially as the sample sizes start to increase, people have done large cohort studies especially epidemiologists have had to deal with large multivariate data collections before really have a lot to offer in figuring out how to standardize large studies, how to figure out batch effects to either reduce the de-trend for them. And as Rick said, having multifactorial study designs where you can explicitly take things like cage or sampling plate or that kind of thing into account are really useful. So for example, if you're running all of your time point zero on one plate with one sequencing technology and then waiting two years and sequencing your experimental cases with like a different technology, you'll probably find some differences but you probably don't want to attribute them to the phenotype rather than the technology unless you've done the appropriate controls. Although you'd be surprised by how often that happens. And I think building on that, I mean a multifactorial study design is incredibly difficult in clinical research to power to, it's very difficult but in this type it's even worse. And most of our power analysis that we're trying to do is based upon very high biomass samples. And as we move into regions of low biomass samples it becomes even more convoluted because you're getting much fewer microbial reads per sample to try to actually power to. So I think those are huge gaps in our knowledge and trying to design good studies is then even thinking about is it a high or low biomass sample? Yeah, with respect to power, I think one of the key challenges has been that until very recently they just haven't been enough studies to even guess what the effect size is going to be for a new study which I think has led to a lot of people having the following very frustrating discussion with either their IRBs or their reviewers about if we knew what the effect size was we wouldn't have to do the experiment to find out what the effect size is and that's why we don't have a power analysis. But at this point there's enough data sets out there with varying numbers of subjects, varying treatments and so forth where if you can guess which study might have an effect size that's similar to the study that you're trying to do what you can do is you can take the data from that study and do permutation tests or sub-something or that kind of thing and ask how few sequences or how few subjects in that other study could I have got away with to still see the main effect claimed in the paper? And so we have a tool called Evident primarily developed by Anthony Gonzalez that addresses that problem that we're preparing that for publication at the moment but like all our other tools you can get it off GitHub in advance of publication and try it out. But basically what it lets you do is it lets you feed it into an existing study then ask if I had had just a subset of the data from that study what fraction of the time would I have seen the results that's claimed in the paper? And so that kind of thing could be really useful if you have some analog of the experiment that you're trying to do. So suppose you guess that your effect size is going to be as big as the effect of obesity or as big as the effect of the difference between a one-year-old and a five-year-old or as big as the difference between say the hand and the gut if you can guess some analogous physical situation you can get a long way powering the study that way. But then you have to pay very careful attention to what your outcome measures are because there are so many different analyses that you could do but if your plan was to basically get your data set and do all of them well you'll always find something significant. Okay let's take a couple of questions at once so our intrepid questions can sit down over there and then followed by that frequency. Okay so I'm Linda Duffy from NIH and my question is it's an intriguing concept regarding keystone species and perhaps one of the values of the large-scale genetic screening is if we can really identify key genes in both the eukaryotic intestinal epithelial cells and in the prokaryotic microbial communities. And so I kind of would just like to pose a question to the panel as to what model systems do you think could enable us to understand how a specific bacterial gene could be responsible for a specific metabolite and how a specific host gene recognizes this metabolite. Okay great and let's take one more question and then we'll get answers. Oh I just had a comment about what Rob was saying earlier about doing population-based studies on the gut. You could, I just wanted to suggest perhaps accessing the National Health and Nutrition Examination Survey maybe the CDC can work together and because that's a population-based representative sample, diverse sample based on race and whatnot. And I also just wanted to add one other comment about your talk on causation and association and differences. One of the hurdles I face as an epidemiologist is that it's sort of a structural hurdle with applying for funding is that clinical trials include sort of behavioral interventions as well and there's extra hurdles for accessing those funding sources if you're doing an intervention which would be the best evidence for causation. So just an additional comment. Thank you. Okay so to this question about key genes and model systems for linking genes, specific genes to metabolites. I can try to tackle that one. So if I was trying to link a specific organism or gene to metabolite, there's several things, several ways I would want to go about that. First of all if you have a metagenome that is sequenced to the sufficient depth that you can try to bin out that organism's genome that would be great because then you could analyze the genome itself and try to understand some of the functional genes on that in the genome. But often you can't do that and so what you're normally left with is the correlation between a species abundance or gene abundance and the abundance of the metabolite. If you do know what the organism is you could always introduce it into a mouse model, a antibiotic mouse model and then look for some kind of causation there. If you don't know, if you don't have the organism isolated then of course then it's much more difficult. Those are the two ideas I had. But I think if you think in the context of longitudinal sampling, then you can start teasing out association of abundance of the metabolite and correlation with abundance of the, because you can have a metabolite made by several different bacteria, the same one. So it's another way to approach the problem. Just while we're waiting for that, longitudinal studies give you dramatically more power to figure out causative associations because you can use things like distance encoding and look for changes rather than absolute state, which is a lot better. It's been really effective in looking at associations between different taxonomic levels in the ocean, for example. Yeah, and also if you have longitudinal data sets you can fit mechanistic models that explicitly model the links, the biological processes that lead to the patterns in the changes in the substances and the bonuses that you're following. So it's much more effective. Thank you. I think some of what you're asking is, again could be old fashioned experimentomics, gene hunting, looking for enzymes that act on particular substrates, stuff like that, the sort of old fashioned sort of analysis that I think has a lot of potential to augment the metagenomic kind of studies and help interpret them and understand them. Yeah, I was really specifically addressing the niche specialization concept and if you really do look for keystone species in different niches with very specific metabolite, how do you set up the model systems and just getting your input into, there's various ways to consider that but thank you, that was great. So you raised something I was gonna bring up eventually was there's a dirty little secret and that's that about 30% of the genes we're talking about we can't annotate, we don't know what they do and so we don't know what metabolites they might produce. And so that's a problem. The other problem that we have is that, Janet mentioned this that it's very nice how you go from the amplicon sequencing to the metagenome to the metatranscriptome and yes the proteome is wonderful because it actually looks at the activity that's present but that's even that's not quite correct because it's not the flux. So nobody's looking at the enzyme kinetics, right? So you could have an enzyme that's present at low abundance but it could be very, very active and account for a lot of the flow of nutrients through the system. And so when you look at the pathways and genes that people are talking about they are the pathways where we know what they do. Glycolysis, central metabolite, amino acid metabolism, purine metabolism and so on. When you look at the constancy of the functionality that's present in these things that we've seen on the gut the genes that are being reported, the pathways that are being reported are those of central metabolism which you would expect to be common and invariant among living organisms because you would expect to be, DNA polymerase should be there somewhere and purine biosynthesis should be there. And that would be true for all the bacteria that are present. So a gap would be to figure out what to do with the 30% of the genes, we don't know what the heck they are and to try to get, and I don't know how to do this at all, some idea of how that affects energy and nutrient flux through the system by an estimation of kinetics. Because if we're gonna be talking about ecosystem dynamics then it's going to be carbon and energy flow through that system that's gonna be driving it unless it violates ecology at lots of different levels. So there's still a huge gap. We're making progress and getting smarter. There's a question about the omics. We have a bigger toolbox than just sequencing as Janet pointed out, but we still have more tools we need and it has to do with actually looking at things that tell us something about the flow of energy and carbon through systems. And maybe getting to the signaling processes which I would argue are not gonna be the major pathways. You're gonna find those in the genes, you don't know what the names of them are and what they do. The ecology is in the part we don't know yet. There's an assertion, you can... Let's just have one quick response to that and then we're going to go to more questions. Did you want to? Larry brought something very important is that from macroecology there's another very important concept which is for a long time people looked at these keystone species because they were visible because they had people funding research for these highly visible animals, but then suddenly a lot of theoreticians realized that the indirect interactions, the weak interactions were also very determinant in these webs. And so these weak interactions from micro systems ecology can be very important and we might miss them just because they're not very visible. Okay, a gentleman in the back there had a question. Thanks. Actually I wanted to comment on a previous question or a comment about the gaps that we have. Being someone who works on the front end is thinking about clinical design and how to sample patients, a lot of the low technology types of things. I think it's really critically important that we have a discussion and some consensus about how we do those things in the front end because they can profoundly affect the interpretation of results that come out in the back end. Let me give you an example. So my area of interest is in the study of inflammatory bowel diseases. Well these are typically thought of as two types of diseases, Crohn's disease and all sort of colitis. But I think that with the information that we've gotten from GWAS studies, it's clear that there are many ways to the same end. That is the description of these two clinical phenotypes represents the end of many potential pathways that eventually lead to it. So if you consider that, these IBD may actually be not two diseases, but actually a dozen or even a hundred diseases. And if you consider that, then the issue is how do you stratify people so you're not comparing apples to oranges? And that becomes very difficult when you're trying to run your data analysis let's say on the microbiome. The other consideration is that inflammatory bowel diseases has different stages. And what you see in the early stage represents a completely different set of potential pathogens or causative agents and pathophysiology than what happens in later stages. And these are considerations I think we have to bring into the equation in order to correctly interpret the data that we get from the microbiome. Thank you. Can we get comments from over there? Yeah, I just wanted to return to the previous discussion about how to identify keystone species or keystone elements in a community. And this may have already been highlighted previously, but aside from the idea of using synthetic communities and deliberate knockouts of the sort of the methods that Andy Goodman and others have used previously and aside from the use of, I think, very valuable insights from epidemiologists of decades ago in looking at cancer and environmental factor associations and possible causation links that one of the really great needs right now would be a suite of reagents that would allow for a deliberate knockdown of specific features of a complex system inside you. And so small molecules, for example, that might inhibit a specific enzyme or knockdown a specific transcript or an organism. And I think there are lots of great leads on how each of those things might be devised and standardized and ramped up for higher throughput production. But one example being this paper from a couple of years ago and I'm gonna forget on the authors and they're probably here, but the deliberate design of an inhibitor for the glucuronidase that acts upon CPT-11, this anti-cancer agent, rendering the conversion of an otherwise relatively inactive form to an active form diminished and thereby allowing greater use of this drug in cancer settings simply by targeting microbial and microbiome enzymes, so just a few comments. So it's the microbial version of shooting all the wolves that Rob talked about. Right, right, the targeted shooting, right. Curtis? I just wanted to comment quickly on something that Larry said earlier, since a question we get a lot about the functional stability is whether it's just the boring stuff and whether it really is central carbon metabolism and the ribosome and DNA replication and whatnot. And that is a big component of it, but it's by no means all of it. I was checking for examples on my phone in line here. Chaperones type two, type six secretion, host interactions, stress response are all among the microbial pathways that show that same metagenomic selective effect. So two things that I perhaps didn't emphasize enough to go along with that are one, they are differentially regulated transcriptionally. And two, that's one of the potential parallels with our own genome. Just because most of it's the same doesn't mean that the smaller parts that are different are uninteresting. And I think that's where it gets back to that edge of functions that we have not characterized yet. 30% is perhaps an underestimate of anything and those small differences between us are going to be what manifest and phenotypes in the same way that small differences in our own genomes do. Two comments, if anyone has a response please chime in. I could just mention that also at the metabolite level, I mean speaking of unknown things, it's vastly much more than 30% of the metabolites that we don't know. So definitely a lot of things. And I also want to reply to the question about the IBD studies and the fluctuation of disease phenotypes. So when we look at the, we just did this one longitudinal study of different IBD patients over time. And the thing that's really interesting and that I didn't present is to get the clinical data as much as possible because these patients are taking drugs, they're taking antibiotics, they have flare ups, they, some of them have gone through resection surgery. And that is the really meaty, interesting data to match on to these wild fluctuations that we see. So that's the goal. And Keta's following up on your response to Larry's comment. I mean, I know you did a lot of controls for the HMP paper and then we also did a bunch of controls for the Tim Balladale 2009 and Nature paper, Moge et al. 2011 science paper, precisely to make sure that we weren't just seeing effects from every microbe has to replicate DNA, make amino acids and so forth. So those controls have certainly been done and that's not all that's going on. Now, at the same time though, as Keta's and Owen and I all mentioned in our talks, if only briefly, there's a huge amount of information that's out there in people's heads about the functions of specific genes and gene families that's not currently in the databases in a form that we can use to automatically annotate all of the data that all of you are producing. And it would be fantastic if we could figure out a crowdsourced effort to basically get all of those annotations into a standardized format so that we could provide precisely that kind of service back to the community and get the amount of unannotated stuff down from 30% or whatever it is currently down to less than 1% and really see at a much more sensitive level a lot of those differences in rarer pathways and rarer functions that we know for sure are important in some cases and are probably important in a lot more cases than we know. And Curtis, you just correct me if I was wrong but in the HMP paper, the functional stability that you observe was actually least stable in the vagina if I remember well. There was a subgroup of subject that where the function is changing at least the potential function. Yes, definitely that was also the case. It's also important to recognize that although the functional profiles are a lot more stable than the taxonomic profiles but the variation that there is in the functional profiles is highly correlated with the variation we see in the taxonomic profiles even though the variation is a lot lower. So they're a lot more stable but the instability is highly correlated with the instability in the taxonomic profile. Again, suggesting that it's not just that the assay is very insensitive and you see the same thing no matter where you look. And to turn around and restate that as someone who sits in a biostats department it's also more statistically significant because it is more significant or excuse me, more consistent even though the magnitude of that change is smaller. Okay, enough question from the middle. So we've talked a lot about metabolomic analyses and analyzing metabolites but I think it's a good thing to keep in mind that it's really a snapshot of what's going on because what we're measuring is metabolites that aren't being utilized by the community at that given moment in time. So I guess a challenge to the field in my mind is more real time sampling and also sampling within an animal or human model over time to get a more accurate picture of how these metabolites are being utilized or not being utilized by a specific community. So that's a challenge in my mind. I agree. That would be great. Okay, and another question from over there. I have a very general question about the use of macroecology to address the relationships in our bodies. In particular, there was the discussion of wolves, what are the wolves in our bodies and are these macroecological analogies actually appropriate? Is the wolf the immune system or the phage or perhaps the communities in our bodies are just structured in a rather different way from macroecological systems and these analogies break down rather sooner than we'd like to think? Well, I'll answer the second question first which is a lot of those analogies don't even hold between different macroecosystems and a lot of the things that often get thrown out as uncontroversial in these kinds of meetings applied to microbial ecosystems are also controversial in macroecology. So it is worth being a little bit careful about that. As far as what the keystone species are in human-associated body habitats, to a large extent, we don't really know because the ability, as has been mentioned several times, to completely remove a particular microbe from a human-associated ecosystem without causing much larger scale disruptions like you do with antibiotics. And as Pete Timbathioz and his really elegant cell paper earlier this year, that response to antibiotics also seems to be individual specific. So you can't guarantee that the same antibiotic is killing the same bug when you administer it to different patients or even to the same patient at different times. It's really difficult to do some of the specific kinds of species removal that have been possible in macroecosystems or that have been accidentally done in macroecosystems. The other thing that's been really interesting there is the invasive species literature where you have accidental introductions of species into an ecosystem that hasn't seen it before but because they're on about our own scale and you can go in there and see the interactions and count them and sit there with your binoculars and see how often the lion's taking down the gazelle or whatever, it's a lot harder to do that kind of thing in the gut and basically no techniques exist for doing that in situ yet. So that's really a major challenge right? How can we observe ecological interactions directly in the gut as opposed to exit you after you do something like say pull out the fibers from, you can dissect out the fibers from feces and see what are associated with them but by the time you get them separated from the substrate that takes a long time and you've disrupted a lot of the interaction that you're looking for. One principle that's very transferable from macroecology is issues about bias and censuses. So a lot of people spend a lot of time worrying about isn't there something biased in that DNA extraction is going to be biased, BCR is going to be biased, amplification of whole genomes is going to be biased and so forth. So that's something that ecologists have dealt with for a long time so if you put out traps for insects for example, exactly how you baited the trap, how you set up the cover, how far it's sticking out of the ground or inset into the ground, where you put it in relation to vegetation and so on. All those are going to bias what you see but if the methodology is consistent the differences that you see across sites are still interesting and relevant and so I think we can definitely take a lesson from that aspect of macroecological studies that a lot of the time what you care about is the differences with consistent methodology rather than necessarily needing an absolute count or an absolute census to get it very useful data. If I could follow up on what Rob said in one regard, with all the NIH people here I worry we may be leaving you with the impression that things like all the methods, all the analytical tools, website, dry site are a lot better figured out than they are talking about the biases and stuff. I mean we all put our best foot forward or we give talks, we don't like talking about the dirty linen and stuff like that but there are huge biases in how we sample, how we deal with reagent contamination, how we do the analysis, different ways of aligning metagenomic data, different tools can lead to very different pictures and ideally you'll do it a few different ways and make sure you get the same answer or for the most important thing you find you test it some other way but boy there's a lot that's still in flux and getting developed and I hope the NIH people are sort of responsive to that. I think a grant that has a piece where they're testing out how the methods work I think is better than a grant that doesn't and so I don't want to leave you with the idea that this is all finished and everything works perfectly and that's that. Although if I could add to that it's really important that the methods development piece has enough to it that you can actually find something out that's generalizable so one thing that's a huge problem with methods papers that are submitted for example to the SME General still on a regular basis is that it's basically an analysis of one sample where that one sample has been run through a bunch of different methods and amazingly the results are significantly different depending on which method you used and the reason why that's not useful is there tends to be a lot of there tends to be a lot of difference between different samples in terms of their interactions with the different methods that you try out so if you do a lot of tech development on one sample very frequently it doesn't generalize even to other samples you collected from the same body site and so one thing that's really critical if there's going to be a methods development piece you need to run it on enough samples that you can draw useful conclusions that are going to generalize to other studies rather than fighting out a lot about one sample in a way that's not going to generalize I think we're going to take one more question because that clock's going to turn red in a second so that's judgment in the middle so on the theme of gaps and challenges I guess something that I would raise to the community as a whole is that a big gap we have are the samples themselves and namely being that we don't have or the access to or the ability to analyze large cohorts that are well characterized have good phenotypic characterizations that you can map phenotypes tightly to be able to understand differences in genomic data is a big challenge and the reason I say that's a big challenge even though there are various cohorts out there they're not as big as they need to be and in addition getting funding for cohorts and such things is extremely difficult given the commitment of both private and public funding sources and so I guess something I'd put out there as a large challenge is as a community thinking about how we can actually create these large cohorts and things like the American gut are obviously a great attempt to do that one concern I have in those types of systems goes back to I think what Owen was talking about earlier is in the end also it's the sample characterization itself and so unless we have really strong standards around how we characterize those samples as well it's an additional challenge. It would be better to have well characterized populations or would it be better to just to have access to more people what are you rate limited by? Is it, can you just expand? Unfortunately both I think. Okay. Yeah I mean as a community if we could have access to say still samples from everyone and Haines which was brought up earlier everyone on the National Children's Study everyone on the Framingham Study and so forth that would be an incredible resource. At the same time I think it's been fair to say that it's been challenging to try to get that kind of thing together at least to date. Well I mean it's possible I know for a fact that for example in England they have a study similar to the National Children's Study and they are collecting stool both on the moms and the babies mostly. So I don't think it's you know we should give up I think there's still possibilities. Yeah I didn't really want to bring that up but it is true but especially Scandinavia is doing an amazing job on this with stool samples from you know stool samples collected at birth from kids who are now eight years old. Thousands of them in the freezer and that kind of thing and so we're doing a lot in that direction at the moment which is very exciting as are a number of other people at the meeting like Ruth for example. The Life Study is 90,000 pairs of mother children. That's incredible, yeah. Okay and on that optimistic note let's call this discussion to a close. We'll see you all tomorrow and please enjoy the bar. Thank you to our panelists and to all of you for the last question. Thank you.