 Hello everyone, we are delighted to present the first talk in this session and would like to thank the organizing committee, scientific committee and the chair of this session for giving us an opportunity to present our work. I am Prithik Jagatab, research assistant professor at the University of Minnesota and the co-lead on the Galaxy PT. Along with my colleague Subina Mehta, I'd like to present our work on metaprodymics analysis of data acquired by using two different mass spectrometry acquisition modes. This work is a result of collaboration with multiple labs. In particular, the biological data set is from Dr. Jacqueline Padilla-Gamino's lab from the University of Washington. The mass spectrometry acquisition of this data set was performed in Dr. Brooke Nunn's lab and we worked with Dr. Brian Searle to analyze this data set using encyclopedia software. Microbiome research has gained importance over the last few years, especially in the area of clinical research. There have been multiple studies that have highlighted the relevance of gut microbiome in human health and pathophysiological conditions such as inflammatory diseases, obesity and cancer. On the other hand, environmental microbiologists have been studying the role of microbiome in diverse ecosystems such as freshwater lakes, permafrost soils and deep sea oil plumes to name just a few. One such area of research is health of the coral system within oceans worldwide. Coral reefs occur in less than one percent of the ocean yet are home to nearly one quarter of all ocean species. Raising sea water temperatures is causing coral bleaching worldwide. As a result, over 50 percent of world's coral reefs have died in last 30 years and up to 90 percent may die within the next century. Coral bleaching results in uncoupling of the symbiotic relationship that the coral host has with the symbiont algae, thus making it vulnerable to diseases. In this study, Dr. Paria Gamino's lab, which is interested in the effect of bleaching on gamete generation and released studied Montipora Capitata, a main reef builder in Hawaii. In this study, their labs subjected colonies of M. Capitata to thermal bleaching for a few weeks before they were returned to the ocean ecosystem so that they can undergo recovery. In the June next year, on a new moon day, the corals spawn wherein X-perm packets are released and they were collected. Shown here is a microscopic view of one such X-perm packet. These X-perm packets collected from bleached and coral controlled samples were subjected to proteomics sample preparation wherein they were digested and mass spectrometric data was acquired in two modes, the data dependent acquisition mode and the data independent acquisition mode. These methods will be discussed later. The study of microbiomes uses various methods such as metagenomics, which measures the correlation of taxonomy with the phenotype using DNA sequencing data. Microbiome studies that use expression of microbial RNA and proteins such as metatranscriptomics and metaproteomics have an advantage in that it can measure microbial functions and thus help in understanding the mechanism by which the microbiome interacts or even interacts with its immediate ecosystem. In metaproteomics, which uses mass spectrometry, proteins from the clinical sample or an environmental sample are digested using enzymes such as trypsin to generate peptides. These peptides are separated using high performance liquid chromatography and then ionized within the mass spectrometer. These ionized peptides are selected and analyzed for their intensity retention times based on liquid chromatographic conditions and mass to charge ratio also called as MYZ ratio. Peptides with precursor intensities above the noise threshold are selected for fragmentation thus generating tandem mass spectra termed as the MSMS spectra. These MSMS spectra can be matched to peptide sequences in a database. This method of data acquisition is called data dependent acquisition or DDA. Although this approach is powerful due to the stochastic nature of data acquisition during mass spectrometry, the mass spectrometer samples peptides for fragmentation with the bias towards those with the strongest signal. Thus DDA presents a challenge in reproducibly quantifying low abundance peptides. If you see in this figure for data independent acquisition mass spectrometry or DIMS, it continuously collects fragment ion intensities for all eluting peptides by using a wider isolation window such as 10 Dalton's. The simultaneous isolation and fragmentation of multiple peptides results in a complex MS2 spectrum consisting of ions from all isolated peptides. As a result, data independent acquisition provides broader dynamic range in quantification and improved reproducibility for identification and quantification resulting in lesser missing values. Due to the complexity of the data, particularly ion information from multiple peptides, many bioinformatics approaches have been developed. One such software is the Encyclopedia software which was developed in Mike Micos lab by Bransel. This software uses DIA acquisition using pooled gas phase fractionated samples and experimental samples along with the protein fastafile as inputs. The protein fastafile can be subjected to deep learning based ProCID software which generates a predicted peptide library which predicts fragmentation as well as retention time amongst other features. The ProCID generated spectral library can serve as an input for Encyclopedia software analysis. The software workflow generates a chromatogram library which is subsequently used as a template by the experimental DI data to offer an output of proteins and peptides and its associated quantitative data. The quantitative peptide information is later used by MetaQuantum software to generate statistical visualization outputs for biological interpretation. Attendees are recommended to visit the GTN sites for Encyclopedia and MetaQuantum software for a better understanding of its functioning. As part of this analysis, we started by generating a reduced database from a large database of over 7 million protein sequences that included the coral host, its symbionts and other bacteria that have been shown or were detected in data set to be associated with corals. To detect organisms within this sample, we used Compil2.0 software developed at Scripps Institute along with Literature Survey. To reduce the database, we used MetaNovo software that uses de-Novo tagging methods to detect possible proteins that are present in the sample. The compact reduced database was later used to search the DDA data set against and this led to detection of about 15,000 to 16,000 peptides in bleach and control data sets respectively. The detected peptides were subjected to quantitative functional and taxonomic analysis so that the inputs can be used for MetaQuantum analysis. For DI data, proceed generated spectral library along with FASTA file, the GFP data and experimental data were used to detect peptides and their associated quantitation. As you can see here, DIA analysis detected fewer peptides and proteins as compared to the DDA data. At this point, I will hand over the mic to Subina Mehta from the Galaxy P team who will take you through the data analysis of this data set. Thank you Pratik, hello all, this is Subina Mehta and I am going to go over the data analysis that we performed on the Coral X data sets using the Galaxy platform. As you can see, the results from both DIA and DDA workflows were a list of identified peptides along with their quantitative values. We decided to analyze the data with MetaQuantum to look at some statistically significant peptides present in the sample. We performed our analysis using MetaQuantum which offers differential abundance analysis, principal components analysis and cluster heat map visualizations across multiple experimental conditions. It is an open source tool is available via command line and also accessible via Galaxy platform for reproducible analysis. The three main inputs are peptide, quantitation report, functional and taxonomic annotation. This is the flow chart of the DIA and DDA analysis that we performed via Galaxy. Just to pre-cap, in DDA search, we use search query peptide shaker to perform database search and flash LFQ for quantitation. In DIA, we use pros and base vector library and encyclopedia to quantify identified peptides. Both DIA and DDA peptides were subjected to Unipept and LastP for functional and taxonomic annotation. Finally, MetaQuantum enabled quantitative and statistical analysis and visualization of functional and taxonomic expression. MetaQuantum has several modules within itself. The four required inputs are peptide intensity, taxonomic and functional annotation, MetaQuantum sample and MetaQuantum databases. The first step is the expand module wherein we expand the set of original annotation to include all the ancestors of the direct annotation. Next, we filter the peptides according to user-defined threshold. Next, we perform data analysis. Lastly, visualization, wherein we get a bar chart, heat map, PCA plot and volcano plot as the output. Let's go back and look at our data. We had peptides with quant values in both DDA and DIA. Given that there were less than a number of peptides from DIA analysis, we decided to look at the peptide overlap between DDA control and bleed sample with DIA output. The overlap analysis showed that DIA detected lesser peptides than DDA. In order to compare the quantization values within DIA and DDA, we looked at the peptides that were exclusively detected in control or bleach in DDA and see how their quantitation values showed up in DIA data. So here there are 173 peptides from DDA control dataset that were detected in DIA and 125 peptides from DDA bleach that were detected in DIA data. This panel shows the peptide quantitative values for one peptide that was exclusively detected in control DDA sample. As you can see, the DDA intensity values for these peptides were only in control and not in bleach. When we looked at the sample in DIA run, it not only showed better quantitative values in control sample but also in bleed sample. This is not unexpected given that DIA dataset is expected to have better quantitation and lesser missing values. Here is a set of peptides that have similar behavior in DDA and DIA. We also looked for peptides in bleed sample that are not present in control, which again showed a similar trend. This made us realize that it is important to get an extensive DIA data to make a better biological conclusion. Now that we see variations, we looked at peptide level differences for different peptides. Currently we have a situation where we have a DDA data with lot of identification but with missing values and DIA data have less than missing values but lesser identification also. Ideally we would like to have a data with more identification and less missing values. Even with this, we continued with our analysis. Here as you can see, as expected, antizoa, which is coral, and dinofaceae, the symbiote, are the two major class in both DDA and DIA. However, if you notice, we do find mammalia in DIA. This could be due to our DIA data not being complete. So we need to improve on this. Next, we looked at PCA analysis in both DDA and DIA function PCA-flav. We didn't see any clear separation between bleach and control because these changes were very subtle and with DDA and less missing values, we saw a similar trend with taxonomy, heat map clustering, and volcano-flav. This could be a result of the subtle changes as we mentioned and as bleaching stimulus was performed in fall season and we were checking in the summer season, so the gamete formation was not affected with bleaching. This could also be the result of missing values in DDA or incomplete coverage in DIA. As Merakwantum did not provide us with any statistically significant results, we looked at the peptides associated with genera identified and we saw that there was not much difference in genera in both control and bleat sample. Thus, we extracted those genera that were elevated in either control or bleach. Here is a bar plot of few of these genera that we looked. We tried to look for the same in DIA. However, here we saw an opposite trend in aspergillus and missing values in sparaforma, vibrio, and apostagopis. As a side note, we did find vibrio at a family and order level hierarchy. Next, we examined peptides associated with function and saw a similar trend of not much changes within DDA control and bleach and DIA control and bleach. However, when we looked at a few functions in DDA that were elevated either in control or bleach and surprisingly all were related with either DNA damage, response to stress, or signal transduction. Making us conclude that as the samples are showing very minute changes, we need to analyze the data with other statistical tools or our scripts that can capture these small changes and to make these available via the Galaxy platform. In conclusion, we analyzed Coral Xperm packet metaprodemic dataset acquired with two MS acquisition methods. Since our workflow detected less number of peptides with DI analysis, we plan to use alternative spectral library generation methods to improve on its coverage. We believe that this will help us to detect differences within control and bleach samples. We plan to use alternative software tools to compare DDA and DI analysis. Lastly, we would like to acknowledge our collaborators and funding agencies. Apart from the co-authors on this talk highlighted in blue, we would like to thank Peter Thaway-Beyon from Scripps Research Institute. Thank you for listening and looking forward to your questions.