Phyloseminar #60: Andrew Roger (Dalhousie)





The interactive transcript could not be loaded.


Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Streamed live on Oct 7, 2016

Combating phylogenetic artefacts by modeling site-specific substitution processes with mixture models and approximations

The most widely used phylogenetic models of amino acid substitution involve a single reversible empirical substitution matrix (e.g. LG, WAG, JTT etc.) and a mixture model of rate heterogeneity cross sites, such as a discretized gamma distribution. However, these models fail to capture important constraints on protein sequence evolution, heterogeneity in the substitution process across the tree, and heterogeneity across multiple proteins in a concatenated data matrix. Failure to model these features of the data can lead to artefacts in phylogenetic reconstructions, especially for "deep" phylogenetic problems. Here I focus on the importance of modeling site-specific heterogeneity in the substitution process.

The structural and functional roles of residues in proteins lead to constraints on the kinds of amino acids that may be substituted at positions over time, a feature that is not captured by the single-matrix models. Site-heterogeneous mixture models have been developed to address this issue. For example, the "CAT" mixture models (CAT-Poisson or CAT-GTR), implemented in the Phylobayes program, have been shown to successfully avoid long branch attraction problems associated with single-matrix analyses in a number of published cases. However, the utility of these and other mixture models is severely limited for very large phylogenomic analyses because of their computational time cost and memory usage. I will discuss several simple rapid and efficient approximations to these full profile mixture models. Our simulation and empirical data analyses demonstrate that these approximations ameliorate long branch attraction artefacts and, in several cases, provide more accurate estimates of phylogenies than the mixture models from which they derive.

Comments are disabled for this video.
When autoplay is enabled, a suggested video will automatically play next.

Up next

to add this to Watch Later

Add to

Loading playlists...