Összes szerző


Szánthó Lénárd L

az alábbi absztraktok szerzői között szerepel:

Szánthó Lénárd
Compositionally Constrained Sites Drive Long-Branch Attraction Artefact in Deep Phylogenomic Inferences

Aug 31 - csütörtök

12:10 – 12:30

Elméleti biofizika

E44

Compositionally Constrained Sites Drive Long-Branch Attraction Artefact in Deep Phylogenomic Inferences

Lénárd L Szánthó1,2,3, Nicolas Lartillot4, Gergely J Szöllősi1,2,3, Dominik Schrempf1

1 Department of Biological Physics, Eötvös University, Budapest, Hungary

2 ELTE-MTA “Lendület” Evolutionary Genomics Research Group, Budapest, Hungary

3 Institute of Evolution, Centre for Ecological Research, Budapest, Hungary

4 Laboratoire de Biométrie et Biologie Evolutive UMR 5558, CNRS, Université de Lyon, Villeurbanne, France

Accurate phylogenies are fundamental to our understanding of the pattern and process of evolution. Yet, phylogenies at deep evolutionary timescales, with correspondingly long branches, have been fraught with controversy resulting from conflicting estimates from models with varying complexity and goodness of fit. Analyses of historical as well as current empirical datasets, such as alignments including Microsporidia, Nematoda, or Platyhelminthes, have demonstrated that inadequate modeling of across-site compositional heterogeneity, which is the result of biochemical constraints that lead to varying patterns of accepted amino acids along sequences, can lead to erroneous topologies that are strongly supported. Unfortunately, models that adequately account for across-site compositional heterogeneity remain computationally challenging for an increasing fraction of datasets. Here we introduce “compositional constraint analysis”, a method to investigate the effect of site-specific constraints on amino acid composition on phylogenetic inference. We show that more constrained sites with lower diversity and less constrained sites with higher diversity exhibit ostensibly conflicting signals under models ignoring across-site compositional heterogeneity that lead to long-branch attraction artifacts and demonstrate that more complex models accounting for across-site compositional heterogeneity can ameliorate this bias. We present CAT-posterior mean site frequencies (PMSF), a pipeline for diagnosing and resolving phylogenetic bias resulting from inadequate modeling of across-site compositional heterogeneity based on the CAT model. CAT-PMSF is robust against long-branch attraction in all alignments we have examined. We suggest using CAT-PMSF when convergence of the CAT model cannot be assured. We find evidence that compositionally constrained sites are driving long-branch attraction in two metazoan datasets and recover evidence for Porifera as the sister group to all other animals.