Submit a preprint

292

Factors influencing the accuracy and precision in dating single gene treesuse asterix (*) to get italics
Guillaume Louvel and Hugues Roest CrolliusPlease use the format "First name initials family name" as in "Marie S. Curie, Niels H. D. Bohr, Albert Einstein, John R. R. Tolkien, Donna T. Strickland"
2024
<p>Molecular dating is the inference of divergence time from genetic sequences. Knowing the time of appearance of a taxon sets the evolutionary context by connecting it with past ecosystems and species. Knowing the divergence times of gene lineages would provide a context to understand adaptation at the genomic level. However, molecular clock inference faces uncertainty due to the variability of the rate of substitution between species, between genes and between sites within genes. When dating speciations, per-lineage rate variability can be informed by fossil calibrations, and gene-specific rates can be either averaged out or modeled by concatenating multiple genes. By contrast when dating gene-specific events, fossil calibrations only inform about speciation nodes and concatenation does not apply to divergences other than speciations.</p> <p>This study aims at benchmarking the accuracy of molecular dating applied to single gene trees, and identify how it is affected by gene tree characteristics. We analyze 5205 alignments of genes from 21 Primates in which no duplication or loss is observed. We also simulated alignments based on characteristics from Primates under a relaxed clock model, to analyze the dating accuracy. Divergence times were estimated with the bayesian program Beast2.</p> <p>From the empirical dataset, we find that the date estimates deviate more from the median age with shorter alignments, high rate heterogeneity between branches and low average rate, features that underlie the amount of dating information in alignments, hence statistical power. The smallest deviation is associated with core biological functions such as ATP binding, cellular organization and anatomical development, categories that are expected to be under strong negative selection.We then investigated the accuracy of dating with simulated alignments, by controlling the three above parameters separately. It confirmed the factors of precision, but also revealed biases when branch rates are highly heterogeneous. This suggests that in the case of the relaxed uncorrelated molecular clock, biases arise from the tree prior when calibrations are lacking and rate heterogeneity is high. Our study finally reports the scale of the gene tree features that influence the dating consistency with median ages, so that comparisons can be made with other genes and taxa. To tackle the molecular dating of events only observed in single gene trees, like deep coalescence, horizontal gene transfers and gene duplications, future models should overcome the lack of power due to limited information from single genes.</p>
https://doi.org/10.5281/zenodo.14000603You should fill this box only if you chose 'All or part of the results presented in this preprint are based on data'. URL must start with http:// or https://
https://doi.org/10.5281/zenodo.14000603You should fill this box only if you chose 'Scripts were used to obtain or analyze the results'. URL must start with http:// or https://
https://doi.org/10.5281/zenodo.14000603You should fill this box only if you chose 'Codes have been used in this study'. URL must start with http:// or https://
molecular clock, molecular dating, uncertainty, gene tree, primates, phylogenetics, phylogenomics
NonePlease indicate the methods that may require specialised expertise during the peer review process (use a comma to separate various required expertises).
Bioinformatics, Evolutionary genomics, Vertebrates
Nicolas Galtier suggested: Marc Robinson-Rechavi (Lausanne), Nicolas Galtier suggested: Bastien Boussau (Lyon), Nicolas Galtier suggested: Nicolas Salamin (Lausanne), Nicolas Galtier suggested: Maria Anisimova (Zurich), Bastien Boussau suggested: I know the authors quite well and prefer not to review their work. I think Benoit Morel (https://www.h-its.org/people/benoit-morel/) or Adrian Davin (https://scmb.uq.edu.au/profile/5621/adrian-arellano-davin) could be appropriate reviewers for this manuscript., Marc Robinson-Rechavi [marc.robinson-rechavi@unil.ch] suggested: Conflict of interest because of collaborations with Roest Crollius., Marc Robinson-Rechavi [marc.robinson-rechavi@unil.ch] suggested: I suggest Daniele Silvestro daniele.silvestro@unifr.ch, Joelle Barido-Sottani [joelle.barido-sottani@m4x.org] suggested: Simon Ho simon.ho@sydney.edu.au, Joelle Barido-Sottani [joelle.barido-sottani@m4x.org] suggested: David Duchêne david.duchene@sund.ku.dk, David Duchêne suggested: The authors have made an extensive effort to address reviewers' concerns. There is still some confusion regarding the use of the term 'precision', which is likely being mixed up with accuracy. To avoid this, the authors should replace all instances of both - particularly in the abstract - for their actual definition (accuracy: divergence from the true value; precision: width of the uncertainty/credible interval). Once this is addressed, the article will be a useful contribution to the field.
e.g. John Doe john@doe.com
No need for them to be recommenders of PCI Genomics. Please do not suggest reviewers for whom there might be a conflict of interest. Reviewers are not allowed to review preprints written by close colleagues (with whom they have published in the last four years, with whom they have received joint funding in the last four years, or with whom they are currently writing a manuscript, or submitting a grant proposal), or by family members, friends, or anyone for whom bias might affect the nature of the review - see the code of conduct
e.g. John Doe john@doe.com
2023-08-15 12:06:09
Federico Hoffmann