Submit a preprint

233

MacSyFinder v2: Improved modelling and search engine to identify molecular systems in genomesuse asterix (*) to get italics
Bertrand Néron, Rémi Denise, Charles Coluzzi, Marie Touchon, Eduardo P. C. Rocha, Sophie S. AbbyPlease use the format "First name initials family name" as in "Marie S. Curie, Niels H. D. Bohr, Albert Einstein, John R. R. Tolkien, Donna T. Strickland"
2023
<p style="text-align: justify;">Complex cellular functions are usually encoded by a set of genes in one or a few organized genetic loci in microbial genomes. Macromolecular System Finder (MacSyFinder) is a program that uses these properties to model and then annotate cellular functions in microbial genomes. This is done by integrating the identification of each individual gene at the level of the molecular system. We hereby present a major release of MacSyFinder (version 2) coded in Python 3. The code was improved and rationalized to facilitate future maintainability. Several new features were added to allow more flexible modelling of the systems. We introduce a more intuitive and comprehensive search engine to identify all the best candidate systems and sub-optimal ones that respect the models’ constraints. We also introduce the novel macsydata companion tool that enables the easy installation and broad distribution of the models developed for MacSyFinder (macsy-models) from GitHub repositories. Finally, we have updated and improved MacSyFinder popular models: TXSScan to identify protein secretion systems, TFFscan to identify type IV filaments, CONJscan to identify conjugative systems, and CasFinder to identify CRISPR associated proteins. MacSyFinder and the updated models are available at: <a href="https://github.com/gem-pasteur/macsyfinder" target="_blank" rel="noopener">https://github.com/gem-pasteur/macsyfinder</a> and <a href="https://github.com/macsy-models" target="_blank" rel="noopener">https://github.com/macsy-models</a>.</p>
https://github.com/macsy-models, https://doi.org/10.6084/m9.figshare.21936992You should fill this box only if you chose 'All or part of the results presented in this preprint are based on data'. URL must start with http:// or https://
https://github.com/gem-pasteur/macsyfinder/tree/master/macsypy/scriptsYou should fill this box only if you chose 'Scripts were used to obtain or analyze the results'. URL must start with http:// or https://
https://github.com/gem-pasteur/macsyfinderYou should fill this box only if you chose 'Codes have been used in this study'. URL must start with http:// or https://
genome annotation; functional annotation; microbial genomes; comparative genomics; bioinformatics; software
NonePlease indicate the methods that may require specialised expertise during the peer review process (use a comma to separate various required expertises).
Bacteria and archaea, Bioinformatics, Functional genomics
David A Baltrus [baltrus@arizona.edu], Jesse B Shapiro [jesse.shapiro@mcgill.ca], Laura Hug [laura.audrey.hug@gmail.com], Lionel Guy [lionel.guy@imbim.uu.se], A Murat Eren [meren@uchicago.edu], Chris Greening [Chris.Greening@monash.edu], Cameron Thrash [thrash@usc.edu], Toni Gabaldon [toni.gabaldon@irbbarcelona.org] No need for them to be recommenders of PCI Genomics. Please do not suggest reviewers for whom there might be a conflict of interest. Reviewers are not allowed to review preprints written by close colleagues (with whom they have published in the last four years, with whom they have received joint funding in the last four years, or with whom they are currently writing a manuscript, or submitting a grant proposal), or by family members, friends, or anyone for whom bias might affect the nature of the review - see the code of conduct
Christopher T. Brown, for competition reasonse.g. John Doe [john@doe.com]
2022-09-09 10:30:31
Gavin Douglas
Kwee Boon Brandon Seah, Max Emil Schön