PCI Genomics

Title *

Semi-artificial datasets as a resource for validation of bioinformatics pipelines for plant virus detectionuse asterix (*) to get italics

Authors *

Lucie Tamisier, Annelies Haegeman, Yoika Foucart, Nicolas Fouillien, Maher Al Rwahnih, Nihal Buzkan, Thierry Candresse, Michela Chiumenti, Kris De Jonghe, Marie Lefebvre, Paolo Margaria, Jean Sébastien Reynard, Kristian Stevens, Denis Kutnjak, Sébastien MassartPlease use the format "First name initials family name" as in "Marie S. Curie, Niels H. D. Bohr, Albert Einstein, John R. R. Tolkien, Donna T. Strickland"

Year *

2021

Picture *

Abstract *

<p>The widespread use of High-Throughput Sequencing (HTS) for detection of plant viruses and sequencing of plant virus genomes has led to the generation of large amounts of data and of bioinformatics challenges to process them. Many bioinformatics pipelines for virus detection are available, making the choice of a suitable one difficult. A robust benchmarking is needed for the unbiased comparison of the pipelines, but there is currently a lack of reference datasets that could be used for this purpose. We present 7 semi-artificial datasets composed of real RNA-seq datasets from virus-infected plants spiked with artificial virus reads. Each dataset addresses challenges that could prevent virus detection. We also present 3 real datasets showing a challenging virus composition as well as 8 completely artificial datasets to test haplotype reconstruction software.</p>

Indicate the full web address (DOI or URL) giving public access to these data (if you have any problems with the deposit of your data, please contact contact@genomics.peercommunityin.org). In case all raw data are included in the preprint, indicate the DOI or URL of the preprint. *

https://gitlab.com/ilvo/VIROMOCKchallengeYou should fill this box only if you chose 'All or part of the results presented in this preprint are based on data'. URL must start with http:// or https://

Indicate the full web address (DOI or URL) giving public access to these scripts (if you have any problems with the deposit of your scripts, please contact contact@genomics.peercommunityin.org). In case all raw scripts are included in the preprint, indicate the DOI or URL of the preprint. *

https://zenodo.org/record/4584967#.YFIwONzjJPYYou should fill this box only if you chose 'Scripts were used to obtain or analyze the results'. URL must start with http:// or https://

Indicate the full web address (DOI, SWHID or URL) giving public access to these codes (if you have any problems with the deposit of your codes, please contact contact@genomics.peercommunityin.org). In case all raw codes are included in the preprint, indicate the DOI or URL of the preprint. *

You should fill this box only if you chose 'Codes have been used in this study'. URL must start with http:// or https://

Keywords (optional)

High-Throughput Sequencing, Reference data, Semi-artificial dataset, Plant virus detection, Bioinformatics pipelines, Haplotype reconstruction

Methods that require specific expertise (optional)

NonePlease indicate the methods that may require specialised expertise during the peer review process (use a comma to separate various required expertises).

Thematic fields *

Bioinformatics, Plants, Viruses and transposable elements

Suggested reviewers - Suggest up to 10 reviewers (provide names and Email addresses). (Optional)

e.g. John Doe john@doe.com

No need for them to be recommenders of PCI Genomics. Please do not suggest reviewers for whom there might be a conflict of interest. Reviewers are not allowed to review preprints written by close colleagues (with whom they have published in the last four years, with whom they have received joint funding in the last four years, or with whom they are currently writing a manuscript, or submitting a grant proposal), or by family members, friends, or anyone for whom bias might affect the nature of the review - see the code of conduct

Opposed reviewers - Suggest up to 5 people not to invite as reviewers. (Optional)

e.g. John Doe john@doe.com

Submission date

2020-11-27 14:31:47

Recommender

Hadi Quesneville

Reviewers

or Register
Submit a preprint