Gavin M. Douglas and Morgan G. I. LangillePlease use the format "First name initials family name" as in "Marie S. Curie, Niels H. D. Bohr, Albert Einstein, John R. R. Tolkien, Donna T. Strickland"
<p style="text-align: justify;">The past decade has seen an eruption of interest in profiling microbiomes through DNA sequencing. The resulting investigations have revealed myriad insights and attracted an influx of researchers to the research area. Many newcomers are in need of primers on the fundamentals of microbiome sequencing data types and the methods used to analyze them. Accordingly, here we aim to provide a detailed, but accessible, introduction to these topics. We first present the background on marker-gene and shotgun metagenomics sequencing and then discuss unique characteristics of microbiome data in general. We highlight several important caveats resulting from these characteristics that should be appreciated when analyzing these data. We then introduce the many-faceted concept of microbial functions and several controversies in this area. One controversy in particular is regarding whether metagenome prediction methods (i.e. based on marker gene sequences) are sufficiently accurate to ensure reliable biological inferences. We next highlight several underappreciated developments regarding the integration of taxonomic and functional data types. This is a highly pertinent topic because although these data types are inherently connected, they are often analyzed independently and primarily only linked anecdotally in the literature. We close by providing our perspective on this topic in addition to the issue of reproducibility in microbiome research, which are both crucial data analysis challenges facing microbiome researchers.</p>
bioinformatics, microbiome, data integration, metagenomics, reproducibility