To avoid biases and to be FAIR, we need to CARE and share biodiversity metadata

ORCID_LOGO based on reviews by Julian Osuji and 1 anonymous reviewer
A recommendation of:

Contextualising samples: Supporting reference genomes of European biodiversity through sample and associated metadata collection

Data used for results


Submission: posted 03 July 2023, validated 04 July 2023
Recommendation: posted 28 May 2024, validated 01 July 2024
Cite this recommendation as:
Sabot, F. (2024) To avoid biases and to be FAIR, we need to CARE and share biodiversity metadata. Peer Community in Genomics, 100255. 10.24072/pci.genomics.100255


Böhne et al. (2024) do not present a classical scientific paper per se but a report on how the European Reference Genome Atlas (ERGA) aims to deal with sampling and sample information, i.e. metadata.

As the goal of ERGA is to provide an almost fully representative set of reference genomes representative of European biodiversity to serve many research areas in biology, they have to be really exhaustive. In this regard, in addition to providing sample metadata recording guidelines, they also discuss the biases existing in sampling and sequencing projects.

The first task for such a project is to be sure that the data they generate will be usable and available in the future (“[in] perpetuity", Böhne et al. 2024). The authors deployed a very efficient pipeline for conserving information on sampling: location, physical information, copies of tissues and of DNA, shipping, legal/ethical aspects regarding the Nagoya Protocol, etc., alongside a best-practice manual. This effort is linked to practical guides for the DNA extraction of specific taxa. More generally, these details enable “Findable, Accessible, Interoperable, and Reusable” (FAIR) principles (Wilkinson et al. 2016) to be followed.

An important aspect of this paper, in addition to practical points, is the reflection upon the different biases inherent to the choice of sequenced samples. Acknowledging their own biases with regards to DNA extraction protocol efficiency, small genome size choice, as well as the availability of material (Nagoya Protocol aspects) and material transfer efficiency, the authors recommend in the future to not survey biodiversity by selecting one’s favorite samples or species, but also considering "orphan" taxa. Some of these "orphan" taxonomic groups belong to non-arthropod invertebrates but internal disparities are also prominent within other taxa. Finally, the implementation of the "Collective benefit, Authority to control, Responsibility, and Ethics" (CARE) principles (Carroll et al. 2021) will allow Indigenous rights to be considered when prioritizing samples, and to enable their "knowledge systems to permeate throughout the process of reference genome production and beyond" (Böhne et al. 2024).

Last, but not least, as ERGA, including its Sampling and Sample Processing committee, is a large collective effort, it is very refreshing to read a paper starting with the acknowledgements and the roles of each member.



Böhne A, Fernández R, Leonard JA, McCartney AM, McTaggart S, Melo-Ferreira J, Monteiro R, Oomen RA, Pettersson OV, Struck TH (2024) Contextualising samples: Supporting reference genomes of European biodiversity through sample and associated metadata collection. bioRxiv, ver. 3 peer-reviewed and recommended by Peer Community in Genomics.

Carroll SR, Herczog E, Hudson M, Russell K, Stall S (2021) Operationalizing the CARE and FAIR Principles for Indigenous data futures. Scientific Data, 8, 108.

Wilkinson MD, Dumontier M, Aalbersberg IjJ, Appleton G, Axton M, Baak A, Blomberg N, Boiten J-W, da Silva Santos LB, Bourne PE, Bouwman J, Brookes AJ, Clark T, Crosas M, Dillo I, Dumon O, Edmunds S, Evelo CT, Finkers R, Gonzalez-Beltran A, Gray AJG, Groth P, Goble C, Grethe JS, Heringa J, ’t Hoen PAC, Hooft R, Kuhn T, Kok R, Kok J, Lusher SJ, Martone ME, Mons A, Packer AL, Persson B, Rocca-Serra P, Roos M, van Schaik R, Sansone S-A, Schultes E, Sengstag T, Slater T, Strawn G, Swertz MA, Thompson M, van der Lei J, van Mulligen E, Velterop J, Waagmeester A, Wittenburg P, Wolstencroft K, Zhao J, Mons B (2016) The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3, 160018.

Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article. The authors declared that they comply with the PCI rule of having no financial conflicts of interest in relation to the content of the article.
All funding credits are listed in the acknowledgements section of the manuscript. The entire list could not be pasted here due to character limitations.

Reviewed by anonymous reviewer 1, 15 Apr 2024

I am happy with the edits made by the authors. Many thanks for the work and the contribution made to the community.
Best wishes

Evaluation round #1

DOI or URL of the preprint:

Version of the preprint: 1

Author's Reply, 14 Feb 2024

Decision by ORCID_LOGO, posted 27 Oct 2023, validated 27 Oct 2023

Dear Dr Böhne,

Your paper has been (finally, I am sorry for the delay in finding suitable reviewers), reviewed twice independantly, and I read it myself thouroughly.

It is a very interesting and fundamental "white paper" on this excellent project that is ERGA. However, to have even a greater impact, and particularly in regard of non-core ERGA members, I agreed with the reviewers that some parts need a better explanation (outside of some minor syntax proposals).

Indeed, one may be afraid of the big machine that is ERGA, as previous large initiatives provided data but with "no goals": insisting on the possible usage of these data for the whole scientific community as well as public would be a great improvment of your manuscript.

If you accept to correct this in this regard, I would be pleased to accept the manuscript.


Sincerely yours


Francois Sabot

Reviewed by anonymous reviewer 1, 10 Oct 2023

Reviewed by , 05 Sep 2023

Review Report



The title clearly reflects the content of the article. However, I suggest replacement of “for” in the title with “of”



The abstract is concise and captures the major points in the article.

55      SSP serves as the sample provider’s entry point… I suggest providers’   reason is that SSP ought to receive several samples; not one sample


I. The Sampling and Sample Processing committee of ERGA

Introduction clearly demonstrates the motivation for the study.

The introduction builds on relevant recent and past reference research.

76-77 Delete the phrase “one of which is the Sampling and Sample Processing committee (SSP)” It seems to appear slightly early. It can come at the beginning sentence of the next paragraph as follows:

88          The Sampling and Sample Processing committee (SSP) is a working group of volunteer expert ERGA members tasked with developing guidelines 83 to support sampling and sample processing.


Materials and Methods

This section contains sufficient information that can be replicated in similar researches.



Data presented in the article are correct and unambiguously presented.

174        ….Widening countries with 44% and 50% of…  I suggest … Widening countries with 44 and 50 % of     and

175 However, only 36% or 42% of the…      However, only 36 or 42 % of the…


The tables and figures (charts) are clear and self-explanatory. However, the texts in Figure 3 could be made more legible for easier reading.

IV. Sample provision: connecting genome teams with 322 sequencing centres

324 arising from three main categories: biological, logistic, and legal issues. I rather think it should be:

324 arising from four main categories: biological, logistic, administrative/policy and legal issues.


364 Future taxon-specific best-practice guidelines

The approach of having different sampling procedures for different taxa is very commendable as it would eliminates complications arising from structural and functional variations between the taxa.


490 References

The listed references are appropriate


General Comment

The article captured very important details associated with an active reference genome community of practice and vividly explained the challenges faced by such a consortium.

Download the review

User comments

No user comments yet