Informed Choices, Cohesive Future: Decisions and Recommendations for ERGA

Jitendra Narayan

doi:10.24072/pci.genomics.100298

Back

Recommendation

Share Tweet

Printable page

Informed Choices, Cohesive Future: Decisions and Recommendations for ERGA

Jitendra Narayan based on reviews by Justin Ideozu and Eric Crandall

A recommendation of:

The European Reference Genome Atlas: piloting a decentralised approach to equitable biodiversity genomics

Ann M Mc Cartney, Giulio Formenti, Alice Mouton, Claudio Ciofi, Robert M Waterhouse, Camila J Mazzoni, Diego De Panis, Luisa S Schlude Marins, Henrique G Leitao, Genevieve Diedericks, Joseph Kirangwa, Marco Morselli, Judit Salces, Nuria Escudero, Alessio Iannucci, Chiara Natali, Hannes Svardal, Rosa Fernandez, Tim De Pooter, Geert Joris, Mojca Strazisar, Jo Wood, Katie E Herron, Ole Seehausen, Phillip C Watts, Felix Shaw, Robert P Davey, Alice Minotto, Jose Maria Fernandez Gonzalez, Astrid Bohne, Carla Alegria, Tyler Alioto, Paulo C Alves, Isabel R Amorim, Jean-Marc Aury, Niclas Backstrom, Petr Baldrian, Loriano Ballarin, Laima Baltrunaite, Endre Barta, Bertrand BedHom, Caroline Belser, Johannes Bergsten, Laurie Bertrand, Helena Bilandija, Mahesh Binzer-Panchal, Iliana Bista, Mark Blaxter, Paulo AV Borges, Guilherme Borges Dias, Mirte Bosse, Tom Brown, Remy Bruggmann, Elena Buena-Atienza, Josephine Burgin, Elena Buzan, Nicolas Casadei, Matteo Chiara, Sergio Chozas, Fedor F Ciampor, Angelica Crottini, Corinne Cruaud, Fernando Cruz, Love Dalen, Alessio De Biase, Javier del Campo, Teo Delic, Alice B Dennis, Martijn FL Derks, Maria Angela Diroma, Mihajla Djan, Simone Duprat, Klara Eleftheriadi, Philine GD Feulner, Jean-Francois Flot, Giobbe Forni, Bruno Fosso, Pascal Fournier, Christine Fournier-Chambrillon, Toni Gabaldon, Shilpa Garg, Carmela Gissi, Luca Giupponi, Jessica Gomez-Garrido, Josefa Gonzalez, Miguel L Grilo, Bjoern Gruening, Thomas Guerin, Nadege Guiglielmoni, Marta Gut, Marcel P Haesler, Christoph Hahn, Balint Halpern, Peter Harrison, Julia Heintz, Maris Hindrikson, Jacob Hoglund, Kerstin Howe, Graham Hughes, Benjamin Istace, Mark J. Cock, Franc Jancekovic, Zophonias O Jonsson, Sagane Joye-Dind, Janne J. Koskimaki, Boris Krystufek, Justyna Kubacka, Heiner Kuhl, Szilvia Kusza, Karine Labadie, Meri Lahteenaro, Henrik Lantz, Anton Lavrinienko, Lucas Leclere, Ricardo Jorge Lopes, Ole Madsen, Ghislaine Magdelenat, Giulia Magoga, Tereza Manousaki, Tapio Mappes, Joao Pedro Marques, Gemma I Martinez Redondo, Florian Maumus, Hendrik-Jan Megens, Shane A McCarthy, Jose Melo-Ferreira, Sofia L Mendes, Matteo Montagna, Joao Moreno, Mai-Britt Mosbech, Monica Moura, Zuzana Musilova, Eugene Myers, Will J. Nash, Alexander Nater, Pamela Nicholson, Manuel Niell, Reindert Nijland, Benjamin Noel, Karin Noren, Pedro H Oliveira, Remi-Andre Olsen, Lino Ometto, Stephan Ossowski, Vaidas Palinauskas, Snaebjorn Palsson, Jerome P Panibe, Joana Pauperio, Martina Pavlek, Emilie Payen, Julia Pawlowska, Jaume Pellicer, Graziano Pesole, Joao Pimenta, Martin Pippel, Anna Maria Pirttila, Nikos Poulakakis, Jeena Rajan, Ruben MC Rego, Roberto Resendes, Philipp Resl, Ana Riesgo, Patrik Rodin-Morch, Andre ER Soares, Carlos Rodriguez Fernandes, Maria M. Romeiras, Guilherme Roxo, Lukas Ruber, Maria Jose Ruiz-Lopez, Urmas Saarma, Luis P Silva, Manuela Sim-Sim, Lucile Soler, Vitor C Sousa, Carla Sousa Santos, Alberto Spada, Milomir Stefanovic, Viktor Steger, Josefin Stiller, Matthias Stock, Torsten Hugo H Struck, Hiranya Sudasinghe, Riikka Tapanainen, Christian Tellgren-Roth, Helena Trindade, Yevhen Tukalenko, Ilenia Urso, Benoit Vacherie, Steven M Van Belleghem, Kees van Oers, Carlos Vargas-Chavez, Nevena Velickovic, Noel Vella, Adriana Vella, Cristiano Vernesi, Sara Vicente, Sara Villa, Olga Vinnere Pettersson, Filip AM Volckaert, Judit Voros, Patrick Wincker, Sylke Winkler (2024), bioRxiv, ver.4, peer-reviewed and recommended by PCI Genomics https://doi.org/10.1101/2023.09.25.559365

Read preprint in preprint server Now published in a journal

Data used for results

Scripts used to obtain or analyze results

Abstract

The European Reference Genome Atlas: piloting a decentralised approach to equitable biodiversity genomics

English: A global genome database of all of Earth's species diversity could be a treasure trove of scientific discoveries. However, regardless of the major advances in genome sequencing technologies, only a tiny fraction of species have genomic information available. To contribute to a more complete planetary genomic database, scientists and institutions across the world have united under the Earth BioGenome Project (EBP), which plans to sequence and assemble high-quality reference genomes for all ~1.5 million recognized eukaryotic species through a stepwise phased approach. As the initiative transitions into Phase II, where 150,000 species are to be sequenced in just four years, worldwide participation in the project will be fundamental to success. As the European node of the EBP, the European Reference Genome Atlas (ERGA) seeks to implement a new decentralised, accessible, equitable and inclusive model for producing high-quality reference genomes, which will inform EBP as it scales. To embark on this mission, ERGA launched a Pilot Project to establish a network across Europe to develop and test the first infrastructure of its kind for the coordinated and distributed reference genome production on 98 European eukaryotic species from sample providers across 34 European countries. Here we outline the process and challenges faced during the development of a pilot infrastructure for the production of reference genome resources, and explore the effectiveness of this approach in terms of high-quality reference genome production, considering also equity and inclusion. The outcomes and lessons learned during this pilot provide a solid foundation for ERGA while offering key learnings to other transnational and national genomic resource projects. French: Une base de donnees genomiques mondiale regroupant toute la diversite des especes de la Terre pourrait constituer un tresor de decouvertes scientifiques. Cependant, malgre les avancees majeures des technologies de sequencage du genome, seule une infime partie des especes dispose d'informations genomiques. Afin de contribuer a la constitution d'une base de donnees genomiques planetaires plus complete, des scientifiques et des institutions du monde entier se sont unis dans le cadre du Earth BioGenome Project (Projet BioGenome de la Terre, EBP), qui prevoit de sequencer et d'assembler des genomes de reference de haute qualite pour l'ensemble des quelque 1,5 million d'especes eucaryotes connues. Alors que l'initiative passe a la phase II, au cours de laquelle 150 000 especes doivent etre sequencees en seulement quatre ans, la participation mondiale au projet sera essentielle a sa reussite. Branche europeenne de l'EBP, l'European Reference Genome Atlas (Atlas Europeen de Genomes de Reference, ERGA) cherche a mettre en oeuvre un nouveau modele decentralise, accessible, equitable et inclusif de production de genomes de reference de haute qualite, et transmettra les informations a l'EBP au fur et a mesure de sa progression. Pour se lancer dans cette mission, l'ERGA a lance un projet pilote visant a etablir un reseau a travers l'Europe afin de developper et de tester la premiere infrastructure de ce type pour la production coordonnee et distribuee de genomes de reference sur 98 especes eucaryotes europeennes a partir d'echantillons provenant de 34 pays europeens. Nous decrivons ici le processus et les defis rencontres lors du developpement d'une infrastructure pilote pour la production de ressources genomiques de reference, et explorons l'efficacite de cette approche en termes de production de genomes de reference de haute qualite, en tenant compte egalement de l'equite et de l'inclusion. Les resultats et les enseignements tires de ce projet pilote constituent une base solide pour l'ERGA, tout en offrant des enseignements cles a d'autres projets transnationaux et nationaux visant a etablir de nouvelles ressources genomiques. German: Eine globale Genomdatenbank fur die gesamte Artenvielfalt der Erde konnte eine Schatzkiste fur wissenschaftliche Entdeckungen darstellen. Trotz grosser Fortschritte bei den Technologien zur Genomsequenzierung liegen bislang allerdings nur fur einen winzigen Bruchteil der Arten Informationen ihres gesamten Erbgutes vor. Um zu einer umfassenderen weltweiten Genomdatenbank beizutragen, haben sich Wissenschaftler*innen und Institutionen aus aller Welt im Earth BioGenome Project (EBP) zusammengeschlossen, das schrittweise die Sequenzierung und Assemblierung hochwertiger Referenzgenome fur alle ca. 1,5 Millionen bekannten eukaryontischen Arten plant. Wahrend die Initiative in Phase II ubergeht, in der innerhalb von nur vier Jahren 150.000 Arten sequenziert werden sollen, wird eine weltweite Beteiligung am Projekt von grundlegender Bedeutung fur den Erfolg sein. Der Europaische Referenzgenom-Atlas (ERGA) stellt den europaischen Knotenpunkt des EBP dar und soll ein neues dezentrales, leicht zugangliches, faires und integratives Modell fur die Erstellung hochwertiger Referenzgenome zur Verfugung stellen, welches das EBP bei seiner Ausweitung inhaltlich unterstutzen wird. Zu diesem Zweck hat ERGA ein Pilotprojekt fur ein europaweites Netzwerk gestartet und die erste Infrastruktur ihrer Art fur eine koordinierte und dezentrale Produktion von Referenzgenomen fur 98 eukaryontische europaische Arten entwickelt und getestet, wobei Proben durch Projektbeteiligte aus 34 europaischen Landern geliefert wurden. In diesem Artikel werden Prozesse und Herausforderungen beschrieben, die sich bei der Entwicklung einer Pilotinfrastruktur zur Erstellung von Referenzgenomressourcen ergeben haben, sowie die Wirksamkeit dieses Ansatzes fur eine qualitativ hochwertige Referenzgenom-Erstellung - unter Berucksichtigung von Fairness und Einbindung - untersucht. Die Ergebnisse und Erfahrungen aus diesem Pilotprojekt bilden eine solide Grundlage fur ERGA und stellen gleichzeitig wichtige Erkenntnisse fur andere transnationale und nationale Projekte zur Erarbeitung genomischer Ressourcen dar. Greek: Η δημιουργία μιας Bάσης γονιδιωματικών δεδομένων για το σύνολο των ειδών του πλανήτη μας θα αποτελέσει μοναδικό θησαυρό από τον οποίο θα προκύψει πλήθος επιστημονικών ανακαλύψεων. Ωστόσο, παρά τη σημαντική πρόοδο στις τεχνολογίες προσδιορισμού της αλληλουχίας των γονιδιωμάτων, τα διαθέσιμα γονιδιώματα προέρχονται από πολύ μικρό ποσοστό των ειδών του πλανήτη μας. Για τη δημιουργία μιας πληρέστερης Bάσης γονδιωματικών δεδομένων σε παγκόσμιο επίπεδο, επιστήμονες και ιδρύματα από όλο τον κόσμο έχουν ενώσει τις δυνάμεις τους στο πλαίσιο του Earth BioGenome Project (EBP), το οποίο σχεδιάζει σταδιακά να αλληλουχήσει και να συγκεντρώσει υψηλής ποιότητας γονιδιώματα αναφοράς για το σύνολο των περίπου 1,5 εκατομμύριων αναγνωρισμένων ευκαρυωτικών ειδών της Γης. Καθώς το έργο αυτό εισέρχεται στη δεύτερη φάση του (ΦΑΣΗ ΙΙ), κατά την οποία πρόκειται να αλληλουχηθούν τα γονιδιώματα από 150.000 είδη σε χρονικό διάστημα μόλις τέσσερα χρόνια, η συμμετοχή ερευνητών από όλο τον κόσμο κρίνεται καταλυτική για την επιτυχία του εγχειρήματος. O Ευρωπαϊκός Άτλας Γονιδιωμάτων Αναφοράς (European Reference Genome Atlas, ERGA) που αποτελεί τον ευρωπαϊκό κόμBο του EBP, επιδιώκει να εφαρμόσει ένα νέο αποκεντρωμένο, προσBάσιμο, δίκαιο και περιεκτικό μοντέλο για την παραγωγή υψηλής ποιότητας γονιδιωμάτων αναφοράς, το οποίο, όσο προχωράει, θα επικαιροποιεί το EBP. Για τον σκοπό αυτό, το ERGA ξεκίνησε ένα πιλοτικό έργο με στόχο τη δημιουργία ευρωπαϊκού δικτύου για την ανάπτυξη και την εφαρμογή της πρώτης, στο είδος της, υποδομής με στόχο τη συντονισμένη παραγωγή γονιδιωμάτων αναφοράς από 98 ευρωπαϊκά ευκαρυωτικά είδη. Τα δείγματα των ειδών αυτών προέρχονται από φορείς συλλογών που εδρεύουν σε 34 διαφορετικές ευρωπαϊκές χώρες. Στο άρθρο αυτό περιγράφουμε τη διαδικασία και τις προκλήσεις που αντιμετωπίσαμε κατά την ανάπτυξη αυτής της πιλοτικής υποδομής για την παραγωγή υψηλής ποιότητας γονιδιωμάτων αναφοράς, και διερευνούμε την αποτελεσματικότητα αυτής της προσέγγισης, λαμBάνοντας υπόψη επίσης τη δίκαιη συμμετοχή και την ενσωμάτωση. Τα αποτελέσματα και τα διδάγματα που αντλήθηκαν κατά τη διάρκεια αυτού του πιλοτικού προγράμματος παρέχουν μια σταθερή Bάση για το ERGA ενώ παράλληλα προσφέρουν τις Bασικές γνώσεις για άλλα διακρατικά και εθνικά έργα παραγωγής γονιδιωματικών υποδομών. Irish: D'fheadfadh bunachar sonrai d'eagsulacht speiceas domhanda a bheith ina thaisce d'fhionnachtana eolaiochta. Ce go bhfuil dul chun cinn ollmhor deanta i dteicneolaiocht sheiceamhu geanom nil eolas geanomaioch ar fail ach do lion fiorbheag de speiceas. Ta eolaithe agus institiuid ar fud na cruinne ag comhoibriu faoi bhratach an tionscnaimh EarthBioGenome Project (EBP) ar mhaithe le bheith ag cur le bunachar sonrai geanomaioch domhanda ata nios iomlaine. Ta se mar aidhm ag an EBP geanoim thagartha d'ardchaighdean a sheicheamhu agus a chur le cheile do gach ceann de na ~1.5 milliun speiceas eocarotach aitheanta tri phroiseas ceim ar cheim. De reir mar a bhogann an tionscnamh ar aghaidh go Ceim II beidh se riachtanach do rath an tionscnaimh go mbeidh rannphairtiocht domhanda toisc go ndeanfar seicheamhu ar 150,000 speiceas taobh istigh de cheithre bliana. Mar larphointe Eorpach an EBP, ta se mar aidhm ag Atlas Geanom Tagartha na hEorpa (ERGA) samhail ata dilaraithe, inrochtana, cothromasacha agus ionchuimsitheach chur i bhfeidhm maidir le geanoim thagartha a chur ar fail. Deanfaidh se seo eolas a thabhairt don EBP de reir mar a mheadaionn se. Le tus a chur leis an aistear seo, chuir an ERGA tus le treoirthionscadal ar mhaithe le lionra a bhunu fud fad na hEorpa. Sprioc an treoirthionscadail na chun an chead bhonneagar da leitheid a fhorbairt agus a thastail le haghaidh tairgeadh geanoim tagartha comhordaithe ar 98 speiceas eocarotach Eorpach o sholathrai samplacha thar 34 tir Eorpach. Deanann muid cur sios anseo ar an bproiseas agus na dushlain a bhaineann le bonneagar piolotach a fhorbairt ar mhaithe le hacmhainni geanoim thagartha a thairgeadh. Anuas air sin deanfar cioradh ar eifeachtacht an cur chuige seo maidir le tairgeadh geanoim thagartha d'ardchaighdean le tracht deanta do chothromas agus ionchuimsitheacht. Tugann na torthai agus na ceachtanna a d'fhoghlaimiodh i rith an phiolota seo bunchloch laidir d'ERGA, ag an am ceanna tugann se eochairphointi foghlama go tionscadail acmhainni geanomaiochta naisiunta agus trasnaisiunta. Italian: Un database di genomi che rappresenti tutta la biodiversita globale potrebbe essere un tesoro di scoperte scientifiche. Tuttavia, nonostante gli enormi progressi nelle tecnologie di sequenziamento del genoma, solo una piccola frazione delle specie note dispone di informazioni genomiche. Per contribuire a un database genomico planetario piu completo, scienziati e istituzioni di tutto il mondo si sono uniti sotto l'egida dell'Earth BioGenome Project (EBP), che prevede di sequenziare e assemblare genomi di riferimento di alta qualita per tutte le circa 1,5 milioni di specie eucariotiche note. La partecipazione mondiale al progetto sara fondamentale per il successo della Fase II, in cui si prevede di sequenziare 150.000 specie nei prossimi quattro anni. In quanto nodo europeo dell'EBP, l'European Reference Genome Atlas (ERGA) cerca di implementare un nuovo modello decentralizzato, accessibile, equo e inclusivo per la produzione di genomi di riferimento di alta qualita per le specie europee, che aiutera a informare gli sforzi dell'EBP. A questo scopo, ERGA ha lanciato un progetto pilota per sviluppare e testare la prima infrastruttura per la produzione coordinata e distribuita di genomi di riferimento su 98 specie eucariotiche europee da 34 paesi. Descriviamo il processo e le sfide affrontate durante lo sviluppo dell'infrastruttura ed esploriamo l'efficacia di questo approccio in termini di produzione di genoma di riferimento di alta qualita. I risultati e le lezioni apprese durante questo progetto pilota forniscono una solida base per ERGA, offrendo allo stesso tempo insegnamenti chiave ad altri progetti dedicati alla produzione di risorse genomiche nazionali e transnazionali. Latvian: Globāla genoma datu bāze ar informāciju par visu sugu daudzveidību, vecinātu zinātniskos atklājumus un pētniecību. Tomēr, neskatoties uz genoma sekvenēsanas tehnoloģiju ievērojamo attīstību, tikai nelielai daļai sugu ir pieejama genomiskās sekvences informācija. Lai veicinātu pilnīgāku globālo genomisko datu bāzi, zinātnieki un iestādes visā pasaulē ir apvienojusies Pasaules BioGenoma projekta (Earth BioGenome Project - EBP) ietvaros, kas, izmantojot pakāpenisku pieeju, plāno izveidot un apvienot augstas kvalitātes references genomus visām ~ 1,5 miljoniem eikariotiskām sugām. Pasreiz sākās EBP iniciatīvas otrā posma, kurā paredzēts sekvencējot 150 000 sugu genomus tikai četru gadu laikā, un visas pasaules dalība projektā būs būtiska, lai gūtu panākumus. Eiropas references genoma atlants (European Reference Genome Atlas - ERGA), kā EBP Eiropas pārstāvis plāno ieviest jaunu decentralizētu, pieejamu, taisnīgu un atvērtu modeli augstas kvalitātes references genomu razosanai. Lai izpildītu so misiju, ERGA izveidoja Eiropas pētniecības tīklu, izstrādāja un pārbaudīja pirmo sāda veida koordinētu un sadalītu genoma sekvenēsanas infrastruktūru un uzsāka izmēģinājuma projektu, sekvencējot 98 Eiropas eikariotisko sugu paraugus no 34 Eiropas valstīm. Seit mēs aprakstam references genoma sekvenēsanas procesu un problēmas, kas radusās infrastruktūras izstrādes laikā, un pētām sīs pieejas efektivitāti augstas kvalitātes genoma razosanā, ņemot vērā arī taisnīgumu atvērtību un iesaisti. Sā izmēģinājuma gaitā gūtie rezultāti un gūtā pieredze ir stabils pamats ERGA, vienlaikus piedāvājot pamatzināsanas citiem starptautiskiem un valstu genoma sekvencēsanas pētījumiem. Lithuanian: Pasaulinė visų Zemės rūsių įvairovės genomų duomenų bazė galėtų tapti mokslinių atradimų lobiu. Tačiau, nepaisant didziulės genomų sekų nustatymo technologijų pazangos, siuo metu nuskaityta tik nedidelės dalies rūsių genominė informacija. Siekdami prisidėti prie visų planetoje esančių organizmų genomų duomenų bazės kūrimo, mokslininkai ir institucijos visame pasaulyje susivienijo į Zemės biogenomo projektą (angl. Earth BioGenome Project, EBP), kuriuo planuojama susekvenuoti ir palaipsniui sukaupti aukstos kokybės etaloninius visų ~1,5 mln. pripazintų eukariotų rūsių genomus. Iniciatyvai pereinant į II etapą, kuriame per ketverius metus turi būti susekvenuota 150 000 rūsių genomų, įvairių pasaulio salių atstovų dalyvavimas projekte bus labai svarbus sėkmei uztikrinti. Europos etaloninių genomų atlaso (ERGA) iniciatyva, kuri yra vienas pagrindinių EBP Europos centrų, siekia įgyvendinti naują decentralizuotą, prieinamą, teisingą ir įtraukų aukstos kokybės etaloninių genomų kūrimo modelį, kuriuo bus remiamasi plečiant EBP. Siai misijai pradėti ERGA inicijavo bandomąjį projektą, kurio tikslas - sukurti tinklą visoje Europoje ir isbandyti pirmąją tokio pobūdzio infrastruktūrą, skirtą koordinuoti ir paskirstyti 98 Europos eukariotų rūsių etaloninių genomų nuskaitymą is mėginių surinktų 34 Europos salyse. Čia aprasome procesą ir issūkius su kuriais susidurta kuriant bandomąją etaloninių genomų isteklių kūrimo infrastruktūrą, bei nagrinėjame sio metodo veiksmingumą siekiant aukstos kokybės etaloninių genomų kūrimo, atsizvelgdami ir į lygiateisiskumą bei įtrauktį. Sio bandomojo projekto rezultatai ir ismoktos pamokos suteikia tvirtą pagrindą ERGA ir kartu suteikia svarbios patirties kitiems tarptautiniams ir nacionaliniams genomų isteklių projektams. Dutch: Een wereldwijde genoom database gevuld met de complete diversiteit aan soorten op aarde kan een schatkamer voor wetenschappelijke ontdekkingen vormen. Ondanks de grote vooruitgang in genoom sequencing-technieken is er momenteel slechts genoom data beschikbaar voor een minuscule fractie van alle soorten op aarde. Om bij te dragen aan een meer complete genoom database van deze planeet zijn wetenschappers en instituten uit de hele wereld samengekomen in het Earth BioGenome Project (EBP). Dit project heeft als doel het sequensen en samenvoegen van hoge kwaliteit referentie genomen voor alle ~1,5 miljoen bekende eukaryote soorten in een stapsgewijze aanpak. Het initiatief gaat momenteel over naar fase II, waarbij in slechts 4 jaar tijd de genoomsequentie van 150.000 soorten moet worden bepaald. Hierbij is wereldwijde deelname cruciaal voor succes. De Europese tak van het EBP, de Europese Referentie Genoom Atlas (ERGA), heeft tot doel het implementeren van een nieuw, gedecentraliseerd, toegankelijk, rechtvaardig en inclusief model voor het produceren van hoge kwaliteit referentie genomen, wat zal bijdragen aan de EBP wanneer het opschaalt. Om dit te realiseren heeft ERGA een proefproject gelanceerd. Hierin is een Europees netwerk ingericht voor het ontwikkelen en testen van de eerste infrastructuur van zijn soort voor de gecoordineerde en gedecentraliseerde productie van referentie genomen van 98 Europese eukaryote soorten verzameld in 34 Europese landen. Hier schetsen we het proces en de uitdagingen die we tegenkwamen tijdens het ontwikkelen van deze proef-infrastructuur voor het produceren van referentie genomen en -databases, en onderzoeken we de effectiviteit van deze aanpak aangaande de productie van hoge kwaliteit referentie genomen, waarbij de rechtvaardigheid en inclusie ook zijn meegenomen. De uitkomsten en geleerde lessen tijdens het proefproject vormen een solide onderbouwing voor ERGA en bieden tegelijkertijd een aantal belangrijke lessen die ook van toepassing zijn op andere transnationale en nationale projecten rond het beschikbaar maken van genoom data. Polish: Ogolnoświatowa baza danych zawierająca w sobie dane genomowe wszystkich gatunkow żyjących na Ziemi, byłaby skarbnicą wiedzy dla przyszłych badaczy. Pomimo dużych postępow w rozwoju technologii sekwencjonowania, dane genomowe są ogolnodostępne tylko dla niewielkiej część gatunkow. W celu stworzenia bardziej kompletnej międzynarodowej bazy danych genomicznych, naukowcy i instytucje z całego świata zjednoczyli się w ramach projektu Earth BioGenome Project (EBP), ktory planuje stopniowe sekwencjonowanie i składanie wysokiej jakości genomow referencyjnych dla wszystkich ok. 1,5 miliona znanych gatunkow eukariotycznych. W najbliższym czasie inicjatywa przechodzi do fazy II, w ktorej w ciągu zaledwie czterech lat mają zostać zsekwencjonowane genomy 150 000 gatunkow. W związku z tym międzynarodowe zaangażowanie będzie miało fundamentalne znaczenie. Europejski węzeł EBP, o nazwie Europejski Atlas Genomow Referencyjnych (ang. European Reference Genome Atlas; ERGA) ma na celu wdrożenie nowego, zdecentralizowanego, sprawiedliwego i dostępnego dla wszystkich modelu sekwencjonowania i składania wysokiej jakości genomow referencyjnych, a także stopniowe przekazywanie tych informacji do EBP. W celu realizacji przyjętej misji, ERGA uruchomiła projekt pilotażowy (ang. pilot project) zmierzający do utworzenia w całej Europie sieci wspołpracy. Projekt ten będzie polegał na opracowaniu i przetestowaniu zastosowania rozproszonej infrastruktury do skoordynowanego zsekwencjonowania i składania genomow referencyjnych dla 98 europejskich gatunkow eukariotycznych, zebranych przez badaczy z 34 europejskich krajow. W niniejszej pracy przedstawiamy proces i wyzwania przed ktorymi stoimy podczas rozwijania tego projektu. Analizujemy także jego skuteczność do generowania wysokiej jakości genomow referencyjnych, mając na uwadze także rowność szans i inkluzywność. ERGA ma nadzieję, że wyniki i wnioski wyciągniete z realizacji projektu pilotażowego będą cenne nie tylko dla tej inicjatywy, ale także że będą cennymi wskazowkami w realizacji podobnych projektow o zasięgu krajpowym i międzynarodowym w przyszłości. Portuguese: Uma base de dados genomica de toda a diversidade de especies da Terra podera ser um tesouro de descobertas cientificas. No entanto, independentemente dos grandes avancos nas tecnologias de sequenciacao de genomas, apenas uma pequena fracao das especies tem informacao genomica disponivel. Para contribuir para uma base de dados genomica planetaria mais completa, cientistas e instituicoes de todo o mundo uniram-se no Earth BioGenome Project (EBP), que planeia sequenciar e gerar genomas de referencia de alta qualidade para todas as cerca de 1,5 milhoes de especies eucarioticas conhecidas, atraves de uma abordagem gradual e faseada. A medida que a iniciativa transita para a Fase II, na qual 150.000 especies serao sequenciadas em apenas quatro anos, a participacao de cientistas e instituicoes de todo o mundo sera fundamental para o seu sucesso. Como no Europeu do EBP, o European Reference Genome Atlas (ERGA) procura implementar um novo modelo descentralizado, acessivel, equitativo e inclusivo para a producao de genomas de referencia de alta qualidade, que informara o EBP enquanto este cresce. Para embarcar nesta missao, o ERGA lancou um Projeto Piloto para estabelecer uma rede atraves da Europa para desenvolver e testar a primeira infraestrutura deste tipo, para a producao coordenada e distribuida de genomas de referencia de 98 especies eucarioticas europeias, a partir de doadores de amostras de 34 paises europeus. Aqui descrevemos o processo e os desafios enfrentados durante o desenvolvimento de uma infraestrutura piloto para a producao de recursos genomicos de referencia e exploramos a eficacia desta abordagem em termos de producao de genomas de referencia de alta qualidade, considerando tambem a equidade e a inclusao. Os resultados e licoes aprendidas durante este piloto fornecem uma base solida para o ERGA, e conhecimento importante para a implementacao de outros projetos de recursos genomicos transnacionais e nacionais. Romanian: O bază de date genomică globală, cu toată diversitatea speciilor de pe Pămant ar putea fi o comoară de descoperiri științifice. Cu toate acestea, in ciuda progreselor majore in tehnologiile de secvențiere genomică, doar o mică parte din specii au informații genomice disponibile. Pentru a contribui la completarea bazei de date genomice planetare, oamenii de știință și instituțiile din intreaga lume s-au unit in cadrul Proiectului Earth BioGenome (EBP), care intenționează să secvențieze și să asambleze genomuri de referință de inaltă calitate pentru toate ~1,5 milioane de specii de eucariote recunoscute printr-o abordare treptată. Pe măsură ce inițiativa trece la Faza II, unde 150.000 de specii urmează să fie secvențiate in doar patru ani, participarea la nivel mondial la acest proiect va fi fundamentală pentru succes. In calitate de nod european al EBP, Atlasul European al Genomurilor de Referință (ERGA) incearcă să implementeze un nou model descentralizat, accesibil, echitabil și incluziv pentru producerea de genomuri de referință de inaltă calitate. Pentru a se angaja in această misiune, ERGA a lansat un proiect pilot pentru a stabili o rețea in intreaga Europă pentru a dezvolta și testa prima infrastructură de acest gen pentru producția coordonată și distribuită de genomuri de referință pe 98 de specii eucariote europene colectate de furnizorii de mostre din 34 de țări europene. Aici descriem procesul și dificultățile cu care s-a confruntat dezvoltarea infrastructurii pilot necesară pentru obținerea de genomuri de referință, și explorăm eficacitatea acestei abordări in ceea ce privește producția de genomuri de referință de inaltă calitate, luand in considerare atat echitatea, cat și incluziunea. Rezultatele și lecțiile invățate in timpul acestui proiect pilot oferă o bază solidă pentru ERGA, oferind in același timp lecții cheie altor proiecte transnaționale și naționale de resurse genomice. Slovakian: Globalna databaza genomov vsetkych druhov na Zemi by mohla byť mimoriadne vyznamnym zdrojom mnohych vedeckych objavov. Napriek veľkemu pokroku v technologiach sekvenovania, informacie o celych genomoch su k dispozicii stale len u veľmi malej časti druhov. S cieľom prispieť ku kompletnejsej databaze genomov sa vedci a institucie z celeho sveta spojili v ramci projektu Earth BioGenome Project (EBP), ktory planuje sekvenovať a zostaviť vysokokvalitne referenčne genomy pre vsetkych ~1,5 miliona znamych eukaryotickych druhov prostrednictvom postupneho, fazoveho pristupu. Keďze iniciativa prechadza do Fazy II, v ktorej sa maju sekvenovať genomy 150 000 druhov v priebehu 4 rokov, zasadna bude pre uspech projektu celosvetova učasť. Konzorcium European Reference Genome Atlas (ERGA) ako europsky uzol EBP sa snazi zaviesť novy, decentralizovany, dostupny, spravodlivy a inkluzivny model produkcie vysokokvalitnych referenčnych genomov, ktory poskytne informacie EBP, ako cieľ Fazy II efektivne dosiahnuť. Aby sa konzorcium ERGA mohlo pustiť do tejto misie, spustili sme pilotny projekt na vytvorenie europskej siete s cieľom vyvinuť a otestovať prvu infrastrukturu svojho druhu na koordinovanu a distribuovanu produkciu referenčnych genomov 98 europskych druhov od poskytovateľov vzoriek z 34 europskych krajin. V tejto studii uvadzame proces a vyzvy, ktorym sme čelili počas vyvoja pilotnej infrastruktury a hodnotime učinnosť tohto pristupu z hľadiska produkcie vysokokvalitnych referenčnych genomov, pričom zohľadňujeme aj spravodlivosť a inkluziu. Vysledky a skusenosti ziskane počas tohto pilotneho projektu poskytuju dolezity zaklad pre fungovanie konzorcia a zaroveň ponukaju kľučove poznatky pre ine nadnarodne a narodne projekty vyuzivajuce genomicke data. Slovenian: Globalna zbirka podatkov o genomih vseh vrst na Zemlji bi lahko bila zakladnica znanstvenih odkritij. Kljub velikemu napredku na področju tehnologij sekvenciranja genomov pa imamo trenutno na razpolago genomske podatke le za majhno stevilo vrst. Da bi prispevali k popolnejsi planetarni podatkovni bazi genomov, so se znanstveniki in institucije po vsem svetu zdruzili v projektu Earth BioGenome Project (EBP), katerega cilj je postopno sekvenciranje in sestavljanje visokokakovostnih referenčnih genomov za priblizno 1,5 milijona priznanih evkariontskih vrst. Ker pobuda prehaja v drugo fazo, v kateri naj bi v stirih zaporednih letih sekvencirali 150.000 vrst, bo za njen uspeh ključno globalno sodelovanje. Evropski referenčni genomski atlas (ERGA) kot evropsko sredisče projekta EBP zeli vzpostaviti nov decentraliziran, dostopen, pravičen in vključujoč model za zagotavljanje visokokakovostnih referenčnih genomov, ki bo podpiral nadaljnje razsirjene pobude projekta EBP. Konzorcij ERGA je začel izvajati pilotni projekt za vzpostavitev vseevropskega omrezja za razvoj in prvo testiranje dostopne infrastrukture za usklajeno in porazdeljeno sekvenciranje referenčnih genomov za 98 evropskih evkariontskih vrst, katerih vzorce so predlozili raziskovalci iz 34 evropskih drzav. V nadaljevanju opisujemo postopek in izzive, s katerimi smo se soočili med pilotnim preizkusom infrastrukture za referenčne genome, hkrati pa ocenjujemo učinkovitost tega pristopa pri določanju visokokakovostnih referenčnih genomov ob upostevanju pravičnosti in vključenosti. Rezultati in izkusnje, pridobljene pri tem pilotnem projektu, so trdna podlaga za nadaljnje delo konzorcija ERGA, hkrati pa ponujajo ključne izkusnje za druge mednarodne in nacionalne projekte na področju analize genomov. Spanish: Una base de datos global de genomas de toda la diversidad de especies de la Tierra podria ser un tesoro de descubrimientos cientificos. Sin embargo, independientemente de los grandes avances en las tecnologias de secuenciacion, tan solo una pequena fraccion de las especies tiene informacion genomica disponible. Para contribuir a una base de datos genomica planetaria mas completa, cientificos e instituciones de todo el mundo se han unido bajo el Proyecto Earth BioGenome (EBP), el cual planea secuenciar y ensamblar genomas de referencia de alta calidad para las ~1,5 millones de especies eucariotas reconocidas a traves de una aproximacion por fases. A medida que esta iniciativa entre en la Fase II, en la que se secuenciaran 150.000 especies en tan solo cuatro anos, la participacion mundial en el proyecto sera fundamental para su exito. Como nodo europeo de la EBP, el Atlas Europeo de Genomas de Referencia (ERGA) busca implementar un nuevo modelo descentralizado, accesible, equitativo e inclusivo para producir genomas de referencia de alta calidad, el cual informara a la EBP a medida que vaya escalando su produccion. Para embarcarse en esta mision, ERGA lanzo un proyecto piloto con la intencion de establecer una red en toda Europa con el fin de desarrollar y probar la primera infraestructura de este tipo destinada a la produccion coordinada y distribuida de genomas de referencia en 98 especies eucariotas europeas, procedentes de 34 paises europeos proveedores. Aqui describimos el proceso y los desafios a los que nos hemos enfrentado durante el desarrollo de una infraestructura piloto para la produccion de recursos genomicos de referencia, y exploramos la efectividad de este enfoque en terminos de produccion de genomas de referencia de alta calidad, considerando tambien la equidad y la inclusion. Los resultados y las lecciones aprendidas durante este piloto constituyen una base solida para ERGA, al tiempo que ofrecen aprendizajes clave para otros proyectos de recursos genomicos transnacionales y nacionales. Swedish: En global databas over jordens alla arters hela genom (arvsmassa) skulle utgora en veritabel skattkista for vetenskapliga upptackter. Men trots stora teknologiska framsteg inom DNA-sekvensering sa har hittills bara en brakdel av alla arters hela genom sekvenserats. For att bidra till en mer komplett planetar genomdatabas har forskare och institutioner varlden over gatt samman i Earth Biogenome Project (EBP), ett projekt som med en stegvis strategi planerar att sekvensera och satta samman hogkvalitativa referensgenom for alla ~1.5 miljoner kanda arter av Eukaryoter. I nasta fas ar malet att sekvensera 150 000 arter pa bara fyra ar, och for att na dit kravs storskaligt globalt engagemang. European Reference Genome Atlas (ERGA), Europas nod av EBP, utvecklar nu en ny decentraliserad, oppen, rattvis och inkluderande modell for produktion av hogkvalitativa referensgenom, en modell med stor relevans for EBPs malsattning. Som start lanserade ERGA ett pilotprojekt med syfte att etablera ett europeiskt natverk for att utveckla och testa denna forsta infrastruktur av sitt slag. Piloten innefattade koordinering av en distribuerad produktion av referensgenom for 98 europeiska arter fran 34 olika europeiska lander. Har beskriver vi processen och utmaningarna for utvecklingen av pilot-infrastrukturen och utvarderar effektiviteten vad galler produktion av hogkvalitativa referensgenom, beaktande saval inklusivitet som rattviseaspekter. Resultaten och lardomarna fran pilotprojektet utgor en stabil grund for ERGA med stor nytta aven for andra nationella och internationella genomresursprojekt. Icelandic: Almennur gagnagrunnur erfeamengjasem spannar liffraeeilegan fjolbreytileika jarear vaeri sannkollue fjarsjoeskista visindalegra upplysinga. pratt fyrir miklar framfarir i raegreiningum erfeamengja eingongu erfeamengaraeir litils brots af ollum tegundum aegengilegar. Til ae fa heilsteyptari gagnagrunn yfir erfeamengi lifvera a joreinni hafa visindamenn og stofnanir vieaum heim sameinast i Earth BioGenome Project (EBP), sem stefnir ae pvi ae raegreina og kortleggja hagaeea-viemieunarerfeamengi fyrir allar paer ~1,5 miljonir tegundir heilkjornunga sem lyst hefur verie, meie skipuloeum haetti. par sem annar afangi verkefnisins (e.Phase II), par sem 150.000 tegundir verea raegreindar a aeeins fjorum arum, er nu ae hefjast er almenn og alpjoeleg pattaka mikilvaeg til ae arangur naist . Evropuhluta EBP verkefnisins, evropska viemieunarerfeamengja-atlasinum (e. the European Reference Genome Atlas (ERGA)), er aetlae ae utfaera nyja aefere til ae setja saman hagaeea-viemieunarerfeamengi mee dreiferi patttoku og almennu aegengi par sem jafnraeei er tryggt, sem mun miela upplysingum til EBP jafnoeum. Til ae framfylgja pessu hefur ERGA sett af stae forverkefni (e. a Pilot Project) sem byggir a samstarfsneti sem spannar alla Evropu, til ae proa og profa slika innviei fyrir greiningu og mielun viemieunarerfeamengja fyrir 98 tegundir evropskra heilkjornunga sem safnae var af patttakendum fra 34 Evropulondum. Her greinum vie fra aefereafraeeinni og askorunum sem vie stoeum frammi fyrir vie ae proa pessa innviei og hvernig vie fundum leieir til ae safna pessum viemieunarerfeamengjum. Auk pess metum vie hversu vel pae gekk m.t.t. jafnraeeis og virkrar patttoku allra patttakenda. paer nieurstoeur og sa laerdomur sem vie hofum aflae i pessu forverkefni leggur goean grunn ae ERGA auk pess ae miela grunnpekkingu til annarra alpjoelegra og landsbundinna erfeamengjaverkefna. Norwegian: En global genomdatabase over hele jordens artsmangfold kunne vaert en skattkiste for vitenskapelige oppdagelser. Men til tross for store fremskritt innen genomsekvenseringsteknologi er det kun en svaert liten del av artene som har genomisk informasjon tilgjengelig. For a bidra til en mer komplett verdensomspennende genomdatabase har forskere og institusjoner over hele verden gatt sammen i Earth BioGenome prosjektet (EBP), som planlegger a sekvensere og assemblere referansegenomer av hoyeste kvalitet for alle ~1,5 millioner anerkjente eukaryote artene, gjennom en trinnvis tilnaerming. Initiativet gar na over i den andre fasen, hvor 150 000 arter skal sekvenseres i lopet av kun fire ar, og verdensomfattende deltakelse i prosjektet er derfor avgjorende for a lykkes. European Reference Genome Atlas (ERGA), som er den europeiske grenen i EBP, har som mal a implementere en ny desentralisert, tilgjengelig, rettferdig og inkluderende modell for produksjonen av referansegenomer av hoyeste kvalitet, som vil informere EBP samtidig som den utarbeides. For a ta fatt pa dette oppdraget lanserte ERGA et pilotprosjekt for a etablere et nettverk over hele Europa som skal utvikle og teste den forste infrastrukturen i sitt slag for koordinert og distribuert referansegenomproduksjon for 98 europeiske eukaryote arter. Disse provene kommer fra samarbeid med partnere i 34 europeiske land. Her skisserer vi prosessen og utfordringene under utviklingen av en pilotinfrastruktur for produksjonen av referansegenomressurser, og undersoker hvor effektiv denne tilnaermingen er nar det gjelder produksjon av referansegenomer av hoyeste kvalitet, samtidig som vi tar hensyn til rettferdighet og inkludering. Resultatene og erfaringene fra dette pilotprosjektet danner et solid grunnlag for ERGA, samtidig som det gir viktige erfaringer til andre transnasjonale og nasjonale genomiske ressursprosjekter. Faroese: Ein heimsfevnandi genomdatugrunnur vie ollum livveru slogum a joreini, kundi verie ein dyrgripur av visindaligum gjognumbrotum. Men tiverri er hetta ikki veruleiki enn. Hoast stora framgongd innan genom tokni, so er tae bert ein litil brotpartur av ollum livveru slogum, ie eru genom kanna. Fyri at faa gongd a ein slikan genomdatugrunn hava visindafolk og stovnar runt allan heimin skipa seg undir heitinum Earth BioGenome Project (EBP). Hetta er ein verkaetlan ie hevur til endamals at framleiea hagoesku tilvisingargenom, fyri tey aleie 1.5 millionir kendu livveru slogini. Verkaetlanin naerkast nu oerum stigi, har tilvisingargenom fyri 150,000 livveru slog skulu framleieast uppa fyra ar. Fyri at klara hesa storu uppgavu, er neyeugt at allur heimurin tekur lut i hesi verkaetlan. European Reference Genome Atlas (ERGA), ie er tann europeiski parturin av verkaetlanin, er i holt vie at gera eina forskrift fyri hvussu vit framleiea hagoesku tilvisingargenom. Hesin frymil tekur haedd fyri miespjaean, atkomuligheit, raettvisi og inklusjon. Til hesa uppgavu, hevur ERGA skapa eitt europeiskt netverk, til at menna og royna eitt undirstoeukervi, ie samskipar framleiesluna av tilvisingargenomum fyri 98 europeisk livveru slog fra 34 europeiskum londum. I hesi grein utgreina vit mannagongdir og avbjoeingar, ie toku seg upp ta tilvisingargenom undirstoeukervie var ment, og kanna hvussu effektivt hetta hevur verie i mun til framleieslu av hagoesku tilvisingargenomum, vie raettvisi og inklusjon i huganum. Urtokurnar og vitanin ie er funnin i hesum fyrsta partinum av verkaetlanini, er goeur grundsteinur til vieari menning av ERGA, og gevur tydningarmiklar leiereglur til aerar liknandi verkaetlanir. Hebrew: למאגר מידע גנומי גלובלי של מגוון המינים על פני כדור הארץ יש פוטנציאל להוות אוצר של תגליות מדעיות. עם זאת, למרות ההתקדמות המשמעותית בטכנולוגיות הריצוף הגנומי, קיים מידע גנומי זמין רק לחלק זעיר ממגוון המינים בעולם. על מנת לתרום ליצירת מאגר מידע גנומי מלא של כל כדור הארץ, מדענים ומרכזי מחקר מכל קצוות תבל חברו תחת פרויקט הביוגנום העולמי - EBP - אשר שם לו כמטרה ריצוף והרכבה של גנום יחוס באיכות גבוהה של כל אחד ממיליון וחצי המינים האאוקריוטים המוכרים. מטרה זו תושג באמצעות גישה רב-שלבית מדורגת ומתואמת. עם המעבר לשלב השני של היוזמה, שבמהלכו ירוצף הגנום של 150,000 מינים על פני ארבע שנים בלבד, החשיבות של מעורבות עולמית בפרויקט הופכת להיות מרכזית. אטלס הייחוס הגנומי האירופי (ERGA) מתוכנן להיות הצומת האירופאי של EBP. הוא שואף ליישם גישה מבוזרת, נגישה, שוויונית ומכילה לייצור ריצופי ייחוס גנומיים באיכות גבוהה, שניתן יהיה להגדילה עם הזמן ותוך תיאום עם EBP. כנקודת זינוק למשימה זו, EGRA השיק פרויקט פיילוט שמטרתו לבסס רשת כלל ארופאית שתפתח ותבדוק את התשתית הראשונה מסוגה להפקת רצפי גנום יחוס באופן מתואם ומבוזר. במסגרת הפיילוט הופקו גנומים מלאים עבור 98 מינים אאוקריוטים מאירופה, על בסיס דגימות שהגיעו מ-34 מדינות אירופאיות. אנו מציגים כאן בראשי פרקים גם את התהליך אותו עבר הפרויקט וגם את האתגרים עימם התמודד במהלך פיתוח התשתיות לייצור משאבים גנומיים. אנו בוחנים את יעילות הגישה בה נקט ERGA לייצור רצפי ייחוס גנומיים ברמה הגבוהה תוך שאנו גם לוקחים בחשבון הוגנות והכללה. התוצרים שהופקו והלקחים שנלמדו במהלך פרויקט הפיילוט מספקים תשתית יציבה להמשך עבור ERGA, ויכולים גם לשמש כבסיס ידע לפרויקטים גנומיים לאומיים ובינלאומיים אחרים. Serbian: Globalna baza podataka genoma svih vrsta na Zemlji mogla bi predstavljati veoma značajno naučno otkriće. Međutim, bez obzira na veliki napredak u tehnologijama sekvenciranja genoma, samo mali deo vrsta ima dostupne genomske informacije. Da bi doprineli potpunijoj planetarnoj genomskoj bazi podataka, naučnici i institucije sirom sveta su se ujedinili u okviru projekta Earth BioGenome Project (EBP), u okviru kog se planira sekvenciranje i sakupljanje visokokvalitetnih referentnih genoma za svih ~1,5 miliona poznatih eukariotskih vrsta kroz visefazni proces. Kako inicijativa prelazi u fazu II, gde će 150.000 vrsta biti sekvencionirano za samo četiri godine, siroko učesće u projektu biće od sustinskog značaja za uspeh. Kao evropski čvor EBP-a, Evropski referentni atlas genoma (ERGA) nastoji da implementira novi decentralizovan, pristupačan, pravičan i inkluzivan model za generisanje visokokvalitetnih referentnih genoma, koji će doprineti EBP-u. Da bi se upustila u ovu misiju, ERGA je pokrenula Pilot projekat za uspostavljanje mreze sirom Evrope za razvoj i testiranje prve infrastrukture te vrste za koordinisano generisanje i distribuciju referentnih genoma 98 evropskih eukariotskih vrsta obezbeđenih iz 34 evropske zemlje. Ovde prikazujemo proces i izazove sa kojima se suočavamo tokom razvoja pilot infrastrukture za generisanje referentnih genoma, i istrazujemo efikasnost ovog pristupa u smislu određivanje referentnih genoma visokog kvaliteta, uzimajući u obzir i jednakost i inkluziju. Ishodi i lekcije naučene tokom ovog pilot-projekta pruzaju solidnu osnovu za ERGA dok nude ključna znanja drugim transnacionalnim i nacionalnim projektima genoma. Ukrainian: Глобальна база даних геномів усього різноманіття видів Землі може стати скарбницею наукових відкриттів. Однак, незважаючи на значні досягнення в технологіях секвенування геному, лише незначна частка видів має наявну геномну інформацію. Щоб зробити свій внесок у створення більш повної планетарної геномної бази даних, вчені та інституції з усього світу об'єдналися в рамках проекту Earth BioGenome Project (EBP), який планує секвенувати та зібрати високоякісні референсні геноми для всіх ~1,5 мільйонів визнаних еукаріотичних видів шляхом підходу поетапного дослідження. Оскільки ця ініціатива переходить у другу фазу, де 150 000 видів мають бути секвеновані всього за чотири роки, залучення учасників з усього світу до проекту буде фундаментальним для його успіху. Як європейський вузол EBP, European Reference Genome Atlas (ERGA) прагне запровадити нову децентралізовану, доступну, справедливу та інклюзивну модель для створення високоякісних референсних геномів, яка інформуватиме EBP у міру його масштабування. Щоб розпочати цю місію, ERGA запустила пілотний проект для створення мережі по всій Європі для розробки та тестування першої інфраструктури такого роду для скоординованого та розподіленого створення референсних геномів 98 європейських еукаріотичних видів від постачальників зразків з 34 європейських країн. Тут ми окреслюємо процес та виклики, з якими ми зіткнулися під час розробки пілотної інфраструктури для створення референсних геномних ресурсів, і досліджуємо ефективність цього підходу з точки зору створення високоякісних референсних геномів, враховуючи також справедливість та інклюзивність. Результати та уроки, отримані під час цього пілотного проекту, створюють фундаментальні підвалини для ERGA, водночас пропонуючи ключові знання для інших транснаціональних та національних проектів з геномних ресурсів. Catalan: Una base de dades global que contingui el genoma de la diversitat de totes les especies del planeta Terra podria ser un tresor de descobriments cientifics. Tanmateix, malgrat els grans avencos en la tecnologia de sequenciacio del genoma, nomes es disposa informacio genomica d'una petita fraccio del conjunt de totes les especies. Per contribuir a una base de dades genomica planetaria mes completa, cientifics i institucions de tot el mon s'han unit sota el Earth BioGenome Project (EBP). Aquest projecte, mitjancant un enfoc gradual dividit en diferents fases, te com objectiu sequenciar i fer l'assemblatge de genomes de referencia d'alta qualitat per aproximadament l'1,5 milions de totes les especies eucariotes conegudes. A mesura que la iniciativa passa a la Fase II, on es te previst sequenciar 150.000 especies en nomes quatre anys, la participacio en el projecte de tots els actors a nivell mundial sera fonamental per assolir els objectius. Com a node europeu de l'EBP, l'Atles de Genomes de Referencia Europeu (European Reference Genome Atlas, ERGA) busca implementar un nou model descentralitzat, accessible, equitatiu i inclusiu per produir genomes de referencia d'alta qualitat, que a mesura que s'avanci, anira informant l'EBP. En iniciar-se aquesta missio, l'ERGA va llancar un Projecte Pilot que va establir una xarxa europea per desenvolupar i provar una primera infraestructura d'aquest tipus. En un primer moment es va realitzar la produccio coordinada i la distribucio dels genomes de referencia de 98 especies eucariotes europees, amb proveidors de mostres de 34 paisos europeus. A continuacio es descriuen els processos i els reptes que s'han hagut d'afrontar durant el desenvolupament de la infraestructura del Projecte Pilot per a la produccio de recursos de genomes de referencia, i s'explora l'eficacia d'aquest enfocament en la produccio de genomes de referencia d'alta qualitat, considerant tambe els principis d'equitat i inclusio. Els resultats i les llicons apreses durant aquest Projecte Pilot proporcionen una base solida per a l'ERGA, alhora que ofereixen aprenentatges clau per altres projectes de recursos genomics transnacionals i nacionals. Croatian: Globalna baza podataka s genomima svih vrsta na Zemlji biti će riznica znanstvenih otkrića. Međutim, bez obzira na veliki napredak u tehnologiji sekvenciranja genoma, samo mali dio vrsta ima dostupne genomske podatke. Kako bi pridonijeli u stvaranju sto potpunije globale baze genoma, znanstvenici i institucije diljem svijeta ujedinili su se u okviru inicijative Earth BioGenome Project (EBP), kojoj je cilj postepeno sekvencirati i sastaviti visokokvalitetne referentne genome za ~1,5 milijun poznatih eukariotskih vrsta. Kako inicijativa ulazi u drugu fazu, kroz koju će se u samo četiri godine sekvencirati genomi 150 000 vrsta, sirenje sudjelovanja u projektu na cijeli svijet biti će ključno za njegov uspjeh. Kao europski predstavnik EBP-a, inicijativa Europski atlas referentnih genoma (European Reference Genome Atlas, ERGA) nastoji implementirati novi decentralizirani, pristupačan, pravičan i uključiv model za proizvodnju visokokvalitetnih referentnih genoma. Kao prvi korak ove misije, ERGA je pokrenula pilot projekt za uspostavljanje europske mreze s ciljem razvoja i testiranja infrastrukture, prve te vrste, za koordinirano i distribuirano sekvenciranje referentnih genoma na 98 europskih eukariotskih vrsta odabranih od predstavnika iz 34 europske zemlje. Ovdje opisujemo proces i izazove s kojima smo se suočili u pilot projektu tijekom razvoja infrastrukture potrebne za produkciju referentnih genoma, i istrazujemo učinkovitost ovog pristupa za proizvodnju visokokvalitetnih referentnih genoma, uzimajući u obzir jednakost i uključenost. Ishodi i lekcije naučene tijekom ovog pilot-projekta daju čvrstu osnovu za ERGA-u, a istovremeno nude ključna znanja drugim internacionalnim i nacionalnim genomskim projektima. Czech: Globalni databaze genomů vsech druhů organismů, zijicich na zemi by byla neocenitelnou pomůckou na cestě ke vědeckym objevům. Bohuzel, i přes vyznamne pokroky v technologiich sekvenace genomů jsou genomicke informace dostupne pouze pro zanedbatelne mnozstvi druhů. Proto, aby bylo mozne vytvořit uplnějsi globalni databazi genomů, se spojili vědci a vyzkumne instituce v ramci projektu Earth BioGenome Project (EBP), jehoz cilem je postupně sekvenovat a zkompletovat vysoce kvalitni referenčni genomy pro vsech ~ 1.5 milionu znamych druhů eukaryotnich organismů. Protoze tato iniciativa nyni vstupuje do druhe faze, během niz by mělo byt v průběhu čtyř let sekvenovano 150,000 druhů, je pro jeji uspěch nezbytna participace vědců z celeho světa. European Reference Genome Atlas (ERGA), jako evropsky uzel EBP, usiluje o vytvořeni decentralizovaneho, dostupneho, spravedliveho a inkluzivniho postupu tvorby kvalitnich referenčnich genomů, ktery by k tomuto usili přispěl. Jako počatek teto mise spustila ERGA pilotni projekt, ktery měl za cil vytvořit celoevropskou siť pro vyvoj a testovani infrastruktury pro koordinovanou a distribuovanou produkci genomů 98 evropskych druhů eukaryot, pochazejicich ze vzorků, dodanych spolupracovniky ze 34 evropskych statů. Zde popisujeme postupy a vyzvy, se kterymi se tato iniciativa setkala při tvorbě pilotni infrastruktury pro produkci referenčnich genomů a posuzujeme efektivitu tohoto přistupu, přičemz bereme v uvahu rovněz spravedlnost a inkluzi. Prakticke vystupy i zkusenosti, ziskane v průběhu teto pilotni studie tvoři solidni zaklad pro dalsi činnost ERGA a nabizi rovněz ziskane zkusenosti dalsim genomickym projektům narodnich i nadnarodnich urovni. Estonian: Ulemaailmne, kogu maakera liigilist mitmekesisust holmav genoomne andmebaas voib kujuneda teaduslike avastuste aardelaekaks. Hoolimata genoomi sekveneerimis- ehk jarjestamistehnoloogia suurtest edusammudest on genoomne teave olemas vaid vaga vaikesel osal liikidest. Selleks, et kaasa aidata uha terviklikuma, kogu maakera genoomiandmebaasi loomisele, on teadlased ja institutsioonid ule kogu maailma koondunud Maa biogenenoomimiprojekti (Earth BioGenome Project, EBP) alla, mille raames on kavas jark-jargult jarjestada koigi ~1,5 miljoni teadaoleva eukaruoodi ehk paristuumse liigi kvaliteetsed referentsgenoomid. Kuna algatus on liikumas ule II etappi, milles nelja aasta jooksul jarjestatakse 150 000 liigi genoomid, on ulemaailmne osalus projekti onnestumiseks vaga oluline. EBP Euroopa solmena puuab Euroopa referentsgenoomide atlas (ERGA) rakendada uut, detsentraliseeritud, juurdepaasetavat, oiglast ja kaasavat mudelit kvaliteetsete referentsgenoomide loomiseks. Selle missiooni alustamiseks kaivitas ERGA pilootprojekti, mille raames loodi ule-Euroopaline vorgustik, et arendada ja katsetada esimest omataolist taristut ning nii jarjestati koordineeritud viisil 34st Euroopa riigist parit proovimaterjali pohjal 98 eukaruootse liigi referentsgenoomid. Siinkohal kirjeldame referentsgenoomi ressursside loomise piloottaristu valjatootamist ning sellega kaasnenud probleeme ning uurime selle lahenemisviisi tohusust kvaliteetsete referentsgenoomide loomise seisukohast, vottes sealjuures arvesse ka vordsust ja kaasatust. Selle pilootprojekti tulemused ja oppetunnid on ERGA jaoks heaks vundamendiks, pakkudes samal ajal olulisi oppetunde teistele riikidevahelistele ja -sisestele genoomiprojektidele. Finnish: Maailmanlaajuinen genomitietokanta koko planeetan lajien monimuotoisuudesta voisi olla tieteellisten loytojen aarreaitta. Siita huolimatta, etta genomin sekvensointitekniikoissa on otettu suuria edistysaskelia, genomitietoa on saatavilla vain pienesta osasta lajeja. Taydentaakseen genomitietokantaa tutkijat ja instituutiot ympari maailman ovat liittyneet Earth BioGenome Project (EBP) -projektiin, jonka tavoitteena on vaiheittain sekvensoida ja koota korkealaatuisia referenssigenomeja jokaiselle noin 1,5 miljoonasta tunnetusta eukaryootti lajista. Aloitteen siirtyessa vaiheeseen II, jossa 150 000 lajia on maara sekvensoida neljassa vuodessa, maailmanlaajuinen osallistuminen hankkeeseen on edellytys sen menestykselle. EBP:n eurooppalaisena osana, European Reference Genome Atlas (ERGA) pyrkii luomaan uuden hajautetun, esteettoman, tasapuolisen ja osallistavan mallin korkealaatuisten referenssigenomien tuottamiseksi, joka pitaa EBP:n ajan tasalla prosessin etenemisesta. Ryhtyessaan tahan tehtavaan, ERGA kaynnisti pilottihankkeen luodakseen Euroopan poikki kulkevan verkoston, jonka tarkoituksena on kehittaa seka testata uudenlaista infrastruktuuria, jota kaytetaan koordinoituun ja hajautettuun referenssi genomien tuotantoon 98:n eurooppalaisen lajin kohdalla, joiden naytteet tulevat 34:sta eri Euroopan maasta. Tassa hahmottelemme prosessia ja haasteita, joita kohtasimme kehittaessamme pilotti-infrastruktuuria referenssigenomien tuotantoa varten, ja tutkimme taman lahestymistavan tehokkuutta korkealaatuisen referenssigenomi tuotannon kannalta, ottaen huomioon myos tasa-arvon ja inklusiivisuuden. Taman pilotin aikana saadut tulokset ja opetukset antavat vankan perustan ERGA:lle ja tarjoavat samalla keskeisia oppitunteja muille monikansallisille ja kansallisille genomi resursseja koskeville projekteille. Bulgarian: Световната геномна база данни за видовото разнообразие на Земята може да бъде съкровищница за научни открития. Въпреки това, независимо от големия напредък в технологиите за секвениране на генома, само за малка част от видовете има налична геномна информация. За да допринесат за създаване на по-пълна планетарна геномна база данни, учени и институции от целия свят се обединиха в рамките на Earth BioGenome Project (EBP), който планира да секвенира и сглоби висококачествени референтни геноми за всички ~ 1,5 милиона установени еукариотни вида, чрез стъпаловиден поетапен подход. Тъй като инициативата преминава във фаза II, където 150 000 вида трябва да бъдат секвенирани само за четири години, световното участие в проекта ще бъде определящо за успеха. Като европейска възлова точка на EBP, Европейският референтен геномен атлас (ERGA) се стреми да приложи нов децентрализиран, достъпен, справедлив и приобщаващ модел за получаване на висококачествени референтни геноми, информирайки EBP. За да започне тази мисия, ERGA стартира пилотен проект за създаване на мрежа в цяла Европа за разработване и тестване на първата по рода си инфраструктура за координирано и разпределено получаване на референтни геноми на 98 европейски еукариотни вида от 34 европейски държави. Тук очертаваме процеса и предизвикателствата, с които се сблъскваме по време на разработването на пилотната инфраструктура за получаване на референтни геномни ресурси, и изследваме ефективността на този подход по отношение на висококачественото референтно геномно производство, като се има предвид също справедливостта и обхвата. Резултатите и уроците, научени по време на този пилотен проект, осигуряват солидна основа за ERGA, като същевременно предлагат ключови знания за други транснационални и национални проекти за геномни ресурси. Hungarian A Fold teljes faji sokfelesegenek globalis genom-adatbazisa a tudomanyos felfedezesek kincsesbanyaja lehetne. A genomszekvenalasi technologiak jelentős fejlődesetől fuggetlenul azonban a fajoknak csak egy kis toredeke rendelkezik genomi informaciokkal. A teljesebb bolygoszintű genomikai adatbazis letrehozasahoz valo hozzajarulas erdekeben a tudosok es intezmenyek vilagszerte osszefogtak az Earth BioGenome Project (EBP) kereteben, amely fokozatos megkozelitessel tervezi a ~1,5 millio elismert eukariota faj jo minősegű referencia genomjanak szekvenalasat es osszerakasat. Mivel a kezdemenyezes a II. fazisba lep, ahol mindossze negy ev alatt 150 000 faj szekvenalasat tervezik elvegezni, a projektben valo vilagmeretű reszvetel alapvető fontossagu a sikerhez. Az EBP europai csomopontjakent az Europai Referencia Genom Atlasz (ERGA) egy uj, decentralizalt, hozzaferhető, meltanyos es inkluziv modellt kivan megvalositani a kivalo minősegű referencia genomok előallitasara, amely az EBP novekedesehez hozzajarulando, az EBP-t is tajekoztatni fogja. E kuldetes megvalositasa erdekeben az ERGA kiserleti projektet inditott egy europai halozat letrehozasara, amelynek celja a maga nemeben első olyan infrastruktura kifejlesztese es tesztelese, amely 98 europai eukariota faj koordinalt es elosztott referencia-genomjanak előallitasat teszi lehetőve 34 europai orszag mintaadoitol. A kovetkezőkben felvazoljuk a referencia-genomforrasok előallitasara szolgalo kiserleti infrastruktura fejlesztese soran felmerult folyamatokat es kihivasokat, es megvizsgaljuk e megkozelites hatekonysagat a magas szinvonalu referencia-genom osszerakas szempontjabol, figyelembe veve az egyenlőseget es az integraciot is. A kiserleti projekt soran elert eredmenyek es tanulsagok szilard alapot biztositanak az ERGA szamara, mikozben kulcsfontossagu ismereteket kinalnak mas transznacionalis es nemzeti genomikai erőforras-projektek szamara.

Biodiversity, Genomics, Genome Assembly, Genome Annotation, Metadata

Submission: posted 01 October 2023, validated 03 October 2023
Recommendation: posted 18 April 2024, validated 11 May 2024

Cite this recommendation as:
Narayan, J. (2024) Informed Choices, Cohesive Future: Decisions and Recommendations for ERGA. Peer Community in Genomics, 100298. 10.24072/pci.genomics.100298

Recommendation

The European Reference Genome Atlas (ERGA) (Mc Cartney et al, 2024, Mazzoni et al, 2023) demonstrates the collaborative spirit and intellectual abilities of researchers from 33 European countries. This ambitious project, which is part of the Earth BioGenome Project (Lewin et al., 2018) Phase II, has embarked on an unprecedented mission: to decipher the genetic makeup of 150,000 species over a span of four years. At the heart of ERGA is a decentralized pilot infrastructure specifically built to assist the production of high-quality reference genomes. This structure acts as a scaffold for the massive task of genome sequencing, giving the necessary framework to manage the complexity of genomic research. The research paper under consideration offers a comprehensive narrative of ERGA's evolution, outlining both successes and challenges encountered along the road.

One of the most significant issues addressed in the manuscript is the equitable distribution of resources and expertise among participating laboratories and countries. In a project of this magnitude, it is critical to leverage the pooled talents and capacities of researchers from across Europe. ERGA's pan-European network promotes communications and collaboration, creating an environment in which knowledge flows freely and barriers are overcome. This adoption of strong coordination and communication tactics will be essential to ERGA's success. Scientific collaboration depends on efficient communication channels because they allow researchers to share resources, collaborate on new initiatives, and exchange ideas. Through a diverse range of gatherings, courses, and virtual discussion boards, ERGA fosters an environment of transparency and cooperation among members, enabling scientists to overcome challenges and make significant discoveries. The importance ERGA places on training and information transfer programmes is a pillar of its strategy. Understanding the importance of capacity development, ERGA invests in providing researchers with the knowledge and abilities necessary for effectively navigating the complicated terrain of genomic research. A wide range of subjects are covered in training programmes (Larivière et al. 2023), from sample preparation and collection to data processing methods and sequencing technology. Through the development of a group of highly qualified experts, ERGA creates the foundation for continued advancement and creativity in the genomics sector.

This manuscript also covers in detail the technological workflows and sequencing techniques used in ERGA's pilot infrastructure. With the aid of cutting-edge sequencing technologies based on both long-read and short-read sequencing, they are working to unravel the complex structure of the genetic code with a level of accuracy and precision never before possible. To guarantee the accuracy of genetic data and prevent mistakes and flaws that can jeopardize the findings' integrity, quality control methods are put in place. Despite having a focus on genome sequencing due to its technological complexities, ERGA also remains firm in its dedication to metadata collection and sample validation. Metadata serves as a critical link between raw genetic data and useful scientific insights, giving necessary context and allowing researchers to draw practical findings from their investigations. Sample validation approaches improve the reliability and reproducibility of the results, providing users confidence in the quality of the genetic data provided by ERGA.

Looking ahead, ERGA envisions its decentralized infrastructure serving as a model for global collaborative research efforts. By embracing diversity, encouraging cooperation, and pushing for open access to data and resources, ERGA hopes to catalyze scientific discovery and generate positive change in the field of biodiversity genomics. ERGA aims to promote a more equitable and sustainable future for all by ongoing interaction with stakeholders, intensive outreach and education activities, and policy change advocacy. In addition to its immediate goals, ERGA considers the long-term implications of its work. As genomic technology progresses, the potential application of high-quality reference genomes will continue to grow. From informing conservation efforts and illuminating evolutionary histories to revolutionizing healthcare and agriculture, it is likely that ERGA's contributions will have far-reaching consequences for people and the planet as a whole.

Furthermore, ERGA understands the importance of interdisciplinary collaboration in addressing the difficult challenges of the twenty-first century. ERGA aims to integrate genetic research into larger initiatives to promote sustainability and biodiversity conservation by forming relationships with stakeholders from other areas, such as policymakers, conservationists, and indigenous groups. Through shared knowledge and community action, ERGA seeks to create a future in which mankind coexists peacefully with the natural world, guided by a thorough grasp of its genetic legacy and ecological interconnectivity.

Finally, the manuscript exemplifies ERGA's collaborative ambitions and achievements, capturing the spirit of creativity and collaboration that defines this ground-breaking effort. As ERGA continues to push the boundaries of genetic research, it remains dedicated to scientific excellence, inclusivity, and the quest of knowledge for the benefit of society. I wholeheartedly recommend the publication of this groundbreaking initiative, offering my enthusiastic endorsement for its valuable contribution to the scientific community.

References
Larivière, D., Abueg, L., Brajuka, N. et al. (2024). Scalable, accessible and reproducible reference genome assembly and evaluation in Galaxy. Nature Biotechnology 42, 367-370. https://doi.org/10.1038/s41587-023-02100-3

Lewin, H. A., Robinson, G. E., Kress, W. J., Baker, W. J., Coddington, J., Crandall, K. A., Durbin, R., Edwards, S. V., Forest, F., Gilbert, M. T. P., Goldstein, M. M., Grigoriev, I. V., Hackett, K. J., Haussler, D., Jarvis, E. D., Johnson, W. E., Patrinos, A., Richards, S., Castilla-Rubio, J. C., … Zhang, G. (2018). Earth BioGenome Project: Sequencing life for the future of life. Proceedings of the National Academy of Sciences, 115(17), 4325–4333. https://doi.org/10.1073/pnas.1720115115

Mazzoni, C. J., Claudio, C.i, Waterhouse, R. M. (2023). Biodiversity: an atlas of European reference genomes. Nature 619 : 252-252. https://doi.org/10.1038/d41586-023-02229-w

Mc Cartney, A. M., Formenti, G., Mouton, A., Panis, D. de, Marins, L. S., Leitão, H. G., Diedericks, G., Kirangwa, J., Morselli, M., Salces-Ortiz, J., Escudero, N., Iannucci, A., Natali, C., Svardal, H., Fernández, R., Pooter, T. de, Joris, G., Strazisar, M., Wood, J., … Mazzoni, C. J. (2024). The European Reference Genome Atlas: piloting a decentralised approach to equitable biodiversity genomics. bioRxiv, ver. 4 peer-reviewed and recommended by Peer Community in Genomics. https://doi.org/10.1101/2023.09.25.559365

PDF recommendation

Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article. The authors declared that they comply with the PCI rule of having no financial conflicts of interest in relation to the content of the article.

Funding:
Funding too long to insert here, but can be found at the bottom of the bioRxiv submission

Reviews

Toggle reviews

Reviewed by Eric Crandall, 01 Apr 2024

The authors have done well to address both reviewers' comments. Just a few very minor comments below. Congratulations!

Specific comments:

P4 "are limited in scope due in large PART to a current lack of standardisation"

P8 the percentages for self-reported gender are reversed and differ from what is in Figure 2b.

P13 "To support this RECOMMENDATION we issued supporting guidance for bioanking"

Finally, a lot of text was moved around, so I may have missed something, but I can't find the text that was reported to have been added in Review 2 Response 12 to clarify recommendations. I've looked in the tracked-changes Word document as well as the posted preprint. I felt that the text was a good addition, and would like to see that it makes it into the final version.

Evaluation round #1

DOI or URL of the preprint: https://doi.org/10.1101/2023.09.25.559365

Version of the preprint: 2

Author's Reply, 07 Mar 2024

Download author's reply Download tracked changes file

Decision by Jitendra Narayan, posted 04 Jan 2024, validated 08 Jan 2024

I strongly urge the author to carefully consider the constructive criticisms and comments made by the discerning reviewers. When writing responses, please explain the changes made in response to each critique, elaborate on any additional data or analyses performed, and provide thorough clarifications where necessary.

Reviewed by Justin Ideozu, 11 Dec 2023

Title: The European Reference Genome Atlas: piloting a decentralized approach to equitable biodiversity genomics

The article details the procedures and challenges encountered while developing a pilot infrastructure for the production of reference genome resources. The authors mentioned that the results and insights gained from the pilot lay a strong foundation for ERGA and offer valuable knowledge to other national and transnational genomic resource initiatives.

Overall, the manuscript was well-written, with nice figures and rich references. However, the structure could use some improvement to enhance readability. One way to achieve this, if it aligns with the ERGA implemented workflows, would be to reorganize the sections into four parts: 1) Background, 2) Development of a Decentralized Infrastructure, 3) Challenges, and 4) Future Directions.

Section 2, can also be restructured into five subsections;

1. Genome Team Establishment

2. Building a Representative Species List

3. Developing a Communications and Coordination Strategy

4. Developing a Capacity Building and Knowledge Transfer Strategy

5. Technical Workflows

Section 2.5, Technical Workflows: These are well described in Steps 2-9, and should be reassigned accordingly. Step 5, should be changed to Sample Preparation or similar since it describes not only HMW DNA isolation but also library prep considerations for each of the platforms.

Section 3, Challenges, authors can assign the challenges into broad themes/subsections; For example, authors can assign the already described challenges into Social, Administrative and Technical Challenges or other relevant titles. Authors could also restructure this section to decribe challenges encoutered in specific sections of Section 2. Authors should avoid repeating titles in subsections. For example, Training and Knowledge Transfer appeared twice.

Reviewed by Eric Crandall, 07 Nov 2023

Summary

The authors describe, at length, the pilot program for the European Reference Genome Atlas, which is the European node of the Earth Biogenome Project (EBP). EBP aspires to sequence the genomes of every eukaryotic species on our planet. The authors describe in detail the selection of species, development of infrastructure, and then nine steps toward the eventual sharing of completed reference genomes, from selection of genome teams, through sample collection and storage, DNA extraction, sequencing, assembly, annotation analysis and sharing of the data. They conclude with a discussion of the challenges of creating a decentralized network and ways to address these in the future.

Major Comments

This is a well-written description of the pilot version of gigantic undertaking, which is itself large in scope. While 98 reference genomes is nothing to sneeze at, the larger importance is that the authors have provided a template, which can be modified and applied around the world, towards the "moon-shot" goal of the EBP. I'm therefore glad that the authors have gone with Peer Community In, and I would suggest that they resist pressure from reviewers or editors to shorten this methods paper. It is full of important details that will be useful to others who try to replicate their success! The authors also clearly appreciate that it is at least as important *who* is doing science as *how* the science is being done, and have taken major steps to be inclusive in their science.

I did want to raise one important issue. The authors clearly understand the importance of ERGA's role in the global biodiversity community, as indicated by Case Study 4. For this reason, I strongly suggest that they use the relevant, established metadata standards and definitions whenever possible, to ensure that ERGA's hard won data are findable, accessible, interoperable (especially) and reusable (FAIR). Reviewing the ERGA Sample Manifest v2.4.3 that was linked in the article, the terms used are not from either Darwin Core (DwC), which is the relevant standard for biodiversity data, or MIxS, which is the relevant genomic metadata standard. This will be important if ERGA wants to share their metadata into GBIF, which uses Darwin Core, and I'd be surprised if they haven't already had issues with uploading to INSDC. Thoughtful people have put a lot of time into developing MIxS and DwC terms and definitions, and even if they are imperfect (for example neither has a term for permit information), the principles of precedence and standardization should be operative here. I don't know that addressing this issue should be a condition for PCI recommendation, as it will probably take some work and time to make changes in COPO's code. But that is also why it is important to address this issue now, rather than later.

Specific Comments

P3 Incorrect quantifier "Biodiversity and ecosystem decline, loss and degradation raise the prospect that *MUCH*, if not most, of the Earth’s biodiversity will be lost forever before they can be genomically explored..."

Also, I fully understand the intention of this sentence but it could be construed to mean that the only value in a species is found in its genomic resources. I know this is not the authors' intent but I suggest rewording.

P4 "However, the scientific enquiries that can be actualised from reference resources15 are limited in scope due in large to a current lack of standardisation across the multitude of actors involved throughout the production of complete reference resources."

Great sentence! I suggest replacing "actualised" with "realised"

P6 "In other cases, *partnering sequencing* contributed their own grant funds" - sequencing partners?

P7 "Building a representative species list" - I have wondered how to go about prioritization of species. This seems like a reasonable process, but surely phylogenetic representation could be considered. I'm curious about how target categories were selected though. (I note from page 26 that phylogenetic representation will be considered going forward)

Figure 2b. I am having trouble interpreting the "International Genome Team Composition". Are the bins the number of countries represented on a genome team? The text on page 8 clarifies that this is the number of international members, where "international" is defined as coming from a separate country than the sample. But the figure legend should be clearer. Or even expressing it in terms of number of countries would be clearer still.

P9 GDPR should be added to the glossary. As a US citizen, I'm aware of GDPR, but other readers might not be.

P9 Step 2: Pre-sampling requirements: Taking all of this into account requires a lot of effort and I congratulate the authors for making it a part of their infrastructure from the beginning.

P10 Thanks for making the sample manifest publicly available. Great that you are using validation rules. I would strongly urge ERGA and COPO to adopt the Darwin Core and/or MIxS metadata standards for their metadata to ensure their FAIRness. See major comments above.

P10 "Unique to ERGA, fields were developed to mandate important information disclosure..."

These fields are not unique to ERGA. At GEOME we have developed similar fields to accept globally unique and persistent identifiers (EZIDs), as well as information about permits and TK/BC notices and labels. See Riginos et al. 2021. These fields are not covered by MIxS or Darwin Core - it might be a good time to meet to discuss standardization of this information.

P12 "All 98 of genome teams" -- All 98 genome teams

P13 "To initialise these partnerships, a sequencing platform landscape assessment was conducted across all of the countries that ERGA had council representation" -- across all of the countries that had ERGA council representation.

P14 "Here, we recommended the following data-type volumes for assembly generation: 30X HiFi or 60X ONT, 25X Hi-C (per haplotype) and 25X (per haplotype) Illumina (in cases where ONT data was used), and the following data-type volumes for annotation: total of 100 million reads if >five tissue types are available, or 30 million reads if tissue samples are pooled."

I am not an expert in genome sequencing as I work more at the population level, so I can't comment on the suitability of these recommendations. However, if these are official ERGA recommendations, there is a lot of room for misunderstanding here. I would spend the space to make them more clear, either in a table, or using several very clear sentences, with AND and OR statements.

P16 I've done a little work to try to understand figure 3A, but haven't made much progress. How can annotation data be at the permitting stage?

P18 I quite like this figure, with lots of information content. While I understand the utility of ToLIDs, I wonder if they are helpful here as I'd have to go to the supplemental table to decipher them. Just flagging a potential issue - handle as you see fit.

Literature Cited

Riginos C, Crandall ED, Liggins L, Gaither MR, Ewing RB, Meyer C, Andrews KR, Euclide PT, Titus BM, Therkildsen NO, Salces‐Castellano A, Stewart LC, Toonen RJ, Deck J. 2020. Building a global genomics observatory: Using GEOME (the Genomic Observatories Metadatabase) to expedite and improve deposition and retrieval of genetic data and metadata for biodiversity research. Molecular Ecology Resources 20:1458–1469. DOI: 10.1111/1755-0998.13269.