The genome is one of the constitutive elements of the human being – so it was also a sensation when the first genome was published a good twenty years ago. However, the International Human Genome Project, which was launched in 1990 and had its reference sequence published in 2003, had gaps: it comprised just over 90 percent of the human genome and consisted of gene sequences from only about 20 people, the majority of which consisted of one person. Following the publication of a first genome of a single human being that is considered complete in "Science" last year, an international team of researchers is now preparing to take the logical next step: to develop a reference for the human "pangenome".

Hinnerk Feldwisch-Drentrup

Editor in the "Nature and Science" department.

  • Follow I follow

This Wednesday, the "Human Pangenome Reference Consortium" published a first draft in the journal "Nature", which contains genome sequences of 47 genetically diverse individuals. The aim of the collection is to "represent as many DNA sequences as possible that can be found within our species," according to a press release. It is estimated that the genomes of all people are 99.9 percent identical – the remaining per thousand thus accounts for all genetic differences. So far, however, mainly genomes of people of European descent have been used for research purposes. As a result, medical decisions for people from other regions of the world may be based on irrelevant data.

Genome of 350 people

Compared to the previous human reference genome with its approximately three billion base pairs, gaps of 119 million base pairs are now closed and numerous structural variants of the genome are covered. On the other hand, the new project makes it possible to capture and map variations between individuals. The results are only an intermediate step: In just over a year, the pangenome is expected to map the genetic diversity of 350 people.

The new reference data are intended to reduce the risk of exacerbating health inequalities, explains Eric Green, director of the U.S. government-funded National Human Genome Institute. This is also in line with its goal of increasing diversity in all aspects of genome research, which is crucial to making genome-based medicine usable in an equitable way.

The project is made possible by new sequencing technologies that can read longer sections of the genome – previously, only short sections could be recorded at a time and the genetic information could then be assembled into a genome using algorithms. It is also now possible to read genetic material that is not easily accessible due to the cellular packaging of DNA, or those in which the same sequences are often repeated. The cost of the current project is estimated at around 40 million US dollars, while that for the Human Genome Project was estimated at around three billion US dollars.

"First step towards democratization of the genome"

The current publication, as well as that from "Science" last year, "close virtually all the gaps that were still present in the first version of the genome," explains geneticist André Reis, who is not involved in the pangenome project, in a statement. He is Director of the Institute of Human Genetics at the University Hospital Erlangen. Structural differences in the genome could have a major impact and be relevant to health; its regulatory elements, which "control the orchestra of genes", so to speak, are less well understood. The new project now captures a significant part of the global diversity. "It is an important first step in the democratization of the genome and the participation of people of non-European descent in the achievements of genome research."

The project is a start, says Stefan Mundlos, Director of the Institute of Medical Genetics and Human Genetics at Charité – but: "It will never be possible to fully map complete diversity, because many variants are individual." It will help to uncover further associations between diseases and the genome. In about a third of patients with a genetic disease, it can already be diagnosed with the help of the existing reference genome. "With the complete genome, this situation should improve," says Mundlos. At the same time, the Pangenome Project aims to develop new standards, for the benefit not only of humans. "The methods we are developing should prove valuable to other species as well," the authors write in their article.