Detailed human pangenome reference captures human diversity

Scientists reveal more complete, diverse collection of genome sequences

Researchers have published a new set of reference human genome sequences that reveals far more genomic diversity from different populations of people than was available previously. Washington University School of Medicine in St. Louis serves as the national coordinating center for the program, called the Human Pangenome Reference Consortium. (Image: Getty Images)

The Human Genome Project, funded by the National Institutes of Health (NIH), ended in April 2003 and produced a human genome sequence made up of a patchwork of data from a small number of individuals. This lack of diversity limited its usefulness as a research tool for understanding human health and disease. Now, researchers have published a new set of reference human genome sequences that reveals far more genomic diversity from different populations of people than was available previously.

Led by the international Human Pangenome Reference Consortium, the work is funded by the National Human Genome Research Institute (NHGRI) of the NIH and appears in a set of papers published May 10 in the journal Nature.

Washington University School of Medicine in St. Louis serves as the national coordinating center for the consortium. Ting Wang, the Sanford C. and Karen P. Loewentheil Distinguished Professor of Medicine at Washington University, leads the coordinating center and is a co-senior author on one of the papers. Washington University also is a key player in the national data production center, led by the University of California, Santa Cruz. Wang leads Washington University’s data production, which contributed almost one-third of the data for this first set of studies.

“These new pangenome reference sequences will serve as an important tool in understanding the diversity of human genetics and its role in determining how human health is maintained and what can go wrong in various diseases,” Wang said. “We look forward to sharing this resource with the research community around the world.”

The new pangenome reference includes genome sequences from 47 people of diverse backgrounds. The study recruited participants from communities around the globe, including, for example, people of African Caribbean ancestry in Barbados, people of African ancestry in the southwest U.S., people of Peruvian ancestry in Lima, Peru, members of the Punjabi community in Lahore, Pakistan, and Han Chinese people in southern China, among others.

The work is ongoing, and the researchers hope to have sequenced the genomes of 350 people by mid-2024. This larger sample size will offer a more complete view of the full diversity of human populations globally.

A human genome is the DNA blueprint guiding the embryonic development and daily bodily functions of a person. In general, any two individuals’ genomes are about 99.6% identical. The 0.4% difference makes each person unique and can reveal information about a person’s health and risk of diseases such as cancer, and Alzheimer’s and heart disease, for example.

The current reference human genome sequence has gaps, especially in areas that are repetitive and hard to read. Recent technological advances such as long-read DNA sequencing, which reads longer stretches of the DNA at a time, helped researchers fill in those gaps to create the first complete human genome sequence. This complete human genome sequence was released last year as part of the NIH-funded Telomere-to-Telomere (T2T) consortium and is incorporated into the current pangenome reference.

“The human pangenome reference will enable us to represent tens of thousands of novel genomic variants in regions of the genome that were previously inaccessible,” said Wen-Wei Liao, a doctoral student in Washington University School of Medicine’s Division of Biology & Biomedical Sciences and a co-first author of one of the Nature papers. He has two academic affiliations and is currently conducting his research at Yale University. “With a pangenome reference, we can accelerate clinical research by improving our understanding of the link between genes and disease traits in diverse populations.”

Other researchers contributing to the new human pangenome reference include those from the University of California, Santa Cruz; Harvard Medical School; Yale University; Heinrich Heine University in Germany; and the University of Tennessee.


Liao et al. A draft human pangenome reference. Nature. May 10, 2023. DOI: 10.1038/s41586-023-05896-x.

This work was funded in part by the National Institutes of Health (NIH) grant numbers U41HG010972, 1U01HG010973, U41HG007234, 1R01HG011274, R01HG010485, U24HG010262 U01HG010963, U24HG007497, U01HG010961, OT2OD033761, U24HG011853, R01-HG006677 R35-GM130151, R01HG002385, R01HG010169, HG007497, U01HG010963, U01HG01973, R01HG011649, 5U01HG010971, R01GM123489, U24HG009081, R01-HG006677, R35- GM130151 and 1ZIAHG200398. Funding also was provided by the Intramural Research Program of the National Human Genome Research Institute of the NIH; the National Center for Biotechnology Information of the National Library of Medicine (NLM) of the NIH; the USDA National Institute of Food and Agriculture, grant number 2018-67015-28199; the National Science Foundation (NSF), grant IOS-1744309 and NSF PPoSS Award #2118709; the Natural Sciences and Engineering Research Council of Canada (NSERC); a Canada Research Chair Tier 1 award; a FRQ-S Distinguished Research Scholar award; the World Premier International Research Center Initiative (WPI), MEXT, Japan; the Carlsberg Foundation; the National Institute of Standards and Technology; the Howard Hughes Medical Institute; an Oxford Nanopore Research Grant SC20130149; the Wellcome Trust, award numbers WT104947/Z/14/Z, WT222155/Z/20/Z and WT108749/Z/15/Z; a Juan de la Cierva fellowship grant, number IJC2020-045916-I funded by MCIN/AEI/ 10.13039/501100011033; the European Union NextGenerationEU/PRTR; the Novo Nordisk Foundation, grant number NNF21OC0069089; the Central Innovation Programme (ZIM) for SMEs of the Federal Ministry for Economic Affairs and Energy of Germany; the BMBF-funded de.NBI Cloud within the German Network for Bioinformatics Infrastructure (de.NBI), grant numbers 031A537B, 031A533A, 031A538A, 031A533B, 031A535A, 031A537C, 031A534A and 031A532B; the German Federal Ministry of Education and Research (BMBF): 031L0184A; the European Commission Innovative training network (ITN): 956229; and the Taiwan Ministry of Education: Government Scholarship to Study Abroad (GSSA).

About Washington University School of Medicine

WashU Medicine is a global leader in academic medicine, including biomedical research, patient care and educational programs with 2,800 faculty. Its National Institutes of Health (NIH) research funding portfolio is the third largest among U.S. medical schools, has grown 52% in the last six years, and, together with institutional investment, WashU Medicine commits well over $1 billion annually to basic and clinical research innovation and training. Its faculty practice is consistently within the top five in the country, with more than 1,800 faculty physicians practicing at 65 locations and who are also the medical staffs of Barnes-Jewish and St. Louis Children’s hospitals of BJC HealthCare. WashU Medicine has a storied history in MD/PhD training, recently dedicated $100 million to scholarships and curriculum renewal for its medical students, and is home to top-notch training programs in every medical subspecialty as well as physical therapy, occupational therapy, and audiology and communications sciences.

Originally published by the School of Medicine