GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs

Title: GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs
Author: Eggertsson, Hannes   orcid.org/0000-0002-1674-9978
Kristmundsdóttir, Snædís   orcid.org/0000-0001-8981-0883
Beyter, Doruk   orcid.org/0000-0002-3644-5760
Jónsson, Hákon   orcid.org/0000-0001-6197-494X
Skúladóttir, Ástrós
Hardarson, Marteinn   orcid.org/0000-0003-1130-8601
Gudbjartsson, Daniel   orcid.org/0000-0002-5222-9857
Stefansson, Kari   orcid.org/0000-0003-1676-864X
Halldórsson, Bjarni   orcid.org/0000-0003-0756-0767
Melsted, Páll   orcid.org/0000-0002-8418-6724
Date: 2019-11-27
Language: English
Scope: 5402
University/Institute: Háskóli Íslands
University of Iceland
Reykjavik University
Háskólinn í Reykjavík
School: School of Engineering and Natural Sciences (UI)
Verkfræði- og náttúruvísindasvið (HÍ)
Heilbrigðisvísindasvið (HÍ)
School of Health Sciences (UI)
School of Science and Engineering (RU)
Tækni- og verkfræðideild (HR)
Series: Nature Communications;10(1)
ISSN: 2041-1723
DOI: 10.1038/s41467-019-13341-9
Subject: Erfðarannsóknir; Tölvunarfræði; DNA-rannsóknir
URI: https://hdl.handle.net/20.500.11815/1519

Eggertsson, H.P., Kristmundsdottir, S., Beyter, D. et al. GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs. Nat Commun 10, 5402 (2019). https://doi.org/10.1038/s41467-019-13341-9


Analysis of sequence diversity in the human genome is fundamental for genetic studies. Structural variants (SVs) are frequently omitted in sequence analysis studies, although each has a relatively large impact on the genome. Here, we present GraphTyper2, which uses pangenome graphs to genotype SVs and small variants using short-reads. Comparison to the syndip benchmark dataset shows that our SV genotyping is sensitive and variant segregation in families demonstrates the accuracy of our approach. We demonstrate that incorporating public assembly data into our pipeline greatly improves sensitivity, particularly for large insertions. We validate 6,812 SVs on average per genome using long-read data of 41 Icelanders. We show that GraphTyper2 can simultaneously genotype tens of thousands of whole-genomes by characterizing 60 million small variants and half a million SVs in 49,962 Icelanders, including 80 thousand SVs with high-confidence.


Publisher's version (útgefin grein).


Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/ licenses/by/4.0/.

