Opin vísindi

Whole genome characterization of sequence diversity of 15,220 Icelanders

Whole genome characterization of sequence diversity of 15,220 Icelanders

Title: Whole genome characterization of sequence diversity of 15,220 Icelanders
Author: Jónsson, Hákon   orcid.org/0000-0001-6197-494X
sulem, patrick   orcid.org/0000-0001-7123-6123
Kehr, Birte
Kristmundsdóttir, Snædís   orcid.org/0000-0001-8981-0883
Zink, Florian
Hjartarson, Eiríkur
Hardarson, Marteinn   orcid.org/0000-0003-1130-8601
Hjorleifsson, Kristjan   orcid.org/0000-0002-7851-1818
Eggertsson, Hannes   orcid.org/0000-0002-1674-9978
Guðjónsson, Sigurjón Axel
... 19 more authors Show all authors
Date: 2017-09-21
Language: English
Scope: 170115
University/Institute: Háskóli Íslands
University of Iceland
Háskólinn í Reykjavík
Reykjavik University
School: Heilbrigðisvísindasvið (HÍ)
School of Health Sciences (UI)
Verkfræði- og náttúruvísindasvið (HÍ)
School of Engineering and Natural Sciences (UI)
Félagsvísindasvið (HÍ)
School of Social Sciences (UI)
Tækni- og verkfræðideild (HR)
School of Science and Engineering (RU)
Department: Læknadeild (HÍ)
Faculty of Medicine (UI)
Félags- og mannvísindadeild (HÍ)
Faculty of Social and Human Sciences (UI)
Series: Scientific Data;4
ISSN: 2052-4463
DOI: 10.1038/sdata.2017.115
Subject: DNA sequencing; Genetic variation; Haplotypes; Rare variants; DNA-rannsóknir; Erfðabreytileiki; Erfðafræði
URI: https://hdl.handle.net/20.500.11815/441

Show full item record


Jónsson, H., Sulem, P., Kehr, B., Kristmundsdottir, S., Zink, F., Hjartarson, E., . . . Stefansson, K. (2017). Whole genome characterization of sequence diversity of 15,220 Icelanders. 4, 170115. doi:10.1038/sdata.2017.115


Understanding of sequence diversity is the cornerstone of analysis of genetic disorders, population genetics, and evolutionary biology. Here, we present an update of our sequencing set to 15,220 Icelanders who we sequenced to an average genome-wide coverage of 34X. We identified 39,020,168 autosomal variants passing GATK filters: 31,079,378 SNPs and 7,940,790 indels. Calling de novo mutations (DNMs) is a formidable challenge given the high false positive rate in sequencing datasets relative to the mutation rate. Here we addressed this issue by using segregation of alleles in three-generation families. Using this transmission assay, we controlled the false positive rate and identified 108,778 high quality DNMs. Furthermore, we used our extended family structure and read pair tracing of DNMs to a panel of phased SNPs, to determine the parent of origin of 42,961 DNMs.


This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/

Files in this item

This item appears in the following Collection(s)