The Open UniversitySkip to content
 

Modeling Population Structure Under Hierarchical Dirichlet Processes

Elliott, Lloyd T.; De Iorio, Maria; Favaro, Stefano; Adhikari, Kaustubh and Teh, Yee Whye (2019). Modeling Population Structure Under Hierarchical Dirichlet Processes. Bayesian Analysis, 14(2) pp. 313–339.

Full text available as:
[img]
Preview
PDF (Version of Record) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB) | Preview
DOI (Digital Object Identifier) Link: https://doi.org/10.1214/17-BA1093
Google Scholar: Look up in Google Scholar

Abstract

We propose a Bayesian nonparametric model to infer population admixture, extending the hierarchical Dirichlet process to allow for correlation between loci due to linkage disequilibrium. Given multilocus genotype data from a sample of individuals, the proposed model allows inferring and classifying individuals as unadmixed or admixed, inferring the number of subpopulations ancestral to an admixed population and the population of origin of chromosomal regions. Our model does not assume any specific mutation process, and can be applied to most of the commonly used genetic markers. We present a Markov chain Monte Carlo (MCMC) algorithm to perform posterior inference from the model and we discuss some methods to summarize the MCMC output for the analysis of population admixture. Finally, we demonstrate the performance of the proposed model in a real application, using genetic data from the ectodysplasin-A receptor (EDAR) gene, which is considered to be ancestry-informative due to well-known variations in allele frequency as well as phenotypic effects across ancestry. The structure analysis of this dataset leads to the identification of a rare haplotype in Europeans. We also conduct a simulated experiment and show that our algorithm outperforms parametric methods.

Item Type: Journal Item
Copyright Holders: 2019 International Society for Bayesian Analysis
ISSN: 1936-0975
Keywords: admixture modelling; Bayesian nonparametrics; hierarchical Dirichlet process; linkage disequilibrium; population stratification; single nucleotide polymorphism data; MCMC algorithm
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM) > Mathematics and Statistics
Faculty of Science, Technology, Engineering and Mathematics (STEM)
Item ID: 68445
Depositing User: Kaustubh Adhikari
Date Deposited: 20 Dec 2019 15:27
Last Modified: 06 Jan 2020 20:56
URI: http://oro.open.ac.uk/id/eprint/68445
Share this page:

Metrics

Altmetrics from Altmetric

Citations from Dimensions

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU