Penelitian menarik yang sempat dipresentasikan dalam pertemuan AAPA 2017 mungkin masih lama publikasinya. Daftar penelitian menarik bertambah dengan digelarnya SMBE 2017, yang akan digelar akhir minggu ini di Austin, Texas. Beberapa penelitian yang menurut saya menarik bisa Anda baca di bawah.
Yang paling menarik adalah hasil genome populasi Rampasasa, yang rata-rata individu mewarisi sekuens Neandertals 48 Mb, mirip dengan rata-rata individu di kepulauan Bismarck (rata-rata 48.9 Mb), dan dua kali lebih besar dari populasi Eropa (rata-rata 23 Mb) dan Asia Timur (rata-rata 28 Mb). Bisakah disimpulkan DNA Neandertals yang diwarisi populasi Rampasasa sebesar yang diwarisi populasi Melanesia? Bisa. Menurut Vernot dan Akey (2016) populasi Melanesia rata-rata mewarisi DNA Neandertals sebesar 2,7%. Namun, hal ini tidak menjelaskan kenapa manusia pygmy Rampasasa memiliki postur tubuh yang relatif pendek dibandingkan populasi Eropa atau Asia Timur. Fakta yang perlu diperhatikan lainnya adalah, tinggi populasi Rampasasa rata-rata 148 cm, dengan kapasitas cranium rata-rata sebesar manusia Jebel Irhoud (1350 cc) yang berumur 315 ribu tahun. Modifikasi wajah populasi Rampasasa bisa juga dibilang tidak terlalu jauh dari manusia Jebel Irhoud, kecuali bagian jidat yang sudah mendekat modern (vertikal tanpa brow ridge). Populasi Rampasasa memiliki dagu negatif (yang banyak dijumpai pada manusia arkaik, atau Homo erectus. Jika ada waktu, nanti akan coba bahas hal menarik tentang evolusi dagu pada manusia), yang membuatnya tidak bisa diklasifikasikan ke dalam Homo sapiens (jika kita mengacu pada prasyarat manusia modern menurut analisis manusia Jebel Irhoud). Tentu saja, populasi Rampasasa secara biologis diklasifikasikan sebagai Homo sapiens (berdasarkan uniparental markers yang ada). Namun secara morfologi tidak sesederhana yang diasumsikan saat ini. Apakah mereka hybrid? Mereka mewarisi sekuens Denisovan, rata-rata 4,4 Mb (rata individu Melanesia 42.9 Mb), walau tidak sebesar sekuens Neandertals. Untuk lebih jelasnya, kita tunggu saja publikasi Serena Tucci et al.
The genetic history of the Indonesian Pygmies of Flores
Serena Tucci et al.
Modern human pygmy populations are distributed globally, and their short stature is hypothesized to represent one aspect of a complex eco-geographic adaptation to rainforest or island environments. Although numerous genetic studies have been conducted on pygmies in Africa and Southeast Asia, to date, there have been no genome-scale analyses of the pygmy population living on the island of Flores, Indonesia. Intriguingly, this population lives in a village near the cave where remains of a small-bodied human species, Homo floresiensis, were recently found. Here, we describe whole-genome sequences (>40x) from 10 Flores pygmy individuals, as well as genome-wide SNP data from 35 individuals. The Flores genomes harbor on average 48 Mb and 4.4 Mb of Neandertal and Denisovan sequence, respectively. Height-associated loci identified in European populations are significantly differentiated in the Flores pygmies, who possess an excess of height-decreasing alleles and a deficiency of height-increasing alleles. This result is consistent with a hypothesis of polygenic selection acting on standing variation for reduced stature in Flores. Finally, we identify a strong signature of recent positive selection encompassing the FADS gene cluster on chromosome 11, encoding for fatty acid desaturases that regulate the metabolism of long-chain polyunsaturated fatty acids (LC-PUFA). Flores individuals are nearly fixed for an ancestral haplotype that is predicted to confer reduced capacity to synthesize LC-PUFA from plant-based precursors. Our results add to emerging evidence that the FADS region has been a recurrent target of selection in diverse human populations, possibly in response to changing diets.
Population genetics of the agricultural transition in Papua New Guinea
Anders Bergström et al.
Abstract: In the last 10ky, humans in different parts of the world have transitioned from hunter-gatherer to farming lifestyles, and genetic studies are increasingly indicating that the spread of farming, culture and languages during this period has primarily been driven by the spread of people and thus genes. Papua New Guinea (PNG) underwent its own independent agricultural transition, but it’s not known if the population genetic consequences here were similar. We investigated this using genome-wide array genotypes from 381 individuals across 85 language groups, and 39 whole-genome sequences. We find that population structure in the highlands region has mostly formed only in the last 10kya and is characterized by a striking genetic divide to lowland populations and major increases in effective population size, consistent with a reshaping of genetic structure following the adoption of farming here. However, PNG differs from other parts of the world by having very strong genetic differentiation between groups, with many FST values exceeding those between major populations within continents. Ancient DNA suggests that at least in the case of west Eurasia, the current genetic homogeneity has actually been established only in the last few thousand years. The independent history of PNG then demonstrates that an agricultural transition does not necessarily lead to such a collapse of population structure. PNG, with its 850 languages and immense cultural diversity, might thus better reflect the population genetic structures that would have characterized most human societies until the very recent past.
Expanded summary*: Papua New Guinea (PNG) represents a key region in human population history, containing some of the oldest evidence of human occupation outside of Africa dating back to ~50 kya and today being the linguistically most diverse place in the world with approximately 850 languages (more than 10% of the world’s total). It was also one of the handful of places in the world where humans developed agriculture and left behind the hunter-gatherer lifestyle. Genetic studies are increasingly indicating that PNG, and the whole continent of Sahul which also included Australia and Tasmania, has been isolated from the rest of world from the initial settlement until at least the last few thousand years. Its history therefore constitutes a second, independent ‘replicate’ of human evolution over ~50ky, allowing us to ask if the population genetic processes that unfolded here were similar to those in the rest of the world. In particular, there is an opportunity to ask if the transition from a hunter-gatherer to a farming lifestyle in PNG had the same effects on human population structure as it did elsewhere.
Most genetic studies in PNG to date have however been limited to small numbers of population samples and/or genetic markers. We have generated genotype array data (1.7 million markers) on 381 individuals from 85 different language groups from PNG, and whole genome sequences for 39 individuals. This represents the first large-scale study of the population genetic history of this part of the world.
We confirm the genetic independence of PNG, especially its interior highlands region, from the rest of the world. We find evidence for a population expansion in the highlands within the timeframe of the spread of agriculture, suggesting that similarly to other parts of the world, agriculture here spread though the movement of people, rather than just the spread of ideas. However, PNG differs in a major way from other parts of the world that have also undergone agricultural transitions, in that genetic differentiation is remarkably strong, with FST values exceeding those within e.g. all of Europe. This study thus demonstrates that while both Europe and PNG transitioned to agriculture, the former saw dramatic genetic homogenization while the latter did not. As such it is an important contribution to the emerging picture on the role of lifestyle and culture in shaping the evolutionary trajectories of human populations.
Genomic Insights into the Ancestry and Human Demography of Remote Polynesia
Alexander Ioannidis et al.
Abstract: Beginning some three thousand years ago, the settling of Polynesia represents the final chapter in the expansion of humans across the globe. Although settled relatively late in historical terms, with occupation of the most remote islands occurring as recently as one thousand years ago, many questions remain about the peopling of this vast oceanic region. These questions include the sequence of island settlement, the dates of settlement, and the role of more recent admixture events in creating the modern island populations. Using dense genome-wide array genotyping of 445 modern samples from across the Polynesian archipelago, we attempt to answer some of these outstanding questions. In particular we investigate patterns of local ancestry within individuals, as well as patterns of relatedness within and across islands, to help elucidate historical settlement patterns. The widely separated Polynesian islands provide a uniquely structured canvas on which to implement novel variants of ancestry de-convolution techniques. We will describe the application of those techniques to human populations ranging from Near Oceania to Easter Island. Our results demonstrate the important role that both recent and ancient admixture events have played in creating the diversity pattern of modern Polynesian island populations.
Simultaneous Estimates of Archaic Admixture and Ancient Population Sizes
Abstract: To estimate archaic admixture, one must control for the sizes and separation times of ancient populations. We describe a new method that provides simultaneous estimates of these parameters in complex models of population history. Preliminary results confirm several previous results, but indicate that (1) Papuans have more Neanderthal admixture and less Denisovan admixture than previously thought; and (2) the archaic populations that contributed genes to modern humans were much larger than previous estimates.
This work was supported by grant BCS-638840 from the NationalScience Foundation.
The genomic health of ancient hominins
Ali Berens et al.
Abstract: The genomes of ancient humans, Neandertals, and Denisovans contain many alleles that influence disease risks. Using
genotypes at 3180 disease-associated loci, we estimated the disease burden of 147 ancient genomes. After correcting for missing data, genetic risk scores were generated for nine disease categories and the set of all combined diseases. These genetic risk scores were used to examine the effects of different types of subsistence, geography, and sample age on the genomic health of ancient individuals. On a broad scale, hereditary disease risks are similar for ancient hominins and modern-day humans, and the genomic health of ancient individuals spans the full range of what is observed in present day individuals. In addition, there is evidence that ancient pastoralists may have had healthier genomes than hunter-gatherers and agriculturalists. We also observed a temporal trend whereby genomes from the recent past are more likely to be healthier than genomes from the deep past. This calls into question the idea that modern lifestyles have caused genetic load to increase over time. Focusing on individual genomes, we find that the overall genomic health of the Altai Neandertal is worse than 97% of present day humans and that Ötzi the Tyrolean Iceman had a genetic predisposition to gastrointestinal and cardiovascular diseases. As demonstrated by this work, ancient genomes afford us new opportunities to diagnose past human health, which has previously been limited by the quality and completeness of remains.
The effects of selection and demography on Neanderthal ancestry in modern humans
Martin Petr et al.
Abstract: Advances in ancient genomics have brought many insights into the evolutionary histories of anatomically modern humans (AMH) and Neanderthals, and we now know that all non-Africans today derive at least 1-2% of their ancestry from the Neanderthals. It is often suggested that purifying selection has acted against introgressed Neanderthal alleles, and such selection has been invoked to explain the depletion of introgression around genes and in conserved regions, large “deserts” depleted of Neanderthal introgression, and a decrease of Neanderthal ancestry over time observed from ancient and present-day AMH samples. It has been recently shown that such depletions in conserved regions are consistent with selection on weakly deleterious variants that had drifted to high frequencies in Neanderthals due to their small population size, but that were more efficiently selected against once they entered the larger AMH population. However, the extent and timing of this depletion are difficult to reproduce using standard models. In this study, we used population genetic simulations to fit a set of selection parameters and demographic models that can produce the dynamics of Neanderthal ancestry changes observed in early modern humans and present-day Europeans.
DEEP LEARNING FOR REFERENCE-FREE INFERENCE OF ARCHAIC LOCAL ANCESTRY
Arun Durvasula & Sriram Sankararaman
Statistical analyses of genomic data from diverse human populations have demonstrated that modern human populations trace a small proportion of their genetic ancestry to archaic hominins such as Neanderthals and Denisovans. These analyses were enabled by the availability of archaic genome sequences. Several studies have suggested that archaic admixture has been common in the history of populations even though the ancestral archaic populations have not been identified. This observation motivates the problem of reference free archaic local ancestry inference, i.e., inferring segments of the genome that trace their ancestry to an archaic population even in the absence of reference archaic sequences.
Previous attempts at reference-free archaic local ancestry inference have relied on a limited number of features or summary statistics (such as S*), which have limited power and a high false positive rate. Recent advances in deep learning permit the learning of complex, non-linear features that can be useful in a number of inferential tasks.
Here, we present a deep neural network (DNN) for archaic local ancestry inference. The DNN learns a number of features from patterns of genetic variation across a number of human genomes that allows accurate inference of archaic inference, with an overall accuracy of 93% and an Area Under the Receiver Operator Curve (AUROC) of 0.98. The baseline AUROC for S* is 0.77. Preliminary analyses of a sub-Saharan African population find that an average of 2.03% (SD: 0.38) of their genomes is labelled as archaic, in line with previous estimates.
Identification of large structural variants in archaic hominins
Laurits Skov et al.
Abstract: Introgression of archaic variants into human populations is an already known phenomenon, with some variants even providing a selective advantage such as adaptation to living in high altitudes or haplotypes carrying alleles of genes involved in the immune-system.
However, analysis of introgressed variants is restricted to SNVs (single nucleotide variants) or small indels due to degradation of ancient DNA. Here we apply a novel k-mer based approach to genotype large indels found in modern human in both Neanderthal and Denisova individuals. We test the method on present day modern humans with known genotypes (1000 genomes) and show that we find high concordance.
We genotype large indels in high coverage from data from the Altai Neanderthal, a newly sequenced Vindija neanderthal and the Altai Denisova individual and identify > 1000 variants greater than 31 bp in each archaic individual that could not be found previously. We find that shared large indels are not evenly distributed across the genome, and the highest density of variants in the HLA region (Human leukocyte antigen region). We also find regions of the genome share more structural variants with the Vindija Neanderthal than with the Altai Neanderthal or the Denisova.
Interpreting Human Genomic Regions Depleted of Archaic Hominin Ancestry
Aaron Wolf & Joshua Akey
Abstract: Recent studies have identified archaic sequences in modern human genomes that were inherited from archaic hominin ancestors, such as Neandertals and Denisovans. Strikingly, the distribution of archaic sequence in the modern human genome is heterogeneous, with some large regions depleted of it. Regions that are depleted of archaic sequence may represent loci where archaic sequence was strongly deleterious and rapidly purged from modern human populations. However, alternative mechanisms, such as stochastic loss of archaic sequences due to drift could also contribute to “archaic deserts”. To this end, we performed extensive coalescent simulations under a wide variety of demographic models. We find that modern humans are significantly more enriched for large depletions than expected under neutral models. We show that the largest regions depleted of archaic sequence differ from the rest of the genome in several key characteristics, such as being significantly enriched for genes expressed in regions of the brain and differing in their levels of sequence diversity. The largest region depleted of archaic sequence contains the FOXP2 gene, which is associated with speech and language and carries a regulatory change unique to modern humans. Finally, we leveraged large-scale functional genomics data sets to map putatively deleterious sites Neandertals carried in these regions that may have contributed to the generation of deserts. Understanding the formation and characteristics of regions depleted of archaic introgressed sequence in the modern human genome will help interpret how archaic admixture influenced human evolution and, possibly, what genes may play a role in unique human behaviors.
Expanded summary*: Anatomically modern humans overlapped in time and space with archaic humans like Neandertals and Denisovans. The recent sequencing of the Neandertal and Denisovan genomes has provided insights into human evolution, such as the finding that for individuals of non- African ancestry ~2% of their nuclear genome is Neandertal introgressed sequence and Melanesians carry an additional ~2-4% Denisovan sequence.
We identified introgressed archaic sequence in 503 European, 504 East Asian, and 27 Melanesian individuals using the S* pipeline and the Altai Neandertal and Altai Denisovan reference sequences. Strikingly, the distribution of surviving archaic introgressed sequence across the modern human genome was heterogeneous. While introgressed sequence appeared throughout the genome, several large regions were significantly depleted of it. We also found the overlap of those regions depleted of Neandertal and Denisovan sequence to be significantly greater than expected due to chance.
The size, consistency, and gene content of depleted regions suggest that they arise from common processes. Specifically, we hypothesize that these depletions are products of selection against archaic sequence at these loci. Alternatively, mechanisms such as genetic drift may be responsible for the formation of these features. In previous work, we have examined the probability of depletions of archaic sequence in the modern human genome. We simulated data using well-established models of human demographic history and found that the empirical data have a significantly greater proportion of large (>8Mb) depleted regions than was found in simulated data.
We have since tested further complex demographic models, varying a larger number of parameters in these models. We find that certain parameter sets and model structures are capable of capturing a portion of the empirical distribution. However, no models and no parameter sets examined thus far have been able to fully reproduce the distribution of depleted windows found in the empirical data. The results of these simulations suggest that the empirical data may represent a mixture distribution, and that the formation of these depleted regions is a complex process involving a combination of mechanisms including genetic drift and selection.
We have begun to characterize the largest regions depleted of archaic sequence using large-scale functional genomics data sets. We find that regions depleted of archaic sequence differ from the rest of the human genome in several key measures, such as being significantly enriched for genes expressed in regions of the brain, and differing in their levels of sequence diversity. The largest region depleted of archaic sequence contains the FOXP2 gene, which is associated with speech and language and carries a regulatory change unique to modern humans. As well, we have identified several loci in these regions that are enriched for fixed differences between human and Neandertal sequence. Some of these loci contain enhancers expressed in neuronal cell lines. We have also identified several genes that contain non-synonymous fixed differences between Neandertal and human with predicted damaging or deleterious effects.
These depletions of introgressed sequence are unexplored and uncharacterized phenomena that hold important insights into human evolution and the biological differences between modern and archaic humans. Studying these loci, identifying the mechanisms of their formation, and characterizing the features contained within, may be informative in addressing fundamental questions about the evolution of uniquely modern human traits. More generally, our understanding of how modern humans came to flourish in Europe and Asia while other archaic humans perished, remains incomplete. New genetic data and analyses can complement archaeological data, providing estimates of population size, diversity, structure, and migration. Analyzing the genomic remains of archaic-modern human admixture adds to the story of human history, demonstrating the complexity of the interactions between modern and archaic humans.
Patterns of deleterious variation within and between geographically diverse populations
Abstract: The deluge of genome-scale sequence data now available in geographically diverse populations has provided considerable insights into patterns of genomic variation within and between populations. However, a coherent narrative about the characteristics, patterns, and consequences of deleterious mutations among individuals that have experienced different demographic histories has yet to emerge and many questions about deleterious variation remain. Here, I will present new analyses of deleterious protein-coding and regulatory mutations in tens of thousands of geographically diverse individuals. We show that demographic history and natural selection both influence patterns of deleterious variation, often in complicated ways, but whether such differences in patterns of deleterious influences mutational load, depends on the particular fitness model assumed. Moreover, using exome data from over 60,000 individuals we show a marked decrease in the average strength of selection acting on deleterious protein-coding variation over the past millennia. Finally, we show how archaic admixture influences the burden of deleterious mutations carried by individuals.
African ROH Drive Enrichment of Deleterious Alleles in a Sample of Admixed Individuals
Zachary Szpiech et al.
Abstract: Runs of homozygosity (ROH) are important genomic features that manifest when identical-by-descent haplotypes are inherited from parents. Their length distributions are informative about population history, and their genomic locations are useful for mapping recessive loci contributing to both Mendelian and complex disease risk. We have previously shown that ROH, and especially long ROH that are likely the result of recent parental relatedness, are enriched for homozygous deleterious coding variation in a worldwide sample of outbred individuals (Szpiech, et al. 2013). However, the distribution of ROH in admixed populations and their relationship to deleterious homozygous genotypes is understudied.
Here we analyze whole genome sequencing data from 1,484 individuals from African American, Puerto Rican, and Mexican American populations. These populations are three-way admixed between European, African, and Native American ancestries and provide an opportunity to study the distribution of deleterious alleles partitioned by local ancestry and ROH. We re-capitulate previous findings that long ROH are enriched for deleterious variation genome-wide. Then, partitioning by local ancestry, we compare the proportion of deleterious homozygotes in ROH comprised of single ancestry haplotypes to the proportion of benign homozygotes in those ROH. We find that ROH falling in African ancestry tracts are enriched the most followed by European and Native American ROH.
These results suggest that, while ROH on any haplotype background are associated with an inflation of deleterious homozygous variation, African haplotype backgrounds may play a particularly important role in the genetic architecture of complex diseases for admixed individuals, highlighting the need for further population genetic study of these populations.
The Genetic Identity of the Bangande People “The secret ones”
Hiba Babiker & Russell Gray
Abstract: Understanding human evolutionary history and past demographic events are becoming ripe for interdisciplinary research to complement genetics, linguistics, anthropological and historical studies. Their importance stems from previous research findings that Africa is the birthplace of modern humans. However, details of human prehistory in Africa remain largely obscure owing to the complex histories of hundreds of distinct populations.Therefore focusing on modern populations from Africa where ancient DNA is not accessible is a key to answering big questions in the history of our species. This project explores the genetic identity of the Bangande people who speak the Bangime language (Bangime is a language isolate spoken in the extreme Northwest of the Bandiagara Escarpment in Central Eastern Mali). We also aim at finding matches/mismatches between genetics and linguistics. We investigate 250 individuals from 12 West African populations representing different ethnicities and linguistic affiliations. Samples are genotyped with the Axiom® GenomeWide Human Origins Array. Moreover and for the purpose of comparison, the final dataset combines publicly available datasets of African and non-African populations. Our analysis applies the most advanced statistical methods in population genetics. The outcomes of this project are critical for reconstructing African demographic history and also has insights on the coevolution of languages and genes. It highlights the importance of interdisciplinary research in decoding unanswered questions in the human history.
The Genetic History of Northern Europe
Alissa Mittnik et al.
Abstract: Recent genetic studies of ancient human genomes have revealed a complex population history of modern Europeans involving at least three major prehistoric migrations that were influenced by climatic conditions, the spread of technological and cultural innovations and possibly diseases. To what extent these dynamics also affected the very North of the European continent surrounding today’s Baltic Sea is less well understood.
Here we report novel genome-wide DNA data from 24 ancient North Europeans ranging from ~7,500 to 200 BCE spanning the transition from a mobile hunter-gatherer to a sedentary agricultural lifestyle, as well as the adoption of bronze metallurgy. We coanalyze our data with over 300 available ancient genomes and data from around 3,800 modern individuals to show that the settlement of Scandinavia occurred via a southern and a northern route, and that the first Scandinavian Neolithic farmers derive their ancestry from Anatolia 1000 years earlier than previously demonstrated. We reveal that the range of Western European Mesolithic hunter-gatherers extended to the east of the Baltic Sea, where this population persisted without gene-flow from Central European farmers until around 2,900 BCE when the arrival of steppe pastoralists introduced a major shift in economy and established wide-reaching networks of contact within Europe during the Late Neolithic and Bronze Age. These, together with continued gene-flow from the local hunter-gatherer population led to the genetic makeup of today’s Lithuanian populations, while additional admixture related to Siberian and East Asian populations is needed to explain modern Estonians’ genetic composition.
40,000-year-old individual from Asia provides insight into early population structure in Eurasia
Melinda Yang et al.
Abstract: To date, very few ancient genomic studies have been conducted in Asia. Genome-wide studies using ancient individuals from Europe have revealed complex ancestry and genetic structure in ancient populations that could not be observed studying only present-day populations, suggesting similar approaches may also aid in elucidating the demographic history in Asia. Here, we present genome-wide data for a 40,000-year-old individual from Tianyuan Cave near Beijing, China. We show that he is more related to present-day Asians than present-day and ancient Europeans. However, unlike present-day Asians, he shows potential relationships with some present-day South Americans and a 35,000-year-old European individual. Our results suggest that there was extensive population structure in Asia by 40,000 years ago that persisted over an extended period of time.
Reconstructing prehistoric African population structure and adaptation
Postus Skoglund et al.
Abstract: The population genomic landscape of Africa prior to its transformation by expansions of farmers and pastoralists is poorly understood, partly due to poor ancient DNA preservation and partly due to the deep time scale of human population history on the continent. We assembled genome-wide data from ten sub-Saharan Africans who lived in the last 4,500 years, and show that one of the most deeply divergent present-day human lineages that is today found almost exclusively in people living in southern Africa, was in the past 2,000 years also present in populations much farther north in Malawi and the Zanzibar archipelago. These results highlight the existence of an ancient genetic cline stretched over thousands of kilometers along a south-north axis. By leveraging data from ancient African genomes without ancestry from more recent into-Africa migrations, we show that western Africans today may harbor ancestry from a lineage that separated from other modern human lineages earlier than any other, including the Khoe-San of southern Africa. Finally, we use the availability of time-stratified southern African genomes to document evidence of both selective sweeps and polygenic selection that might have conferred adaptations to desert environments.
Expanded summary*: Africa is the homeland of our species, and contains within it more human genetic diversity than the rest of the world combined. However, far less is known about the prehistory of Africa than the prehistory of other parts of the world, both because of the poor preservation of ancient DNA in Africa’s hot climate, and because of the disruptions of African population structure that occurred with the expansion of farming populations. Here we increase the amount of ancient DNA from Africa by a factor of 10 by taking advantage of recent advances for extracting DNA from ancient individuals. Using this first view of prehistoric African population structure, we provide evidence for a previously unknown hunter-gatherer population that once dominated East Africa, and the existence of an admixture gradient in which ancient East African foragers where in contact with southern African foragers as far north as Tanzania. In contrst, today such ancestry is restricted to the southern tip of Africa.. We also show evidence that West Africans today harbor substantial ancestry from a lineage that split from other modern humans before the lineage currently viewed as oldest (the Khoe-San of southern Africa). Finally, we reveal recent natural selection in the Khoe-San of southern Africa today that may have provided key adaptations to life in the open Kalahari desert, including genes affecting response to radiation and taste receptor loci. These results will provides the first view of prehistoric African population structure, and represent a first ancient genomic step into the deep past of humans in Africa.
The Role of Migration in Cultural Changes during the Chalcolithic period in the Levant
Eadaoin Harney et al.
Abstract: A major controversy is whether cultural change evident in the archaeological record is typically achieved through movements of people or cultural infiltration. The Chalcolithic period in the southern Levant (4th-5th millennium BCE) contains artifacts not detected in earlier archaeological sites of the region, yet have strong affinities to contemporary and earlier cultures from Anatolia and Iran. In order to test the hypothesis that the Chalcolithic culture of this region may have been formed through migration from the North, we analyzed new genome-wide ancient DNA data from 22 individuals from the Peqi’in cave site in Upper Galilee, Israel that are associated with the Late Chalcolithic culture of the southern Levant, thereby approximately doubling the number of samples with genome-wide ancient human DNA from the Levant. We report that that these individuals derive approximately 58% of their ancestry from populations related to those of the local Levant Neolithic, approximately 17% from populations related to the Iran Chalcolithic, and approximately 25% related to the Anatolian Neolithic, supporting the hypothesis that this population was formed in part by migration from the North. We show that population turnover continued after the Chalcolithic, as the population that the Peqi’in Cave group was a part of did not contribute to later Levantine populations from the Bronze Age, which had little or no Anatolian-related ancestry.
Ancient DNA from Two Pre-Columbian Mummies from Sierra Tarahumara
Viridiana Villa-Islas et al.
Abstract: The Tarahumara are an indigenous population also known as Rarámuri, who inhabits the Sierra Tarahumara, mainly in the state of Chihuahua, Mexico. This millenary group is recognized for their incredible physical endurance and ability to run long distances. Genomic studies of this population and its ancestors may give us insights into their population history. Here we report the retrieval and sequencing of aDNA from two ca. 900 year-old pre-Columbian mummies found in a cave from Sierra Tarahumara. We performed Whole-Genome-Capture on the aDNA libraries to enrich the amount of endogenous DNA and sequenced until we reached saturation. We obtained enough data to cover two-thirds of the genome for one of the mummies and less than 10% for the second. This allowed us to perform PCA to compare the ancient individuals with modern Native Mexican groups. Preliminary analyses show that the mummies cluster with different present-day populations, respectively. This opens up questions of historical and anthropological interest; specifically regarding past genetic structure and migration between different indigenous groups.
Expanded summary*: The Tarahumara are an indigenous population also known as Rarámuri, who inhabits the Sierra Tarahumara, mainly in the state of Chihuahua, Mexico. Long-distance running is a cultural practice that men, women and children have practiced for centuries through the rugged landscape of the Sierra Tarahumara. Motivated by the interest in investigating a possible genetic basis for this extraordinary capacity, and to learn more about their population history, we launched a genomic study of this population combining genomic information from past and present individuals.
Ancient DNA (aDNA) offers an unparalleled source of information to better understand the evolutionary processes that have generated the genetic diversity of today’s populations and to detect possible genes that could have been subject to selection in “real time”. To gain insights into the population history of the Rarámuri and to explore possible signatures of adaptive evolution, we increased the sequencing depth of two ancient genomes belonging to two ca. 900-year old pre-Columbian mummies initially reported in Raghavan et al, 2015. Both mummies were found in a cave from Sierra Tarahumara; their DNA was extracted, built into Illumina libraries and sequenced at low depth. We increased the coverage of their genomes by generating additional libraries and performing Whole-Genome Capture (WGC) to enrich their endogenous content.
We obtained enough data to cover two-thirds of the genome for one of the mummies and less than 10% for the second. This allowed us to determine the mitochondrial haplogroup of the two individuals as C, and C1c1a, respectively; both are typical mitochondrial haplogroups of Native Americans. Also, we performed PCA to compare the ancient individuals with a reference panel of modern Native Mexican groups. Interestingly, these preliminary analyses show that one mummy clusters closely with present-day Tarahumaras, while the second clusters with a geographically distant population. This result opens up questions of historical and anthropological interest; specifically regarding past genetic structure and migration between different indigenous groups and demands further and more detailed analyses of the genomic data in a population context.
In the next phase of the project we will increase the depth of coverage of the genomes and combine with knowledge generated from modern Rarámuri. We are in parallel characterizing functional variation in present-day Rarámuri and identifying targets of adaptive evolution. The combination of both sources of genetic information — ancient and modern — might help characterize the temporality of the variants associated with adaptive evolution, which might, or might not, be related to physical endurance.
This study, not only complements our knowledge on the genetic component of the Tarahumara population, but also contributes to a better understanding of the pre-Columbian Mexican native populations, which have been little studied from the point of view of genetic diversity. Few studies to date have focused on the paleogenomic study of ancient human samples in the Americas and none has characterized complete ancient genomes of samples from Mexico despite its rich historical and cultural heritage as reflected in the vast archaeological record. Consequently, paleogenomic studies in Mexico have a great potential.
Furthermore, another important contribution of this study is the implementation of the WGC enrichment method, which has been proposed to increase the endogenous DNA content of ancient human samples, yet it has not been tested thoroughly. This work allows testing new parameters of the protocol and gain further insights about its performance on mummified samples.
Complete mitochondrial genomes provide additional evidence on the geographical origin of the indigenous people of the Canary Islands
Abstract: Deciphering the geographic origin of the Canary Islands’ indigenous inhabitants has fascinated both scholars and the general public. Ancient DNA (aDNA) evidence, based on PCR techniques, has confirmed the presence of North African mitochondrial DNA (mtDNA) lineages in the indigenous people, including the North African U6 haplogroup. In fact, one striking result was the discovery of the U6b1a sub-haplogroup, which is exclusively observed in ancient and modern populations of the Canary Islands, and it is absent in North Africa. Classical aDNA techniques have provided valuable information, but results have been always hindered by the risk of modern contamination. Moreover, PCR-based analyses are limited to a small portion of the mtDNA genome and important information from the coding region is unavailable.
In this study, we apply for the first time next-generation sequencing to the recovery of whole mtDNA genomes of indigenous people of the Canary Islands (n=44). Most of the lineages observed in the ancient population of the Canary Islands belong to West Eurasian and North African haplogroups, confirming previous results. As expected from archaeological, anthropological and linguistic studies, the majority of indigenous mtDNA lineages are present in the Maghreb. Phylogenetic analysis indicates the presence of additional autochthonous lineages that mimic the distribution observed for U6b1a. Coalescence ages for those Canarian-autochthonous subhaplogroups are mostly in agreement with the colonization time proposed by radiocarbon dates and archaeological criteria. However, an older autochthonous lineage as U6b1a is unlikely to have developed in the Canary Islands based on currently available archaeological records.
Expanded summary*: The goal of this project is to apply paleogenomic techniques to the study of the Canary Islands’ prehistory for the first time. During the 13th-14th centuries, European sailors eagerly traveled the oceans searching for new worlds. The subsequent expansion of European colonies across the world, triggered the European dominance of the global economy, but also had important cultural and ecological consequences, because it brought together, for the first time, distant civilizations and environments. Portuguese sailors discovered several groups of islands in the Atlantic Ocean in the 13th century. Around this time, the Portuguese and Castilians began to settle the Atlantic archipelagos, including the Azores, Madeira and Cape Verde, but only the Canary Islands were found to be inhabited by an indigenous population, generally known as Guanches. During the 15th century, the Canary Islands were gradually conquered, directly or indirectly, by the Spanish kingdom of Castile, beginning with the island of Lanzarote in 1402 and finishing with Tenerife in 1496. In general terms, the Conquest was exceptionally violent, due in part to the fierce resistance of the indigenous people against the invaders. The crushing of the resistance, and the subsequent European colonization, had a great impact on the indigenous way of life. In spite of the indigenous protective policy of Queen Isabel ‘La Católica’, who legally abolished slavery on the Islands in 1498, a large number of Guanches were deported during and after the Conquest, and some of them were introduced into the 16th century European slave trade. Those that survived and stayed within the islands progressively mixed with the European colonizers, leading to the loss of indigenous culture and language.
Most archaeological, anthropological and linguistic researchers point to a North African origin for the Canary indigenous people, more precisely related to the proto-Berber and Berber world. Ancient DNA analyses on the Guanche population using classical PCR-based methods have confirmed the presence of North African lineages in the indigenous people, including different sublineages of the characteristic North African U6 haplogroup. One important result was the characterization of the U6b1a sub-haplogroup, which is exclusively observed in ancient and modern populations of the Canary Islands, and not in North Africa. However, due to the lack of samples from the eastern islands (Gran Canaria, Lanzarote and Fuerteventura) and to limitations associated with the use of only a small portion of the mtDNA genome, we were unable to identify a specific geographical origin for the Guanche people.
In this project, we used next-generation sequencing to generate complete mtDNA genomes from the indigenous population for the first time. We obtained high-coverage mtDNA genomes for 44 human remains excavated from 23 different archaeological sites distributed across the entire Canarian archipelago. Most of the lineages observed in the ancient population of the Canary Islands belong to West Eurasian (H, J and T) and North African (U6) haplogroups, confirming previous results using PCR techniques. As expected from archaeological, anthropological and linguistic studies, our results indicate that the first inhabitants of the Canary Islands are related to modern populations of North Africa. However, the absence or low frequency of some key haplogroups in North Africa indicates that the continental genetic composition has been modified by later human migrations. More strikingly, phylogenetic analysis indicates the presence of additional autochthonous lineages that mimic the distribution observed for U6b1a. By using whole-genome sequences of indigenous samples, we have been able to identify additional autochthonous lineages that mimic the distribution observed for U6b1a. Those Canarian-specific haplogroups are sublineages of the Eurasian H, J and T, and the African L3 macrohaplogroups. With this refined phylogeographic information we will be able to unequivocally assess the indigenous origin of maternal lineages observed in the modern Canarian population.
Apart from its clear significance for understanding the demographic history of the Guanche population, the results obtained in this project are also of paramount importance for the Canarian society. The success of projects and companies providing ancestry information indicates how people crave knowledge about their origin. This is especially true for the Canary Islands. The indigenous population plays an important role on the identity of the modern inhabitants of the Canary Islands, who are very interested in knowing as much as possible about this part of their history. However, due to the mystical aura that has always surrounded the Guanche population since the first European chroniclers started writing about them, misleading and pseudo-scientific information is sometimes fed to the public and accepted as fact. It is the responsibility of scientists to provide society with evidence and help providing insight to differentiate what is fact and what is myth. This project will allow us to keep answering those questions with state-of-the-art methods in the field.
A pre-existing isolation by distance gradient in West Eurasia may partly account for the observed “steppe” component in Europe
Luca Pagani et al.
Abstract: It has been proposed that modern European populations can be modelled, by and large, as a three-way mixture of Hunter Gatherer, Anatolian Neolithic and Steppe components that took place after 6kya (Haak et al. 2015, Allentoft et al. 2015). Particularly the pre-existing Hunter-Gatherer are thought to have admixed with incoming Early Neolithic people from Anatolian and, subsequently, with people carrying a “Steppe” component from the East. These people were likely bearing the so called Yamnaya and/or Corded-Ware cultures, and their initial impact of the European gene pool was estimated to be as high as 75% (Haak et al. 2015).
However ancient DNA samples from East European and Caucasian Hunter-Gatherers as well as from Early Iranian Neolithic, dating from before the Yamnaya expansion, already show signs of this so called “Steppe” component (Lazaridis et al. 2016). Such an observation is compatible with the presence of a pre-existing genetic gradient ranging from Caucasus/Iran all the way to Europe, which likely formed through isolation by distance over thousands of years.
Here we show that such a gradient, defined as decrease of “steppe” component with distance from Iran, can be inferred from ancient samples pre-dating the Yamnaya expansion (r2=0.93). When analysed in the light of this gradient, later ancient and modern samples from Europe still display an excess of Steppe component, however this excess is less pronounced than previously estimated. Additionally we found that, of the analysed samples, modern South Asians show the highest excess of “steppe” component, pointing to the documented, recent links between the Caucasus/Iran populations and the South Asian peninsula.
7,000 years of change: Migration and admixture in the population history of the Caribbean
Maria A Nieves-Colon et al.
Abstract: Although the Caribbean has been continuously inhabited for the last 7,000 years, European contact in the last 500 years dramatically reshaped the cultural and genetic makeup of island populations. Several recent studies have explored the genetic diversity of Caribbean Latinos, and have characterized Native American variation present within their genomes. However, the difficulty of obtaining ancient DNA from pre-contact populations and the underrepresentation of non-Latino Caribbean islanders in genetic research, have prevented a complete understanding of genetic variation over time and space in the Caribbean basin. Here we discuss research that takes two approaches towards characterizing migration and admixture in Caribbean populations: an ancient DNA analysis of 139 individuals from three pre-contact archaeological sites in Puerto Rico (A.D. 500–1300), and an analysis of whole genome variants from 55 Afro-Caribbeans in five Lesser Antillean populations. Our ancient DNA analysis traces the origin and number of pre-contact migrations to Puerto Rico and examines the extent of genetic continuity between ancient and modern populations. In contrast, our modern DNA work analyzes autosomal SNP genotypes to characterize complex patterns of admixture since European contact among Lesser Antillean Afro-Caribbeans. Our findings characterize how ancient indigenous groups, European colonial regimes, the African Slave Trade and modern labor movements have shaped the genomic diversity of Caribbean islanders. In addition to its anthropological or historical importance, such knowledge is also essential for informing the identification of medically relevant genetic variation in these populations.
Expanded summary*: Characterizing how migration and admixture shapes human genetic diversity is vital for understanding human evolution, history and health. This is especially true in world regions that have undergone recent and dramatic demographic shifts, such as the Caribbean. Previous research with admixed Caribbean populations has shown that many islanders retain genomic variation from pre-Columbian indigenous groups, but also carry signatures of more recent admixture events fostered by European colonization and the African Slave Trade. However, a complete understanding of human genomic diversity across the Caribbean region is hampered by sampling gaps of both past and present populations. Due to the difficulties of obtaining ancient DNA (aDNA) from the tropics, the genetic diversity of pre-Columbian Caribbean groups is not well characterized. Efforts have been made to address this problem by studying Native American fragments in the genomes of admixed islanders. But, because modern populations do not retain all the genomic diversity of ancient groups, this approach provides limited resolution for reconstructing ancient demographic events. Further, many Caribbean populations remain underrepresented in large catalogs of genomic variation. Except for Barbadian Afro-Caribbeans, recently included in 1000 Genomes Phase 3, genetics research on Lesser Antillean populations has been limited to uni-parental loci and low-density ancestry informative markers. The present research seeks to fill in these gaps through two approaches: an aDNA analysis of 139 individuals from three archaeological sites in Puerto Rico (A.D. 500–1300), and an analysis of genome-wide SNP variants from 55 Afro-Caribbeans in five Lesser Antillean (LA) populations.
The aDNA investigation characterizes patterns of migration and genetic admixture in pre-Columbian Puerto Rico, and examines the extent of genetic continuity between ancient groups and modern islanders. In-solution capture and next-generation sequencing were used to obtain ancient DNA from 139 human skeletal remains (dated between A.D. 500–1300), from the sites of Tibes (n=52), Paso del Indio (n=50) and Punta Candelero (n=37). Preliminary data obtained from 24 complete mitochondrial genomes (mean read depth: 9.8x) suggest that pre-Columbian communities in Puerto Rico share genetic affinity with several extant South American and Mesoamerican indigenous populations. We also find that most pre-Columbian mtDNA lineages are not present in the Americas today, except for one, which is found almost exclusively in modern Puerto Ricans. These data support an origins scenario of complex and continuous admixture for ancient Caribbean groups but also underscore the large effect that contact-era population declines had on indigenous communities. Autosomal genotypes currently being generated from these remains will further inform these issues.
The second part of our project analyzes autosomal SNP genotypes in 55 self-identified Afro-Caribbeans from St. Kitts (n=5), St. Lucia (n=10), St. Vincent (n=15), Grenada (n=6), and Trinidad (n=19). We characterize patterns of genome-wide variation and ancestry in these individuals and compare them to exising data from other recently admixed American populations. We observe a complex pattern of admixture among in the LA Afro-Caribbeans with inputs from up to five continental sources and strong signatures of sex-biased mating. African ancestry proportions are high, but Native American ancestry is extremely low. This pattern contrasts sharply with that observed in Caribbean Latinos and is more similar to that observed in Haitians and Barbadians. We further observe that Trinidadian Afro-Caribbeans have the highest proportion of admixture with East and South Asian populations of all Caribbean populations studied to date.
Overall, our findings underscore the large impact of post-contact demographic shifts on Caribbean population history and illustrate how genomic diversity has changed in this region over the last 7,000 years. In addition, this work increases the representation of admixed and diverse populations in available genomic datasets and has the potential to inform future functional and clinical genetics research with admixed Caribbean islanders.
Genome wide data from the Iron Age provides insights into the population history of Finland.
Thiseas Christos Lamnidis et al.
Abstract: The population history of Finland is subject of an ongoing debate, in particular with respect to the relationship and origins of modern Finnish and Saami people. Here we analyse genome-wide data, extracted from three teeth found in the archaeological site of Levänluhta, in southern Ostrobothnia. The site dates back to the Iron Age between 550-800 AD, according to the artefacts recovered, while radiocarbon dating on scattered femurs from the site span 350-730 AD. When analysed together with previously published ancient European samples and with modern European populations, the ancient Finnish samples lack a genetic component found in early Neolithic Farmers and all modern European populations today. Instead, we find that they are more closely related to modern Siberian and East Asian populations than modern Finnish are, a pattern also observed in genetic data from modern Saami. Our results suggest that the ancestral Saami population 1500 years ago, inhabited a larger region than today, extending as far south as Levänluhta. Such a scenario is also supported by linguistic evidence suggesting most of Finland to have been speaking Saami languages before 1000 AD. We also observe genetic differences between modern Saami and our ancient samples, which are likely to have arisen due to admixture with Finnish people during the last 1500 years.
Rapid Evolution of Lighter Skin Pigmentation in Southern Africa
Meng Lin et al.
Abstract: Skin pigmentation is under strong directional selection, with lighter skin in northern European and East Asian populations, and darker skin in equatorial populations. However, selection on skin color and its mechanisms have only rarely been elucidated in studies of other populations worldwide. KhoeSan populations in far southern Africa, who are among the earliest diverged human populations, possess lightened skin pigmentation. We sequenced pigmentation genes to high coverage in over 400 KhoeSan individuals and demonstrate that a canonical skin pigmentation gene, SLC24A5, experienced recent adaptive evolution in the KhoeSan. The functionally causative skin lightening allele is present at a high frequency of 24% in the KhoeSan, after controlling for the recent European gene flow. The effect size of the allele is slightly larger than the mean pigmentation difference between Europeans and East Asians, explaining 11.9% of the variance in pigmentation in the KhoeSan. Haplotype analysis indicates that the derived haplotypes in these populations are identical to those fixed in Europeans. Using a hidden Markov model, we estimate the age of the ancestral haplotype carrying the derived allele in KhoeSan to be 18 kya [12 kya – 42 kya], somewhat older than the age of the allele in Europeans at 13 kya [6 kya – 41 kya]. We hypothesize that the allele was only introduced into the KhoeSan within the past 3,000 years, likely by pastoralists moving from eastern Africa to southern Africa while retaining non-African admixture. We test this hypothesis using an approximate Bayesian computation (ABC) approach, incorporating demographic models and selection. The SLC24A5 locus is a rare example of strong, ongoing, parallel adaptation adopted through gene flow in recent human history. We demonstrate a novel strategy of tracing the selection on both the genotype and corresponding phenotype, by modeling the signal from the genetic association and its selection through demographic history.
Expanded summary*: Human skin pigmentation is among the most notably diverse phenotypes across populations. It’s also one of the most strongly selected phenotypes in the recent human adaptation history, mirroring the migration paths to different latitudes where ultraviolet radiation (UVR) varies. To present, the genetic basis and evolution of skin pigmentation have been primarily studied in light skinned northern Europeans and East Asians, mostly discovered through strong signals in selection scans. The evolution mechanism of skin color in other parts of the world remains a huge mystery.
In this study, we explore a rapid adaptation scenario of a skin lightening allele with large effect in KhoeSan population from the far southern Africa. Among the earliest diverged human populations, KhoeSan possess relatively light skin as compared to their Bantu-speaking neighbors. Through our association study in two KhoeSan communities, we found a large effect variant in the canonical pigmentation gene SLC24A5 that lightens skin pigmentation by 4 melanin units, and explains 12% of the phenotypic variance. This non-synonymous variant is present at a high allele frequency of 38% in our cohort, which cannot be explained by the proportion of recent gene flow from Europeans. After targeted sequencing this region at high coverage in 430 individuals from this cohort, we demonstrate that the derived haplotypes in KhoeSan are identical to those fixed in Europeans, forming a strong starburst pattern in the network. Time of origin of the derived allele in the two populations are estimated to overlap with each other (18 kya[12 kya – 42 kya] in KhoeSan vs. 13kya[6 kya – 41 kya] in Europeans). The possible introduction of this allele to KhoeSan is about 2~3 kya via gene flow from eastern African pastoralists, who carried non-African admixture. We test this hypothesis using an approximate Bayesian computation (ABC) approach, incorporating demographic models and selection.
The broader implications and significance include that 1) our finding shows a rare case of strong selection on an allele introduced through introgression in human history; 2) method wise, we demonstrate a novel strategy of tracing the selection on both the genotype and corresponding phenotype, by modeling the signal from the genetic association and its selection through demographic history; 3) as an example of exploring selection in an understudied population, our finding enriches the understanding of the story on convergent evolution of light skin pigmentation.
Dissecting Historical Changes of Selective Pressures in the Evolution of Human Pigmentation
Xin Huang et al.
Abstract: Human pigmentation is a highly diverse trait among populations, and has drawn particular attention from both academic and non-academic investigators for thousands of years. To explain the diversity of human pigmentation, researchers have proposed that human pigmentation is adapted for ultraviolet radiation (UVR) and driven by natural selection. Although studies have detected signals of natural selection in several human pigmentation genes, none have quantitatively investigated the historical selective pressures on pigmentation genes during different epochs and thoroughly compared the differences in selective pressures between different populations. In the present study, we developed a new method to dissect historical changes of selective pressures in a multiple population model by summarizing selective pressures on multiple genes. We collected genotypes of 16 critical genes in human pigmentation from 15 public datasets, and obtained data for 3399 individuals of five representative populations from worldwide. Our results suggest (1) that a significant historical increase of selective pressure on light pigmentation shared by all non-Africans at the early stage of the out-of-Africa event (1.78 × 10-2 per generation); (2) that diversifying selection, instead of the relaxation of selective pressures, is the cause of light pigmentation in low UVR areas; (3) and that epistasis plays important roles in the evolution of human pigmentation.
Expanded summary*: Human pigmentation – the color of the skin, hair, and eye – is one of the most diverse traits among populations. Its obvious diversity has attracted particular attention from both academic and non-academic investigators for thousands of years, as noted by Charles Darwin one century ago and as noticed by ancient Egyptians more than 4000 years ago. Why human pigmentation diverges, however, remains a central puzzle in human biology. Human pigmentation may be adapted for UVR and driven by natural selection. Natural selection may favor dark skin for effectively absorbing sunlight, and light skin for efficiently producing vitamin D. Dark skin may protect individuals against sunburn and skin cancer in low latitude areas with high UVR, while light skin may prevent rickets in infants in high latitude areas with low UVR. A better understanding of how natural selection shapes the diversity of human pigmentation could provide relevant and beneficial information for public health.
During the last 10 years, studies have applied methods to detect signals of natural selection in several human pigmentation genes. These genes encode different proteins, such as signal regulators, possible enhancers, important enzymes, and putative exchangers. Although previous studies have been devoted to understanding the evolution of separate pigmentation genes, fewer studies have examined how multiple genes contributed to the evolution of human pigmentation. Moreover, none have quantitatively investigated the historical selective pressures of pigmentation genes during different epochs, and thoroughly compared the differences of selective pressures between different populations.
In the present study, we developed a new method to dissect historical changes of selective pressures for different periods of human evolution. Our results showed not only independent selective pressures in Europeans and Asians, but also a significant historical increase of selective pressure on light pigmentation in all non-Africans at the early stage of the out-of-Africa event. Further, our results excluded the relaxation of selective pressures, and favored diversifying selection as a single explanation for the cause of light pigmentation in low UVR areas, a long-standing puzzle in the evolution of human pigmentation. Finally, our results indicate epistasis plays important roles in the evolution of human pigmentation, partially explaining diversifying selection on human pigmentation among populations.
New regions underlying pigmentation of skin, hair, and iris using quantitative phenotypic traits in individuals of European ancestry
Frida Lona-Durazo et al.
Abstract: A moderate number of genes have been associated with human pigmentary traits (skin, hair and iris color) using genomewide association studies (GWAS). Most of these GWAS have been carried out in European populations. However, the majority of these efforts have relied on qualitative assessments of eye and hair color, which fail to capture the underlying quantitative distribution of these traits. We performed a GWAS of pigmentary traits in 575 individuals of European ancestry. All traits were evaluated with quantitative methods: skin and hair pigmentation were measured with a reflectometer, and eye color was measured from highresolution photographs using the CIELab color space. The samples were genotyped with Illumina’s Multi-Ethnic Global Array (MEGA), and untyped genotypes were imputed using as a reference the Phase 3 samples of the 1000 Genomes project. We identified signals within well-established genes associated with pigmentation of skin, hair, or iris, such as IRF4, OCA2, and HERC2. We also identified new regions associated with these phenotypes. For skin pigmentation, we identified genome-wide signals (p<5×10-8) within or near the genes WASF1/CDC40 and EGFR. For hair pigmentation, genome-wide signals were identified within or near the genes PRKAA2, FHIT, MATN2, KCNT1 and DENND5B. For eye color, we observed a very strong signal within the gene HERC2, which has been extensively associated with blue iris color. This region also shows genome-wide significance for central heterochromia. We are currently carrying out replication analyses in independent European samples to confirm the genomic regions identified in our study.
Adaptation to nine thousand years of diet in Asia
Srilakshmi Raj et al.
Abstract: Agriculture and domestication created dietary changes that influenced human genomic variation. Shifts creating the strongest impacts have been explored extensively. The process of agriculture and domestication, however, was complex and spread over several thousands of years across different global origins. We hypothesize that the diet shifts introduced by plant and animal domestication must have rendered a greater impact on the human genome than previously explored, and these impacts varied among human populations. We combined detailed archaeological evidence of domestication with modern-day dietary information to reconstruct high resolution models of diet across Asian populations for the last 9000 years, incorporating mode of subsistence, food preparation techniques, time since domestication of the crop, micro and macro-nutrient composition, and percent of diet. We used three Bayesian methods, BayeScan, BayeScEnv, and Baypass, to identify correlations between common genetic variants in 29 modern-day Asian populations and 9 dietary variables. We found an enrichment for genetic pathways associated with salivary gland morphology, insulin secretion, taste perception, olfaction and kidney development in the top 1% of gene regions correlated with our dietary variables. Many of these genes and gene families have not been previously reported to be under selection, especially due to diet. We present a case for archaeobotanical evidence as a powerful tool for understanding how historical human niche construction influenced modern human genetic variation. As our knowledge of the timing and spread of agricultural domestication increases, we can use similar techniques to more accurately measure the impact of subtle environmental changes on the human genome.
Expanded summary*: One of the most dramatic animal domestication events occurred some 8000 years ago when humans domesticated themselves to a diet of domesticated plants and animals. The major changes in the genome resulting from the shift from a hunter-gatherer lifestyle to an agrarian one suggest that 1) dramatic and sharp lifestyle changes resulted in equally dramatic changes in the genome, and 2) many of these changes resulted specifically from sharp dietary changes. Milk-drinking cultures have a markedly higher frequency of LCT polymorphisms compared with non-milk-drinking cultures, and ADH1B polymorphism frequencies correlate with time of rice domestication 1,2. Studies of ancient human DNA from 8000 years ago also suggest that European genomes may have undergone selection for loci involved in decreased height, in vitamin D levels, in fatty acid metabolism, in pigmentation and in hair thickness over this time period
3. These and other studies on the relationship between domestication, diet and human evolution have used what is known about current dietary habits and lifestyles of populations, and focused on dramatic changes that influenced the genome in selective sweeps. In contrast to these studies, I treated diet as a dynamic variable that changed in composition and through time, using high-resolution archaeological data to construct the model. I worked with a nutritionist to identify the optimal way to combine the dietary variables in terms of glycemic index and glycemic load. I also quantified diet through its macronutrient and micronutrient composition, to understand how dietary changes influenced nutrition. I also used three different, but complementary, statistical techniques to measure the correlation between these subtle environmental changes and modern genetic variation.
Construction of the dietary model and application to genetic data was exhaustive and required a truly interdisciplinary collaboration and approach. This work showed correlations between long-term dietary habits and gene ontologies associated with salivary gland development, insulin signaling, kidney development, and sensory perception. Many of these associations were found in pathways that have never before been suggested to evolve due to domestication. This work also demonstrates that archaeological data can be successfully combined with nutritional and genomic data to better understand how humans evolved in a changing environment.
OHANA: DETECTING SELECTION IN MULTIPLE POPULATIONS BY MODELLING ANCESTRAL ADMIXTURE COMPONENTS
Jade Cheng & Rasmus Nielsen
One of the most powerful and commonly used methods for detecting local adaptation in the genome is the identification of extreme allele frequency differences between populations. In this paper we present a new maximum likelihood method for finding regions under positive selection. The method is based on a Gaussian approximation to allele frequency changes, which allows it to account for interbreeding between populations and retains high power when the test populations are admixed. It can also simultaneously and efficiently compare multiple populations. We evaluate the method using simulated data and compare it to methods using summary statistics. We also apply it to a human genomic data set and identify loci with extreme genetic differentiation between major geographic groups. Most of the genes identified are previously known selected loci relating to hair pigmentation and morphology, skin and eye pigmentation, including the top two genes: EDAR and SLC25A5. In contrast to previous genomic scans, we include data from Aboriginal Australians, which provide us with additional power to detect selection specific to East Asians. Using this method, we can identify new candidate loci – like CASC15, involved in melanoma suppression – and narrow down on the likely causal SNPs in previously reported candidate regions – like KCNB2, involved in various neurological functions.
Human X and Y chromosome co-evolution, ampliconic gene evolution, selective sweeps and speciation
Elise Lucotte et al.
Abstract: The X chromosome is disproportionally involved in speciation in humans and other great apes. We recently reported that the X chromosome has been the target of independent very strong selective sweeps in several great apes species targeting overlapping regions. These regions associate with the location of multi-copy, testis-expressed genes (so-called ampliconic genes) and also with genomic deserts of Neanderthal introgression into humans from interbreeding around 50,000 years ago. This suggests that these regions contain reproductive incompatibilities between human and Neanderthal, possibly due to the ampliconic genes. We speculated that competition between X and Y in male meioses, i.e. meiotic drive, by these ampliconic genes and their non homologous counterparts on the Y chromosome is responsible for these sweeps, and that such drive may be a major contributor to speciation. We present results on the variation in ampliconic gene copy number within and among human populations based on a new mapping approach of short read sequences from the Simons genome diversity and the Danish pangenome projects. We report extensive variation in ampliconic gene number for 7 X-linked and 7 Y-linked ampliconic regions and find that this variation is geographically structured around the globe. For the Y-chromosome, many duplications and deletions of ampliconic genes occur recurrently among different haplogroups. We relate this variation to our inference of very strong X-linked selective sweeps targeting specific human populations in order to identify potential drivers. Finally, we present preliminary results on ampliconic gene expression through male meiosis studied from micro-dissection of testes, and how this expression relates to the copy number of ampliconic genes and the ratio of X and Y chromosomes in spermatozoa.
The immune system was a major target of natural selection during the European Neolithic
Abstract: The European Neolithic, starting around 6,400 BCE, marked a dramatic change in terms of both population and lifestyle. Over several thousand years hunter-gatherer populations merged with migrating farmers who spread their settled agricultural lifestyle across the continent. Improvements in the efficiency of ancient DNA (aDNA) sequencing mean that aDNA from this period can be used, not only to track population movements, but also to detect natural selection – revealing how these populations adapted to changes in environment, diet, and social organization.
We analyze data generated in the Reich lab from more than 300 individuals who lived between 10,000 and 1,500 BCE and show that the immune system, along with diet- and pigmentation-related traits, was a major target of natural selection. Several genes involved in pathogen response were targeted, including haplotypes at the OAS and TLR gene clusters that had originally introgressed from Neanderthals. We detect at least six independent targets of selection at the Major Histocompatibility Complex (MHC) and show that, in general, both the MHC and immune-associated loci are significantly enriched for evidence of selection. Finally, we show that higher MHC diversity in farmers compared to hunter-gatherers is proportional to genome-wide diversity, which argues against selection for increased diversity.
This project demonstrates the power of aDNA for learning about selection, but also reveals its limitation. In particular, it is difficult to determine what specific factors are driving the selective events that we observe. Linking human and pathogen aDNA with archaeological and isotopic data provides a promising path for future work.