Eurogenes Blog: R1b

Showing posts with label R1b. Show all posts

Tuesday, November 1, 2022

The story of R-V1636

Who wants to bet against this map? Keep in mind that ART038 (~3000 calBCE) remains the oldest sample with the V1636 and R1b Y-chromosome mutations in the West Asian ancient DNA record. Ergo, there's nothing to suggest that V1636 or R1b entered Eastern Europe from West Asia.

Wednesday, October 27, 2021

Local origins of the earliest Tarim Basin mummies (Zhang et al. 2021)

Over at Nature at this LINK. It's nice to see yet another huge surprise courtesy of ancient DNA. Please note that most of the ancients from this paper are already in the Global25 datasheets. Here's the abstract:

The identity of the earliest inhabitants of Xinjiang, in the heart of Inner Asia, and the languages that they spoke have long been debated and remain contentious 1. Here we present genomic data from 5 individuals dating to around 3000–2800 bc from the Dzungarian Basin and 13 individuals dating to around 2100–1700 bc from the Tarim Basin, representing the earliest yet discovered human remains from North and South Xinjiang, respectively. We find that the Early Bronze Age Dzungarian individuals exhibit a predominantly Afanasievo ancestry with an additional local contribution, and the Early–Middle Bronze Age Tarim individuals contain only a local ancestry. The Tarim individuals from the site of Xiaohe further exhibit strong evidence of milk proteins in their dental calculus, indicating a reliance on dairy pastoralism at the site since its founding. Our results do not support previous hypotheses for the origin of the Tarim mummies, who were argued to be Proto-Tocharian-speaking pastoralists descended from the Afanasievo 1,2 or to have originated among the Bactria–Margiana Archaeological Complex 3 or Inner Asian Mountain Corridor cultures 4. Instead, although Tocharian may have been plausibly introduced to the Dzungarian Basin by Afanasievo migrants during the Early Bronze Age, we find that the earliest Tarim Basin cultures appear to have arisen from a genetically isolated local population that adopted neighbouring pastoralist and agriculturalist practices, which allowed them to settle and thrive along the shifting riverine oases of the Taklamakan Desert.

Zhang, F., Ning, C., Scott, A. et al. The genomic origins of the Bronze Age Tarim Basin mummies. Nature (2021). https://doi.org/10.1038/s41586-021-04052-7

See also...

How the Shirenzigou nomads became Proto-Tocharians

Wednesday, January 27, 2021

The great shift

Here's a Principal Component Analysis (PCA) featuring some of the ancients from the recent Saag et al. paper at Science Advances. To see an interactive version of the plot paste the Global25 coordinates here into the relevant field here.

Note that the Fatyanovo culture agropastoralists, who are rich in Y-haplogroup R1a and steppe ancestry, cluster with present-day Eastern Europeans. On the other hand, the Volosovo culture singleton sits near the European hunter-gatherer cline that no longer exists.

This Volosovo individual belongs to Y-haplogroup Q1a. However, most of the Volosovo males whose genomes are soon to be published belong to Y-haplogroup R1b.

Thus, in much of Eastern Europe during the Bronze Age, agropastoralists rich in R1a and steppe ancestry replaced hunter-gatherers rich in R1b and with no steppe ancestry. Of course, that's not where the story ends, but I'll get back to that later this year.

By the way, the relatively high coverage Fatyanovo Y-chromosome sequences are being analyzed at YFull. You can check out the results here.

See also...

Sunday, January 17, 2021

That old chestnut: Northeast vs Northwest Euros

In the last comment thread reader Greg put forth this question:

David, when are you going to explain the genetic discrepancy between Northeastern and Northwestern Europeans? You know, the one that people believe is due to Baltic Hunter-Gatherer admixture, whereas you believe it is due to genetic drift? You ought to make a post about this issue at some point, because a lot of people are wondering what's causing the differences.

Well, Greg, this issue has been discussed to the proverbial death here and elsewhere. In fact, there were two posts and rather lengthy comment threads on the same topic at this blog just a few months ago. See here and here.

Nevertheless, it seems that a fair number of people are still befuddled, so I'm going to try to explain this one last time, as briefly as a I can using just a handful of f4-stats.

Admittedly, Northeast Europeans generally do pack higher levels of indigenous European hunter-gatherer ancestry than Northwest Europeans. This is especially true of Balts, who show more of this type of ancestry than even Scandinavians in practically every type of analysis.

The f4-stats below back this up unambiguously. Note the significantly positive (>3) Z scores, which suggest that Latvians and Lithuanians harbor more Baltic hunter-gatherer-related ancestry than Norwegians and Swedes.

Chimp Baltic_HG Norwegian Latvian 0.001301 7.114
Chimp Baltic_HG Swedish Latvian 0.001017 4.205
Chimp Baltic_HG Norwegian Lithuanian 0.001023 7.341
Chimp Baltic_HG Swedish Lithuanian 0.000763 3.408

Greg, I know what you're thinking: the naysayers are right! But wait, because there's a twist to this tale. Check out these f4-stats:

Chimp Baltic_HG Norwegian Belarusian 0.000265 1.934
Chimp Baltic_HG Swedish Belarusian 0.000152 0.7
Chimp Baltic_HG Norwegian Polish 6.4E-05 0.519
Chimp Baltic_HG Swedish Polish -0.000235 -1.074

Please note, Greg, that none of the Z scores reach significance, which means that these Northwest Europeans and Slavs are symmetrically related to Baltic_HG. They're also symmetrically related to other relevant ancient groups such as the Yamnaya steppe herders. This, of course, suggests that they harbor very similar levels of basically the same ancient genetic components.

Chimp Karelia_HG Norwegian Belarusian 0.000136 0.844
Chimp Karelia_HG Swedish Belarusian 7.9E-05 0.32
Chimp Karelia_HG Norwegian Polish -4.7E-05 -0.304
Chimp Karelia_HG Swedish Polish -0.000134 -0.54

Chimp Yamnaya_Samara Norwegian Belarusian -0.000134 -1.085
Chimp Yamnaya_Samara Swedish Belarusian -6.6E-05 -0.34
Chimp Yamnaya_Samara Norwegian Polish -0.000225 -1.995
Chimp Yamnaya_Samara Swedish Polish -0.000311 -1.574

Chimp Barcin_N Norwegian Belarusian -0.000335 -2.809
Chimp Barcin_N Swedish Belarusian -0.000284 -1.491
Chimp Barcin_N Norwegian Polish -0.000222 -2.057
Chimp Barcin_N Swedish Polish -0.000318 -1.662

Chimp Baikal_N Norwegian Belarusian 0.000186 1.3
Chimp Baikal_N Swedish Belarusian -7E-05 -0.33
Chimp Baikal_N Norwegian Polish -4.6E-05 -0.351
Chimp Baikal_N Swedish Polish -0.000477 -2.277

Interestingly, pairing up Ukrainians with English samples from Cornwall and Kent produces similar outcomes. But that's because most ancient ancestry proportions in Europe show a closer correlation with latitude than longitude.

Chimp Baltic_HG English_Cornwall Ukrainian 0.000282 2.242
Chimp Baltic_HG English_Kent Ukrainian 0.000225 1.748

Chimp Karelia_HG English_Cornwall Ukrainian 0.000323 2.175
Chimp Karelia_HG English_Kent Ukrainian 0.000239 1.634

Chimp Yamnaya_Samara English_Cornwall Ukrainian -6.6E-05 -0.569
Chimp Yamnaya_Samara English_Kent Ukrainian -0.000112 -0.977

Chimp Barcin_N English_Cornwall Ukrainian -0.000519 -4.641
Chimp Barcin_N English_Kent Ukrainian -0.000598 -5.232

Chimp Baikal_N English_Cornwall Ukrainian 0.000385 2.874
Chimp Baikal_N English_Kent Ukrainian 0.00036 2.836

Now, Greg, if at least in terms of genetic ancestry, Latvians, Lithuanians, Belarusians, Poles and Ukrainians all qualify as Northeast Europeans, then what makes them different, as a group, from Northwest Europeans? Do you believe that the key factor is admixture from Baltic hunter-gatherers? Or is it genetic drift?

Of course, considering all of the f4-stats above, logic dictates that it must be relatively recent genetic drift.

Keep in mind, however, that this only applies to Balto-Slavic speaking Northeast Europeans without significant Uralian ancestry. Overall, Uralic speakers have a more complex population history, and indeed genetic differences between them and Northwest Europeans are in large part due to somewhat different ancestry proportions and also Siberian admixture.

See also...

So who's the most (indigenous) European of us all?

Tuesday, September 8, 2020

Warriors from at least two different populations fought in the Tollense Valley battle

I can't get the genotype data from the Burger et al. paper. The lead authors, Joachim Burger and Daniel Wegmann, aren't replying to my emails.

But they were gracious enough to release the BAM files for each of their samples, and these files can be converted to genotype data. So I've included ten of the Tollense Valley warriors (DEU_Tollense_BA) in the Global25 datasheets (see here).

The claim in the paper that these warriors "represent an unstructured population" is absolutely false and extremely naive.

Below are a couple of Principal Component Analysis (PCA) plots produced with Vahaduo Global25 views. The samples are labeled according to their Y-chromosome haplogroups. To see interactive versions of the same plots, paste the Global25 coordinates from the text file here into the relevant fields here.

These warriors are not a single unstructured population, because they cover too much ground in the above plots for that to be possible. It's clear to me that they represent at least two different groups from Central Europe and surrounds.

Of course, this would be a lot easier to work out if Burger et al. cared to supply more information about each of the warriors, such as their attire, weapons, circumstances of death, and so on. It's a complete mystery to me why this wasn't included in the paper, and the authors are refusing to talk to me, so it's unlikely that I'll ever be able to get it from them.

In the absence of such crucial archeological and anthropological data, I don't want to speculate too much, and get overly creative, but here are a couple of possible scenarios to explain the ancient DNA results:

- this may have been a battle between two Central European armies, one rich in Y-haplogroup R1b and the other rich in Y-haplogroup I2a, as well as their allies or hired help, including warriors from Eastern Europe belonging to Y-haplogroup R1a

- or perhaps it was an invasion from the east by warriors rich in Y-haplogroup R1a, and it was a success, with the local armies, rich in Y-haplogroups R1b and I2a, losing the battle and suffering most of the casualties.

I'm sure that one day someone will attempt to undertake a decent multidisciplinary study of this epic battle, and we'll at least have a rough idea about what happened. Or not.

Citation...

Burger et al., Low Prevalence of Lactase Persistence in Bronze Age Europe Indicates Ongoing Strong Selection over the Last 3,000 Years, Current Biology, Available online 3 September 2020, https://doi.org/10.1016/j.cub.2020.08.033

See also...

Genetic and linguistic structure across space and time in Northern Europe

Sunday, September 6, 2020

Low prevalence of lactase persistence in Bronze Age Europe (Burger et al. 2020)

Over at Current Biology at this LINK. Unfortunately, this is the long-awaited Tollense Valley battle paper. Despite the obvious presence of some very interesting genetic substructures among the Tollense Valley warriors (see here), the authors have the audacity to claim that these individuals represent a "single unstructured Central/Northern European population".

One of the warriors, labeled WEZ56, belongs to Y-haplogroup R1a and shows an exceedingly Balto-Slavic-like genome-wide genetic structure. But none of this is even mentioned in passing in the paper. Indeed, according to Burger at al., WEZ56 is best classified as belonging to R1, even though the R1a classification is quite secure based on the raw data that the authors posted online.

Be extremely wary of what you read in this paper, and anything else that these scientists have published in the past and will publish in the future. Below is the paper summary:

Lactase persistence (LP), the continued expression of lactase into adulthood, is the most strongly selected single gene trait over the last 10,000 years in multiple human populations. It has been posited that the primary allele causing LP among Eurasians, rs4988235-A [1], only rose to appreciable frequencies during the Bronze and Iron Ages [2, 3], long after humans started consuming milk from domesticated animals. This rapid rise has been attributed to an influx of people from the Pontic-Caspian steppe that began around 5,000 years ago [4, 5]. We investigate the spatiotemporal spread of LP through an analysis of 14 warriors from the Tollense Bronze Age battlefield in northern Germany (∼3,200 before present, BP), the oldest large-scale conflict site north of the Alps. Genetic data indicate that these individuals represent a single unstructured Central/Northern European population. We complemented these data with genotypes of 18 individuals from the Bronze Age site Mokrin in Serbia (∼4,100 to ∼3,700 BP) and 37 individuals from Eastern Europe and the Pontic-Caspian Steppe region, predating both Bronze Age sites (∼5,980 to ∼3,980 BP). We infer low LP in all three regions, i.e., in northern Germany and South-eastern and Eastern Europe, suggesting that the surge of rs4988235 in Central and Northern Europe was unlikely caused by Steppe expansions. We estimate a selection coefficient of 0.06 and conclude that the selection was ongoing in various parts of Europe over the last 3,000 years.

Burger et al., Low Prevalence of Lactase Persistence in Bronze Age Europe Indicates Ongoing Strong Selection over the Last 3,000 Years, Current Biology, Available online 3 September 2020, https://doi.org/10.1016/j.cub.2020.08.033

See also...

Warriors from at least two different populations fought in the Tollense Valley battle

Monday, September 2, 2019

Commoner or elite?

I recently started looking at the correlations between Y-chromosome haplogroups and social standing in ancient Europe, and was surprised by what I learned about the five currently sampled prehistoric Scandinavians belonging to Y-haplogroup R1b. I certainly wasn't expecting to uncover these stories about a mass human sacrifice, a bog body, and an Arctic circle warrior:

- The earliest Scandinavian in the ancient DNA record belonging to R1b comes from a grave site in what is now northern Norway (VK531, Margaryan et al. 2019). This individual has a genome-wide profile similar to that of local Mesolithic hunter-gatherers, but is dated to just ~2,400 BCE. During this time, Scandinavia was dominated by a "new" population associated with the Battle-Axe culture (BAC), with high levels of ancestry from the steppes of Eastern Europe. Since VK531 wasn't buried with any BAC grave goods, and indeed with no grave goods at all, it's possible that he may have been from a remnant forager population that was displaced and ultimately forced into extinction.

- R1b-U106 is today by far the most common R1b subclade in Scandinavia, but it's not yet clear how it managed to attain this status. Was it perhaps through elite dominance? The earliest ancient individual belonging to R1b-U106 is dated to 2275-2032 calBCE and comes from a Late Neolithic, likely post-BAC burial ground in what is now Sweden (RISE98, Lilla Beddinge, grave 49, southern skeleton, Allentoft et al. 2015). However, RISE98 wasn't buried in any way that would suggest he was an individual of high social standing. In fact, he was found in a mass grave, along with two other adults and two infants, possibly representing a human sacrifice. The only artefact in the grave was a bone needle. More details are available here.

- During the Nordic Bronze Age it became customary for Scandinavian elites to be laid to rest in richly furnished barrows, while commoners were buried in flat graves with few or no offerings. Human remains recovered from a "commoner" flat grave cemetery dated to the Early Bronze Age near the present-day city of Aalborg, northern Denmark, included the skeleton of a male belonging to Y-haplogroup R1b-M269 (RISE47, grave 3, skeleton 8, Allentoft et al. 2015). Keep in mind, however, that this might have been another case of an ancient Scandinavian R1b-U106 if not for missing data. A flint dagger was found alongside one of the skeletons in this cemetery, but RISE47 wasn't accompanied by any grave goods (see here).

- One of the most amazing archeological discoveries made in Scandinavia is the Trundholm Sun Chariot. Found in a peat bog on the island of Zealand, Denmark, in 1902, it's thought to be an Indo-European religious artefact dating back to the Nordic Bronze Age; a representation of a horse pulling the sun and perhaps also the moon in a spoked wheel chariot. Another important discovery in a peat bog near Trundholm dating to the Nordic Bronze Age was the body of a man belonging to R1b-M269 (RISE276, Trundholm mose II, bog find 1940, Allentoft et al. 2015). However, chances are slim that RISE276 was a charioteer or, say, a spiritual guru who accidentally drowned in the bog. Most Danish bog bodies are thought to have belonged to sacrificial victims or executed criminals.

- Interestingly, the earliest likely Scandinavian warrior belonging to R1b, and also R1b-U106, is from an early Iron Age burial in present-day northwestern Norway (VK418, Margaryan et al. 2019). This site isn't quite as far north as the grave of the above mentioned VK531, but it's still well within the Arctic circle. Apparently, VK418 was buried with some impressive weapons, potentially of "eastern origin", including a shield, spearheads and a sword. Who knows, he may even have been an elite warrior for his time and place?

The other two main Scandinavian Y-haplogroups, I1a and R1a, haven't yet been found in prehistoric Nordic remains from such, shall we say, depressing burials. That's not to say, of course, that they won't be sooner or later. RISE175, from Allentoft et al. 2015, is currently the only individual who fits the bill as a representative of the Nordic Bronze Age elite. He was buried in a barrow grave in what is now southwest Sweden and probably belongs to Y-haplogroup I1a. That's not much to go on, but perhaps it's a sign of things to come?

Saturday, August 17, 2019

A surprising twist to the Shirenzigou nomads story

Remember those potentially Afanasievo-derived and Tocharian-related Shirenzigou nomads from the Ning et al. paper? Well, in my opinion, they're probably neither. The genotypes and other data for these Iron Age individuals from the eastern Tian Shan are available here.

Below are a few successful and not so successful qpAdm mixture models for them. Note that I tried to use a wide range of relevant "right pops", but also retain a lot of markers, specifically to be able to discriminate between different types of steppe and steppe-derived sources of gene flow (refer to the full output). Admittedly, the Shirenzigou nomads can be modeled with Afanasievo-related ancestry, but...

CHN_Shirenzigou_IA
KAZ_Botai 0.161±0.023
KAZ_Wusun 0.490±0.023
NPL_Mebrak_2125BP 0.349±0.019
chisq 5.793
tail prob 0.926172
Full output

CHN_Shirenzigou_IA
KAZ_Botai 0.143±0.022
NPL_Mebrak_2125BP 0.295±0.019
Saka_Tian_Shan 0.562±0.024
chisq 6.796
tail prob 0.870794
Full output

CHN_Shirenzigou_IA
KAZ_Botai 0.185±0.023
NPL_Mebrak_2125BP 0.428±0.021
RUS_Sintashta_MLBA 0.270±0.026
TJK_Sarazm_En 0.117±0.027
chisq 11.351
tail prob 0.414345
Full output

CHN_Shirenzigou_IA
KAZ_Botai 0.032±0.027
KAZ_Zevakinskiy_LBA 0.567±0.025
NPL_Mebrak_2125BP 0.401±0.019
chisq 15.157
tail prob 0.232961
Full output

CHN_Shirenzigou_IA
NPL_Mebrak_2125BP 0.452±0.031
RUS_Afanasievo 0.435±0.025
RUS_Okunevo_BA 0.114±0.049
chisq 19.808
tail prob 0.0708003
Full output

CHN_Shirenzigou_IA
NPL_Mebrak_2125BP 0.409±0.031
RUS_Okunevo_BA 0.173±0.050
Yamnaya_RUS_Caucasus 0.418±0.026
chisq 20.453
tail prob 0.0589872
Full output

CHN_Shirenzigou_IA
NPL_Mebrak_2125BP 0.464±0.033
RUS_Okunevo_BA 0.104±0.053
Yamnaya_RUS_Samara 0.432±0.027
chisq 27.189
tail prob 0.0072566
Full output

Both the Wusun and Saka are generally accepted to have been the speakers of Indo-Iranian languages. So it's possible that the Shirenzigou nomads were Indo-Iranian speakers too, or at least derived from such peoples.

Surprisingly, NPL_Mebrak_2125BP was the key to obtaining the best statistical fits. This is a trio of samples, roughly contemporaneous with the Shirenzigou nomads, from a burial site high up in the Himalayas in what is now Nepal (see here).

To be honest, I'm not quite sure why the Himalayan ancients work so well in my models. Perhaps they're just a really good proxy for an Iron Age population from the northern edge of the Tibetan Plateau?

By the way, most of the Shirenzigou nomads made it into the latest Global25 datasheets (see here). They can be analyzed in a variety of ways described in this blog post: Getting the most out of the Global25. Below is a screen cap of me comparing the effectiveness of Afanasievo, Sintashta and Wusun samples as proxies for the steppe ancestry in the Shirenzigou nomads with an online tool freely available here. As expected, the algorithm picks Sintashta ahead of Afanasievo, and the Wusun ahead of both.

Friday, August 2, 2019

The PIE homeland controversy: August 2019 status report

Archeologist David Anthony has a new paper on the Indo-European homeland debate titled Archaeology, Genetics, and Language in the Steppes: A Comment on Bomhard. It's part of a series of articles dealing with Allan R. Bomhard's "Caucasian substrate hypothesis" in the latest edition of The Journal of Indo-European Studies. It's also available, without any restrictions, here.

Any thoughts? Feel free to share them in the comments below. Admittedly, I found this part somewhat puzzling (emphasis is mine):

It was the faint trace of WHG, perhaps 3% of whole Yamnaya genomes, that identified this admixture as coming from Europe, not the Caucasus, according to Wang et al. (2018). Colleagues in David Reich’s lab commented that this small fraction of WHG ancestry could have come from many different geographic places and populations.

I think that's highly optimistic. It really should be obvious by now thanks to archeological and ancient genomic data, including both uniparental and genome-wide variants, that the Yamnaya people were practically entirely derived from Eneolithic populations native to the Pontic-Caspian (PC) steppe. So, in all likelihood, this was also the source of their minor WHG ancestry.

Indeed, they clearly weren't some mishmash of geographically, culturally and genetically disparate groups that had just arrived in Eastern Europe, but the direct descendants of closely related and already significantly Yamnaya-like peoples associated with long-standing PC steppe archeological cultures such as Khvalynsk and Sredny Stog. I discussed this earlier this year, soon after the Wang et al. paper was published:

On Maykop ancestry in Yamnaya

I hope I'm wrong, but I get the feeling that the scientists at the Reich Lab are finding this difficult to accept, because it doesn't gel with their theory that archaic Proto-Indo-European (PIE) wasn't spoken on the PC steppe, but rather south of the Caucasus, and that late or rather nuclear PIE was introduced into the PC steppe by migrants from the Maykop culture who were somehow involved in the formation of the Yamnaya horizon.

Inexplicably, after citing Wang et al. on multiple occasions and arguing against any significant gene flow between Maykop and Yamnaya groups, Anthony fails to mention Steppe Maykop. But the Steppe Maykop people are an awesome argument against the idea that there was anything more than occasional mating between the Maykop and Yamnaya populations, because they were wedged between them, and yet clearly distinct from both, with a surprisingly high ratio of West Siberian forager-related ancestry (see here and here).

Despite all the talk lately about the potential cultural, linguistic and genetic ties between Maykop and Yamnaya, including claims that the latter possibly acquired its wagons from the former, my view is that the Steppe Maykop and Yamnaya wagon drivers may have competed with each other and eventually clashed in a big way. Indeed, take a look at what happens after Yamnaya burials rather suddenly replace those of Steppe Maykop just north of the Caucasus around 3,000 BCE.

Yamnaya_RUS_Caucasus
RUS_Progress_En_PG2001 0.808±0.058
RUS_Steppe_Maykop 0.000
UKR_Sredny_Stog_II_En_I6561 0.192±0.058
chisq 13.859
tail prob 0.383882
Full output

Yep, total population replacement with no significant gene flow between the two groups. Apparently, as far as I can tell, there's not even a hint that a few Steppe Maykop stragglers were incorporated into the ranks of the newcomers. Where did they go? Hard to say for now. Maybe they ran for the hills nearby?

Intriguingly, Anthony reveals a few details about new samples from three different Eneolithic steppe burial sites associated with the Khvalynsk culture:

The Reich lab now has whole-genome aDNA data from more than 30 individuals from three Eneolithic cemeteries in the Volga steppes between the cities of Saratov and Samara (Khlopkov Bugor, Khvalynsk, and Ekaterinovka), all dated around the middle of the fifth millennium BC.

...

Most of the males belonged to Y-chromosome haplogroup R1b1a, like almost all Yamnaya males, but Khvalynsk also had some minority Y-chromosome haplogroups (R1a, Q1a, J, I2a2) that do not appear or appear only rarely (I2a2) in Yamnaya graves.

As far as I can tell, he suggests that they'll be published in the forthcoming Narasimhan et al. paper. If so, it sounds like the paper will have many more ancient samples than its early preprint that was posted at bioRxiv last year.

For me the really fascinating thing in regards to these new samples is how scarce Y-haplogroup R1a appears to have been everywhere before the expansion by the putative Indo-European-speaking steppe ancestors of the Corded Ware culture (CWC) people. It's basically always outnumbered by other haplogroups wherever it's found prior to about 3,000 BCE, even on the PC steppe. But then, suddenly, its R1a-M417 subclade goes BOOM! And that's why I call it...

The beast among Y-haplogroups

At this stage, I'm not sure how to interpret the presence of Y-haplogroup J in the Khvalynsk population. It may or may not be important to the PIE homeland debate. Keep in mind that J is present in two foragers from Karelia and Popovo, northern Russia, dated to the Mesolithic period and with no obvious foreign ancestry. So it need not have arrived north of the Caspian as late as the Eneolithic with migrants rich in southern ancestry from the Caucasus or what is now Iran. In other words, for the time being, the steppe PIE homeland theory appears safe.

Update 20/12/2019: A note on Steppe Maykop

See also...

Is Yamnaya overrated?

The PIE homeland controversy: January 2019 status report

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Sunday, July 28, 2019

They mixed up Huns with Tocharians

I don't yet have the genomes from the recent Ning et al. paper on the Iron Age nomads from the Shirenzigou site in the eastern Tian Shan. But I do have most of the previously published data featured in the paper, including the Damgaard et al. 2018 Hun and Saka samples from the western Tian Shan.

After reading the Ning et al. paper between the lines and running a few analyses of my own, it's clear to me that most of the supposedly Tocharian-related Shirenzigou individuals actually share a very close relationship with the Tian Shan Huns, and indeed may have been their ancestors.

For instance, Ning et al. found that a large part of the ancestry of the Shirenzigou ancients could be modeled with the Tian Shan Huns, which was an anachronistic approach because the former are older than the latter. They also found that Ulchi-related ancestry was a key part of the genetic structure of eight out of the ten Shirenzigou individuals, and this likewise appears to be an important part of the genetic structure of the Tian Shan Huns.

Note the strong statistical fits in the Global25/nMonte and qpAdm mixture models below, respectively, which characterize these Huns as a two-way mixture between the Ulchi and the earlier Tian Shan Saka. And keep in mind that the Saka also harbor significant Ulchi-related ancestry.

Hun_Tian_Shan
Saka_Tian_Shan,92
Ulchi,8
distance%=1.2553

Hun_Tian_Shan
Saka_Tian_Shan 0.928±0.009
Ulchi 0.072±0.009
chisq 4.409
tail prob 0.992464
Full output

Moreover, the Shirenzigou males belong to Y-haplogroups Q1a and R1b (two instances of each), and they share the latter with one of the Tian Shan Huns. Judging by the data from the relevant BAM files, it's also possible that the Shirenzigou males share a very rare subclade of R1b with the Hun, defined by the PH155 mutation (see here). The Y-haplogroup assignments for the other Tian Shan Huns end at R and R1, but that's almost certainly due to missing data.

On the other hand, two Tian Shan Sakas belong to Y-haplogroup R1a but none to R1b, which fits with the pattern from currently available ancient DNA that R1a was more common than R1b in Saka-related groups, such as the Scythians and Sarmatians (see here).

This is all very interesting, because the Huns replaced the Saka in the western Tian Shan, and, considering their R1b and excess Ulchi-related ancestry, very likely moved into the region from the direction of Shirenzigou. Indeed, in my opinion a strong argument can now be made that the Iron Age population from the Shirenzigou region took part in the formation of the Hunnic confederacy.

So where does that leave the theory presented by Ning et al. that the Shirenzigou ancients may have been closely related, and perhaps even ancestral, to the Tocharians, simply because they packed a lot of Yamnaya-related and possibly proto-Tocharian Afanasievo ancestry, and were living close to the Tarim Basin, where Tocharian languages were subsequently first attested?

I'm not sure, but I now find it difficult to reconcile this theory with the fact that they were closely related, and probably ancestral, to the Tian Shan Huns. As far as I'm aware, Huns cannot be linked to Tocharians in any meaningful way.

Of course it's possible that different Afanasievo-derived groups were living in the Tarim Basin and surrounds, and, as some merged with new populations pushing into the region from the east and adopted non-Indo-European languages, others retained their Tocharian speech and eventually split into communities speaking Tocharian A, B and apparently also C (see here).

But this has to be demonstrated directly with ancient DNA from archeological sites where Tocharian languages were attested. Till then, I'll keep thinking that Ning et al. wrote a paper about Tocharians that really should've been a paper about Huns.

Here's a famous wall painting of Tocharian princes from the cave of the sixteen sword-bearers in the Tarim Basin, dated to 432–538 AD. They don't look like guys with a lot of Ulchi-related admixture to me, but I might be wrong. Feel free to let me know what you think in the comments below.

Update 08/17/2019: The Shirenzigou nomads are now in my dataset. Below are a few successful and not so successful qpAdm mixture models for them. Note that I tried to use a wide range of relevant "right pops", but also retain a lot of markers, specifically to be able to discriminate between different types of steppe and steppe-derived sources of gene flow (refer to the full output). Admittedly, the Shirenzigou nomads can be modeled with Afanasievo-related ancestry, but...

CHN_Shirenzigou_IA
KAZ_Botai 0.161±0.023
KAZ_Wusun 0.490±0.023
NPL_Mebrak_2125BP 0.349±0.019
chisq 5.793
tail prob 0.926172
Full output

CHN_Shirenzigou_IA
KAZ_Botai 0.143±0.022
NPL_Mebrak_2125BP 0.295±0.019
Saka_Tian_Shan 0.562±0.024
chisq 6.796
tail prob 0.870794
Full output

CHN_Shirenzigou_IA
KAZ_Botai 0.185±0.023
NPL_Mebrak_2125BP 0.428±0.021
RUS_Sintashta_MLBA 0.270±0.026
TJK_Sarazm_En 0.117±0.027
chisq 11.351
tail prob 0.414345
Full output

CHN_Shirenzigou_IA
KAZ_Botai 0.032±0.027
KAZ_Zevakinskiy_LBA 0.567±0.025
NPL_Mebrak_2125BP 0.401±0.019
chisq 15.157
tail prob 0.232961
Full output

CHN_Shirenzigou_IA
NPL_Mebrak_2125BP 0.452±0.031
RUS_Afanasievo 0.435±0.025
RUS_Okunevo_BA 0.114±0.049
chisq 19.808
tail prob 0.0708003
Full output

CHN_Shirenzigou_IA
NPL_Mebrak_2125BP 0.409±0.031
RUS_Okunevo_BA 0.173±0.050
Yamnaya_RUS_Caucasus 0.418±0.026
chisq 20.453
tail prob 0.0589872
Full output

CHN_Shirenzigou_IA
NPL_Mebrak_2125BP 0.464±0.033
RUS_Okunevo_BA 0.104±0.053
Yamnaya_RUS_Samara 0.432±0.027
chisq 27.189
tail prob 0.0072566
Full output

Both the Wusun and Saka are generally accepted to have been the speakers of Indo-Iranian languages. So it's possible that the Shirenzigou nomads were Indo-Iranian speakers too, or at least derived from such peoples.

Surprisingly, NPL_Mebrak_2125BP was the key to obtaining the best statistical fits. This is a trio of samples, roughly contemporaneous with the Shirenzigou nomads, from a burial site high up in the Himalayas in what is now Nepal (see here).

To be honest, I'm not quite sure why the Himalayan ancients work so well in my models. Perhaps they're just a really good proxy for an Iron Age population from the northern part of the Tibetan Plateau? By the way, most of the Shirenzigou nomads made it into the latest Global25 datasheets (see here).

See also...

Almost everything you ever wanted to know about the Xiaohe-Gumugou cemeteries

The mystery of the Sintashta people

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Thursday, May 16, 2019

Fresh off the sledge

As things stand, the closest individual to a Proto-Uralic speaker in the ancient DNA record is arguably 0LS10 from an Iron Age tarand grave in what is now Estonia. I say that because:

- isotopic data suggest that 0LS10 wasn't born where he died, and considering his elevated Siberian ancestry relative to earlier and most contemporaneous Baltic ancients, he was very likely a migrant to the Baltic region from the east

- the tarand grave tradition appears to be specifically a Finnic (west Uralic) phenomenon that probably spread from the Volga-Oka region, which is just west of where most people place the Proto-Uralic homeland

- 0LS10 belongs to Y-chromosome haplogroup N-L1026, a paternal marker that is especially closely associated with Uralic-speaking populations and probably only appeared in the East Baltic region during the transition from the Bronze Age to the Iron Age

You can find more background info about 0LS10 and other relevant samples in Saag et al. 2019 (see here). This is where he sits in my Principal Component Analyses (PCA) focusing on fine scale Northern European genetic diversity. The relevant datasheets are available here and here, respectively.

Note that 0LS10 doesn't cluster strongly with any ancient or modern populations. To investigate this in more detail I ran a series of two-way qpAdm analyses, testing tens of ancient individuals and populations as potential admixture sources. These two models stood out above the rest in terms of their statistical fits, chronology and overall plausibility.

Baltic_EST_IA_0LS10
Baltic_EST_BA 0.826±0.045
RUS_Sintashta_MLBA_o1 0.174±0.045
chisq 12.527
tail prob 0.564048
Full output

Baltic_EST_IA_0LS10
Baltic_EST_BA 0.683±0.102
RUS_Mezhovskaya 0.317±0.102
chisq 13.811
tail prob 0.463864
Full output

Please note that RUS_Sintashta_MLBA_o1 isn't representative of the Sintashta culture population as a whole. It's a group of the most extreme genetic outliers among the Sintashta samples, and they may or may not have been Uralic speakers (see here). Interestingly, the Mezhovskaya culture population is generally associated with the Ugric branch of the Uralic language family.

I was also able to closely replicate these results with the Global25/nMonte method; down to almost one per cent. However, the statistical fits (distances) are poor, probably because the reference populations aren't the real mixture sources. This is in line with the fact that their Y-haplogroups are Q1a, R1a and R1b, rather than any type of N.

Baltic_EST_IA:0LS10
Baltic_EST_BA,83.8
RUS_Sintashta_MLBA_o1,16.2
distance%=4.7955

Baltic_EST_IA:0LS10
Baltic_EST_BA,69.8
RUS_Mezhovskaya,30.2
distance%=3.5783

I do realize that two Bronze Age samples from Bolshoy Oleni Ostrov, Kola Peninsula, belong to N-L1026, but adding them to my mixture models doesn't help. Little wonder, because the Kola Peninsula lies within the Arctic Circle, and I'm pretty sure that 0LS10 and his N-L1026 came from somewhere just north of the mixture cline marked on the map below. Unfortunately, I can't test this directly yet due to the scarcity of ancient samples from this region.

Friday, May 12, 2017

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

All of the post-Middle Neolithic samples from the recent Mittnik et al. and Saag et al. preprints on the ancient population history of the Baltic region belonged to Y-chromosome haplogroup R1a. And most of them belonged to the R1a-M417 (R1a1a) subclade that makes up almost 100% of the R1a lineages in the world today. This is what the results look like in a table (the sample IDs are of my own design):

Earlier samples from the same region belonged to Y-haplogroups I2a and R1a, but this was a subclade of R1a defined by the YP1272 mutation that is extremely rare today even in Northeastern Europe.

And now shifting our focus west of Scandinavia: all but two of the post-Middle Neolithic samples from around the North Sea from the recent Olalde et al. preprint on the Bell Beaker phenomenon and ancient population history of Northwest Europe belonged to Y-chromosome R1b, and more specifically to the R1b-M269 (R1b1a1a2) subclade, which makes up almost 100% of the R1b lineages in the world today. Here's a table:

Earlier samples from the same region belonged to Y-haplogroups I2a, I, G2a and CF, and most of the instances of I and the CF would probably be classified as I2a if not for missing data.

Interestingly, despite the R1a vs R1b dichotomy between these post-Middle Neolithic obvious newcomers to the Baltic and North Sea regions, respectively, they were very similar in terms of overall genetic structure, obviously closely related, starkly different from Middle Neolithic Northern Europeans, and in all likelihood mainly derived from the same homeland that was not located in Northern Europe.

So can we locate this homeland with any degree of certainty, you might wonder? In fact, you might ask, isn't this a futile search for the time being, as we await ancient DNA from many prehistoric Eurasian populations?

Not at all, because when attempting to answer this question we're bounded by two key constraints: the exceptionally high frequencies of R1a and R1b in the post-Middle Neolithic Baltic and North Sea samples, and their close genetic affinity to earlier and contemporaneous populations from the Pontic-Caspian steppe, part of which is due to significant Caucasus Hunter-Gatherer (CHG) admixture that was lacking in Middle Neolithic Northern Europeans.

Indeed, to date, the Pontic-Caspian steppe is the only region where both R1a and R1b have been found in ancient remains from the same sites dating to the Mesolithic, Neolithic and Eneolithic. Here's a table based on results from Mathieson et al. 2015 and 2017. The R and R1 might really be R1a or R1b if not for missing data.

The Pontic-Caspian steppe also abuts the Caucasus foothills, and we know that CHG admixture was a major feature of its inhabitants from at least the Eneolithic. So odds are, and make no mistake, these are indeed excellent odds, that the homeland we're looking for was on the Pontic-Caspian steppe.

But of course I2a has also been recorded in prehistoric samples from the Pontic-Caspian steppe. So, you might ask, why did the populations migrating out of the steppe belong to R1a and R1b, and why did some of them seemingly carry only R1a while others only R1b? This can be explained by local founder effects on the steppe due to patrilocality. Moreover, it's possible that some groups moving out of the steppe did carry high frequencies of I2a, but they're yet to enter the ancient DNA record. [Edit: Maybe they already have? See here]

Now, the aforementioned post-Middle Neolithic newcomers to the Baltic and North Sea regions are most certainly in large part the direct ancestors of modern-day Northern Europeans, speaking languages belonging to the three daughter branches of late Proto-Indo-European (PIE): Balto-Slavic, Celtic and Germanic. It's highly unlikely that languages ancestral to these present-day languages were spoken by Middle Neolithic farmers, nor introduced into Northern Europe after it was colonized by the migrants from the Pontic-Caspian steppe.

What this strongly suggests is that the Pontic-Caspian steppe was also the late PIE homeland.

But, you might argue, the Pontic-Caspian steppe may have just been the expansion point for some of the late PIE language branches. No, that won't work. For one, modern-day populations speaking languages belonging to all other late PIE branches, such as Armenian, Greek, Indo-Iranian and Italic, show signals of the same population expansion from the Pontic-Caspian steppe that gave rise to modern-day Northern Europeans, in the form of Yamnaya-related genome-wide genetic admixture and appreciable frequencies of Y-chromosome haplogroups R1a-M417 and/or R1b-M269.

Some of these signals are certainly due to fairly recent admixture from Northern Europeans, like in much of Greece as a result of the Slavic expansions during the Early Middle Ages, but most cannot be explained in this way.

Secondly, Balto-Slavic, Celtic and Germanic are not more closely related to each other than to some of the other late PIE branches. For instance, Balto-Slavic is considered far more closely related to Indo-Iranian than to Celtic, which is generally seen as a sister branch to Italic. Therefore, if Balto-Slavic and Celtic derive from a homeland on the Pontic-Caspian steppe, then logically this is also where we should look for the origins of Indo-Iranian and Italic.

So as far as the late PIE homeland is concerned, thanks to ancient DNA, the debate is now practically over. But the PIE homeland debate is still wide open, or so we're told.

Apparently, Mathieson et al. 2017 aren't comfortable with putting the PIE homeland on the Pontic-Caspian Steppe because they can't find any evidence in their ancient DNA dataset of a significant migration through the Balkans that would potentially bring Anatolian languages from the Pontic-Caspian steppe to Anatolia. From the paper:

One version of the Steppe Hypothesis of Indo-European language origins suggests that Proto-Indo European languages developed in the steppe north of the Black and Caspian seas, and that the earliest known diverging branch – Anatolian – was spread into Asia Minor by movements of steppe peoples through the Balkan peninsula during the Copper Age around 4000 BCE, as part of the same incursions from the steppe that coincided with the decline of the tell settlements. [51] If this were correct, then one way to detect evidence of it would be the appearance of large amounts of characteristic steppe ancestry first in the Balkan Peninsula, and then in Anatolia. However, our genetic data do not support this scenario. While we find steppe ancestry in Balkan Copper Age and Bronze Age individuals, this ancestry is sporadic across individuals in the Copper Age, and at low levels in the Bronze Age. Moreover, while Bronze Age Anatolian individuals have CHG/Iran Neolithic related ancestry, they have neither the EHG ancestry characteristic of all steppe populations sampled to date [20] , nor the WHG ancestry that is ubiquitous in southeastern Europe in the Neolithic (Figure 1A, Supplementary Data Table 2, Supplementary Information section 1). This pattern is consistent with that seen in northwestern Anatolia [11] and later in Copper Age Anatolia [23], suggesting continuing migration into Anatolia from the East rather than from Europe.

And this...

On the other hand, our data could still be consistent with the Steppe-Balkans-Anatolia route hypothesis model, albeit with constraints. It remains possible that populations dating to around 1600 BCE in the regions where the Indo-European Luwian, Hittite and Palaic languages were spoken did have European hunter-gatherer ancestry. However, our results would require that such ancestry was not ubiquitous in Bronze Age Anatolia, and was perhaps tightly linked to Indo-European speaking groups. We predict that additional insight about the genetic origins of the potential speakers of early Indo-European languages will be obtained when ancient DNA data become available from additional sites in this key period in Anatolia and the Caucasus.

But I'd say the authors are taking that one particular version of the Steppe Hypothesis way too seriously. They might even be implying things that the creator(s) of the said hypothesis never posited.

Why do they seemingly expect a massive surge of steppe admixture into the Balkans during the Copper Age? If the steppe people are just shooting through the Balkans on their way to Anatolia, why would they leave a lot of admixture along the way? And if the locals are abandoning their tell settlements and running for the hills as far away from the oncoming steppe invaders as they can, how exactly would they acquire steppe admixture? Osmosis or what?

The Balkans is not Northern Europe, and the hypothesized migration of the proto-Anatolians from the Pontic-Caspian Steppe to Anatolia through the Balkans was never, as far as I know, meant to parallel the massive Corded Ware expansion across Northern Europe. In other words, why should all of the early Indo-European expansions have been of the same character, especially considering that they moved into such starkly different areas of Eurasia?

Indeed, as Mathieson et al. 2017 point out in the quote above, the evidence for the fleeting presence of steppe peoples in the Copper Age Balkans is in their dataset. For instance, in their Varna 1 sample set from Bulgaria, three out of the five individuals show significant steppe admixture. One of these individuals is almost 50% Yamnaya-like. Surely, there's really no need to expect anything more than that when looking for signals of a proto-Anatolian migration from the Pontic-Caspian Steppe to Anatolia.

In fact, even though I do appreciate the incredible work these guys are doing and the data they're making available to myself and everyone else, I suspect that there's a little bit of, shall we say, schadenfreude going on here.

They sequenced all of three Early Bronze Age Anatolians of obscure origin (are they actually suspected Anatolian speakers, like Luwians?), and apparently it's a big deal that they can't find any steppe admixture in Early Bronze Age Anatolia. Come on.

And then we're offered just three Yamnaya samples from the Pontic Steppe in Ukraine. One happens to be a massive outlier towards the Caucasus. Wow, what are the chances of that? And guess what, all three of these Yamnayans are females, so of course we're left wondering about the Y-haplogroups of the Yamnaya males on the Pontic Steppe. What happened to the males? Next paper, that's what.

Update 19//05/2017: Please note that the authors are not holding back any Yamnaya males from Ukraine for a future paper, as per my claim in the last paragraph above. They used what they had for the time being.

Update 21/05/2017: Actually, I suspect that we already have a population from the Bronze Age steppe in the ancient DNA record with a high frequency of Y-haplogroup I2a. See here.

See also...

R1a-M417 from Eneolithic Ukraine!!!11

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts

Eastern Europe as a bifurcation hotspot for Y-hg R1

Globular Amphora people starkly different from Yamnaya people

search this blog