Avaria
https://en.wiktionary.org/wiki/avaria#Italian |
Some time ago one of the Avar period samples from Hungary reported initially as a female (V37.2: released Feb 2019):
AV1 AV1 AV1 petrous .. .. 1240K AmorimNatureCommunications2018 1361 541-637 calCE (1487±26 BP) Hungary_Avar Szólád Hungary 46.28333333 17.85 F X2m'n 1472±27 0.325459145 2.76426 802843 half All PASS (literature) .. .. 0.325459145 0.0874 X2m'n 0.98 2018
Gained male Y-chromosome (V42.4: released Mar 2020):
5697 AV1 AV1 AV1 petrous 2018 AmorimNatureCommunications2018 AmorimNatureCommunications2018 1361 541-637 calCE (1487±26 BP) Hungary_AvarPeriod Szólád Hungary 46.28333333 17.85 1240K .. 2.573859 802843 U R1b1a1a2a1 .. X2m'n .. .. .. .. .. .. .. half .. PASS (literature)
This erroneous information possibly have come from the authors themselves as even folks from the Reich's Lab were bamboozled for a while (although not completely, as the sex of the sample was changed to Unknown). Fortunately since then they've greatly improved their process of assessing uniparental markers and current version of their dataset was released with corrected metadata ([Wed Jan 20 22:51:34 EST 2021]: V44.3 release):
11218 AV1 AV1 AV1 petrous 2018 AmorimNatureCommunications2018 AmorimNatureCommunications2018 Direct (WARNING MISSING LAB CODE): 95.4%; IntCal20, OxCal v4.4.2 Bronk Ramsey (2020); r:5; Atmospheric data from Reimer et al (2020) 1360 25 549-640 calCE (1487±26 BP) Hungary_Avar_5 Szólád Hungary 46.28333333 17.85 1240K .. 2.573859 802843 .. F Hungary, Szólád Family A (2 members) (AV2-AV1 have a mother-daughter relationship) .. 10.47 X2m'n .. Haplofind .. 0.0874 .. n/a (female) n/a (female) n/a (female) n/a (female) -0.015 0.006 None [0,0] ds.half .. PASS (literature)
Could we have somehow avoided these shenanigans?
Below you can see a full (sic!) output from Yleaf:
chr pos marker_name haplogroup mutation anc der reads called_perc called_base state chry 14730165 FGC26351 A00 T->A T A 1 100 T A chry 6964065 M6857 B2b1b1b~ A->T A T 1 100 A A chry 18581720 CTS8960 D1a2a3a1a A->G A G 1 100 A A chry 22770207 CTS10774 E1b1a1a1a1c1a1a3a1d G->A G A 1 100 G A chry 22770232 CTS10775 E1b1a1~ G->T G T 1 100 G A chry 7873617 Y6168 E1b1b1b2a1a1a1a1e1~ C->A C A 1 100 C A chry 8272388 Z40155 G2a1a1b2 C->T C T 1 100 C A chry 8272382 FT40739 G2a2b1a1a1a2b2~ G->A G A 1 100 G A chry 23987375 BY200867 G2a2b2a1a1b1a1e2a T->A T A 1 100 T A chry 17544312 Z13554 H3 G->T G T 2 100 G A chry 16202490 L623 I2a1b1a2b2 A->T A T 1 100 A A chry 13960753 Z28245 J1a2a1a2d2b2b1a~ G->T G T 1 100 G A chry 16202478 ZS6620 J1a2a1b1a1~ A->G A G 1 100 A A chry 9990070 Z8257 J2b2a~ C->G C G 2 100 C A chry 17411756 AM01369 J2b2a~ C->T C T 1 100 C A chry 14858930 Z42488 L1a1b3a1a1a~ G->A G A 1 100 G A chry 22567345 M2106 N1a1a1a1a C->G C G 1 100 C A chry 14513977 YP5780 N1a2b2b G->A G A 1 100 G A chry 9953214 Z25708 O2a2b2a1 G->A G A 1 100 G A chry 16202425 F2086 Q1a1a T->C T C 1 100 T A chry 17544259 YP819 Q1a2a1~ T->G T G 2 100 T A chry 8502236 L51 R1b1a1b1a G->A G A 1 100 A D chry 21154383 S22641 R1b1a1b1a1a1c3 G->A G A 1 100 G A chry 13204207 BY15390 R1b1a2 T->A T A 1 100 T A
The proportion of reads mapped to X and Y chromosomes leaves no illusions (Skoglund et al., 2013):
chrX 289426 3.86% chrY 1732 0.02%
If checking this was so easy, why didn't all these genetic experts out there do this instead of babbling about some "Celto-Germanic father"? Were they trying to deceive you, dear reader?
I hope that you now know who not to trust.
Comments
Green is more probable because it better fits the whole G25 dataset, but who knows.
Target: Av0_Pannonian
Distance: 1.9282% / 0.01928151 | R3P
52.0 Polish
24.2 French_Occitanie
23.8 Greek_Peloponnese
Target: Av0_Pannonian
Distance: 1.6694% / 0.01669411
15.8 Lithuanian_VZ
15.0 French_Occitanie
14.2 Lithuanian_SZ
12.0 Greek_Central_Macedonia
11.4 Polish
8.6 Dutch
7.0 Basque_French
6.6 Serbian
4.6 BedouinB
1.8 Greek_Peloponnese
1.2 Irish
1.0 Spanish_Galicia
0.8 Yemenite_Mahra
Av0_Pannonian,0.132035,0.137097,0.050534,0.033269,0.043393,
0.015338,0.00799,0.011769,0.003068,-0.004556,
0.001136,-0.009291,0.00446,0.009083,-0.009771,
0.014453,-0.016559,-9.99999999999266E-07,-0.003897,0.003752,
-0.004491,-0.002968,0.001972,-0.006387,-0.000958
Thank you for your great work on this blog! I have a couple of questions which--if you haven't already looked into it--I hope you could investigate;
1) Given that Baltic_BA's HG ancestry did not come from the Baltic region, where did it come from? Did you see the recent poster from a conference here: https://twitter.com/GerberDniel2/status/1400010765445353475/photo/1
Using the current HG-rich samples from Hungary, can you show in some way that the HG ancestry in Baltic_BA came from Hungary and surrounds?
2) You showed in the previous article that, after controlling for Baltic_BA and Yamnaya ancestry, the remaining EEF ancestry of Slavs was very Balkan-like. Interestingly, you show this is true also for modern Balts. I think the question of when this Balkan-like EEF signal reached Baltic populations is very interesting, because it may show that Balts reached the region fairly late, after Baltic_BA type ancestry did. The Baltic_BA grouping contain samples from 1200BC to near the 0s AD, and there are also later samples like LTU_BA from the Baltic region in the historical period. Could you check if later samples from the Baltic region show any sign of Balkan EEF introgression compared to the earliest samples in the Baltic_BA cluster?
Much appreciated - Ryukendo
I'll answer your questions by tomorrow. I did not include later Baltic samples because to do so I need to account for possible Uralic-related ancestry. BTW Few hours ago I've found out that in the newest .anno file one of the Estonia_BA samples is described as an outlier:
s19_X09_1.SG Estonia_BA_o.SG QUESTIONABLE (literature, popgen.outlier)
Unfortunately numsnps is only 16729. Right now I'm running dstats and so far the only Z-score > 3 is this one:
Chimp.REF Hungary_Maros_EBA.SG Estonia_BA.SG Estonia_BA_o.SG 0.0113 0.00347 3.26 0.00111 6297
A smoking gun, isn't it?
Re: BR2 haplotype sharing
Target: HUN_LBA:I1504
Distance: 1.7473% / 0.01747324
68.8 Sorb_Niederlausitz:Niederlausitz1
31.2 POL_Globular_Amphora:I2433
He's almost exactly between one of the GAC samples and that single Sorb.
He's probably not ancestral to the Slavs. Likely it's the opposite - he has the "Slavic" ancestry plus extra neolithic on top of that.
2/3 of his ancestry is like modern West Slavs - probably that's why the haplotype sharing in Cassidy et. al (but also check ChromoPainter results from the supplement of Margaryan et al.) tops among Poles and drops among the rest of the Balto-Slavs who have different proportions of common ancestral components.
The rest of his ancestry can explain why there is a high haplotype sharing with Wales. It can be mediated via neolithic substrate common to him and e.g. one of the Celtic groups that ended up in Wales.
Re: 2)
Most of the samples have strong preference for HG, but at the same time they pick rather WHG-poor neolithic pops. And now I'm not sure if I should treat them separately, or as a complex proxy that tries to compensate for different things because something is missing e.g. on the Uralic side.
We can eventually try this approach with Baltic_LTU_BA (because here we don't have to worry about poorly referenced Asian ancestry skewing the model). Turlojiske3 has a nice distance and as an extra neolithic it picks up LBK.
Target: Baltic_LTU_BA:Turlojiske3
Distance: 1.3927% / 0.01392702
54.6 Baltic_LVA_BA
16.0 Baltic_EST_BA
10.2 Corded_Ware_POL_early
9.0 Corded_Ware_DEU
7.8 DEU_LBK_SMH
2.4 Yamnaya_RUS_Samara
Turlojiske1 is more complex. EEF sources are WHG-poor, but in the original model it picks up Iron_Gates. Latvia_HG cannibalizes it completely (assimilation of some HG leftovers?):
Target: Baltic_LTU_BA:Turlojiske1
Distance: 2.1177% / 0.02117722
58.8 Baltic_LVA_BA
11.8 Corded_Ware_DEU
10.0 Yamnaya_RUS_Kalmykia
6.8 Baltic_LVA_HG
5.6 TUR_Barcin_N
3.6 Baltic_EST_BA
2.0 HUN_Tiszapolgar_ECA
1.4 DEU_LBK_HBS
https://i.postimg.cc/bydTHKWr/Baltic-LTU-BA.png
(Iron Gates is not subtracted, but Latvia_HG is, hence the shift)
So it seems that such "southern" ancestry was in the Baltics from the very beginning (of the existence of the Baltic_BA cluster). Although I wouldn't treat this as an ultimate proof, because samples from Turlojiske behave slightly... weird. It's either bad quality or some admixtures from different sides. Or both. Generally I avoid using them in my models.
Re: 1)
Carpathians. Previously I thought that this kind of ancestry survived somewhere north of Ukraine and east of the Baltics, but after the publication of the Volosovo and Fatyanovo samples it became clear that there were just "regular" HGs of the Latvia_MN type.
On the other hand area around Carpathians was (and still is) yielding Baltic_BA-admixed pops one after another. I think that the most important samples are:
BR2 - GAC-BR2 vector points directly at the Balto-Slavic cline. It's unlikely that it's just an accident and probably at least one of the grandparents was on this cline which wouldn't exist without "Baltic_BA". Such population had to be somewhere there. (haplotype sharing is a nice bonus)
_________
Bell_Beaker_HUN_EBA:I3528 (2562-2299 calBCE)
Ancients without VK2020:
Target: Bell_Beaker_HUN_EBA:I3528
Distance: 1.7266% / 0.01726640
19.0 DEU_LBK_SMH
17.2 SRB_Iron_Gates_HG
16.0 Baltic_EST_BA
10.4 ITA_Collegno_MA
6.2 TUR_Alalakh_MLBA
5.0 KAZ_Golden_Horde_Euro
4.8 FRA_Grand_Est_MN
4.6 DEU_Tollense_BA
3.4 FRA_Nouvelle_Aquitaine_Meso
3.2 IRN_Tepe_Hissar_C
3.0 ITA_Sicily_EBA
2.8 DEU_Wartberg_MN
2.4 Baltic_EST_MA
1.2 ROU_C_o
0.8 VUT_2300BP_all
There is enough "drifted" ancestry to reconstruct one of the grandparents as almost pure Baltic_BA (Tollense here is just the WEZ56). Note how early he is.
_______
ROU_C_o:GB (3512-3350 calBCE)
Target: ROU_C_o:GB
Distance: 1.5695% / 0.01569462
33.4 SRB_Iron_Gates_HG
17.6 DEU_Wartberg_MN
15.8 Baltic_EST_BA
9.2 HUN_ALPc_III_MN
8.6 Iberia_Central_CA
7.2 TUR_Pinarbasi_HG
3.6 Baltic_LVA_BA
2.6 HUN_Koros_N_HG
1.6 UKR_Trypillia
0.4 HUN_ALPc_Szakalhat_MN
Nearly 20% of Baltic_BA, despite that this sample rather doesn't have any steppe ancestry, which in turn means that the Baltic_BA acts as a proxy for a hypothetical WHG population with the "Balto-Slavic" drift. Matt on Eurogenes posted once a ghost of this pop that was spot on. When I added it to the mix it replaced Baltic_BA completely. I'll re-run this model if I find these coordinates.
To sum things up:
by 1250 BCE we have a proof of the Balto-Slavic cline developing somewhere in the vicinity of Hungary
by 2400 BCE we have a proof of a population similar to Baltic BA somewhere in the vicinity of Hungary
by 3400 BCE we have a proof of the "Balto-Slavic" drift in a HG-rich sample with zero steppe ancestry in central Romania.
Yeah, I saw it. I also saw the presentation of Anna Szecsenyi-Nagy and I have no doubts how these outliers will look like and what kind of "drift" they'll have.
The only question is if we will see among them any Baltic_BA-like sample... or will we have to wait a little longer for it. At this point finding such specimen is just a matter of time.
After reading your posts on AG I think I misunderstood you. If you were asking for a model where Baltic_BA is the target then it's not possible (and especially not in qpAdm, as besides the proper source we are lacking also a population that we could put in the right pops).