Friday, March 13, 2009
Tired of relying on unlikely TRMCA esimates by others, I decided to take a look at human Y-DNA structure on my own. Unlike with mtDNA, which is relatively asy to study in full, the Y chromosome is very large and, with very few exceptions, it has never been studied in such lenghts. Instead geneticists have been gradually adding SNPs to their collections and that way perfecting our understanding of human paternal genalogy.
But the different branches are very unequally studied. Some like Western European R1b are very well studied (partly via private genealogy companies) and we can presume that we know already most of the SNPs that exist in that line. In comparison others are barely sketched.
This uneven reality of knowledge makes very difficult to compare the different lineages as I did with mtDNA using SNPs (see here, here and here). So I did the following: using the YSOGG data, took a reference lineage (the longest one from root to tip: R1b1b2a1a4a, 104 known SNPs from "Adam") and took also sample lineages in each of the other branches (always the longest apparent line within each). With simple maths I determined the "informative value" of each SNP in those lineages (always in comparison to the reference) and then calculated the equivalent value of the lineal sequences of SNPs at the root of each branch.
The resulting chart is as follows:
Of course, there is some uncertainty, greater for branches that are very badly studied, like C or H, whcih could well be older than they appear here. But the result is at least pretty much illustrative of how things might have been. There is also great uncertainty regarding what exactly "recent past" means. I assume that it means at least two extra reference SNPs, maybe more, as these had no real time to expand and must be restricted to private lineages. Any clade that is widespread enough as to be sampled more than once (i.e. non-private) is probably a founder effect from some time ago: it did not spread this century or even this milennium certainly.
Whatever the case, I did some afterwork on the raw diagram, trying to find out some logical timelines. I ended with the following:
This is by no means definitive: some fine tuning may generate much improved estimates, but it does give a rather consistent overall scheme of how things might have been. It was only natural to place Toba catastrophe where it is, just before the great expansion at rapidly succeeding nodes: F, IJK, K and PNO, as well (with some uncertainty) C. But also after the DE and CF nodes. There is archaeological evidence of the presence of H. sapiens in Eurasia before Toba and of continuity in South Asia, so moving the Toba timeline to before the CF node makes no sense to me. It was just very convenient (and also natural) to assign 1000 years per reference mutation along the R1b line, so I did.
Analyzing the intrepretation a little bit:
R1b may be than 20,000 years BP (the R1b apparent node falls exactly in the 21 kya line and the R1 node is at the 23 kya line), what is more coincident with Solutrean than Magdalenian. But the most common subclade R1b1b2a (and notice that the other subclades' position is unclear) appears diversified as recently as c. 5000 BCE. At that date Europe was like this and even if we stretch the timeline a little (map), we can only attribute this homogenity within R1b safely to the post-Magdalenian Epipaleolithic context (Tardenoisian especially). So it is possible that most modern European R1b (R1b1b2a) only consolidated in Epipaleolithic times and may have a more northernly origin than the Basque Country (France-Belgium-Rhineland). It might also be Neolthic but it's extremely difficult to find a single Neolithic source for that.
R1b is not just present in Europe and West Asia but it's also frequent in Africa and Central Asia. The latter is mostly R1b1b1, while the (ill-studied) African one falls within two categories: Euro-like R1b1b2a and exotic R1b*. Egyptian R1b is evenly divided in these two categories while Ouldeme R1b appears to be R1b* (some distinct subclade, not yet categorized) in its totality. If the current understanding of non-European R1b is correct (it may be not), then the overall expansion of R1b may have happened c. 20,000 years ago, maybe in connection with Solutrean and related cultures like Iberomaurusian (Oranian), while the "European" R1b instead would have expanded closer to the Epipaleolithic/Neolithic timeline, it seems.
The node shown (c. 18,000 years ago) represents the earliest bifurcation, which surely happened in South or Central Asia. It does not represent the R1a1a main subclade, which expanded much later with all likehood.
We know nothing yet of the substructure of R2 but from the estimates it seems it diverged from R1 c. 32,000 years ago, deep in the Paleolithic.
The divergence of R and Q appears even older, c. 42,000 years ago, before most of Europe was colonized by h. sapiens.
It's been recently discovered that NO and P share one basal mutation. Still their split falls wholly in the main period of K branching out, that, according to my estimates is earlier than 60,000 BP. I will try to adress NO substructure and internal timeline in the future.
The K multifurcation appears to represent the main Eurasian expansion better than anything else. It may have happened some 63,000 years ago. Some of the branches (L, T) may have gone by very long coalescence periods, while others expanded soon after the K node instead.
Another recently discovered connecion is that of IJK. IJ separated from K not long before the main K divergence and not long after the main F multifurcation. I and J appear to have split c. 50,000 years ago, maybe in connection with the earliest European colonists in Bulgaria. I will try to adress IJ internal structure and possible timeline in the future.
F may have been the first lineage to diverge after Toba. If H and G actually hang from the F node directly, this may have happened somewhere in NW South Asia.
The estimated timeline for the coalscence of C may be too long. I am rather inclined to think its multifucation happened earlier, closer to the F-K spread. Whatever the case, this clade must have participated in the "main Eurasian expansion" right after Toba.
I also promise some greater insight in the internal structure of this macro-clade (soon to comeTM). Like in the case of C, the main bifurcation node may be missdated a little too late anyhow.
It's very curious how this African lineage appears to have an expansion time that parallels that of F-K, right after Toba. This catastrophe surely also affected Africa and may have caused some major alterations and opportuities. Of course, as B is only poorly understood, it may be just an illusion.
The oldest distinct human lineage would seem to have expanded in a parallel timeline to that of the out-of-Africa epysode. Probably there were favorable climatic conditions in that window, helping everybody.