New blogs

Leherensuge was replaced in October 2010 by two new blogs: For what they were... we are and For what we are... they will be. Check them out.

Thursday, March 18, 2010

Another tentative mtDNA chronology


[Updated Aug 6 2010: there was a key error in the steps between L3'4'6 and L4 and L3. In other words: I totally forgot about the L4'3 node and the three CR mutations defining it, pushing L(xL4'3) ahead in time some est. 9000 years. Corrected the main est. chronology now but left the text untouched otherwise]


Take what follows with the proverbial pinch of salt please. It's nothing but a working note.



Method:

The first West Eurasian lineages that show up in the downstream CR mutation count are R0 and M1, at 30 CR mutations counting from "Eve" (7 downstream of L3).

The first European-specific lineages are H and V at 33 CR mutations (10 from L3).

These facts may serve to generate a tentative chronology using the simplest molecular clock methodology: 1 CR mutation = 1 time unit.

A further control may be the place of K1 at 48 CRM (21 from L3), the most recent important clade showing signs of expansion in Europe/WEA, which should not be more recent than Neolithic.

The safest date for the colonization of Europe, which surely included an starlike expansion like the one we see at H, is that of Aurignacian expansion c. 40 Ka ago. A reasonable (but speculative) date for the arrival of H. sapiens to West Asia is c. 50 Ka ago.

This makes CRM=10/3=3.33 Ka.

To make calculations simpler I will round down this rate to CRM=3 Ka and take age(H,V)=40 Ka. as main reference.

Control: K1: 40-(3x11)=7 Ka ago. Fits perfectly: 7 Ka ago is roughly when Neolithic arrived to Central Europe, where K has been detected aboundantly in aDNA from the period and may be coincident with the expansion of K1.


Results:

  • 148 Ka - Root ('Eve')
  • ...
  • 133 Ka - L1''6
  • ...
  • 121 Ka - L0, L1, L2"6 --- Beginning of Abbassia Pluvial
  • ...
  • 115 Ka - L0a'b'f'k
  • ...
  • 109 Ka - L5
  • ...
  • 100 Ka - L2'3'4'6, L0a'b'f
  • ...
  • 91 Ka - L0a'b, L0d, L1c --- End of Abbassia Pluvial
  • 88 Ka - L0f
  • 85 Ka - L3'4'6, L0d1'2
  • ...
  • 79 Ka - L2, L0a
  • 76 Ka - L2a'd, L3'4
  • 73 Ka - L4
  • 70 Ka - L3 --- Out of Africa migration?
  • 67 Ka - L4b, L3a, L3b'f, L3c'd'j, L3e'i'k'x
  • 64 Ka - L3i, L3h
  • 61 Ka - M, L1b, L3e ---- Beginning of Eurasian Expansion
  • 58 Ka - L2b'c, L3f, M1'51, M3a, M3c, M4"64, M5, M9, M12'G, M13'46'61, M25, M29'Q, M32'56, M33, M34'57, M35, M40'62, M44, M49 ---- Arrival to East Asia and Melanesia
  • 55 Ka - L1b1a, L2a, N, M30, M37, M7, M9a'b'c'd, M14, M17, M56, M36, M42, M52'58, M60, D --- Arrival to Australia
  • 52 Ka - R, L0d3, L3d, M3b, M4a, M4b1, M4b2, M45, M13, M21, M27, M39, M71, N1'5, N9, S --- Final pan-Eurasian wave
  • 49 Ka - R0, L4a, L3k, L3x, M2, M38, M43, M6, M8, G, M31, M54, O (N12), R2'JT, R6, R11'B7, B4'5, R30, R31, P ---- Colonization of West Asia
  • 46 Ka - D4, L3e1, M1, M4c, M5a, E, Q, M32a'b, M53, N1, HV, R9, R12'21,
  • 43 Ka - M63, M11, M29, M41, M73, D1, N2, N9a, N22, HV0a, JT, R5, R9b, U
  • 40 Ka - H and V, L0k, L2a1, L3b, M1a, M64, M8a, Z, M12, Y, A, X, R0a, F, U6, U2'3'4'7'8'9 ---- Colonization of Europe, North Africa and NE Asia
  • 37 Ka - M10, N5, N9b, X2, H1, H2, H3, H6a, H6b, H7, H9, H10, H13, H14, H15, H16, H17, R8, U6b, U4'9, U8
  • 34 Ka - C, D4a1, A2, R0a2, H2a, H8, H11, H12, H18, H19, J, U3, U5, U6a --- Beginning of coldest conditions
  • 31 Ka - L6, L3c, L3j, H4, J1, U5a, U6d ---- Gravettian
  • 28 Ka - L2d, M7a1a, M23, R2, J1c, J2, R7, R11, U1, U5b
  • 25 Ka - L2e, N1a, I, U6c, U2b, U5b3, U9
  • 22 Ka - W, J2b, U4 --- Solutrean
  • 19 Ka - K, D4h3a
  • 16 Ka - M51, T, K2 --- Magdalenian, end of coldest period
  • 13 Ka - T1
  • 10 Ka - T2, K2a --- End of Ice Age, earliest Neolithic, Epipaleolithic
  • 7 Ka - K1 --- European Neolithic
  • 4 Ka - K1a1, T2b

Note: bold type is arbitrary for perceived "most important haplogroups", however font size reflects the presence of star-like nodes: large size for 5-12 branches, largest size for >15 branches. Clades are listed in logical phylogenetic order with a few exceptions when a single node seems to define a whole phase, in which case they have been listed first. All suggested dates are in thousand years (Ka) ago.


Comments:

Of course there's no way I know of properly estimating the effective mutation rate at each space-time, which should be affected by issues as population size and, with low population levels specially, purely random accidents (drift). Still, I would prefer a logarithmic approach, with longer times/mutation towards the past and smaller ones towards the present.

That would probably be better because it would allow to push the L2 and L3'4'6 expansion towards a more realistic date at the beginning of the Abbassia Pluvial, when we see clear signs of expansion in North Africa and Palestine and also would push the root of the tree (the earliest genetic signal of expansion of H. sapiens) closer to the oldest known fossils c. 160 Ka.

Another issue is the known length of downstream branches, even in some well studied lineages, which appear almost "frozen" since their expansion. This seems to happen in particular to large star-like lineages like M and H (not sure why) but, in any case, the high variability in the length of the lineages towards the present is an anomaly that I would rather not have to face.

In this regard, notice the star-like expansions within the K and T haplogroups by the end of the (always tentative) chronology, wouldn't they be better some 3,000 years earlier? That way T2 and K2a could take part in the Magdalenian expansion, while K1a1 and T2b would belong to the Neolithic expansion. But maybe they fit well with Epipaleolithic and what I imagine as some phase of the Indoeuropean expansion... somehow.

A corrected 2.7 Ka/CRM ratio would fix that.

But it would also push the oldest dates forward quite a bit (for instance the root would be at just 129 Ka), so I feel I need a more refined approach: which should probably be a logarithmic or quasi-logarithmic equation that could account for estimated population sizes. However my maths skills are terribly rusty...


See also PhyloTree for a whole comprehensive mtDNA phylogeny.

7 comments:

Anonymous said...

Hi. Thanks for your interesting articles.

A few questions :

- You seem to say that U5 is among the youngest U hgs and apparently it's rather seen as older, elsewhere. The "mainstream" estimations seem quite different (from 60 kya to 45 kya). Why the difference?

http://en.wikipedia.org/wiki/Haplogroup_U_%28mtDNA%29#Haplogroup_U5

- Where do you think appeared U8 (and where is its highest frequency in the world?)?
In Europe or in Asia minor/Near-east?
I mean, there's apparently plenty of K in Kurdistan and the near-east. Did they enter Europe or did they leave it through migration at some point?
K kind of puzzles me.

- Where do you think H appeared?
middle-east (apparently that's where HV appeared) or Europe where it is the most numerous mtDNA hg (and as apparently there were some HV found in Europe deep into paleolithic - even though it could have been only at the origin of V) ?

Maju said...

"You seem to say that U5 is among the youngest U hgs and apparently it's rather seen as older, elsewhere".

My method is the hyper-simplest: count the mutations (control region only) from the root (or other ancestral node) down to the relevant node. Instead the usual MCH approaches what do is to average the length of the various branches downstream of the node they want to estimate. Their system favors lineages which have accumulated more mutations AFTER divergence, hence making lineages such as U (in general) look older and lineages such as HV look younger. I suspect this is an artifact of the method but, of course, my method can also produce some artifacts. I just happen to make more sense of it.

Similarly the usual methods may even say that N "is older" than M and even that H11 "is older" than H (this last is obviously impossible but I have seen it reported).

My method, with whichever flaws it may have, produces results that are consistent in some aspects:

1. The coalescence of L0, L1 and L2"6 is simultaneous, what makes sense if caused by the same kind of conditions.

2. The coalescence of L2 and L3 is simultaneous (same sense as above).

3. The huge star-like M node is the first Eurasian haplogroup to expand, what makes sense with arrival to huge virgin new lands such as Southern Asia, after the OoA migration.

4. The only other star-like node that can be compared to M, H, would also be the first European haplogroup to expand, what makes sense if that happened at the early colonization of Europe (where I think H actually "exploded", rather than in West Asia).

Maju said...

"Where do you think appeared U8 (and where is its highest frequency in the world?)?"

Highest frequency should be where K is most common, that is in Central Europe (I think), because U8b and U8a are rare. But that's pretty irrelevant because it obviously represents a different and much later process.

I don't have totally clear where did U8 arose: U8a appears clearly European and related to the Franco-Region (but it's a very rare lineage anyhow), U8b'K might be most diverse around the Alps but, as happens with some other lineages with high diversity in Italy, West Asia is a close competitor. So I can't say for sure.

I think that the expansion signature of U8 as such correlates with the expansion of H (actually slightly after the H node, at the same time as the many H-derived sublineages). But this might just mean a late migration of pre-U8a into Europe, with pre-U8b'K remaining in West Asia.

U8a-specific expansion signature belongs to the 22 Ka period (but it's probably meaningless, being such a rare lineage anyhow). U8b'K expansion (bifurcation into pre-U8a and pre-K) instead looks older to me: to the 31 Ka "moment". This might indicate a penetration in the Gravettian period, if it can be resolved as European.

This is very difficult to say because, archaeology supports Paleolithic (minor?) flows between Europe and West Asia in both directions (the European-like art of Turkey and Egypt attest the West->East direction, while the Epipaleolithic of the Zagros seems to derive from a transcaucasian origin in Eastern European Epigravettian).

When the diversity is so similar, it is extremely difficult to say. Whatever the case, this and other lineages suggest a thin but clear arch of connection between Italy (specially) and West Asia, probably in Paleolithic times too.

"K kind of puzzles me".

Me too: it could be Magdalenian or Neolithic, West Asian or European. I'm undecided.

However if it could be confirmed that K (rather than the usual suspects), and specially K2a, was the main benefitiary of the Magdalenian re-colonization of Central Europe, then it'd be European almost by default.

However the aDNA data, in particular the five U individuals from Swabia (U* and U5), rather suggests a Neolithic timeframe for K, allowing it to have coalesced in West Asia. Its presence in North Africa can also be coincident with this Neolithic timeframe (but again it may be not, as the H and V there are almost all derived from SW Europe).

Can't say with any certainty. Hopefully some day we will get some nice aDNA that solves our doubts.

"Where do you think H appeared?
middle-east (apparently that's where HV appeared) or Europe"...?

IMO Europe. There's huge diversity in Europe and many basal sublineages (which must have expanded shortly after H itself) are not found in West Asia.

Also I have already mentioned the fact that North African H (attested since at least 12,000 BP) is derived from SW Europe, specially Iberia (H1, H3, H4) and also what should be France (H7). This fits perfectly with the classical theory of Oranian being derived from the Gravetto-Solutrean of Iberia (which I think is confirmed by archaeology, even ¡f some "Africanist" authors are reluctant to accept) and may be related to the high diversity of U6 in Iberia in turn (backflow), a lineage almost non-existent elsewhere in Europe.

Alternatively one might argue for the "H explosion" (which is really huge in terms of basal sublineages and only comparable to that of M) might have happened in West Asia but "minutes before" crossing the Bosphorus, so to say. But I'm really inclined to think it as an Aurignacian marker, because there was some really fast action at that time in Europe, right after the CI supervolcano, which erased almost all previous cultural diversity (some of which might be Neanderthal) and left an almost uniform Aurignacian ethno-cultural landscape in just few millennia.

Anonymous said...

Thanks for the clarifications.

Anonymous said...

I find it hard to accept that my mtDNA haplogroup originated so long ago. I belong to the V haplogroup.

It goes against what I have read which is an origin around 12-15 Kya.

Maju said...

Hi, Ponto.

"It goes against what I have read"...

It also goes against what I have read... but science is based in methodical doubt.

My method seems to be the opposite, while they count mutations from present day populations, I do from the root of the genealogical. As all branches are not identical in length, there are discrepancies.

So this brings us to a key question, why are some branches longer and other shorter?

I have detected that shorter branches are also almost invariably those more common, which may mean that, when populations expand, effective mutation rate is slowed.

Why? I suspect this may only natural as the dominan lineage(s) would easily drift out any novel mutations in most cases unless a major demographic change is involved.

As for haplogroup V specifically, it has been found (2 individuals) in Late Paleolithic North Moroccans of Oranian culture (Kefi 2005 - PPT download from Pasteur Institute) and Cromagnoid morphology, dated to c. 12,000 BP. Oranian itself is dated to c. 22-20,000 BP and probably derived, at least in part, from Iberian Gravetto-Solutrean.

Unless you believe that mtDNA V originated at the Rif and expanded quickly afterwards in some mysterious way (no possible cultural flow can be associated) to as far north as Lappland, I think that this aDNA data clearly rejects the validity of such short estimates.

It does not prove my estimates either but it is at least consistent.

Cheers.

Maju said...

Sorry about the many typos above. :/