New blogs

Leherensuge was replaced in October 2010 by two new blogs: For what they were... we are and For what we are... they will be. Check them out.

Wednesday, April 29, 2009

Eurasian mtDNA expansion


What follows is based mainly on my previous post
MtDNA tree (version 1.1), which in turn is based fundamentally on the information found at PhyloTree (as well as on Ian Logan's mtDNA site for many of the locations - but not for the phylogeny).

I am at all times assuming that each SNP is equivalent to some amount of time in all branches equally. Of course this is a rather daring assumption but better bold than coy. In any case, overall, it should be a nice approximation to real events.


1. Out of Africa and M explosion:


This map represents the time span included in five SNPs: the L2 and L3 explosions in Africa are roughly contemporary but M explosion (producing some 35 lineages) must necessarily have happened some time later (4 SNPs downstream of L3).

As in subsequent maps (and previous posts), macro-haplogroup M is color-coded in red and macro-haplogroup N in blue. The M node is located in South Asia because there is where the highest top-tier diversity is, the specific placement in Bengal is arbitrary: a concession to the also high diversity of this superlineage in East Asia and, to some extent, Sahul. But the M explosion could well have happened in Western or Southern South Asia too, soon after the arrival from (we assume) the South Arabian coasts.


2. The N explosion and diverse M secondary expansions:


This and subsequent "episodes" represent the (theoretically assumed) temporal equivalent of one SNP mutation.

N has been placed in SE Asia because of the 12 lineages produced in this explosion, the largest minority appears to be in Sahul, specifically Australia. Next in line is again the prolific South Asia. In fact SE and East Asia does not appear so much diverse in regard to this clade but is what stands between India and Australia.


Together with the N explosion, several M sublineages appear to have expanded. It is notable that Andaman appears to have been colonized at this early episode and that an arrival to Japan also at this early time should be considered as very possible.

The use of the "coastal route" should appear as obvious at this stage already.


3. First colonization of Sahul and other M and N derived expansions:


The flow along the Indo-Pacific arc continues and for the first time we see clear indications of human presence in Sahul, both Melanesia and Australia. Again a mainly Japanese clade is also seen expanding what seems to ratify the idea that humans were following the East Asian coast in northwards direction. The split of N1'5 is also indication that, even if N maybe exploded in SE Asia, there was a westward flow, flow that must also have carried the precursor of R back into South Asia.


4. The R explosion:


In this map I introduce two new notational features: (1) N-derived macro-haplogroup R is color-coded in green and not plain blue (reserved for N(xR) clades), and (2) some other relevant derived lineages are shown as smaller dots with italic text (in this case N1, ancestor of I, which appears to pioneer the colonization of West Eurasia).

The central event here is of course the R explosion (15 top-tier lineages), that happened without doubt in South Asia and that surprises somewhat because of its vigour. I speculate that the "tribe" that first carried this lineage may have developed some decisive technological, sociological or ecological advantage but it is hard to say for sure.

We also witness the contemporary expansion of several M sublineages, notably D in East Asia. M14, noted with a question mark, is shared between Australia and West Asia and, like others of the same kind has been tentatively located in SE Asia, without much conviction. M27 probably represents the first colonization of the Melanesian islands NE of New Guinea. N1 must represent the first attempt to colonize West Asia, where they must have met the Neanderthals.


5. Main secondary R expansion:


This moment is clearly dominated by R subclades, which appear by the moment limited to the classic Indo-Pacific arc, already transited by their M and N relatives before them.

The R31 node, shared by South Asia and Australia, is tentatively located at SE Asia but could perfectly be South Asian. It is notable that an R sublineage arrives to Sahul (located at New Guinea but in fact shared with Australia). R2'JT is also notable because it is with all likehood located in NW South Asia, where R2 is still rather frequent and not far from West Asia, where JT would soon after expand.


6. East and West:


After the vigorous R expansion of the two previous episodes, this moment seems somewhat more quiet and focused in West and East Asia. This may be a partly false impression caused by the fact that I am not paying much attention to "minor" derived subclades anyhow.

In any case, the will-be important R0 lineage (also known as pre-HV) is already present and dynamic in West Asia. In East Asia, we see the expansive drive of three important clades: R11'B (ancestor of B), M8 (ancestor of M8a and CZ) and G. Maybe I should have placed G in Japan, as I did with its ancestor but, anyhow, G is also found in mainland East Asia: in Eastern Siberia and Central Asia notably. The M8 node may well also have a rather northern location and both may represent a first attempt of colonization of the Eastern Steppes.


7. East and West II, plus the intriguing N2:


Again maybe for lack of low level resolution, we have the most important events centered in East and West Asia at this episode.

In West Asia probably (but surely not far away from South Asia anyhow) the important R-derived lineage U branches out: U1 and U5 probably stay in West Asia, while U6 may have began its journey towards North Africa. The remainder haplogroup U2'3'4'7'8'9 surely coalesced somewhere between Iran and Pakistan. Another notable West Asian event is the branching out of HV (7 sublineages, including the one leading to H, which may be already exploring its way into Europe).

In East Asia we can see the branching of three major lineages: CZ, R9'F and B.

N2 is the ancestor of Australian N2a and West/South Asian W. It is surely the last direct mtDNA link between these two regions and surely represents the last meaningful transiting of the "classical" Indo-Pacific coastal route that has been losing relevance already. Again its location in SE Asia is merely tentative.


8. Some further Asian expansions:

West Asia: split of JT, what means that this lineage was already "on location" and also split of U2'3'4'7'8'9. U2 and U7 are still shared by West and South Asia, U8 (ancestor of K among other lineages), U3 and U4'9 surely headed to or were already in West Asia.

South Asia: the expansion of M2 (eastern coasts) is notable in itself.

East Asia: haplogroup E, most important in SE Asia, expands.


9. European and North African colonization:


Not the only event in this episode but surely the most important one is the simultaneous expansion of haplogroups H (28 sublineages within a starlike structure: a macro-clade in its own right, closer to M in its generated diversity than to N or R) and U6. H is clearly centered in Europe and U6 is in North Africa. They appear to correspond to Aurignacian and to the "aurignacoid" Dabban industries respectively (both derived from a Levantine "proto-Aurignacian", it seems). The expansion of HV0 (of which V is a derived lineage) is surely also related to those events as is that of U8 (not marked in the map but contemporary anyhow).

Other notable events are the expansion of Q in New Guinea, that of N9a and N9b in East Asia (their "sister" Y would follow soon after) and that of R5 and R8 in South Asia.


Epilogue:

And I believe it is a good moment to stop this visual narration. There are still some relevant events that happened after episode 9 but I have not drawn the maps.


Some of the most relevant not yet told events are:

Episode 10: F expands in East Asia, A does too (but further north probably), Y (N9-derived) also expands at this moment, what means the colonization of Sakhalin. X expands in West Asia and Egypt (X1-X2 split).

Episode 11: M1 expands in West Asia and East Africa, V expands in Europe, U5 splits into U5a and U5b (in Europe?).

Episode 13: C and Z expand already as distinct lineages in which is probably another episode of the colonization of the far North (Z is shared by such distant peoples as Japanese, Finnish or Hazara for instance).

Episode 14: J expands in West Asia.

Episode 17: expansion of K.


Episode 18: W and T expand in West Asia (W maybe South Asia too). It is possible that T2 migrated to Europe at this time.

Note: I estimated before that these K, T and W expansions might be as old as c. 21 kya but right now it appears to me that they could also be Neolithic. The decisive variable is the date of the M (and N) event (as the H node seems necesarily fixed at c. 41 kya): if the M node is estimated at c. 70 or 74 kya, then K, T and W only expanded in Neolithic or Epipaleolithic times but if, as I did before, it is estimated c. 60 kya, then this late expansion is still well in the Paleolithic and K therefore could well be associated with Magdalenian expansion, for instance.

Of course these "age estimates" are done ignoring demographic factors altogether, what is in itself a huge risk.


That's all, folks.
.

23 comments:

terryt said...

Your ideas regarding M and R appear pretty sound, but I spot a difficulty with your Ns. You have S arriving in Australia in number 3 and R popping up in number 4. But A and Y don't make their appearance till later. They must have been around, and presumably geographically separated, before R appeared.

terryt said...

Correction. I've just looked at the PhyloTree again and realised A and Y must have been around contemporary with R. Therefore you should have them in either number 3 or 4.

Maju said...

You have to keep in mind that this is nothing but mapping the mtDNA tree with the assumption that each SNP means some roughly equivalent ammount of time.

Haplogroup A is derived directly from N but has a very long stem many SNPs between N and A with no known branches). What means that pre-A remained as a small lineage for long. Where? Somewhere in Eastern Asia with all likehood but we can't know for sure.

Haplogroup Y is a different case because it is derived not directly from N but from N9. N9 branched out (meaning some level of expansion) early after N. Then all three branches remained small (stem phase) until they did expand again in their respective destinations. Nevertheless their respective expansions are very much parallel, with a very similar timeframe (slightly more tardy for Y but that's probably only because of the greatest effort of colonizing its northern destination). This is intriguing in itself and may mean that the three N9 sublineages were closely tied to each other (same ethinicity or whatever) before their respective expansions.

Correction. I've just looked at the PhyloTree again and realised A and Y must have been around contemporary with R. Therefore you should have them in either number 3 or 4.

Check my version of that tree, please. It is the same tree but I have put the SNP distance in perspective.

Pre-A splits from N at the same time as pre-R (and all other N sublineages) but, while the stem leading to R is 2 SNPs long, the one leading to A is much longer and implies that A expanded not simultaenoustly to R at all but actually much later, probably after H did, just to mention a reference clade. Same for X and mostly also for Y (with the caveat that Y is derived from N9 not N directly).

All these N sublineages remained as a thin thread with no expansion until they finally expanded. And is only the expanded lineage that we call A, X or Y, not the stem phase. If something would be found branching out from one of these stems, it would be called pre-A or something like that, not A.

Maju said...

PS-

I just wrote: If something would be found branching out from one of these stems, it would be called pre-A or something like that, not A.

In fact Y is an excellent example. After defining this haplogroup in the "paleolithic of population genetics" (whoa, the late 20th century!), it was discovered that in fact there was something hanging from the Y stem. That was called N9a and N9b and the haplogroup including the three was rechristened as N9.

Admittedly the mtDNA nomenclature is pretty much chaotic and has other examples of this kind (or even uglier). The same happened to H and even to HV, the resulting higher level haplogroup was called first pre-HV and then R0. Others just have no commonly accepted names as of now - but they exist regardless of the chaos in the nomenclature.

Maju said...

Or, from another perspective:

After each SNP mutation there is the possibility that a new haplogroup forms. This would happen if the lineage expanded but, if it does not, the proto-branches are drifted out, "reabsorbed", all except one (if that one would also be drifted out, the whole lineage would become extinct and we would not know of it nowadays).

A haplogroup is nothing but a group of lineages that share a common ancestral node. If there is no such node, as may happen with some rare lineages (though most of them have some branching even if only at the very end of the stem, i.e. a few generations ago) we still call that a haplogroup sometimes but much more properly would be to speak of a private lineage, stemming from whatever node. H for instance has many many of those private lineages (that in fact constitute the bulk of the haplogroup), normally grouped under the umbrella of H*.

terryt said...

"After each SNP mutation there is the possibility that a new haplogroup forms".

As long as it leaves daughters, who leave daughters, etc. a new haplogroup has formed.

"This would happen if the lineage expanded".

It doesn't have to expand until long after it first forms. It would probably eventually normally come to near fixation in a population of fairly constant, and relatively limited, size.

"each SNP means some roughly equivalent ammount of time".

Not necessarily, but it will give some idea of the sequence. A derived clade cannot form before its parent clade.

"Somewhere in Eastern Asia with all likehood but we can't know for sure".

But what route did pre-A take to get there? To me India seems unlikely.

"the one leading to A is much longer and implies that A expanded not simultaenoustly to R at all but actually much later".

But that only means it EXPANDED later. As you mention, pre-A still split from the parent haplogroup at the same time as pre-R.

"All these N sublineages remained as a thin thread".

And which regions does this thin thread pass through?

"And is only the expanded lineage that we call A, X or Y, not the stem phase".

Wouldn't you say these stem phases were already quite widely spread? Remembering the many ancient N mt-haps found around and north of the Zagros Mountains.

"it would be called pre-A or something like that, not A".

Surely that's what N is to all of them. There are a variable number of SNIPs, and therefore variable numbers of pre-? on each line, but they still all separated at roughly the same time.

"Admittedly the mtDNA nomenclature is pretty much chaotic".

Yes. The Y-chromosome haplogroups are more sequentially named. Mt-haps were first named for American Indian haplogroups, the rest just tack onto those.

"if it does not, the proto-branches are drifted out, 'reabsorbed', all except one (if that one would also be drifted out, the whole lineage would become extinct and we would not know of it nowadays)".

And I happen to believe that's exactly what we see. Haplogroups spread when times are good and then become isolated. At which time individual haplogroups become fixed in individual populations. Many of them then have a secondary spread when good times (or new technology) returns. However this second expansion can often involve a haplogroup of the opposite sex unrelated to the one they first came in with.

"(though most of them have some branching even if only at the very end of the stem, i.e. a few generations ago)".

Surely we're concentrating here on deeper branches.

terryt said...

"it was discovered that in fact there was something hanging from the Y stem. That was called N9a and N9b and the haplogroup including the three was rechristened as N9".

What's the problem? Both N9s are East Asian anyway.

Maju said...

The problem is what we call haplogroup and what just private lineage.

A is not the same as pre-A, the same that Y is not the same as N9(xY). We count the haplogroup since the node that branches out, not since its divergence from the antecessor root.

It is an important conceptual difference: something that shares 5 SNPs with a haplorgoup but lacks 1 does not belong to that haplogroup, though it's obviously related.

It is also mportant for my analysis because I understand that we do know more or less where a haplgroup expanded from where it is found now (this is more obvious for clades concetrated in an area than for clades that are spread around, logically). But we know almost nothing about where the private lineage precursor of that haplogroup was at the stem phase. The best we can do is locate the antecessor node and the haplogroup node and draw a straight line (what is logically just a gross approximation, nothing else).

So we can guess with some certainty where the L3 and M nodes happened but for the period of the four SNPs between them we can just draw a straight line and then approximate a most reasonable route adapting it to actual geography. As I said before, it could also have happened via UFO abduction for what we can tell, as a private lineage technically only needs one succesion of individuals of the same gender to exist, nothing else.

In realistic terms it may have been a tiny clan, a larger one that got massacred but one or a few survived, a lineage that was lurking as minority inside a group dominated by a different lineage, etc.

We can only spot them back in time at the branching nodes, because these represent some sort of expansion. The rest are just speculative dotted lines.

Wouldn't you say these stem phases were already quite widely spread? Remembering the many ancient N mt-haps found around and north of the Zagros Mountains.

No. If you join the scattered dots derived from N (and even more if you see that from a more general perspective, and still even more if you look also at the first N-derived lineages that do show signs of autonomous expansion) you get the N node locatd in SE Asia (not the Zagros, not at all).

...but they still all separated at roughly the same time.

Yes and that's why N has a starlike structure (many lines scattering from a common center) and that's why we think of the N node as a demic expansion event of large dimensions (but not as large as M or even H, not at all).

But it says virtually nothing of A or R, just that they started their journey then, along with many other lineages, many of which probably have not survived the test of time.

What is relevant is when A or R specifically expanded, flourished, not so much their seed-like "pre-" stage. Obviously something happened too at that stage but I can only shrug when faced with questions about it, at most speculate in the line of: if by the seed stage 0 they were at Bangkok and by the seed stage n they were at Okhots, maybe at the seed stage n/2 they were around Shanghai. But who knows? Maybe they migrated fast early on and then languished at the edge of the tundra, or maybe they did exactly the opposite: lurking as minority clade near Bangkok until an adventurous descendant migrated to Siberia, where she had great success. The path is ultimately unpredictable at such minimal levels of information, at such minimal levels of people actually involved.

So why your emphasis in what is impossible to determine with any certainty?

And I happen to believe that's exactly what we see. Haplogroups spread when times are good and then become isolated. At which time individual haplogroups become fixed in individual populations. Many of them then have a secondary spread when good times (or new technology) returns. However this second expansion can often involve a haplogroup of the opposite sex unrelated to the one they first came in with.

We are in agreement in this then. Or at least I hope so. Though fixation does not necessarily results in a single universal haplogroup: it is a process with that tendency but at any moment there may be several haplogroups at different apportions. Drift and fixation need time and small populations. If the population expands again before the fixation process is finished or if the fixation process just doesn't have time to finish at all (because the population involved is too large), we can see several, possibly unrelated haplogroups going together.

Take modern European demic colonization as example (and let's use Y-DNA better for this case): we see that the various European haplogroups (and even West African ones, even against their will) all benefitted from it. Of course R1b was most favored because it's dominant at the origin (Western Europe) but R1a, I, J, E1b1 and even T and N3 also have been carried to the colonies at variable ammounts.

If such a process would have happened in the much more chaotic Paleolithic conditions (much smaller population: much greater chance of founder effect and drift), you could see, for example, E1b1 becoming dominant in, say, Queensland. Of course, the odds do favor R1b but here and there other lineages may accidentally win the race.

What's the problem? Both N9s are East Asian anyway.

The N9 case is rather clear, I think but what about A or X? They are related almost as much as N9 derivates are with each other but A is found in NE Asia and X in the SW. This case is confusing if taken in isolation. So when you start talking about the pre-A stage (you could well talk about the pre-X stage if that fit your imagination, I guess) you just drive me nuts.

That is the problem.

terryt said...

"I think but what about A or X? They are related almost as much as N9 derivates are with each other but A is found in NE Asia and X in the SW. This case is confusing if taken in isolation".

And where is the Indian haplogroup that should connect them? Drifted out? In a population the size of India's, with so many M haplogroups surviving, and N's daughter Rs?

Their nearest connection is with the YN9 haplogroup (which can easily be considered to be just one haplogroup, let's combine them as Y) which is found in East Asia. And they're all connected to mtDNA S, in Australia. As you say, "That is the problem".

terryt said...

I just checked your mtDNA tree again and you almost had me convinced: N21 and N22. But I checked. They're Malay. Hardly what I'd call South Asia. More like SE Asia. So my question stands: 'And where is the Indian haplogroup that should connect them?'

terryt said...

By the way. Do you happen to know how widely Y-chromosome F2 is distributed? All I can find is a general 'E Asia'.

Maju said...

And where is the Indian haplogroup that should connect them? Drifted out? In a population the size of India's, with so many M haplogroups surviving, and N's daughter Rs?.

A and X? The only connector is N-root. Their long stems clearly indicate they were very small private lineages before they found their niches to expand in opposite corners of Asia.

There must have existed many many private lineages as these, just that most did never survive the test of time, while others did but as tiny private lineages (most Australian N clades in fact).

Their nearest connection is with the YN9 haplogroup (which can easily be considered to be just one haplogroup, let's combine them as Y) which is found in East Asia. And they're all connected to mtDNA S, in Australia. As you say, "That is the problem".

Well first the name is N9, not Y. The nomenclature is already too complicated to allow you to add more of your own naming innovations arbitrarily. Y is a subclade of N9, just like I is a subclade of N1, HV of R0 or CZ of M8. If we can't even agree in the names, we can hardly go anywhere.

Second: neither N9 nor S connect anything regarding A and X. They are sister clades and hence autonomous since the N-node explosion.

So what connects them all? N and only N.

I just checked your mtDNA tree again and you almost had me convinced: N21 and N22. But I checked. They're Malay.

You may be right on this. I could only access the Pierson 2005 reference and he reports them in Island SE Asia.

It is an interesting erratum. It reinforces the idea that N probably exploded in SE Asia. I'll review that tree later on.

So my question stands: 'And where is the Indian haplogroup that should connect them?'.

Connect what? We see traces of Sahul-South/West Asia movements in other N lineages like N2 (and other in the M and R sets), the main South/West Asian lineage is anyhow N1'N5, which is also the pioneer of West Asian colonization. The main N-derived South Asian lineage is R.

If what you ask is for something hanging from the pre-X stem, we just don't find it anywhere: X was private before it got lucky and found a niche where it could expand in West Asia/Egypt. The same can be said of A: there's just nothing between its putative SE Asian origins and its actual expansion in NE Asia.

By the way. Do you happen to know how widely Y-chromosome F2 is distributed? All I can find is a general 'E Asia'.

No, I wish.

Ebizur said...

Haplogroup F2-M427/M428 was first described by Sengupta et al. in 2006:

"Indigenous and Exogenous HGs Represented in India
On the basis of the combined phylogeographic distributions of haplotypes observed among populations defined by social and linguistic criteria, candidate HGs that most plausibly arose in situ within the boundaries of present-day India include C5-M356, F*-M89, H-M69* (and its sub-clades H1-M52 and H2-APT), R2-M124, and L1-M76. The congruent geographic distribution of H-M69* and potentially paraphyletic F*-M89 Y chromosomes in India suggests that they might share a common demographic history.

A median-joining network analysis (not shown) of F*-M89 microsatellite haplotypes in Indians and East Asian Lahu suggested divergence between these two populations. This is confirmed by the discovery of the linked HG F2-M427 and M428 markers (table 2) that are restricted to the Lahu in our data set. HG R2-M124 occurs with a frequency of 9.3% in India, consistent with 8%–10% reported elsewhere (Kivisild et al. 2003a; Cordaux et al. 2004). The decreasing frequency of R2—from 7.4% in Pakistan to 3.8% in Central Asia (Wells et al. 2001) to 1% in Turkey (Cinnioğlu et al. 2004)—is consistent with the pattern observed for the autochthonous Indian H1-M52 HG."

Lahu (Sino-Tibetan > Tibeto-Burman > Loloish)
5/7 = 71.4% F2-M427/M428
1/7 = 14.3% O3a3c-M134
1/7 = 14.3% O3-M122(xO3a3c-M134)

I wonder how much of the F(xK) Y-DNA in the Yi people and other Tibeto-Burman-speaking populations of southwestern China, northeastern India, and vicinity should belong to F2-M427/M428.

Maju said...

Thanks for the input, Ebizur. But I don't get why do you think that Sengupta was talking of F2 in that quote: it's all the time F*-M89, not F2-M427/428.

Following what Terry has researched, two other F-number clades (F1 and F4, no mention to F3) are found primarily or exclusively in South Asia, as well as F*. That should account for Sengupta's F*-M89, right?

terryt said...

Also, Thanks for the input, Ebizur.

"I wonder how much of the F(xK) Y-DNA in the Yi people and other Tibeto-Burman-speaking populations of southwestern China, northeastern India, and vicinity should belong to F2-M427/M428".

That might indicate that F2 did not actually move very far into East Asia, just to the border regions between India and China. So F (in the strict sense) is not too widely spread. It's K's relations that really take off, and they basically only move into the southeast, until the descendants of NOP in turn take off.

terryt said...

I realise this is an old article but you may not have seen it. Claims that Northeast India has basically always been a barrier:

http://www.ncbi.nlm.nih.gov/pubmed/15128876

There has, of course, been opposition to the claim.

terryt said...

Further to the rapid southern migration theory. I have Thursdays off and it was raining all day so I was able to follow up some of the mtDNA M haplogroups, and line them up to some extent, similar to what I have done for several Y-chromosome haplogroups and mtDNA N.

It seems the haplogroups M44-M51 (many of which you've labelled unknown) were mostly identified during research in the Khasi Hills, east of the Brahmaputra in Assam. You will be aware of my reasons for prefering to call this region part of a 'borderland' between India and SE Asia. This borderland would include the forested hills of Burma, Northern Thailand, Laos and Yunnan (in Southwestern China). I suggest we then call the region northeast of this borderland 'East Asia'. The region south of the borderland we could profitably call 'Sunda', stretching right down to, and including, Malaya, Western Indonesia and the island of Borneo. For various reasons I would include Vietnam in this geographic region. We could profitably divide the region across Wallacea into two: 'Australia' and 'New Guinea/Melanesia'. Away to the north we find mtDNA M haplogroups centred on what we could call another, different, geographic region, 'Japan'.

So, using the above geographic classification we find (as you are well aware) the greatest number of basal mtDNA M haplogroups in India. Thirteen actually: M2, M3, M4/M30/M37/etc., M5, M6, M22, M33, M34, M35, M36, M39, M40 and M41. With further research we may be able to narrow down each of their centres of expansion within India, especially if we can identify Sri Lankan haplogroups.

The next greatest number of haplogroups (7) are pretty much centred on the borderland region, including the Andaman Islands: M44, M46, M47, M48, M49, M51 and M31/M32.

This takes care of 20 of the 36 haplogroups. Hardly supporting evidence for a rapid eastward expansion from India.

By the time we work our way further north into East Asia we find just four haplogroups (D, M11, M13 and M2) and further north still, around Japan, we find three (C/Z/M8, M7 and G/
M12). Southwards, Sunda has just two: E/M9 and M10. Melanesia 3: M27, M28 and Q/M29. And Australia with just two (M15 and M42), although Australia also shares haplogroup M10 with Sunda and M14 with Western Eurasia. Talking of which, we also find M1 in Western Eurasia, almost certainly the product of back migration from India.

I'm sure you will be able to correct any mistakes and improve this list.

Maju said...

Yah, it's a nice classical paper. But it refers to a second phase: after South and East Asia had become "disconnected". In a sense it's much like those papers claiming that Gibraltar Strait was a barrier: reality is not as extreme in fact.

What it makes clear is that Assam-plus belongs to SE Asia and not South Asia and that the area has been pretty much on its own and more connected to Burma and Tibet than to anywhere else.

But does this reflect the conditions of the MP? And, if it does, what about the Bengal-Arakan coastal corridor?

It seems the haplogroups M44-M51 (many of which you've labelled unknown) were mostly identified during research in the Khasi Hills, east of the Brahmaputra in Assam. You will be aware of my reasons for prefering to call this region part of a 'borderland' between India and SE Asia.

Aha, pretty interesting. I agree that Assam-plus is SE Asia, rather than South Asia, which actually ends in Bangla Desh-Tripura.

This borderland would include the forested hills of Burma, Northern Thailand, Laos and Yunnan (in Southwestern China). I suggest we then call the region northeast of this borderland 'East Asia'. The region south of the borderland we could profitably call 'Sunda', stretching right down to, and including, Malaya, Western Indonesia and the island of Borneo.

I'd call that "borderland" SE Asia. Everybody does.

Sunda is term for many Indonesian islands, including the southern part of Wallacea (lesser Sunda islands). Sundaland (not just "Sunda") is the ancient peninsula or subcontinent extending south of the Kraa isthmus, now fragmented into islands (greater Sunda islands) and the Malay peninsula.

For various reasons I would include Vietnam in this geographic region.

I see absolutely no reason to get Vietnam into Sundaland.

... another, different, geographic region, 'Japan'.

That could make some sense.

All these regions (except Sahul) anyhow belong to East Asia and there's no way to determine borders between them, the same that you'd be hard pressed to place border lines inside Europe. They are more like interconnected provinces in a major world region.

You could probably do the same with South Asia, dividing it into three or four provinces (like NW, mid-East and South - or SW and SE) but the flow between them is too high to really matter for our purposes. Trying to be too detailed may end up in failure.

So, using the above geographic classification we find (as you are well aware) the greatest number of basal mtDNA M haplogroups in India. Thirteen actually: M2, M3, M4/M30/M37/etc., M5, M6, M22, M33, M34, M35, M36, M39, M40 and M41. With further research we may be able to narrow down each of their centres of expansion within India, especially if we can identify Sri Lankan haplogroups.

I counted 15 in fact. My own partial notes on some of them (as well as R subclades) have their cores pretty much spread in too many cases. Some lineages are extended by all the continent while others have wildly scattered cores maybe in Sri Lanka and Kashmir or things like that.

Going to excessive detail may get us rather lost instead of finding out more I fear.

Anyhow, for your amusement, some informative papers:

- Chaubey et al. on R7

- Metspalu on South Asian mtDNA - maps for M2, M3, M4a, M6, M18, M25, R2, R5, R6, U2, U7 and W.

- Thangaray et al. on why mtDNA M originated in South Asia. Mentions some rare M haplogroups in the 30-40 area with sample locations. Not too comprehensive.

- A thread of Quetzacoatl Anthropology Forum with a list of many papers on South Asian genetics. Among them maybe important is Palani et al. that shows why R is South Asian by origin.

In the Y-DNA area you may enjoy reading this one of Redd et al. on the apparent South Asian-Australian connections within haplogroup C specifically (it's before C4 and C5 were defined but does not really matter).

The next greatest number of haplogroups (7) are pretty much centred on the borderland region, including the Andaman Islands: M44, M46, M47, M48, M49, M51 and M31/M32.

This takes care of 20 of the 36 haplogroups. Hardly supporting evidence for a rapid eastward expansion from India
.

I am not sure you are right about that right now, but in any case 15 > 7, and all spawn anyhow from the M root, in starlike manner, indicating a rapid expansion from somewhere. South Asia is the default conclussion based on diversity levels (plus overall apportions too).

terryt said...

Thanks for those links. A couple of clarifications. First off East Asia includes haplogroups D, M11, M13 and M21 (not M2).

"But does this reflect the conditions of the MP?".

Probably, yes.

"Sundaland ... is the ancient peninsula ... extending south of the Kraa isthmus, now fragmented into islands ... and the Malay peninsula".

And at such times what we now call Thailand (especially Southern Thailand) and Kampuchea were absolutely part of Sunda land. The mountains to the north of that region would have been part of a different ecological region, the 'borderland' between Sundaland and South Asia.

"I see absolutely no reason to get Vietnam into Sundaland".

I didn't carefully read what I'd written. I actually meant Vietnam was part of East Asia, not Sunda. Vietnam is separated from mainland Sunda by the very rugged Annam Cordillera. Vietnam has been culturally connected to China much more than have Laos and Thailand. This latter region's connection to China is relatively recent.

"there's no way to determine borders between them, the same that you'd be hard pressed to place border lines inside Europe".

You're not that hard pressed. Mountains divide the macro-regions into micro-regions in both places. Sure, the North European Plain is pretty continuous, but even there, although boundaries are ill-defined, the population basically forms a cline from east to west (or vice versa if you insist). Elsewhere in Europe the Pyrennes, Alps, Carpathians and even the Dinaric and Rhodope mountains form boundaries to some extent. Same in East Asia.

"I counted 15 in fact".

I counted haplogroups M4/M30/M37/etc. as a single haplogroup. They all derive from a single root within M.

"South Asia is the default conclussion based on diversity levels".

I totally agree that mtDNA M has its origin in India. It's N I have a problem with, although the research recently posted by Dienekes may show R is actually the root of this haplogroup, in which case it too could derive from India.

Maju said...

And at such times what we now call Thailand (especially Southern Thailand) and Kampuchea were absolutely part of Sunda land. The mountains to the north of that region would have been part of a different ecological region, the 'borderland' between Sundaland and South Asia.

That is an assumption you make but what does the archaeological and genetic data say? All I know is that haplogroup D is important and probably originated in that area of Thailand-Cambodia-Andaman - but instead of migrating to the south, it headed northwards, both along the coast to Japan and to the highlands near Tibet.

I didn't carefully read what I'd written. I actually meant Vietnam was part of East Asia, not Sunda. Vietnam is separated from mainland Sunda by the very rugged Annam Cordillera. Vietnam has been culturally connected to China much more than have Laos and Thailand. This latter region's connection to China is relatively recent.

Well, whatever. I really do not make much sense of chopping SE Asia or East Asia into those reduced and somewhat arbitrary areas. If at least these would have an archaeological or genetical basis, but I don't see that clear anywhere: all I see is speculative assumptions.

Elsewhere in Europe the Pyrennes, Alps, Carpathians and even the Dinaric and Rhodope mountains form boundaries to some extent.

LOL. You are telling to a Basque that the Pyrenees are a boundary and not a heartland? C'mon!

When the mountains were covered in glaciers, notably the Alps, they were indeed major barriers. But otherwise...

We were historically a single people living at plains of two distinct climatic-ecological regions (Atlantic and Mediterranean) and the mountains in between were never any frontier but the backbone. The mountains were where we found refuge when attacked and where we defeated our foes who moved insecurely in such terrain. Thanks to the mountains we kept (often) the plains, both to the north and to the south.

For organized armies mountains are barriers but for local guerrilla parties (and in the Paleolithic this was the only warfare), mountains are fortresses, deadly traps where to make the enemies pay in blood the price of their greed.

"I counted 15 in fact".

I counted haplogroups M4/M30/M37/etc. as a single haplogroup. They all derive from a single root within M
.

You're correct about that but anyhow they are 14 (not 15, as I said before).

It's N I have a problem with, although the research recently posted by Dienekes may show R is actually the root of this haplogroup, in which case it too could derive from India.

That paper (if it's the same I commented on here) does not say that: it suggests that "R" would be restricted to the three West Eurasian lineages (R0, U and JT). It basically suggests to scrap R off totally. But lacks of SNP defined branches and is based in a mere statistical approximation, so I'm just for ignoring it, unless it recieves more support - what I have not detected so far.

N still looks SE Asian to me in any case, as the South Asian R haplogroups are not even mentioned, with the exception of R5. P was excluded too.

terryt said...

"The mountains were where we found refuge when attacked and where we defeated our foes who moved insecurely in such terrain".

So the mountains were a barrier to everybody else once the Basques had become established. As a result the Pyrennes separate what is now called Spain from what is now called France.

Maju said...

The mountains were a barrier for armies and certainly the central Pyrenees are hardly transitable:no real passes, unlike the Alps. The Pyrenees can be crossed by armies by the less high and abrupt west (Basque) and east (Catalan) sides. Most invasions in one sense or another went through the eastern side (Hannibal, etc.) the western side was better guarded and/or was less interesting economically before modernity. No armies crossed it without trouble but the real trouble was the people living and fighting there.

We are talking two different things: peoples and states/armies. For states mountains are barriers, for peoples often they are homelands and refuges. If they place a border there, then the people become smugglers naturally, as the border is their heartland, not any barrier.

What's your point?

terryt said...

"What's your point?"

You took issue with my comment, 'Mountains divide the macro-regions into micro-regions in both places'. Seems there's now no disagreement. People occupying mountains are difficult to displace.