A reader points me to a new quite interesting paper on human phylogeny from the viewpoint of autosomal DNA mainly.
Jinchuang Xing et al. Toward a more uniform sampling of human genetic diversity: A survey of worldwide populations by high-density genotyping. Genomics 2010. Pay per view.
A copy can be found at ZohoViewer and the supplementary material is also freely available.
High-throughput genotyping data are useful for making inferences about human evolutionary history. However, the populations sampled to date are unevenly distributed, and some areas (e.g., South and Central Asia) have rarely been sampled in large-scale studies. To assess human genetic variation more evenly, we sampled 296 individuals from 13 worldwide populations that are not covered by previous studies. By combining these samples with a data set from our laboratory and the HapMap II samples, we assembled a final dataset of ~ 250,000 SNPs in 850 individuals from 40 populations. With more uniform sampling, the estimate of global genetic differentiation (FST) substantially decreases from ~ 16% with the HapMap II samples to ~ 11%. A panel of copy number variations typed in the same populations shows patterns of diversity similar to the SNP data, with highest diversity in African populations. This unique sample collection also permits new inferences about human evolutionary history. The comparison of haplotype variation among populations supports a single out-of-Africa migration event and suggests that the founding population of Eurasia may have been relatively large but isolated from Africans for a period of time. We also found a substantial affinity between populations from central Asia (Kyrgyzstani and Mongolian Buryat) and America, suggesting a central Asian contribution to New World founder populations.
The abstract already addresses which are the most important conclusions of the paper: (1) lower genetic distances with better sampling strategies, (2) claim of large distinct founder population at the origins of the Out of Africa migration and (3) claim of greater affinity of Native Americans with Central Asians than with East Asians senso stricto. Additionally they also emphasize (4) the finding that the West Eurasian component in South Asians is of West Asian origin rather than European.
In the graphs I have noticed a couple of other details worth of mention: (5) that Pygmies appear more distinct than Khoisan from the bulk of the species (which is somewhat contradictory with the haploid phylogeny) and (6) that the closest African populations to Eurasians are "Nilotic" groups of the Kenya-Uganda-Ituri area (neither the Horn of Africa nor the Nile Basin were sampled).
I will address some of these matters now.
The migrant Out of Africa population
The authors take some time to address the issue of the migrant population in pages 20-21:
The OoA hypothesis, proposing a single OoA bottleneck followed by an expansion into Eurasia approximately 50,000 years ago, has gained extensive support from the archaeological record and genetic studies. Nevertheless, many of the historical details of this diaspora remain unclear. A common interpretation is that the OoA bottleneck was the result of a migration of a small founding population into Eurasia. Given the difference in haplotype heterozygosity between African and non-African populations and the relationship between heterozygosity and effective population size, we can estimate the effective population size of such a founding population . Within Africa, the average 100-kb haplotype heterozygosity in our data is 0.91. Immediately outside of Africa in Europe, the Middle East, and Central Asia, the average haplotype heterozygosity is 0.82 (Figure 2). A reduction of heterozygosity from 0.91 to 0.82 in a one-generation bottleneck would require an effective population size of only 5.5 individuals. While a one-generation bottleneck is an oversimplification, these estimates indicate that an OoA bottleneck resulting from the migration of a small founding population would require an extremely small population size. However, given that the archaeological record indicates a rapid expansion of modern humans into Europe and Asia in just a few thousand years , it seems unlikely that Eurasia could be populated so quickly by a such a small founding population.
A more likely explanation for the OoA bottleneck is that Eurasia was populated by a larger population that had been relatively isolated from other modern human populations for tens of thousands of years prior to the expansion. The first fossil evidence for modern humans outside of Africa is in the Middle East at Skhul and Qafzeh between 80,000-100,000 years ago, which is at least 20,000 years prior to the Eurasian diaspora. If a population of modern humans remained in the Middle East until the expansion into Eurasia, there would have been sufficient time for genetic drift to reduce heterozygosity dramatically before the Eurasia expansion. This “Middle East isolation” hypothesis provides a robust explanation for the relative homogeneity of European and Asian populations relative to African populations (see Figures 3A-B) and is supported by a recent maximum likelihood estimate of 140,000 years ago for the time of Eurasian-West African population separation. Interestingly, a recent study of the Neandertal genome suggests that the non-African individuals, but not the Africans, contain similar amount of admixture (1-4%) with the Neandertals. The authors suggest that the admixture must have happened between the Neandertals with an ancestral non-African population before the Eurasian expansion. Given the fossil, archaeological, and genetic evidence, the Middle East isolation hypothesis warrants rigorous evaluation as whole-genome sequence data become available.
I must say that the real problem is to be talking of a mere depth of 50,000 years for H. sapiens colonization of Eurasia, when that must be the date of the reflux into West Eurasia. The archaeological record for Asia east of Iran is inconclusive (too poor) and the genetic data, including the one available here, strongly suggests that South and East Asia were colonized before West Eurasia.
Hence we must be talking of a quite greater time depth such as the 75-80,000 years ago or more, as has been suggested by most population genetic analysis as of late. Certainly nothing less than 60,000 years ago minimum.
The assumption the authors make is therefore wrong so it's likely that the conclusion is also wrong.
That doesn't mean that the considerations they make, specially those regarding a very small colonizer population do not make sense. This small group of adventurous colonists could perfectly have colonized Asia with much more time, leaving very few remains precisely because they were few and even when they grew up in numbers they were still not many. The relatively poor situation of Asian archaeology does not help to unravel the case in either direction but we must remember that the Jawalpurram remains have clear African MSA affinities (and hence are likely to be product of our species) and these date from before the Toba event, which could well have also helped in the reduction of Eurasian heterozygosity even more, some 74,000 years ago. And there are other archaeological clues that, while not clearly conclusive, may suggest an expansion into Asia since as early as c. 110,000 years ago.
Sure, it would be also a good idea to ponder carefully about the role of the Middle Paleolithic colonists of Palestine in the whole process if that is possible. I have nothing against that but I still don't like their reasoning in this point.
The branching out of Eurasians and the two South Asian components
The neighbor joining trees (see fig. 3 above and also fig. S1 at the supplemental materials, very similar) are one of the most interesting results of this paper and the authors are clearly proud of them.
I am going to ignore this "detail" hereafter but I must however mention that the tree produced in fig. S2, after the inclusion of a North African and two Palestinian populations is however very different. This is strange but I don't know how to handle this discrepancy. It might be a point of support for their hypothesis of a long separate coalescence in the Levant? Can't say.
The two other trees however really produce a result that is an almost perfect fit with haploid phylogenies, with Eurasians branching in two in Tropical Asia (South and East Asian branches) first of all.
Then the South/West Eurasian branch shows a division between South Indians and the rest, what I interpret as a split happening still in South Asia prior to the colonization of West Eurasia. Then Pakistanis and West Eurasians branch apart and then the same happens with Europeans diverging from the West Asian/Caucasus population.
Some of the branches' positions however may be caused by ulterior admixture so let's be careful with that.
The authors also emphasize the finding (consistent with what we have seen in other papers) that the second South Asian component, related to West Eurasians, is essentially of West Asian/Caucasus affinity and not European.
I agree with this and I think that it is an important point to make. It seems to imply that an important genetic flow has existed from West Asia into South Asia, specially the Northwest part of it. Of course the flow may have happened at different historical and prehistorical periods but it is important to realize that the Neolithic Age was surely when such migrations might have caused a greater impact.
In contrast some of "European" (darker orange) component is also visible, maybe originating in the Indoeuropean flows and maybe replaceable by a more specific Central Asian component (sadly Central Asia and Siberia is only sparsely sampled in this paper) if the findings of Hui Li are to be reproduced in the context of proper sampling strategies in this delicate area. Whatever the case the European input in South Asia is very minor, even if maybe slightly larger than among West Asians/Caucasians. We can safely infer, I understand, that it reflects the real Indoeuropean genetic input via Central Asia.
Most importantly a clearly distinct South Asian component (purple) has been detected and is strong enough to make up 50% of the Pakistani gene pool and almost the totality of some South Indian populations. Also notice the distinctive Irula component (blue), which may reflect the particular long isolation of these tribals, in the past tentatively classified as "Negritos".
Notice also the minor but significant presence of the Indian component in SE Asia, specially in Thailand. I have on occasion noticed that some Thais seem to have a distinctive phenotype and maybe this is the explanation.
East Asians and Native Americans
In this aspect I want to say that I am not totally persuaded by the authors' claim of greater Central Asian affinity of Native Americans. The main reason is that the "Central Asians" they mention such as Nepalese or Kyrgyzes are possibly admixed populations that owe their position in the NJ tree to that fact.
Even the Buryats appear to show some of that admixture. In this case (and maybe in the others too) it is probably a case of Central Asian specific components indeed but components that still may reflect a very ancient admixture event in the early Upper Paleolithic process of colonization of Central Asia and the Far North.
This is a limitation of this paper: they do some chest beating about a very throughout sampling (somewhat justified indeed) but in the case of Central Asia/Siberia they are lacking and the matter seems to be left unclear.
In any case, Native Americans or rather their ancestral founder population does look like having coalesced in a complex Central Asian and Siberian sparsely populated ancient landscape prior to their arrival to Beringia and subsequent colonization of America. Haploid genetics is very strongly supportive of such scenario.
It is difficult to ascertain however whether their high divergent location in the NJ tree, in the context of the East Asian branch, owes to them having diverged very early or rather (as I suspect) to their early admixture event, maybe partly shared with Central Asians and Siberians. We would need a much improved sampling strategy in those areas to be able to get some clear ideas.
Otherwise East Asians appear to show a first division between NE Asians and SE Asians, with the divide running across China. Not much more can be said, as the sample has not sufficient coverage, specially in SE Asia and Oceania.
One of the details of the trees that called my attention is that, in contrast to what happens in simplified haploid genetics, Pygmies are more distant from the rest of Humankind than Khoisan. This has an explanation, I believe, as the lineages more tightly associated with the Khoisan such as mtDNA L0 and Y-DNA A have representatives in NW Africa and even Arabia, indicating a protracted divergence (or repeated re-convergence) between the southern proto-Khoisanid branch and the main proto-Afro-Eurasian one. Instead when proto-Pygmies diverged they probably did for good, in spite of recent admixture with Bantus and some ancient lineages also shared with West Africans at minority levels.
Another such detail is that the populations most closely related to Eurasians are East Africans (Hema, Luhya, Alur). Overall the African branching process is coherent with the scenario I described here at Leherensuge some months ago.