An.Hm. Genet.(2001).65,436 Printed in Great Britain The phylogeography of Y chromosome binary haplotypes and the origins of modern human populations P.A. UNDERHILLI* G. PASSARINOI, A.A. LINI P SHEN2 M. MIRAZON LAHR, R. A FOLEY.P.. OEFNER2 AND LL. CAVALLI-SFORZAL I Department of Genetics, Stanford University, 300 Pasteur Dr, Stanford, CA 94305-5120, USA 2 Stanford DNA Sequencing and Technology Center, 855 California Ave, Palo Alto, CA 94304, USA a Department of Biological Anthropology, University of Cambridge, Downing Street Cambridge CB2 3DZ, UK a Departamento de biologia, Inst. de Biociencas, Universidad de Sao Paulo, Rua do matio Travessa 14. No. 321.05508-900 Cidade universitaria. Sao paulo, brasil S Department of Cell Biology, Calabria University, Rende, Italy Received 24.8.00. Accepted 16 11.00 SUAMARY Although molecular genetic evidence continues to accumulate that is consistent with a recent common African ancestry of modern humans, its ability to illuminate regional histories remains incomplete. A set of unique event polymorphisms associated with the non-recombining portion of the Y-chromosome (NR Y)addresses this issue by providing evidence concerning successful migrations originating from Africa, which can be interpreted as subsequent colonizations, differentiations and migrations overlaid upon previous population ranges. A total of 205 markers identified by denaturing high performance liquid chromatography(DHPLC), together with 13 taken from the literature, were used to construct a parsimonious genealogy. Ancestral allelic states were deduced from orthologous great ape sequences. A total of 131 unique haplotypes were defined which trace the microevolutionary trajectory of global modern human genetic diversification. The genealogy provides a detailed phylogeographie portrait of contemporary global population structure that is emblematic of human origins, divergence and population history that is consistent with climatic, paleoanthropological and other genetic knowledge INTRODUCTION Northern eurasia. Overlain on these events are A model for the origins of human diversity the contractions associated with the Last glacial deduced from palaeontological evolutionary ge- Maximum(LGM), and subsequent post-glacial ography maintains that while the modern human expansions of both hunter-gatherers and agri species originates from a single evolutionary culturists event, diversity is a result of subsequent multiple DNA sequences offer an evidentiary alterna evolutionary events associated with various tive to fossil-based pre-historical reconstructions geographie range expansions, migrations, (Jorde et al. 1998, Owens King 1999). The colonizations and differential survival of popu- uniparentally inherited non-recombining haploid lations(Lahr Foley, 1994). Overall, current mtDNA and the Y chromosome loci are par paleoanthropological evidence would suggest an ticularly sensitive to the influences of drift early set of dispersals across Africa and into especially founder effect. Consequently these Western Asia; an early southern dispersal into loci are ideal for assessing the origins of con Asia and melanesia: and a later one into temporary population diversity, and provide Correspondence: P. A Underhill context for paleontological hypothesis testing E-mail: under(@stanford. edu (Foley, 1998). The combination of a recent
Ann. Hum. Genet. (2001), 65, 43–62 Printed in Great Britain 43 The phylogeography of Y chromosome binary haplotypes and the origins of modern human populations P. A. UNDERHILL"*, G. PASSARINO",&, A. A. LIN", P. SHEN#, M. MIRAZO!N LAHR$,%, R. A. FOLEY$, P. J. OEFNER# L. L. CAVALLI-SFORZA" " Department of Genetics, Stanford University, 300 Pasteur Dr., Stanford, CA 94305–5120, USA # Stanford DNA Sequencing and Technology Center, 855 California Ave, Palo Alto, CA 94304, USA $ Department of Biological Anthropology, University of Cambridge, Downing Street Cambridge CB2 3DZ, UK % Departamento de Biologia, Inst. de Biociencas, Universidad de Saho Paulo, Rua do Mataho, Travessa 14, No. 321, 05508–900 Cidade UniversitaUria, Saho Paulo, Brasil & Department of Cell Biology, Calabria University, Rende, Italy (Received 24.8.00. Accepted 16.11.00) Although molecular genetic evidence continues to accumulate that is consistent with a recent common African ancestry of modern humans, its ability to illuminate regional histories remains incomplete. A set of unique event polymorphisms associated with the non-recombining portion of the Y-chromosome (NRY) addresses this issue by providing evidence concerning successful migrations originating from Africa, which can be interpreted as subsequent colonizations, differentiations and migrations overlaid upon previous population ranges. A total of 205 markers identified by denaturing high performance liquid chromatography (DHPLC), together with 13 taken from the literature, were used to construct a parsimonious genealogy. Ancestral allelic states were deduced from orthologous great ape sequences. A total of 131 unique haplotypes were defined which trace the microevolutionary trajectory of global modern human genetic diversification. The genealogy provides a detailed phylogeographic portrait of contemporary global population structure that is emblematic of human origins, divergence and population history that is consistent with climatic, paleoanthropological and other genetic knowledge. A model for the origins of human diversity deduced from palaeontological evolutionary geography maintains that while the modern human species originates from a single evolutionary event, diversity is a result of subsequent multiple evolutionary events associated with various geographic range expansions, migrations, colonizations and differential survival of populations (Lahr & Foley, 1994). Overall, current paleoanthropological evidence would suggest an early set of dispersals across Africa and into Western Asia; an early southern dispersal into Asia and Melanesia; and a later one into Correspondence: P. A. Underhill. E-mail: under!stanford.edu Northern Eurasia. Overlain on these events are the contractions associated with the Last Glacial Maximum (LGM), and subsequent post-glacial expansions of both hunter-gatherers and agriculturists. DNA sequences offer an evidentiary alternative to fossil-based pre-historical reconstructions (Jorde et al. 1998, Owens & King 1999). The uniparentally inherited non-recombining haploid mtDNA and the Y chromosome loci are particularly sensitive to the influences of drift, especially founder effect. Consequently these loci are ideal for assessing the origins of contemporary population diversity, and provide context for paleontological hypothesis testing (Foley, 1998). The combination of a recent
44 P. A. UNDERHILL AND OTHERS molecular age( Shen et al. 2000), and geographical repeat elements other than LINE, yielding structure, makes the NrY a sensitive genetic overlapping amplicons 300-500 bp in length index capable of tracing the microevolutionary PCR conditions are given in Underhill et al. 2000 patterns of novel modern human diversity. Any and Shen et al. 2000. All 218 polymorphisms are andallpopulationlevelforcesandpossiblegiveninAppendixI(depositedathttp:// localizednaturalselectionthatreducesthewww.gene.uclac.uk/anhumgen/)whichlists effective male population size relative to females, primers, the primary reference for each marker will influence the genetic landscap the specific DNA sequence variant and its We combine 205 PCR compatible binary NRY loeation in the fragment. Two new markers polymorphisms(Underhill et al. 2000; Shen et al. (M223, M224)found while genotyping other 2000)together with 13 additional markers from markers are included he literature to examine phylogeographical patterns that may record historical population migrations, mergers and divisions that account DHPLC analy for the current spectrum of human variability. Unpurified PCR products were mixed at an While extrapolating variation associated with a equimolar ratio with a reference Y chromosome single gene to population history must be and subjected to a 3-min 95C denaturing step interpreted cautiously, the phylogeographie re- followed by gradual reannealing from 95C to construction presented here offers one such 65C over 30 min. Ten pl of each mixture were interpretation. It comprehensively integrates the loaded onto a DNASep column(Transgenomic, prehistoric and Y-chromosome data, along with San Jose, CA), and the amplicons were eluted inferences from mt DNA and autosomal haplo- in 0. 1 M triethylammonium acetate, pH 7, with types, into a possible hypothesis for the evolution a linear acetonitrile gradient at a flow rate of of human diversity. We attempt here to diseuss 0.9 ml/min(14). Using appropriate temperature the observed phylogeographie patterns of NR Y conditions, which were optimized by computer variationinthecontextofglobalpopulationsimulation(http://insertion.stanfordedu/melt diversification, and integrate it with paleo- html), mismatches were recognized by the climatological, paleoanthropological and other appearance of two or more peaks in the elution genetic knowledge. In developing our synthesis profiles we have aimed at producing palaeodemographic hypotheses that are consistent with as many DNA sequencing other lines of evidence as possible, and that are Poly morphic and reference PCR samples were amenable to testing by further studies from a purified with QIAGen (Valencia, CA)QIAquick number of disciplines spin columns, cyele sequenced with ABI Dye terminator cycle sequencing reagents and MATERIALS AND METHODS analysed on a PE Biosystems 373A sequencer Chimpanzee, gorilla and orangutan samples were also sequenced for each human polymorphic DNA from 1062 men belonging to 21 popu- locus lations was analvsed. Further details on the ethnic affiliations of these samples are given in RESULTS AND DISCUSSION Underhill et al.(2000) The 218 NrY poly morphisms were used to deduce a phylogenetic tree based on the principle of maximum parsimony, in which a network of PCR branches is drawn that minimizes the num ber of Primers designed for SMCY, DFFRY, UTY, mutational events required to relate the lineages and DBY covered all unique sequences and ( Fig. 1). The ancestral alleles were deduced using
44 P. A. U molecular age (Shen et al. 2000), and geographical structure, makes the NRY a sensitive genetic index capable of tracing the microevolutionary patterns of novel modern human diversity. Any and all population level forces and possible localized natural selection that reduces the effective male population size relative to females, will influence the genetic landscape. We combine 205 PCR compatible binary NRY polymorphisms (Underhill et al. 2000; Shen et al. 2000) together with 13 additional markers from the literature to examine phylogeographical patterns that may record historical population migrations, mergers and divisions that account for the current spectrum of human variability. While extrapolating variation associated with a single gene to population history must be interpreted cautiously, the phylogeographic reconstruction presented here offers one such interpretation. It comprehensively integrates the prehistoric and Y-chromosome data, along with inferences from mtDNA and autosomal haplotypes, into a possible hypothesis for the evolution of human diversity. We attempt here to discuss the observed phylogeographic patterns of NRY variation in the context of global population diversification, and integrate it with paleoclimatological, paleoanthropological and other genetic knowledge. In developing our synthesis we have aimed at producing palaeodemographic hypotheses that are consistent with as many other lines of evidence as possible, and that are amenable to testing by further studies from a number of disciplines. Samples DNA from 1062 men belonging to 21 populations was analysed. Further details on the ethnic affiliations of these samples are given in Underhill et al. (2000). PCR Primers designed for SMCY, DFFRY, UTY, and DBY covered all unique sequences and repeat elements other than LINE, yielding overlapping amplicons 300–500 bp in length. PCR conditions are given in Underhill et al. 2000 and Shen et al. 2000. All 218 polymorphisms are given in Appendix I (deposited at http:}} www.gene.ucl.ac.uk}anhumgen}) which lists primers, the primary reference for each marker, the specific DNA sequence variant and its location in the fragment. Two new markers (M223, M224) found while genotyping other markers are included. DHPLC analysis Unpurified PCR products were mixed at an equimolar ratio with a reference Y chromosome and subjected to a 3-min 95 °C denaturing step followed by gradual reannealing from 95 °C to 65 °C over 30 min. Ten µl of each mixture were loaded onto a DNASep2 column (Transgenomic, San Jose, CA), and the amplicons were eluted in 0.1 triethylammonium acetate, pH 7, with a linear acetonitrile gradient at a flow rate of 0.9 ml}min (14). Using appropriate temperature conditions, which were optimized by computer simulation (http:}}insertion.stanford.edu}melt. html), mismatches were recognized by the appearance of two or more peaks in the elution profiles. DNA sequencing Polymorphic and reference PCR samples were purified with QIAGEN (Valencia, CA) QIAquick spin columns, cycle sequenced with ABI Dyeterminator cycle sequencing reagents and analysed on a PE Biosystems 373A sequencer. Chimpanzee, gorilla and orangutan samples were also sequenced for each human polymorphic locus. The 218 NRY polymorphisms were used to deduce a phylogenetic tree based on the principle of maximum parsimony, in which a network of branches is drawn that minimizes the number of mutational events required to relate the lineages (Fig. 1). The ancestral alleles were deduced using
,,, 国日留,器222习非1日2相日粒4#4444瑟3 Fig. 1. Maximum parsimony phylogeny of human NRY chromosome biallelic variation ' Tree is rooted with respeet to non-human primate sequences. The 131 numbered compound haplotypes were construeted from 218 mutations that are indicated on segments. Marker numbers are discontinuous(see text). Haplotypes are assorted into 10 groups(1-X)
Y chromosome binary haplotypes and origins of modern human populations 45 Fig. 1. Maximum parsimony phylogeny of human NRY chromosome biallelic variation. Tree is rooted with respect to non-human primate sequences. The 131 numbered compound haplotypes were constructed from 218 mutations that are indicated on segments. Marker numbers are discontinuous (see text). Haplotypes are assorted into 10 groups (I–X)
P. A. UNDERHILL AND OTHERS great ape sequence data to root the phylogeny. variations associated with the NRY. in addition All phylogenetically equivalent mutations whose to tracing a common African heritage, resolve order cannot be determined are indicated with a numerous population subdivisions, gene flow slash (i.e. M42/M94/M139). Markers with M episodes and colonization events. They show the numbers >218 reflect the selective removal of overall pattern of the progressive succession of polymorphisms associated with recurrent length Group differentiation and movement across the rariations such as tetra- or pentanucleotide world reflective of expansions and genetic drift repeats and homopolymer tracts. The deter- processes mination of the ancestral state for these poly- This composite collection of 218 NRY variants morphisms is uncertain, and (with one exception, provides improved resolution of extant patri M91)they were excluded from the analysis lineages. Additional resolution will occur with (Underhill et al. 2000). The marker panel com- the discovery of new delimiting markers. The prises 125 transitions, 66 transversions, 26 succession of mutations is unequivocal except in insertions/deletions, plus an Alu element. All branches defined by two or more markers. While polymorphisms except one are biallelic. A double uncertainties related to assessing the effective transversion, M116, has three alleles whose population size of males make temporal esti derived alleles define quite different haplotypes. mates of bifurcation events difficult, age esti Two transitions(M64 and M108) showed evi- mates of key nodes have been made assuming a dence of recurrence but cause no ambiguity. No model of population growth (Thomson et reversions were observed, although one tran- 2000). These indicate a more recent ancestry of sition, SRY10831(Whitfield et al. 1995), also the NRY at 59000 years(95% CI= 40000- referred to as SRY1532(Kwok et al. 1996), is 140000) than previously estimated at known to be a reversion(Hammer et al. 1998). It 134250+44980 years based on 13 mutational is not included here as we have phylogenetically events and constant population size(Karafet et stable transversion and deletion polymorphisms al. 1999). Neither demographic model is likely to cally mimic its patterns be realistic, as the palaeoanthropological er Haplotypes are partitioned into haplogroups dence shows a more complex population history (called Groups I-X) in an attempt to simplify It should be noted that the lower estimate is criteria of presence or absence of alleles located in for dispersals of modern humala iest evidence discussion of phylogeography, using the simple considerably younger than the ear the interior of the phylogeny. These discretion ary Group designations provide a framework for categorization and discussion of haplotypes. The Phylogeography Y genealogy is composed of 131 haplotypes that Intriguing clues about the history of our delineate the 10 Groups, seven of which are species can be derived from the study of the monophyletic. Three groups are polyphyletic, geographie distribution of the lineages on the but have related haplotypes defined as follows: tree in Figure 1, in the approach known as the presence of M89 /M213 and absence of M9 ' phylogeographic'(Avise et al. 1987). Such an (Group VI); the presence of M9 and absence of approach has been previously used for mtDNA M175/M214 and M45/M74(Group VIll)or the networks(Richards et al. 1998, 2000: Kivisild et presence of M45/M74 and absence of M173/M207 al. 1999; Macaulay et al. 1999). Figure 3a-h (Group x). The contemporary global frequeney depicts the hypothesized chronological geo- distribution of the 10 Groups based on >1000 graphic distribution of Y Groups from the globally diverse samples genotyped using a Isotope Stage 5 interglacial to the Holocene. The hierarchical top down approach is illustrated in underlying assumption of phylogeography is that Figure 2, which is based upon frequency data there is a correspondence between the overall given in Underhill et al. 2000. Autochthonal distribution of haplotypes and haplogroups and
46 P. A. U great ape sequence data to root the phylogeny. All phylogenetically equivalent mutations whose order cannot be determined are indicated with a slash (i.e. M42}M94}M139). Markers with M numbers"218 reflect the selective removal of polymorphisms associated with recurrent length variations such as tetra- or pentanucleotide repeats and homopolymer tracts. The determination of the ancestral state for these polymorphisms is uncertain, and (with one exception, M91) they were excluded from the analysis (Underhill et al. 2000). The marker panel comprises 125 transitions, 66 transversions, 26 insertions}deletions, plus an Alu element. All polymorphisms except one are biallelic. A double transversion, M116, has three alleles whose derived alleles define quite different haplotypes. Two transitions (M64 and M108) showed evidence of recurrence but cause no ambiguity. No reversions were observed, although one transition, SRY10831 (Whitfield et al. 1995), also referred to as SRY1532 (Kwok et al. 1996), is known to be a reversion (Hammer et al. 1998). It is not included here as we have phylogenetically stable transversion and deletion polymorphisms that unequivocally mimic its patterns. Haplotypes are partitioned into haplogroups (called Groups I–X) in an attempt to simplify discussion of phylogeography, using the simple criteria of presence or absence of alleles located in the interior of the phylogeny. These discretionary Group designations provide a framework for categorization and discussion of haplotypes. The Y genealogy is composed of 131 haplotypes that delineate the 10 Groups, seven of which are monophyletic. Three groups are polyphyletic, but have related haplotypes defined as follows: the presence of M89}M213 and absence of M9 (Group VI); the presence of M9 and absence of M175}M214 and M45}M74 (Group VIII) or the presence of M45}M74 and absence of M173}M207 (Group X). The contemporary global frequency distribution of the 10 Groups based on"1000 globally diverse samples genotyped using a hierarchical top down approach is illustrated in Figure 2, which is based upon frequency data given in Underhill et al. 2000. Autochthonal variations associated with the NRY, in addition to tracing a common African heritage, resolve numerous population subdivisions, gene flow episodes and colonization events. They show the overall pattern of the progressive succession of Group differentiation and movement across the world reflective of expansions and genetic drift processes. This composite collection of 218 NRY variants provides improved resolution of extant patrilineages. Additional resolution will occur with the discovery of new delimiting markers. The succession of mutations is unequivocal except in branches defined by two or more markers. While uncertainties related to assessing the effective population size of males make temporal estimates of bifurcation events difficult, age estimates of key nodes have been made assuming a model of population growth (Thomson et al. 2000). These indicate a more recent ancestry of the NRY at 59 000 years (95% CI¯40 000– 140 000) than previously estimated at 134 250³44 980 years based on 13 mutational events and constant population size (Karafet et al. 1999). Neither demographic model is likely to be realistic, as the palaeoanthropological evidence shows a more complex population history. It should be noted that the lower estimate is considerably younger than the earliest evidence for dispersals of modern humans. Phylogeography Intriguing clues about the history of our species can be derived from the study of the geographic distribution of the lineages on the tree in Figure 1, in the approach known as ‘phylogeographic’ (Avise et al. 1987). Such an approach has been previously used for mtDNA networks (Richards et al. 1998, 2000; Kivisild et al. 1999; Macaulay et al. 1999). Figure 3a–h depicts the hypothesized chronological geographic distribution of Y Groups from the Isotope Stage 5 interglacial to the Holocene. The underlying assumption of phylogeography is that there is a correspondence between the overall distribution of haplotypes and haplogroups and
y chromosome binary haplotypes and origins of modern human populations 47 X Fig. 2. Contemporary worldwide distribution of Y chromosome groups in 22 regions. Each group is represented by a distinguishing colour. Coloured sectors reflect representative group frequencies. Pacific basin not to scale. With respect to Table 1 of Underhill et al.(2000), Hunza and Pakistan+ India are combined. In addition the results of Native Americans have been subdivided in North(N= 14), Centra (N= 13)and South(N= 79) past human movements. The strong geographieal derived alleles for M42/M94/M139 and presence signal seen in the Y chromosome data is of M91, while all non-African, as well as the consistent with this assumption. The interpret- majority of African males, sampled carry the ative framework should be compared with derived alleles. Both Group I and II lineages are alternatives, such as continuous gene flow, diverse and suggest a deeper genealogical heri- selection, or the effects of recent events. How- tage than other haplotypes. Representatives of ever, these alternatives have not been formally these lineages are distributed across Africa but developed in ways that can be tested against the generally at low frequencies. Populations repre data, and are less consistent with other lines of sented in Groups I and Il include some Khoisan evidence and Bantu speakers from South Africa, Pygmies Groups I and II are restricted to Afriea, and from central Africa, and lineages in Suda are distinct from all other African and non- Ethiopia and Mali. A single Sardinian was in African chromosomes on the basis of the M168 Group I. All members of Group ll share the M60 mutation. In an analogous context, the mtdna and mis1 mutations that are distributed across haplogroups L and La are distinguished from Africa, with an idiosyncratic occurrence other Africans and all non-Africans on the basis Pakistan. M182 defines the major sub-clade. of the 3594 mutation (or 3592 Hpal restriction although an intermediate haplotype in Mali with site,Chen et al. 2000 and citations therein). the unique M146 mutation still persists Group I is distinguished by the absence of the Although not mutually exclusive, some geo
Y chromosome binary haplotypes and origins of modern human populations 47 Fig. 2. Contemporary worldwide distribution of Y chromosome groups in 22 regions. Each group is represented by a distinguishing colour. Coloured sectors reflect representative group frequencies. Pacific basin not to scale. With respect to Table 1 of Underhill et al. (2000), Hunza and PakistanIndia are combined. In addition the results of Native Americans have been subdivided in North (N¯14), Central (N¯13) and South (N¯79). past human movements. The strong geographical signal seen in the Y chromosome data is consistent with this assumption. The interpretative framework should be compared with alternatives, such as continuous gene flow, selection, or the effects of recent events. However, these alternatives have not been formally developed in ways that can be tested against the data, and are less consistent with other lines of evidence. Groups I and II are restricted to Africa, and are distinct from all other African and nonAfrican chromosomes on the basis of the M168 mutation. In an analogous context, the mtDNA haplogroups L" and L# are distinguished from other Africans and all non-Africans on the basis of the 3594 mutation (or 3592 HpaI restriction site, Chen et al. 2000 and citations therein). Group I is distinguished by the absence of the derived alleles for M42}M94}M139 and presence of M91, while all non-African, as well as the majority of African males, sampled carry the derived alleles. Both Group I and II lineages are diverse and suggest a deeper genealogical heritage than other haplotypes. Representatives of these lineages are distributed across Africa but generally at low frequencies. Populations represented in Groups I and II include some Khoisan and Bantu speakers from South Africa, Pygmies from central Africa, and lineages in Sudan, Ethiopia and Mali. A single Sardinian was in Group I. All members of Group II share the M60 and M181 mutations that are distributed across Africa, with an idiosyncratic occurrence in Pakistan. M182 defines the major sub-clade, although an intermediate haplotype in Mali with the unique M146 mutation still persists. Although not mutually exclusive, some geo-
P. A. UNDERHILL AND OTHERS Figure ab Late stages 囗回 Fioure Je. Early Glacial? 50-45 Ky F345:30M 叭383-20M FM3LGM:1009M9490 [四@mWE Fig 3a-h Hypothesized chronology of the global geographie distributions of Y chromosome mutations and groups during geological time periods relevant to the history of anatomically modern humans
48 P. A. U Fig. 3a–h. Hypothesized chronology of the global geographic distributions of Y chromosome mutations and groups during geological time periods relevant to the history of anatomically modern humans
y chromosome binary haplotypes and origins of modern human populations 49 graphie substructure of derived Group II haplo- 2000), while early human population history is types is detected Within the major M182 cluster, more likely characterized by sequential M112/M192 lineages reflect mostly central and expansions and contractions. The effects of southern African populations, whereas M150 repeated founder effects on the age estimates of associated lineages tend to be represented by mutations may account for the apparent dis- populations in Sudan, Ethiopia or Mali crepancies. The current age estimates from the Y In conclusion, the pattern of Group I and I chromosomes rule out very ancient histories distributions, their phylogenetic position and However, the confidence limits of the molecular accumulated variation is suggestive of an early estimates are probably less resolved than the diversification and dispersal of human popu- palaeoanthropological data, and so are used here lations within Africa, and an early widespread to give broad relative frameworks(i.e. order of distribution of human populations in that con- events)rather than precise bracketing in terms tinent. Their patchy distribution, with high of an absolute chronology frequencies among isolated hunter-gatherer groups and in parts of Ethiopia and Sudan, may be interpreted as the survivorship of some of Out of africa these ancient lineages through more recent The M168 mutation represents the sig signature of population events. The palaeoanthropological the recent successful modern human migrations record suggests that during the Stage 5 in- across Africa and beyond, and it is at the root of terglacial (130000-90000 years ago)(Fig. 3a), Groups Ill-X. The geographical distribution of early human populations expanded throughout Groups Ill-X allows us to try to understand Africa, north and south of the Sahara, also some of the major movements that occurred reaching the Levant (Lahr& Foley, 1998). These after the human beings left Africa population expansions are supported by faunal The main points considered in that recon evidence, which shows the presence of not only struction are: the early formation of a non modern humans, but east African species in the African sub-cluster characterized by mutations Middle East at this time(Tchernov, 1994). A last RPS4Y/M216, present today among interglacial age for the first pan-African dispersal Australians, New Guineans, southeast Asians of humans is much earlier than the presently Japanese and central Asians (Group V): the estimated age of 59 000 years ago for the common shared presence of the derived YAP/M145/M20 ancestor of NRY variation. Three considerations alleles in Africans. southeast Asians and should thus be made. First, that as mentioned J apanese(Groups Ill and IV); the distribution above, the upper limit of the confidence interval of a third sub-cluster, characterized by (CI)of the age estimate(140000 years)embraces mutations M89 /M213, across the entire world the period concerned. Second, that the history of with the exception of most of sub-Saharan Africa human Y-chromosomes is characterized by a (Groups VI-X reduction of variation, more so than that of All Y-chromosomes that are not exclusively female lineages(Shen et al. 2000). Thus the initial African contain the M168 mutation, which may phase of early human expansions within sub- have originated within an East African popu Saharan Africa between 130000 and 70000 years lation as a sub-group of Group Il M168 lineages ago may have witnessed several expansion evolved into three distinct sub-clusters: one events, with the extinction of earlier human which acquired an Alu insertion (YAP= NRY variation. Support for this is the apparent DYS287)and the equivalent M145/M203 nucle absence of intermediate haplotypes related to tide substitutions, and two other lineages the M42/M94+/M139 segment. Finally, the es- defined by the distinct mutations RPS4Y/M216 timation of the age presented here assumes a and M89/M213. The destiny of these three model of population growth (Thomson et al. sub-clusters represents a deep structuring
Y chromosome binary haplotypes and origins of modern human populations 49 graphic substructure of derived Group II haplotypes is detected. Within the major M182 cluster, M112}M192 lineages reflect mostly central and southern African populations, whereas M150 associated lineages tend to be represented by populations in Sudan, Ethiopia or Mali. In conclusion, the pattern of Group I and II distributions, their phylogenetic position and accumulated variation is suggestive of an early diversification and dispersal of human populations within Africa, and an early widespread distribution of human populations in that continent. Their patchy distribution, with high frequencies among isolated hunter-gatherer groups and in parts of Ethiopia and Sudan, may be interpreted as the survivorship of some of these ancient lineages through more recent population events. The palaeoanthropological record suggests that during the Stage 5 interglacial (130 000–90 000 years ago) (Fig. 3a), early human populations expanded throughout Africa, north and south of the Sahara, also reaching the Levant (Lahr & Foley, 1998). These population expansions are supported by faunal evidence, which shows the presence of not only modern humans, but east African species in the Middle East at this time (Tchernov, 1994). A last interglacial age for the first pan-African dispersal of humans is much earlier than the presently estimated age of 59 000 years ago for the common ancestor of NRY variation. Three considerations should thus be made. First, that as mentioned above, the upper limit of the confidence interval (CI) of the age estimate (140 000 years) embraces the period concerned. Second, that the history of human Y-chromosomes is characterized by a reduction of variation, more so than that of female lineages (Shen et al. 2000). Thus the initial phase of early human expansions within subSaharan Africa between 130 000 and 70 000 years ago may have witnessed several expansion events, with the extinction of earlier human NRY variation. Support for this is the apparent absence of intermediate haplotypes related to the M42}M94}M139 segment. Finally, the estimation of the age presented here assumes a model of population growth (Thomson et al. 2000), while early human population history is more likely characterized by sequential expansions and contractions. The effects of repeated founder effects on the age estimates of mutations may account for the apparent discrepancies. The current age estimates from the Y chromosomes rule out very ancient histories. However, the confidence limits of the molecular estimates are probably less resolved than the palaeoanthropological data, and so are used here to give broad relative frameworks (i.e. order of events) rather than precise bracketing in terms of an absolute chronology. Out of Africa The M168 mutation represents the signature of the recent successful modern human migrations across Africa and beyond, and it is at the root of Groups III–X. The geographical distribution of Groups III–X allows us to try to understand some of the major movements that occurred after the human beings left Africa. The main points considered in that reconstruction are: the early formation of a nonAfrican sub-cluster characterized by mutations RPS4Y}M216, present today among Australians, New Guineans, southeast Asians, Japanese and central Asians (Group V); the shared presence of the derived YAP}M145}M203 alleles in Africans, southeast Asians and Japanese (Groups III and IV); the distribution of a third sub-cluster, characterized by mutations M89}M213, across the entire world with the exception of most of sub-Saharan Africa (Groups VI–X). All Y-chromosomes that are not exclusively African contain the M168 mutation, which may have originated within an East African population as a sub-group of Group II. M168 lineages evolved into three distinct sub-clusters: one which acquired an Alu insertion (YAP¯ DYS287) and the equivalent M145}M203 nucleotide substitutions, and two other lineages, defined by the distinct mutations RPS4Y}M216 and M89}M213. The destiny of these three sub-clusters represents a deep structuring of
P. A. UNDERHILL AND OTHERS Y-chromosome diversity outside Africa. How- in relation to the events recorded in the archaeo ever, before considering their history across logical and fossil records. However, it embraces the world, its contextual origin within Africa is within its confidence interval the period in which important We mentioned above the existence of the models above postulate an African popu palaeoanthropological evidenee for an early pan- lation fragmented and differentiated into distinct African/Levantine expansion of modern humans sub-clusters that later dispersed out of the during the last interglacial period, which we continent. The recent age of the M168 mutation associate with the expansion of Groups I and II is evidence that most modern extant Y-chromo- lineages. The subsequent African history from somes trace their ancestry to African forefathers the archaeological and fossil records is poorly who left Africa relatively recently and eventually known, but the palaeoclimatic record shows that replaced completely archaic Y-chromosome the onset of glacial climates 70000 years ago was lineages in Eurasia. Suggested departure routes accompanied by fragmentation of African for these dispersals include passages via the Horn environments and isolation of both northwest of Africa to India and the middle east levantine and northeasternmost Africa from each other corridor(Cavalli-Sforza et al 1994; Lahr& Foley and the south. Lahr Foley (1994, 1998) 1994) suggested that it had been during this period of fragmentation and isolation that African popu The evolution of nRy diversity within africa lations acquired variation that was then exported out of Africa independently throt The YAP/145/203 Group Ill multiple dispersals of different African groups. Sub-Saharan African populations today are In other words, that part of the diversity found characterized by the presenee of four NRY outside Africa today was the magnification of a Groups(l,Il, Ill, and VI), of which Group Ill is process of diversification within Africa between the most frequent. The lineage that acquired the 90000-50000 years ago( Fig 36). This recon- YAP/M145/M203 polymorphisms in Africa div struction received support from the "Weak ided into two sub-clusters. One is found today in Garden of Eden"model, based on the pairwise Afric Africa and the Mediterranean. defined by the mismatch distribution of mt DNA lineages show- M40/M96 mutations(Group lll), where M40 ing the contraction of ancestral African diversity SrY4064 (Whitfield et al. 1995). The other sub- into separate groups, who would have then cluster is found in Asia and is defined by the expanded independently within and beyond M174 mutation(Group IV)(discussed in the next Africa at slightly different times(Sherry et al. section 1994). The NrY data gives resolution to these Group Ill lineages are found at high fre models. The evolution of the M168 mutation into quencies in Africa, and relatively high fre four separate clusters(YAP/M145/M203/M40/ quencies in the Middle East and southern Europe M96: ancestors of YAP/M145/M203M174; (characterized by the M35/M215 mutations)with M89 /M213 and ancestors of RPS4Y/ M216 occasional occurrences in Central Asia, Pakistan lineages) is consistent with a process of popu- and America(Hammer et al. 1997, Hammer lation sub-division in Africa of a M168 popu- Horai, 1995, Qamar et al. 1999; Underhill et al lation, prior to the main dispersal events into 2000). Considerable haplotype differentiation Eurasia 50000-40000 years ago. The age of the within Group Ill is observed. Most notably, the M168 mutation, representing the last common PN2 transition(Hammer et al. 1997)unites two ancestor of all non-African human Y-chromo- high frequeney sub-clades, defined by M2/PN1/ somes. has been estimated to be 40000 vears M180 mutations in sub-Saharan Africa. and (95%CI 31000-79000)(Thomson et al. 2000).As M35/M215 in north and east Africa, the with the age of the common ancestor of NRY Mediterranean basin and Europe. The wide- variation discussed above, this estimate is young spread distribution of these two sub-clades
50 P. A. U Y-chromosome diversity outside Africa. However, before considering their history across the world, its contextual origin within Africa is important. We mentioned above the existence of palaeoanthropological evidence for an early panAfrican}Levantine expansion of modern humans during the last interglacial period, which we associate with the expansion of Groups I and II lineages. The subsequent African history from the archaeological and fossil records is poorly known, but the palaeoclimatic record shows that the onset of glacial climates 70 000 years ago was accompanied by fragmentation of African environments and isolation of both northwest and northeasternmost Africa from each other and the south. Lahr & Foley (1994, 1998) suggested that it had been during this period of fragmentation and isolation that African populations acquired variation that was then exported out of Africa independently through multiple dispersals of different African groups. In other words, that part of the diversity found outside Africa today was the magnification of a process of diversification within Africa between 90 000–50 000 years ago (Fig 3b). This reconstruction received support from the ‘‘Weak Garden of Eden’’ model, based on the pairwise mismatch distribution of mtDNA lineages showing the contraction of ancestral African diversity into separate groups, who would have then expanded independently within and beyond Africa at slightly different times (Sherry et al. 1994). The NRY data gives resolution to these models. The evolution of the M168 mutation into four separate clusters (YAP}M145}M203}M40} M96; ancestors of YAP}M145}M203}M174; M89}M213 and ancestors of RPS4Y}M216 lineages) is consistent with a process of population sub-division in Africa of a M168 population, prior to the main dispersal events into Eurasia 50 000–40 000 years ago. The age of the M168 mutation, representing the last common ancestor of all non-African human Y-chromosomes, has been estimated to be 40 000 years (95% CI 31 000–79 000) (Thomson et al. 2000). As with the age of the common ancestor of NRY variation discussed above, this estimate is young in relation to the events recorded in the archaeological and fossil records. However, it embraces within its confidence interval the period in which the models above postulate an African population fragmented and differentiated into distinct sub-clusters that later dispersed out of the continent. The recent age of the M168 mutation is evidence that most modern extant Y-chromosomes trace their ancestry to African forefathers who left Africa relatively recently and eventually replaced completely archaic Y-chromosome lineages in Eurasia. Suggested departure routes for these dispersals include passages via the Horn of Africa to India and the Middle East Levantine corridor (Cavalli-Sforza et al. 1994; Lahr & Foley, 1994). The evolution of NRY diversity within Africa The YAP}M145}M203 Group III Sub-Saharan African populations today are characterized by the presence of four NRY Groups (I, II, III, and VI), of which Group III is the most frequent. The lineage that acquired the YAP}M145}M203 polymorphisms in Africa divided into two sub-clusters. One is found today in Africa and the Mediterranean, defined by the M40}M96 mutations (Group III), where M40¯ SRY4064 (Whitfield et al. 1995). The other subcluster is found in Asia and is defined by the M174 mutation (Group IV) (discussed in the next section). Group III lineages are found at high frequencies in Africa, and relatively high frequencies in the Middle East and southern Europe (characterized by the M35}M215 mutations) with occasional occurrences in Central Asia, Pakistan and America (Hammer et al. 1997, Hammer & Horai, 1995, Qamar et al. 1999; Underhill et al. 2000). Considerable haplotype differentiation within Group III is observed. Most notably, the PN2 transition (Hammer et al. 1997) unites two high frequency sub-clades, defined by M2}PN1} M180 mutations in sub-Saharan Africa, and M35}M215 in north and east Africa, the Mediterranean basin and Europe. The widespread distribution of these two sub-clades
y chromosome binary haplotypes and origins of modern human populations 51 which together account for 80% of Group Il VI as presently known, remains polyphyletic, lineages, is considered to be the result of recent differentiated from the others(Groups Vll-X) events. The M2 transition(Seielstad et al. 1994) by the absence of the M9 mutation. We suggest and its analogues, PNI (Hammer et al. 1997)and that the M89/M213 mutations shared by East M180, are linked to the rflp 49f Ht4. found in Africans and most Eurasian and amerindia high frequeney throughout Africa. The wide world populations occurred in a northeast African distribution of this sub-clade in sub-Saharan population carrying the M168 lineage. Similarly, Africa probably reflects the Bantu agricultural East Africa has been indicated as the source of expansion in the last three thousand years mtDNA haplogroup M(Quintana-Murci et al ( Passarino et al. 1998; Scozzari et al. 1999). The 1999)and superhaplogroup U (the oldest lineage expansion of Bantu farmers would have been with the 16223 transition)(Kivisild et al. 1999) largely accompanied by the replacement of other Thus, part of this early population carrying the Y-chromosomes. The extent of founder effects M89/M213 mutations is still represented within associated with the recent expansion of Group northeast African NRY diversity, while most of III lineages is illustrated by the M191 mutation, its descendants are found outside of Africa which occurs in 40% of the M2/M180/PNI having dispersed via the Levantine corridor to clade members. Furthermore, the low frequeney Eurasia 45-30 K years ago(Fig. 3d). This of lineages within Groups I and II and of the Eurasian M89/M213 ancestral population would 20% minority of the haplotypes within Group have later diversified into further sub-clusters of Ill that lack the PN2 mutation, distinguished by Group VI, as well as Groups VIl, VIll and either the M33 or M75 mutations and confined to lineage carrying the derived M45/M74 alleles, Africa, is evidence of the impact of the Bantu which originated Groups IX and X(discussed expansion which overwhelmed the pre-existing below) African NRY chromosome diversity. This is not revealed by the pattern of mtDNA diversity, The colonization of Australo-Melanesia and which indicates a persistence of mtDNA haplo- formation of early Asian outliers types, suggesting a larger effective population size of African women versus African men The RPS4Y/216 Group I However, it finds reflection in the sub-Saharan RPS4Y/M216 lineages are found in Asia African fossil record, which shows greater early Australo-Melanesia and North America(Bergen Holocene/late Pleistocene morphological diver- et al. 1999; Karafet et al. 1999). We suggest that sity than at present(Lahr, unpublished results). a M168 African population dispersed from the The M35 /M215 sub-clade cluster of haplotypes Horn of Africa via a coastal or interior route agments a lineage(Ht 4)deseribed previously (50-45 K years ago: Walter et al. 2000)towards .1997).We population with this sub-clade of the African mutations probably originated( Fig. 3c) YAP/M145/M203/PN2 cluster expanded into Descendants of this dispersal reached southeast the southern and eastern Mediterranean at the Asia and were also the first to colonize the sahul end of the Pleistocene(Fig. 3h). These lineages landmass- New Guinea and Australia. The lat- would have been introduced then from the ter are characterized by the lineages M3$/M208 Middle East into southern Europe (and to a(hts)and M210 (ht49). One RSP4Y/M216 sser extent northern India and Pakistan) by lineage acquired the M217 mutation, which farmers during the Neolithie expansion. spread through central and eastern Asia, also In east Africa some of the least differentiated reaching Japan(where individuals with RPS4Y/ YAP/M145/M203/M40/M96 lineages are ob- M216/M8/M105/M131 are much less frequent ed(Ht 28, 35, 39). The M89 /M213 mutations than RPS4Y/M216/M217 lineages), and later are at the root of Groups VI-X. However, Group North America. Today, with the exception of
Y chromosome binary haplotypes and origins of modern human populations 51 which together account for 80% of Group III lineages, is considered to be the result of recent events. The M2 transition (Seielstad et al. 1994) and its analogues, PN1 (Hammer et al. 1997) and M180, are linked to the RFLP 49f Ht4, found in high frequency throughout Africa. The wide distribution of this sub-clade in sub-Saharan Africa probably reflects the Bantu agricultural expansion in the last three thousand years (Passarino et al. 1998; Scozzari et al. 1999). The expansion of Bantu farmers would have been largely accompanied by the replacement of other Y-chromosomes. The extent of founder effects associated with the recent expansion of Group III lineages is illustrated by the M191 mutation, which occurs in C40% of the M2}M180}PN1 clade members. Furthermore, the low frequency of lineages within Groups I and II and of the 20% minority of the haplotypes within Group III that lack the PN2 mutation, distinguished by either the M33 or M75 mutations and confined to Africa, is evidence of the impact of the Bantu expansion which overwhelmed the pre-existing African NRY chromosome diversity. This is not revealed by the pattern of mtDNA diversity, which indicates a persistence of mtDNA haplotypes, suggesting a larger effective population size of African women versus African men. However, it finds reflection in the sub-Saharan African fossil record, which shows greater early Holocene}late Pleistocene morphological diversity than at present (Lahr, unpublished results). The M35}M215 sub-clade cluster of haplotypes fragments a lineage (Ht 4) described previously (Hammer et al. 1997). We suggest that a population with this sub-clade of the African YAP}M145}M203}PN2 cluster expanded into the southern and eastern Mediterranean at the end of the Pleistocene (Fig. 3h). These lineages would have been introduced then from the Middle East into southern Europe (and to a lesser extent northern India and Pakistan) by farmers during the Neolithic expansion. In East Africa some of the least differentiated YAP}M145}M203}M40}M96 lineages are observed (Ht 28, 35, 39). The M89}M213 mutations are at the root of Groups VI–X. However, Group VI as presently known, remains polyphyletic, differentiated from the others (Groups VII–X) by the absence of the M9 mutation. We suggest that the M89}M213 mutations shared by East Africans and most Eurasian and Amerindian world populations occurred in a northeast African population carrying the M168 lineage. Similarly, East Africa has been indicated as the source of mtDNA haplogroup M (Quintana-Murci et al. 1999) and superhaplogroup U (the oldest lineage with the 16223 transition) (Kivisild et al. 1999). Thus, part of this early population carrying the M89}M213 mutations is still represented within northeast African NRY diversity, while most of its descendants are found outside of Africa, having dispersed via the Levantine corridor to Eurasia 45–30 K years ago (Fig. 3d). This Eurasian M89}M213 ancestral population would have later diversified into further sub-clusters of Group VI, as well as Groups VII, VIII and a lineage carrying the derived M45}M74 alleles, which originated Groups IX and X (discussed below). The colonization of Australo-Melanesia and formation of early Asian outliers The RPS4Y}M216 Group V RPS4Y}M216 lineages are found in Asia, Australo-Melanesia and North America (Bergen et al. 1999; Karafet et al. 1999). We suggest that a M168 African population dispersed from the Horn of Africa via a coastal or interior route (50–45 K years ago; Walter et al. 2000) towards southern Asia, where the RPS4Y}M216 mutations probably originated (Fig. 3c). Descendants of this dispersal reached southeast Asia and were also the first to colonize the Sahul landmass – New Guinea and Australia. The latter are characterized by the lineages M38}M208 (ht48) and M210 (ht49). One RSP4Y}M216 lineage acquired the M217 mutation, which spread through central and eastern Asia, also reaching Japan (where individuals with RPS4Y} M216}M8}M105}M131 are much less frequent than RPS4Y}M216}M217 lineages), and later North America. Today, with the exception of
P. A. UNDERHILL AND OTHERS Australia, and to a lesser extent New Guinea, the 3c), similar to the Asian dispersal of rps4Y/ RPS4Y/M216 lineages have a relic distribution. M216 lineages discussed above(also probably at Significant similarities between Group V lineages a similar time).YAP/M145/M203/M174 lineages in Australia and New Guinea and chromosome are today mostly confined to Japan and Tibet 21 MXI haplotype 2 have been observed (Jin et where they occur at high frequencies, with fewer al.1999) found scattered throughout southeast Asia( Su et The YAP/145/1203/1174 Group / al. 1999). The chromosome 21 MXI haplotype 6, which correlates with Group Ill lineages, also Group IV lineages(YAP/M145/M203M174)occurs in East Asia, ineluding Japan(Jin et al are exclusively Asian. Interestingly all members 1999). This parallels the shared common ancestry arry the ancestral alleles for M40 and M96. On of Y chromosome related lineages between Africa the basis of YAP and M40 and nested cladistic ( Group Ill) and East Asia(Group IV). Similar to analysis, this pattern has been interpreted as Group V lineages, Group IV haplotypes have a evidence for an asian origin of the yap insertion relie distribution in asia mutations and a subsequent back migration to In conclusion, we suggest that Group IV and V Africa, followed by a major expansion(Altheide lineages in Asia represent the descendants of two Hammer, 1997: Hammer et al. 1998). This early dispersal events of African populations interpretation requires reassessment. The pres- from the Horn of Africa to southern Asia, (Fig ence of M174 in the phylogeny underscores the 3c). a dispersal event that would have also taken difficulty of basing directionality on the absence the mtDNA M haplogroup to India. Alterna- of a character state. The M174 data taken alone tively, the distribution of Group Iv and V would support an African origin of the YAP/ lineages in Asia could reflect a single dispersal M145/M203 polymorphisms, as the M174 an- event of a population carrying both Group I\ cestral allele is found exclusively in Africa. The and V lineages, the former in low frequencies, relatively shallow time-depth of the Y phylogeny facilitating its subsequent extinction in most Thomson et al. 2000)suggests that the extinction descendant populations except J apan and Tibet rate of Y haplotypes is high, which might account where drift would have increased its frequency for the apparent absence of any YAP/ This southern route of dispersal from East Africa M145/M203 chromosomes in Africa that carry to India and beyond does not seem to have been the ancestral allele SR Y4064(=M40). Thus the used very frequently, either by modern humans current apparent lack of less derived precursors or archaic hominids(Lahr& Foley, in press), but haploty pes within both Africa and Asia precludes rather in particular windows of time when the localization of the geographical origin of climate and geography allowed, possibly related YAP. Our results from nested cladistic analvsis to the combination of low sea-levels and light unpublished)indicate a range expansion, but monsoonal regimes. The early human groups not the migration direction required to localize that used this route around 50000 years ago the origin of YAP. Additional clarification con-(taking the earliest occupation of Australia as cerning the geographical origin of the Yap the endpoint of this dispersal) were not restricted insertion may come from the possible eventual to coastal areas, and must have successfully discovery of intermediate haplotypes. There is colonized the Asian mainland, as shown by the no other evidence, either from the other NRY distribution of surviving Group Iv and V groups, mtDNA (Quintana-Murci et al. 1999)or lineages. However, they would have been largely alaeoanthropology (lahr& Foley, 1998), of a replaced by more recent population events Paleolithic migration back to Africa. We suggest associated with the subsequent expansion of that a population carrying only the YAP/M145/ Group VIll throughout all of Asia, Group VIl M203 poly morphism dispersed from Africa lineages in the east Asia, and Group IX lineages through the Horn towards southern Asia(Fig. in west Eurasia
52 P. A. U Australia, and to a lesser extent New Guinea, the RPS4Y}M216 lineages have a relic distribution. Significant similarities between Group V lineages in Australia and New Guinea and chromosome 21 MX1 haplotype 2 have been observed (Jin et al. 1999). The YAP}M145}M203}M174 Group IV Group IV lineages (YAP}M145}M203}M174) are exclusively Asian. Interestingly all members carry the ancestral alleles for M40 and M96. On the basis of YAP and M40 and nested cladistic analysis, this pattern has been interpreted as evidence for an Asian origin of the YAP insertion mutations and a subsequent back migration to Africa, followed by a major expansion (Altheide & Hammer, 1997; Hammer et al. 1998). This interpretation requires reassessment. The presence of M174 in the phylogeny underscores the difficulty of basing directionality on the absence of a character state. The M174 data taken alone would support an African origin of the YAP} M145}M203 polymorphisms, as the M174 ancestral allele is found exclusively in Africa. The relatively shallow time-depth of the Y phylogeny (Thomson et al. 2000) suggests that the extinction rate of Y haplotypes is high, which might account for the apparent absence of any YAP} M145}M203 chromosomes in Africa that carry the ancestral allele SRY4064 (¯M40). Thus the current apparent lack of less derived precursors haplotypes within both Africa and Asia precludes the localization of the geographical origin of YAP. Our results from nested cladistic analysis (unpublished) indicate a range expansion, but not the migration direction required to localize the origin of YAP. Additional clarification concerning the geographical origin of the YAP insertion may come from the possible eventual discovery of intermediate haplotypes. There is no other evidence, either from the other NRY groups, mtDNA (Quintana-Murci et al. 1999) or palaeoanthropology (Lahr & Foley, 1998), of a Paleolithic migration back to Africa. We suggest that a population carrying only the YAP}M145} M203 polymorphism dispersed from Africa through the Horn towards southern Asia (Fig. 3c), similar to the Asian dispersal of RPS4Y} M216 lineages discussed above (also probably at a similar time). YAP}M145}M203}M174 lineages are today mostly confined to Japan and Tibet, where they occur at high frequencies, with fewer found scattered throughout southeast Asia (Su et al. 1999). The chromosome 21 MX1 haplotype 6, which correlates with Group III lineages, also occurs in East Asia, including Japan (Jin et al. 1999). This parallels the shared common ancestry of Y chromosome related lineages between Africa (Group III) and East Asia (Group IV). Similar to Group V lineages, Group IV haplotypes have a relic distribution in Asia. In conclusion, we suggest that Group IV and V lineages in Asia represent the descendants of two early dispersal events of African populations from the Horn of Africa to southern Asia, (Fig. 3c), a dispersal event that would have also taken the mtDNA M haplogroup to India. Alternatively, the distribution of Group IV and V lineages in Asia could reflect a single dispersal event of a population carrying both Group IV and V lineages, the former in low frequencies, facilitating its subsequent extinction in most descendant populations except Japan and Tibet, where drift would have increased its frequency. This southern route of dispersal from East Africa to India and beyond does not seem to have been used very frequently, either by modern humans or archaic hominids (Lahr & Foley, in press), but rather in particular windows of time when climate and geography allowed, possibly related to the combination of low sea-levels and light monsoonal regimes. The early human groups that used this route around 50 000 years ago (taking the earliest occupation of Australia as the endpoint of this dispersal) were not restricted to coastal areas, and must have successfully colonized the Asian mainland, as shown by the distribution of surviving Group IV and V lineages. However, they would have been largely replaced by more recent population events associated with the subsequent expansion of Group VIII throughout all of Asia, Group VII lineages in the east Asia, and Group IX lineages in west Eurasia