Two Types of GGE Biplots for Analyzing Multi-Environment Trial Data Weikai Yan,Paul L.Cornelius,Jose Crossa,and L.A.Hunt ABSTRACT h ed data (GE)of the 997)is more le becaus n of hoth rauiaemtaaGG正tbploisthattcangap as first deseribed in Yan eta (2000). n of the s s as the pri y.Th effect and that markers of all ther cultivars are contained in th of the polygon.a perpend ond the po so tha t the biplot i e in the SRE nd th at the vert x fo ach sector is the lity of t ting ximated by PCl and PC2.Thus.g are the same best performers are graphi e守o nvironm site that t for onm and GE cat not repeatable over years,which suggests that the tes enti fo ication of suchc 8ae2 terec@PmaS产 e PC on (GE) are alua ores (higher and theen Co lo s iplot that diptath ran principal co ment"refers to a year-site combination;it does no condary effects. es iplot allows man the environment-centered data ie.comm referre fect c the utih MET data of the Ontario W.Yan and LA trials during 1989-1999.and of winter wheat perfo tec s:G. main GE,gen (CIMMY 7.Apdo.P uelph ca) onal multiplic Sit with two ul Published in Crop(1) 656
Two Types of GGE Biplots for Analyzing Multi-Environment Trial Data Weikai Yan,* Paul L. Cornelius, Jose Crossa, and L. A. Hunt ABSTRACT on scaled or non-scaled data. When replicated data are available, SREG on scaled data (Crossa and Cornelius, SA genotype main effect plus genotype 3 environment interaction 1997) is more desirable because it deals with any hetero- (GGE) biplot graphically displays the genotypic main effect (G) and geneity of within-site error variance. the genotype 3 environment interaction (GE) of the multienvironment trial (MET) data and facilitates visual evaluation of both the One unique merit of a GGE biplot is that it can graphgenotypes and the environments. This paper compares the merits of ically show the which-won-where patterns of the data, two types of GGE biplots in MET data analysis. The first type is as first described in Yan et al. (2000). Briefly, markers constructed by the least squares solution of the sites regression model of the cultivars furthest from the plot origin (0,0) are (SREG2 ), with the first two principal components as the primary and connected with straight lines to form a polygon such secondary effects, respectively. The second type is constructed by Man- that markers of all other cultivars are contained in the del’s solution for sites regression as the primary effect and the first polygon. To each side of the polygon, a perpendicular principal component extracted from the regression residual as the line, starting from the origin of the biplot, is drawn secondary effect (SREGM11 ). Results indicate that both the SREG2 and extended beyond the polygon so that the biplot is biplot and the SREGM11 biplot are equally effective in displaying the divided into several sectors and the markers of the test “which-won-where” pattern of the MET data, although the SREG2 sites are separated into different sectors. The cultivar biplot explains slightly more GGE variation. The SREGM11 biplot is more desirable, however, in that it always explicitly indicates the at the vertex for each sector is the best performer at average yield and stability of the genotypes and the discriminating sites included in that sector, provided that the GGE is ability and representativeness of the test environments. sufficiently approximated by PC1 and PC2. Thus, groups of sites that share the same best performers are graphically identified. Multienvironment trials are conducted for all ma- If the which-won-where patterns identified by a biplot jor crops throughout the world. The main pur- are repeatable over years, different mega-environments pose of MET is to identify superior cultivars for recom- (subregions) can be defined. By selecting superior cultimendation to farmers and to identify sites that best vars for each mega-environment, both G and GE can represent the target environment. Usually, a large num- be effectively exploited. The GGE biplot is still useful ber of genotypes are tested over a number of sites and even in cases where the which-won-where patterns are years, and it is often difficult to determine the pattern not repeatable over years, which suggests that the tested of genotypic responses across environments without the environments belong to a single mega-environment. It help of graphical display of the data. can be used to identify superior cultivars and test enviYan et al. (2000) developed a “GGE biplot” method- ronments that facilitate identification of such cultivars, ology for graphical analysis of MET data. “GGE” refers provided that the target mega-environment is suffito the genotype main effect (G) plus the genotype 3 ciently sampled and that the genotype PC1 scores have environment interaction (GE), which are the two near-perfect correlation (say, r . 0.95) with the genosources of variation that are relevant to cultivar evalua- type main effects. Ideal cultivars should have large PC1 tion. A biplot (Gabriel, 1971) is a plot that simultane- scores (higher average yield) and near zero PC2 scores ously displays both the genotypes and the environments (more stable). Similarly, ideal test environments should (or in more general terms, both the row and the column have large PC1 scores (more discriminating of the cultifactors). The GGE biplot is a biplot that displays the vars) and near zero PC2 scores (more representative of GGE of MET data. It is constructed by plotting the an average environment). (Note that a “test environment” refers to a year-site combination; it does not first two principal components (PC1 and PC2, also referred to as primary and secondary effects, respectively) necessarily correspond to a “test site”.) Thus, the GGE derived from singular value decomposition (SVD) of biplot allows many important questions to be addressed effectively and graphically. the environment-centered data. Models that decompose However, the requirement for a near-perfect correla- the environment-centered data are commonly referred tion between genotype PC1 scores and genotype main to as sites regression models or SREG, and SREG with effects is not always met, which restricts to the utility two PCs is referred to as SREG2. SREG can be used of the SREG2 based GGE biplot. Analysis of the yearly MET data of the Ontario winter wheat performance W. Yan and L.A. Hunt, Dep. of Plant Agriculture, Univ. of Guelph, trials during 1989-1999, and of winter wheat perforGuelph, Ontario, Canada N1G 2W1; P.L. Cornelius, Dep. of Agronomy and Dep. of Statistics, Univ. of Kentucky, Lexington, KY 40546- 0091; Jose Crossa, Biometrics and Statistics Unit, International Maize Abbreviations: G, genotypic main effect; GE, genotype 3 environand Wheat Improvement Center (CIMMYT), Lisboa 27, Apdo. Postal ment interaction; GGE, Genotype main effects plus genotype 3 environment interaction; E, environment main effect; SREGM11 6-641, 06600 Mexico D.F., Mexico. Received 14 Feb. 2000. *Corre- , Mandel’s sponding author (wyan@uoguelph.ca). sites regression model with one additional multiplicative term; PC, principle component; SREG2, Sites regression model with two multiPublished in Crop Sci. 41:656–663 (2001). plicative terms; SVD, singular value decomposition. 656
YAN ET AL:BIPLOT ANALYSIS OF MULTI-ENVIRONMENT TRIAL DATA 657 max() -min()=max(ni)-min(ni) i.e.. λA(max(En-min(E)=-A(max(n)-(n) notype main e ver.dop for som this bemu Thus s anorm rathe r than an A.-0.51+ ② be interpreted as representng the same informatio 1nλ the genotype main Consequently.the vielding ting ability and therep senaPeaessofthetestenvi The SREGy Biplot lily visu Y=Bi ba +E 3 en e as in Eq.a is the derived fror sidua to SVD as the E H )of th a srec biplot,the primary ars n SVD of the for the Yy=B ba Arfana Eg or and in displaying the which-w where pa Ya-B)bo +E 4 terns c da ate and the Yo-B bjai Emi 5 SREG and the genotype ma scores o MATERIALS AND METHODS B= max(a)-(min(a】 [6 The SREG,Biplot √max(b)-min(b) The SREG,based GGE biplot is derived from Eq.[] A and B are cho ach that the plot spa Y,-B,=入nn+e4=∑Ea+ conducted usine SAS(SAS Institute.199). ent/on PCn The Data values (i cultivars erat the t growing areas.Pre an be btaincd by the SV arnation.range to principal com variance coGE d from 3to% 13 to 2 he SREG GGE.Analy nvironmen results of fitting Ea.i1l in a biplot the singula :and in a Ctag恢e6 differ
YAN ET AL.: BIPLOT ANALYSIS OF MULTI-ENVIRONMENT TRIAL DATA 657 mance trials from several states of the USA (Yan, un- max(j* in) 2 min(j* in) 5 max(h* jn) 2 min(h* jn), published) indicates that the genotype PC1 scores are i.e., usually highly correlated with the genotype main effect. lAn n (max(jin) 2 min(jin) 5 l12An Poor correlations between genotype PC1 scores and ge- n (max(hjn) 2 (hjn)). notype main effects, however, do occur for some years. Thus, Moreover, when multiple years of data are analyzed together, this becomes a norm rather than an exception because of large and complex GE interaction (discussed An 5 0.551 1 1n1 max(hjn) 2 min(hjn) max(jin) 2 min(jin)2 1nln 6. [2] later). In such cases, the genotype PC1 scores cannot be interpreted as representing the same information as the genotype main effects. Consequently, the yielding ability and stability of the genotypes, and the discrimi- The SREGM11 Biplot nating ability and the representativeness of the test envi- Mandel (1961) presented the following model for analysis ronments cannot be readily visualized. of non-additivity of two-way data: To avoid these possible exceptions, in this paper we report an alternative GGE biplot, which is constructed Yij 5 bj 1 bjai 1 εij [3] by Mandel’s sites regression on genotype main effects where Yij and bj are the same as in Eq. [1], ai is the main as the primary effect and the first principal component effect of Genotype i, and bj is the regression coefficient of the derived from subjecting that residual to SVD as the environment centered yields (i.e., Yij 2 bj) within Environ- secondary effect. Such a GGE biplot is referred to as ment j on the genotype main effects (ai). Equation [3] is similar a SREGM1 to the well-known model of Finlay and Wilkinson (1963), but 1 biplot, with the subscript "M" referring to Mandel’s solution. In a SREGM1 the roles of cultivars and sites are exchanged. 1 biplot, the primary effects are the genotype main effects per se; it is, there- If the first principal component (l1ji1hj1) from SVD of the residual from Eq. [3], i.e., (Yij 2 bj 2 bjai), is added, then fore, free from the problem discussed above for the SREG2 biplot. However, it is not clear if a SREGM11 Yij 5 bj 1 bjai 1 l1ji1hj1 1 εij or biplot is as effective as the SREG2 biplot in explaining Yij 2 bj 5 bjai 1 l1ji1hj1 1 εij [4] the GGE and in displaying the which-won-where patterns of the data. This study was initiated to answer where all terms are the same as defined in Eq. [1] or [3]. To these questions by comparing the SREG construct a SREGM11 biplot, Eq. [4] is written as 2 biplot and the SREGM11 biplot applied to several datasets that showed Yij 2 bj 5 b* ja* i 1 j* i1h* j1 1 εij [5] different relations between genotype PC1 scores of with j* i1 5 lA1 1 jil ,h* j1 5 l12A1 1 hj1,b* j 5 Bbj , and a* i 5 B21 ai , where SREG2 and the genotype main effects. A1 is defined by Eq. [2], and MATERIALS AND METHODS B 5 ! max(ai) 2 (min(ai) max(bj) 2 min(bj) . [6] The SREG2 Biplot The SREG2 based GGE biplot is derived from Eq. [1] A1 and B are chosen such that the plot space used by genotypes are the same as that by environments. Analogous to PC1 and Yij 2 bj 5 o 2 n51 lnjinhjn 1 εij 5 o 2 n51 j* inh* jn 1 εij [1] PC2 in the SREG2 model, b* ja* i and j* j1h* i1 are referred to as the primary and secondary effects, respectively. All analyses were where Yij is the average yield of Genotype i in Environment conducted using SAS (SAS Institute, 1996). j, bj is the average yield of all genotypes in Environment j, ln is the singular value for principal component PCn, jin and h The Data jn are scores for Genotype i and Environment j on PCn, respectively, and εij is the residual associated with Genotype The data used in this study were from the 1989 to 1999 i in Environment j. The values of ln, jin, and hjn are simultane- Ontario winter wheat performance trials (Yan, 1999). Each ously obtained by subjecting the environment-centered yield year, 10 to 33 winter wheat (Triticum aestivum L.) cultivars (i.e., Yij2bj) to SVD. This can be achieved by principal compo- are tested with four to six replicates in seven to 14 sites repre- nent analysis of the environment-centered yield using the SAS senting the Ontario winter wheat growing areas. Previous anal- procedure PRINCOMP. The PRINCOMP generates jin as ysis indicated that the yearly variance components due to the genotype scores and (ln jin ) as the environment scores. environment (E) dominated the total yield variation, ranging Alternatively, ln, jin and hjn can be obtained by the SVD from 55 to 91% and averaging 80% of the total variance. function within the SAS procedure IML, which is a basic The variance component due to G ranged from 1.8 to 28.5%, function in many SAS procedures related to principal compo- whereas that due to GE ranged from 7.3 to 15.1% (Yan, 1999). nent analysis. A SAS program for principal component analy- G ranged from 13 to 65% of the total GGE. Analysis with sis of MET data is available from the senior author of this the SREG2 biplot revealed that in all years except 1995 the paper. environmental PC1 scores were of the same sign; and in all To display results of fitting Eq. [1] in a biplot, the singular years except 1995 and 1996 the genotype PC1 scores showed value ln has to be absorbed by the singular vector for cul- high correlation with the mean yield of the genotypes (r . tivars hjn and that for environments jin. That is, j* in 5 lAn n jin and 0.93). Thus, in this study the 1995, 1996, and 1998 datasets, h* jn 5 l12An n hjn . An is chosen such that the range of the environ- representing different types of relations between genotype ment markers is equal to the range of the cultivar markers: PC1 versus genotype main effects, were chosen to compare
CROP SCIENCE,VOL 41,MAY-JUNE 200 ofGGE explained SREG No.of 6 ee08443851 14 Average 19.0 6的1 60 20.9 eral sect RESULTS which is the verte of the is the nu 之a 茶 A wa imun ot GG IN, and E markers of n G SREG 2 data SREG marker and tha 1).Thus. ou any is nearl So the discussion wil oorest genotypes at SREC ith 1998 Data the vertex genotypes. the SREG: anear-pertect relation between eno well as Fig.IB.to b ing and representiv eness genoty es 6 and site against their respective seconda gave the high est aver age yields(largest prim ry scores (as In co ada pted Ger ed poorly a relatively small secondary scores(rive stable).The 5ceo是8 average yield of low average These pes.ype ly6,9.29.33 yield of a genotype at individual sites.For example
658 CROP SCIENCE, VOL. 41, MAY–JUNE 2001 Table 1. Proportions of GGE SS explained by SREG2 and SREGM11 for 12 datasets from the 1989–1999 Ontario winter wheat performance trials. % of GGE explained SREG2 SREGM11 No. of No. of Degrees of Year cultivars sites freedom PC1 PC2 Total Primary Secondary Total 1989 10 9 32 42.5 21.3 63.8 40.7 21.9 62.6 1990 10 7 28 59.7 21.2 80.9 53.5 25.1 78.6 1991 10 9 32 53.3 20.7 74.0 49.1 22.1 71.2 1992 10 10 34 57.0 19.9 76.9 56.4 20.1 76.5 1993 18 9 48 56.8 20.0 76.8 55.4 21.2 76.6 1994 14 11 44 45.6 16.2 61.8 41.6 16.8 58.4 1995 14 14 50 54.2 13.4 67.6 40.8 25.2 66.0 1996 23 9 56 29.6 24.5 54.1 26.7 25.3 52.0 1997 28 8 66 55.0 15.9 70.9 54.0 15.9 69.9 1998 33 8 76 71.5 14.7 86.2 71.0 15.2 86.2 1999 31 9 74 51.5 17.4 68.9 50.7 17.7 68.4 1996–99 11 34 84 24.5 22.7 47.2 23.0 23.9 46.9 Average – – – 50.1 19.0 69.1 46.9 20.9 67.8 the GGE biplot based on SREGM11 with one based on SREG2. dicular to the sides of the polygon are drawn to, and In addition, a complete subset of 11 cultivars by 34 environ- extended beyond, each side of the polygon dividing the ments (year-site combinations) extracted from the 1996 to plot into several sectors; each site will fall into one of 1999 trials was also used in the comparison. the sectors (note that only perpendiculars relevant to discussion were drawn). Assuming that the biplot suffi- RESULTS ciently approximates the variation of GGE, it can be For all datasets, both SREG2 and SREGM11 use the mathematically proven that all sites in the same sector same number of degrees of freedom [(g1e22)1 (g1e2 share the same winning genotype, which is the vertex 4) or 2(g1e)26, where g is the number of genotypes genotype for that sector (Yan et al., 2000). and e the number of the environments] (Table 1). With In Fig. 1A, the sites fell into three sectors: the winning the same number of degrees of freedom, SREG2 is theo- genotype for sites RN, WE, ID, and NN was Genotype retically the most effective model for explaining the 6; the winning genotype for sites WK, HN, and EA was variation due to GGE, because the first two principal Genotype 9; and the winning genotype for site OA was components are computed to explain the maximum Genotype 29. Note that Genotype 9 was the best perforamount of variation. Nevertheless, SREGM11 explained mer for WK, HN, and EA because markers of these only slightly smaller amounts of GGE. When averaged sites were on Genotype 9’s side of the perpendicular to over 12 datasets, SREG2 explained 69.1%, whereas the line that connects Genotypes 9’s marker and that SREGM11 explained 67.8% of the total GGE (Table of genotype 6. Vertex genotypes without any site in 1). Thus, SREGM11 is nearly as effective as SREG2 in their sectors were not the highest yielding genotypes at explaining the variation of GGE. So the discussion will any site; moreover, they were the poorest genotypes at be focused on whether the SREGM11 biplot displays all or some sites. Genotypes within the polygon, particusimilar which-won-where pattern as the SREG2 biplot. larly those located near the plot origin, were less responsive than the vertex genotypes. It can be appreciated 1998 Data that the supplementary lines on the biplot are critical for visual analysis of the MET data. The PC1 scores of the SREG2 model had near-perfect In addition, a near-perfect correlation between geno- correlation (r 5 0.99) with the genotypic main effects type primary effect scores and the genotype main effects for this dataset. Consequently, the SREG2 biplot and allows both biplots, Fig. 1A, as well as Fig. 1B, to be the SREGM11 biplot look almost exactly alike. They were, therefore, equally effective in displaying the GGE used to evaluate cultivars for their yielding ability and stability and to evaluate environments for their discrimi- information (Fig. 1A and 1B). The GGE biplot is constructed by plotting the pri- nating ability and representiveness. Genotypes 6 and 9 mary effect scores of each genotype (as x gave the highest average yields (largest primary scores) -axis) and each site against their respective secondary effect scores (as and were relatively stable over the sites (small absolute y secondary scores). In contrast, three non-adapted Geno- -axis) such that each genotype and each test site is represented by a “marker.” For visualizing the which- types 27, 28, and 31 yielded poorly at all sites, as indiwon-where pattern, the genotype markers located away cated by their small primary scores (low yielding) and from the plot origin were first visually identified and relatively small secondary scores (relatively stable). The connected with straight lines to form a polygon, within average yield of Cultivars 1 and 20 were below average which the markers of all other genotypes are contained. (primary scores ,0) and highly unstable (large absolute These away-from-origin genotypes, namely 6, 9, 29, 33, secondary values). The biplots show not only the aver- 27, 28, 20, and 2 in Fig. 1A, are called “corner” or age yield of a genotype (the primary effect), but also “vertex” genotypes because they are at the corners of how it was achieved. That is, the biplots also show the the polygon. Next, starting from the origin, lines perpen- yield of a genotype at individual sites. For example
YAN ET AL:BIPLOT ANALYSIS OF MULTI-ENVIRONMENT TRIAL DATA 659 A.SREG Coreinted'h)an effects si 15 10 RN On the contrary.site 1615 effect score NN 805 .1.0 al.2000:Yan.1999 15 -1.0-0.50.00.51.0 1.52.02.5 Primary effect (71.5%) 1996 Data 1.5 As with moates that all PCI scores of thet (Fig.2A as arbitrarily assigned positiv 10 0.5 6 00 名Eoromelhationsasocated -NN -0.5 of the OA .1 1.00.5 00 0.5 2.5 3.0 in kno toin such Fig 1 SREG,and SREG aioic-wher ly the bes 6 had the highest ielded the RN.WE.ID.and NN.and EA WK.CA.HW.and OA.In addit on.the SREG the othe ielded below average at sites OA.EA ough it wa the yirtua effects.and its second are deviations from the e marker of a genotyp e ha ding cultiva neither was very sta tto the v5ecope bu are requir indicated that site EA was highly discriminatir RN was most discrimi WK and RN were both discriminating and represen- theoHodue to tative econd 1995 Data 1995d Site NN was not the most discriminating found durin should be consis trials in which near-zero secondary effect score.At a site with a near tion hetween the cultivar PCl scores and the cultivar
YAN ET AL.: BIPLOT ANALYSIS OF MULTI-ENVIRONMENT TRIAL DATA 659 (i.e., genotype main effects since they were perfectly correlated in this dataset) and the differences among genotypes are in proportion to the primary effect scores of the sites. Thus, a genotype that yielded well at such a site has a large average yield. On the contrary, site OA was neither discriminating (small primary effect score) nor representative (large secondary effect score); and therefore, cultivars had high yield at OA did not necessarily give high average yield over sites. Analysis of multiple year data indicated that OA represented a different mega-environment (eastern Ontario) from the major winter wheat growing regions in Ontario (Yan et al., 2000; Yan, 1999). 1996 Data As with most datasets, the SREG2 biplot (Fig. 2A) for 1996 indicates that all PC1 scores of the sites were of the same sign, which was arbitrarily assigned positive so that the genotype PC1 scores correlated positively with the genotype main effect. However, as mentioned earlier, the correlation between the genotype PC1 scores and the genotype main effects for this dataset was only 0.85. The relatively poor correlation is associated with the fact that the GGE explained by PC1 is only slightly greater than that by PC2 (29.6 vs. 24.5%). The poor correlation prevents the genotype PC1 scores of the SREG2 solution being interpreted as representing the genotype main effect; in fact, it alone is not interpretable in known biological and agricultural terms. In such cases, the utility of a SREG2 biplot is limited to investigation of the which-won-where patterns. Based on Fig. Fig. 1. SREG2 and SREGM11 biplots for the 1998 Ontario winter wheat 2A, Cultivar 1 was the best performer at sites RN, LN, performance trial data. The numbers are different cultivars; the ID, and WE; and Cultivar 2 was the best performer at sites are EA 5 Elora, HN 5 Harriston, ID 5 Inwood, NN 5 Nairn, OA 5 Ottawa, RN 5 Ridgetown, WE 5 Woodslee, WK 5 sites EA, WK, CA, and OA, and nearly the best at HW. Woodstock. The SREGM11 biplot (Fig. 2B) explained slightly less GGE, but revealed the same which-won-where patterns as the SREG2 biplot. It indicates that Cultivar 1 won at Cultivar 6 had the highest average yield because it sites RN, LN, WE, and ID, and Cultivar 2 won at sites yielded the highest at sites RN, WE, ID, and NN, and yielded above average at all other sites. On the other EA, WK, CA, HW, and OA. In addition, the SREGM11 biplot is more interpretable. By definition, the primary hand, the average yield of Cultivar 20 was below aver- effects of the SREGM11 biplot are the cultivar main age, because it yielded below average at sites OA, EA, effects, and its secondary effects are deviations from the HN, WK, and NN, even though it was quite good at main effects of the cultivars. Thus, the SREGM11 biplot RN. A below-average yield is indicated if the virtual explicitly showed that Cultivars 1 and 2 were the highest line from the origin to the marker of a genotype has an yielding cultivars on average, but neither was very sta- obtuse angle with the virtual line from the origin to the ble, as evidenced by their relatively large secondary marker of a test site. Likewise, an above-average yield effects. With respect to the sites, the SREGM11 biplot is indicated by an acute angle. Supplementary lines, indicated that site EA was highly discriminating, but not presented in the biplots, are required to explicitly not representative of the average environment, whereas determine these relationships. WK and RN were both discriminating and represen- With respect to the test sites, RN was most discrimi- tative. nating as indicated by the longest distance between its marker and the origin. However, due to its large second- 1995 Data ary score, cultivar differences observed at RN may not exactly reflect the cultivar differences in average yield The 1995 dataset was the only dataset found during over all sites. Site NN was not the most discriminating, the 1989 to 1999 Ontario winter wheat performance but cultivar differences at NN should be highly consis- trials in which the site PC1 scores of the SREG2 differ tent with those averaged over sites because it had a in sign (Fig. 3A). Among the 14 test sites, four (Sites near-zero secondary effect score. At a site with a near- 4, 6, 7, and 10) had negative PC1 scores, though their zero secondary effect score, the genotypes are essen- absolute values were small. This led to poor a correlatially ranked according to their primary effect scores tion between the cultivar PC1 scores and the cultivar
660 CROP SCIENCE,VOL 41,MAY-JUNE 200 A.SREG2 B.SREGM.1 1.0 1.5 EA 10 EA 0.5 1520 0.0 214 CA WK 0.0 20. 月0.5 RN 20 19 RN D12 LN 121 ID LN 1.0 0.5 WE 0.0 0.5 1.0 1.5 2.0 Primary Effect(29.6%) .1.0 1.0 0.50.0 0.51.0 1.5 Primary effect(26.7%) 1996-1999Dat 5 and 12.The tterns are similar in the SREGy cates ke the 1996 dat sRECnpbtPCsorsodiieetse2aIg than that by PC2(24.5 vs.227%)As a result.the was o A.SREG2 1.5 15 1.0 1.0 12 0.5 94 G9 G13 125 0.0 G2G5 197 G2 3 10.5 66 6 0 1.0 .15 -15 .1.51.0 -05 0.0 0.5 1.0 1.5 2.0 -0.5 0.0 05 10 1.52.0 Primary effect (54.1%) Primary effect(40.8%) ace trial data.Each site is represented by anumber,and each
660 CROP SCIENCE, VOL. 41, MAY–JUNE 2001 Fig. 2. SREG2 and SREGM11 biplots for the 1996 Ontario winter wheat performance trial data. The numbers are different cultivars; the sites are CA 5 Centralia, EA 5 Elora, HN 5 Harriston, HW 5 Harrow, ID 5 Inwood, OA 5 Ottawa, RN 5 Ridgetown, WE 5 Woodslee, WK 5 Woodstock. main effects (r 5 0.83). The SREG2 biplot indicates that 1996–1999 Data cultivar G6 was the best for nearly all sites except Sites 4, 6, and 7, at which Cultivar G4 (and also G10) was Although the environmental PC1 scores in the SREG2 better than G6. Cultivar G7 was as good as G6 for Sites model tend to be of the same sign for yearly MET, 5 and 12. These patterns are similar in the SREG they often take different signs when multi-year data are M11 biplot (Fig. 3B). It indicates that Cultivar G6 was on jointly analyzed. For this dataset, among all 34 year-site average the best and Cultivar G12 the second best, and combinations, 9 had negative PC1 scores and the rest that Sites 5 and 12 were highly discriminating but neither had positive PC1 scores (Fig. 4A). Like the 1996 data, was representative. Interestingly, all sites had positive the GGE explained by PC1 was only slightly greater primary effects in the SREGM11 biplot, as compared than that by PC2 (24.5 vs. 22.7%). As a result, the with the site PC1 scores of different signs in the correlation between cultivar PC1 scores and cultivar SREG2 biplot. main effects was only 0.58. This low correlation prevents Fig. 3. SREG2 and SREGM11 biplots for the 1995 Ontario winter wheat performance trial data. Each site is represented by a number, and each cultivar is represented by a number preceded by the latter G
YAN ET AL:BIPLOT ANALYSIS OF MULTI-ENVIRONMENT TRIAL DATA 661 A.SREG 1.5 1996 253 2533■ ◆1999 ◆ A M作 05 1e95 .1.0 -1.5 10 05 0.0 0.5 1.0 Primary effect (24 5 1.5 ·1999 1.0 0.50.00.51.0 1.5 ce trial dat 兰时 with high ield but ear-zer se the eyeida th among the cul rs in ras the hes in thenvi efore.it biplot.As for the 995da biplot.they were all positive in the sreg biplot. favor the pe rmance of s ome cultiva rs,but DISCUSSION Merits of the Two Types of GGE Biplots OnorcbeeTnaronmentsbut ck general adapt ction for high yi hee ea although the Gen prhe®Hhehea6oendo.eoce for the primaryf esig of the cultivars,the pic scores of the secondar m茶a effect must indicate Eini&raciomasociatedthecli or lessof GGE(the 1995.1996and 19961999 datasets)
YAN ET AL.: BIPLOT ANALYSIS OF MULTI-ENVIRONMENT TRIAL DATA 661 Fig. 4. SREG2 and SREGM11 biplots for the 1996|1999 Ontario winter wheat performance trial data. Sites in different years are represented by different symbols. The full cultivar names are: 2533 5 Pioneer 25W33, Ari 5 OAC Ariss, Fre 5 Freedom, Fun 5 Fundulea, Han 5 Hanover, Har 5 Harus, Kar 5 Karena, Mar5 Marilee, Men 5 Mendon, Mor 5 AC Morley, Ron 5 AC Ron. visual identification of cultivars with high average yield large primary effect scores but near-zero secondary based on the SREG2 biplot. Nevertheless, as with all scores. Second, because the genotypic primary effects previous datasets, both biplots displayed very simi- indicate general adaptation of the cultivars, the environlar which-won-where patterns (Fig. 4A and 4B). The mental primary effects must indicate the ability of the SREG2 biplot predicted that cultivar “2533” was the environments to discriminate among the cultivars in best performer in about half of the 34 environments terms of general adaptation. Environments with larger while cultivar “Men” was the best in the other half. primary effects would thus facilitate identification of Therefore, it can be inferred that cultivars “2533” and cultivars with better general adaptation. Third, analo- “Men” must be the two best performers on average. gous to the genotypic secondary effects, the environThis, however, is explicitly indicated only in the SREGM1 mental secondary effects must indicate the tendency of 1 biplot. As for the 1995 dataset, while the primary effects each environment to cause GE interaction. Environof the environments were of different signs in the SREG ments with large (absolute) secondary effects should 2 biplot, they were all positive in the SREG favor the performance of some cultivars, but disfavor M11 biplot. others at the same time. Thus, cultivars selected under environments with large secondary effects may be highly DISCUSSION specific to these environments but lack general adaptaMerits of the Two Types of GGE Biplots tion or stability. Therefore, from the perspective of selection for high yielding and stable cultivars, the ideal This study indicates that both the SREG2 biplot and test environments should have large primary effects, but the SREGM11 biplot explained similar amounts of varia- near-zero secondary effects. tion due to GGE, although the former tends to explain slightly more in most cases. Both biplots displayed the Why Correlation between Genotype Scores same which-won-where pattern and indicated the same of PC1 in SREG2 and Genotype Main winning cultivars in individual environments. Therefore, Effects Varies with Datasets the two biplots can be considered as equally effective in these regards. It was concluded that the SREGM11 biplot is more The SREGM11 biplot was designed to be more inter- desirable than the SREG2 biplot for MET data analysis pretable than the SREG2 biplot. First, since the geno- because the interpretability of the latter is impacted by typic scores for the primary effect of SREGM11 are desig- the uncertain relations between its primary effects and nated to indicate the average yield (general adaptation) the genotype main effects. On the basis of the trials of the cultivars, the genotypic scores of the secondary investigated in this study, Fig. 5 indicates that this correeffect must indicate GE interaction associated the culti- lation is strongly determined by the relative importance vars, which is an indicator of selective or specific adapta- of G in GGE. Near-perfect correlation occurs when G tion. Thus, the SREGM11 biplot simultaneously displays is 40% or more of GGE (the 1992, 1993, 1997–1999 both general adaptation and specific adaptation (stabil- datasets), and poor correlation occurs when G is 20% ity) of the cultivars. The ideal cultivars are those with or less of GGE (the 1995, 1996 and 1996–1999 datasets)
662 CROP SCIENCE,VOL 41,MAY-JUNE 200 1.00 05 1998 but they may not berepr 090 1995 0.80 1996 my indced have limited value 075 while for following reason GE b nviron 0.65 and w 0.60 most res techniqu 0.55 1996-1999 o the 0 20 30 40 50 60 70 re n the data Gas percentage of GGE the us ness o no e e from ce trials. how could ar ound to perform the best in tw groups By relating 2andat2o0n was a able to revea that in tended to be fav s with cold winte mme and short cultivars te tterns based on a singl that the be eses, which can be test ion, because nd more MET tha ample from a the Ontaric ion ontario sites (Out fGGE e res 1999 datas d and data (Ya from th biplot cani in two major aspects. eld and of the ar's MET These tu a-env depicted b the abs vear may not be very informative.biplots con- are based ted fron the eral years s can be highly valuable vear MET data analys It can also be s of trials.In ation.they are normally unbalanced.and therefore the are common to three to four years of performance trials
662 CROP SCIENCE, VOL. 41, MAY–JUNE 2001 biplot technique can not readily applied; single year data are usually balanced but they may not be representative of future years. Thus, a question arises whether biplot analysis of single year MET data is really useful if the which-won-where pattern is not repeatable over years. A single year data may indeed have limited value because of the year-to-year variation. Nevertheless, we believe biplot analysis of single year MET data is worthwhile for the following reasons. First, the GGE biplot is a graphic display of the G and GE of the data, which are relevant to cultivar evaluation and mega-environment identification. Therefore, if the researcher believes that a single year MET is worthy of analysis, and we believe most researchers do, the GGE biplot technique should be the first choice. Although the biplot does not add new information to the data, it does help the researcher quickly view the patterns that are in the data. The biplot gives the researcher the power to “see” what was going on in a particular year. Some may question the usefulness of the single year patterns if they are not Fig. 5. Genotype main effect (G) as percentage of GGE and the repeatable over years. But without knowing the patterns correlation coefficient (r ) between the genotype PC1 scores of the from individual years, how could one know if they are SREG2 model and the genotype main effects for 12 datasets from repeatable or not? Second, the biplot can be used to the 1989-1999 Ontario winter wheat performance trials. identify research problems. For example, if two cultivars were found to perform the best in two different groups The essence of principal component analysis is to pick of locations in a particular year, one might want to know up the most important pattern in the data using the what were the underlying reasons, and answers to this smallest number of degrees of freedom. PC1 picks up question may lead to valuable findings. By relating the largest pattern, PC2 picks up the second largest biplot scores to explanatory variables collected in the pattern, and so on. A close correlation between PC1 trials, Yan and Hunt (2001) was able to reveal that in scores and genotype main effects occurs only when the Ontario, Canada, tall and late winter wheat cultivars genotype main effect is large enough to be the most tended to be favored in seasons with cold winters and important component of GGE. A poor correlation oc- cool summers, whereas early and short cultivars tended curs otherwise, which suggests strong and complex GE to be favored in seasons with warm winters and hot interaction in the data. Therefore, it is not surprising summers. Third, the biplot patterns based on a single that the correlation between PC1 scores of SREG2 and year MET can serve as hypotheses, which can be tested genotype main effect is typically poor when multi-year using extended data and more critical statistics. For ex- data are analyzed in a genotype 3 environment (year- ample, biplots based on yearly data from the Ontario site) fashion, because greater and more complex GE winter wheat performance trials led to the hypothesis interactions are sampled in a multi-year MET than in that two eastern Ontario sites (Ottawa and Kemptville) a single year MET. Complex GE interaction is usually constituted a mega-environment different from the rest accompanied by similar amounts of GGE explained by of the Ontario winter wheat growing region, which was PC1 and PC2 (as for the 1996 and 1996–1999 datasets, subsequently tested and supported by variance compo- Table 1), as opposed to much more GGE explained by nent analysis based on pooled data from 11 yr of perfor- PC1 than by PC2 (e.g., the 1998 dataset). mance trials (Yan, 1999). Thus, although conclusions from a single year MET may not be decisive, they are The Usefulness of the GGE Biplot Based valuable suggestions. Fourth, even if the which-won- on a Single Year MET where pattern is proven to be unrepeatable over years, As a graphic approach to MET data analysis, GGE the researcher would still want to know the average biplot can be useful in two major aspects. The first is yield and the stability of the cultivars based on each to display the which-won-where pattern of the data, year’s MET. These two aspects of cultivar performance which may lead to identification of different mega-envi- are graphically depicted by the abscissa and ordinate of ronments. The second is to identify high-yielding and the biplot, respectively. Finally, although a biplot from stable cultivars and discriminating and representative a single year may not be very informative, biplots contest environments. However, both promises are based structed from several years can be highly valuable. on the assumption that the data is sufficiently represen- Moreover, the biplot technique is not limited to single tative of the target environment; a conclusion can never year MET data analysis. It can also be applied to balgo beyond what the data allow. While multi-year MET anced subsets extracted from multiple years of trials. In data are required for any decisive cultivar and site evalu- Ontario, for example, over 20 winter wheat cultivars ation, they are normally unbalanced, and therefore the are common to three to four years of performance trials
EPINAT-LE SIGNOR ET AL:GENOTYPE X ENVIRONMENT INTERACTIONS FOR EARLY MAIZE HYBRIDS 663 n datah hould c It ean also be used in displaying and ana MaaaLeP96Poa genotype x Am.Sta STAT user's guide.second edition.SAS for.but not limited to.MET data analysis. h spe Guelph.Guelph.Ont REFERENCES mentneieatomlcaomnheCCEpb Interpretation of Genotype x Environment Interactions for Early Maize Hybrids over 12 Years C.Epinat-Le Signor,S.Dousse,J.Lorgeou,J.-B.Denis,R.Bonhomme,P.Carolo,and A.Charcosset ABSTRACT have to fac the recurring problemo of genotype ded 132 nd 229 1 ive v ns are as much a function of the was qual to,or er than th sis using additional tal in es(GDD)nee ry from and the GDD stage,the have been achieved in crop physiolog agronomy.and he n eff I co 420 er ba and r Many fixed or mixed models have been used for de k.1995a.b: nd Hi 1998:vaction 1999 Until now.the hee few attempts to analyz oted through wick et al.(199 fore being recommended for a given zone.To achiev C.Epinat-Le Signor.S Dousse.A.Char etINRA-INAPG-UPS. GDD.s s t.( 1720 eJ-B RA.Statio GE. n per hectare:RSD.Resid NRA ion:SRf,sum of ra around flowering (tror d0100B8 to 500 GDD (12 leave (charcosmoulon inra.fr) 2 GDD to142 (al Published in Crop Sci.(1)
EPINAT-LE SIGNOR ET AL.: GENOTYPE 3 ENVIRONMENT INTERACTIONS FOR EARLY MAIZE HYBRIDS 663 and a balanced subset from such database should con- Finlay, K.W., and Wilkinson, G.N. 1963. The analysis of adaptation in a plant breeding program. Aust. J. Agric. Res. 14:742–754. tain valuable information. Furthermore, the biplot tech- Gabriel, K.R. 1971. The biplot graphic display of matrices with applica- nique is not even limited to genotype 3 environment tion to principal component analysis. Biometrika 58:453–467. data analysis. It can also be used in displaying and ana- Mandel, J. 1961. Non-additivity in two-way analysis of variance. J. Am. Stat. Assoc. 65:878–888. lyzing other types of two-way data such as genotype 3 SAS institute, 1996. SAS/STAT user’s guide, second edition. SAS trait data and diallel cross data (Yan, unpublished re- institute Inc., Cary, NC. search). In conclusion, the GGE biplot is a useful tool Yan, W. 1999. A study on the methodology of yield trial data analysis— for, but not limited to, MET data analysis. with special reference to winter wheat in Ontario. Ph D diss., University of Guelph, Guelph, Ontario, Canada. Yan, W., L.A. Hunt, Q., Sheng, and Z. Szlavnics. 2000. Cultivar evaluaREFERENCES tion and mega-environment investigation based on the GGE biplot. Crop Sci. 40:597–605. Crossa, J., and P.L. Cornelius. 1997. Sites regression and shifted multi- Yan, W., and L.A. Hunt. 2001. Genetic and environmental causes plicative model clustering of cultivar trial sites under heterogeneity of genotype 3 environment interaction for winter wheat yield in of error variances. Crop Sci. 37:405–415. Ontario. Crop Sci. 41:19–25. Interpretation of Genotype 3 Environment Interactions for Early Maize Hybrids over 12 Years C. Epinat-Le Signor, S. Dousse, J. Lorgeou, J.-B. Denis, R. Bonhomme, P. Carolo, and A. Charcosset* ABSTRACT this goal, multi-environment trials form the core of variGenotype 3 etal testing programs in many countries. These programs environment interaction was investigated for grain yield of early maize (Zea mays L.) hybrids. Data were obtained from have to face the recurring problem of genotype 3 envithe French Association Ge´ne´rale des Producteurs de Maı¨s trial net- ronment (GE) interactions. Indeed, differential genowork and included 132 hybrids and 229 environments over 12 yr, typic responses to variable environmental conditions, following an unbalanced design. Analysis of genotype 3 environment especially when associated with changes in genotypic interaction was done for the 1-yr data sets, for the two successive years ranking, limit the identification of superior, stable hy- data sets, and for the 12-yr data set. The magnitude of genotype 3 brids. The GE interactions are as much a function of the environment interaction variance was equal to, or greater than the environmental variables as a function of the genotypic, genotypic variance. Interaction effect was modeled by factorial regres- morphological, phenological, and physiological traits of sion analysis using additional genotypic and environmental informa- the varieties (Nachit et al., 1992). Identification of causal tion. Genotypic covariates considered were the sum of growing day degrees (GDD) necessary from sowing to flowering and the GDD factors of the GE effect and quantification of unexnecessary from flowering to maturity. Environmental covariates were plained variation are of prime importance for selecting the mean temperature from sowing to the 12 leaf stage, the mean for stability or to recommend environmentally specific temperature from the 12 leaf stage to the end of the linear grain- varieties. During recent decades, new developments filling stage, the water balance around flowering, and the sum of solar have been achieved in crop physiology, agronomy, and radiation around flowering. These six covariates explained about 40% statistics and some integrated approaches appeared for of the interaction effect in all analyses, with equal contribution of GE interactions evaluation (Brancourt-Hulmel, 1999). genotypic variates (20%) and environmental variates (20%). Flow- Many fixed or mixed models have been used for de- ering earliness of hybrids, water balance around flowering, and mean tecting and characterizing GE interaction (van Eeuj- temperature from the 12 leaf stage to the end of the grain filling phase wick, 1995a,b; Yan and Hunt, 1998; Vargas et al., 1999). were determinants of genotype 3 environment interaction for grain yield in the considered area. A biological interpretation of the interac- Until now, there have been few attempts to analyze tion was attempted through examination of the regression parameters. this interaction for the newly registered varieties of maize over an important series of years. Only van Eeujwick et al. (1995b) reported results concerning maize multi-environment trials over a series of 11 yr but they Newly registered cultivars generally need to be studied forage percent dry-matter content and not yield. tested at many locations and for several years be- Little is known about the most relevant environmental fore being recommended for a given zone. To achieve C. Epinat-Le Signor, S. Dousse, A. Charcosset, INRA-INAPG-UPS, Abbreviations: AGPM, Association Ge´ne´rale des Producteurs de Station de Ge´ne´tique Ve´ge´tale, Ferme du Moulon, F91190 Gif sur Maı¨s, France; AMMI, Additive Main effects and Multiplicative InterYvette; J. Lorgeou, Association Ge´ne´rale des Producteurs de Maı¨s, action analysis; GDD, sum of Growing Day Degrees; GDDs_f, GDD Station expe´rimentale, F91720 Boigneville; J.-B. Denis, INRA, Station from sowing to flowering; GDDf_m, GDD from flowering to maturity; GE, genotype 3 environment; Mg ha21 de Biome´trie, Route de Saint-Cyr, F78026 Versailles; R. Bonhomme, , ton per hectare; RSD, ResidINRA, Unite´ de recherche en Bioclimatologie, F78850 Thiverval- ual Standard Deviation; SRf, sum of radiation around flowering (from Grignon; P. Carolo, Rustica-Prograin Ge´ne´tique, 117 avenue de Ven- 06220 to 08–20); SS, Sum of Squares; TMs_12l, mean temperature doˆ me, F41000 BLOIS. Received 18 Aug. 1999. *Corresponding author from sowing to 500 GDD (12 leaves stage); TM12l_e, mean tempera- (charcos@moulon.inra.fr). ture from 500 GDD to 1425 GDD (end of linear grain-filling phase); WATf, water balance around flowering (rainfall 1 irrigation – evapoPublished in Crop Sci. 41:663–669 (2001). transpiration from 06–20 to 08–20)