正在加载图片...
LETTER RESEARCH METHODS in the studies and we only derived soil pH values from the global data set if no soil pH Literature search.We searched the literature on studies reporting organic-to- value was indicated in the paper. conventional yield comparisons.First we used the references induded in the To assess whether the conventional yield values reported by studies and previous studys and then extended the search by using online search engines included in the meta-analysis were representative of regional average crop yields, (Google scholar,ISI web of knowledge)as well as reference lists of published we compared them to FAOSTAT yield data and a high-resolution spatial yield articles.We applied several selection criteria to address the criticisms of the pre- data set We used the FAO data,which reports national yearly crop yields vious study and to ensure that minimum scientific standards were met.Studies from 1961 to 2009,for temporal detail and a yield data set",which reports sub- were only included if they (1)reported yield data on individual crop species in an national crop yields for 175 crops for the year 2000 at a 5-min latitude by 5-min organic treatment and a conventional treatment,(2)the organic treatment was longitude resolution,for spatial detail.We calculated country average crop yields truly organic (that is,either certified organic or following organic standards),(3) from FAO data for the respective study period and calculated the ratio of this reported primary data,(4)the scale of the organic and conventional yield observa- average study-period yield to the year-2000 FAO national yield value.We derived tions were comparable,(5)data were not already included from another paper the year-2000 yield value from the spatial data set through the latitude by longitude (that is,avoid multiple counting),and(6)reported the mean(X),an error term value of the study site and scaled this value to the study-period-to-year-2000 ratio (standard deviation (s.d.),standard error (s.e.)or confidence interval)and sample from FAOSTAT.If the meta-analysis conventional yield value was more than 50% size (n)as numerical or graphical data,or if X and s.d.of yields over time could be higher than the local yield average derived by this method it was classified as'above calculated from the reported data.For organic and conventional treatments to be average',when it was more than 50%lower as below average',and when it was considered comparable,the temporal and spatial scale of the reported yields within50%of local yield averages as 'comparable'.We choose this large yield needed to be the same,that is,national averages of conventional agriculture difference as a threshold to account for uncertainties in the FAOSTAT and global compared to national averages of organic agriculture or yields on an organic farm yield data set compared to yields on a neighbouring conventional farm-not included were,for Meta-analysis.The natural log of the response ratio's was used as an effect size example,single farm yields compared to national or regional averages or before- metric for the meta-analysis.The response ratio is calculated as the ratio between after comparisons.Previous studies have illustrated the danger of comparing the organic and the conventional yield.The use of the natural logarithm linearizes yield data drawn from single plots and field trials to larger state and national the metric(treating deviations in the numerator and the denominator the same) averages. and provides more normal sampling distribution in small samples25.Ifthe data set The use ofselection criteria is a critical step in conducting a meta-analysis.On the has some underlying structure and studies can be categorized into more than one one hand,scientific quality and comparability of observations needs to be ensured. group (for example,different crop species,or different fertilizer types)a categorical On the other hand,a meta-analysis should provide as complete a summary of the meta-analysis can be conducted25.Observations with the same or similar current research as possible.There is an ongoing debate about whether meta- management or system characteristics were grouped together.We then used a analyses should adopt very specific selection criteria to prevent mixing incompar. mixed effects model to partition the variance of the sample,assuming that there is able data sets together and to minimize variation in the data set2 or whether, random variation within a group and fixed variation between groups.We calcu- instead,meta-analyses should include as wide a range of studies as possible to allow lated a cumulative effect size as weighted mean from all studies by weighting each for an analysis of sources of variation We followed the generally recommended individual observation by the reciprocal ofthe mixed-model variance,which is the approach,trying to minimize the use of selection criteria based on judgments of sum of the study sampling variance and the pooled within-group variance study quality.Instead,we examined the influence of quality criteria empirically by Weighted parametric meta-analysis should be used whenever possible to deal with evaluating the differences between observations with different quality standards heteroscedasticity in the sample and to increase the statistical power of the ana We did not therefore exclude yield observations from non-peer-reviewed sources or lysis".The cumulative effect size is considered to be significantly different from from studies that lacked an appropriate experimental design a priori.The quality of zero (that is,the organic treatment shows a significant effect on crop yield)if its the study and the comparability of the organic and conventional systems were 95%confidence interval does not overlap zero. assessed by evaluating the experimental design of the study as well as the form of To test for differences in the effect sizes between groups the total heterogeneity publication.Studies that were published in peer-reviewed journals and that con- trolled for the possible influence of variability in space and time on experimental of the sample was partitioned into the within group (Qw)and between group heterogeneity (Qa)in a process similar to an analysis of variance7.The signifi- outcomes through an appropriate experimental design were considered to follow cance of Qn was tested by comparing it against the critical value of the y distri. high quality standards. bution.A significant Qg implies that there are differences among cumulative effect Categorical variables.In addition to study quality criteria,information on several other study characteristics like crop species,location and timescale,and on dif- sizes between groups Only those effects that showed a significant Q are presented in graphs.All statistical analyses were carried out using Metawin ferent management practices,was collected(see Supplementary Tables 1-3).We also wanted to test the effect of study site characteristics on yield ratios and we thus 2.For representation in graphs effect sizes were back-transformed to response ratios. collected information on biophysical characteristics of the study site.As most studies did not report climate or soil variables we derived information on several Each observation in a meta-analysis is required to be independent.Repeated agroecological variables that capture cropland suitability",including the moisture measurements in the same location over time are not independent.If yield values indexx(the ratio of actual to potential evapotranspiration)as an indicator of from a single experiment were reported over several years therefore the average moisture availability to crops,growing degree days(GDD,the annual sum of daily yield over time was calculated and used in the meta-analysis.If the mean and mean temperatures over a base temperature of 5C)as an indicator of growing variance of multiple years was reported,the weighted average over time was season length,as well as soil carbon density (Csoa,as a measure of soil organic calculated by weighting each year by the inverse of its variance.Different experi- content)and soil pH as indicators of soil quality from the latitude X longitude ments(for example,different tillage practices,crop species or fertilizer rates)from values of the study site and global spatial models/data sets at 5min resolution the same study are not necessarily independent.However,it is recommended to We derived the thresholds for the classification of these climate and soil vari- still include different experiments from the same study,as their omission would ables from the probability of cultivation functions previously described.This cause more distortions of the results than the lack of true independence We probability of cultivation function is a curve fitted to the empirical relationship therefore included different experiments from a single study separately in the between cropland areas,GDD or Coa.It describes the probability that a location meta-analysis. with a certain climate or soil characteristic is covered by cropland.Suitable loca- If data from the same experiment from the same study period were reported in tions with favourable climate and soil characteristics have a higher probability of several papers,the data were only included once,namely from the paper that being cultivated.Favourable climate and soil characteristics can thus be inferred reported the data in the highest detail (that is,reporting s.e./s.e.and n and/or from the probability of cultivation.For x,GDD and Coul a probability of cultiva- reporting the longest time period).If instead data from the same experiment from tion under 30%was classified as low'suitability,between 30%and 70%as different years were reported in separate papers,the data were included separately medium'suitability,and above 70%as high'suitability(Supplementary Table 3) in the analysis (for example,refs 39,40). Sites with low and medium suitable moisture indices are interpreted as having In addition to potential within-study dependence of effect size data,there can insufficient water availability,sites with low and medium GDD have short growing also be issues with between-study dependence of datadata from studies con- seasons,and sites with low and medium soil carbon densities are either unfertile ducted by the same author,in the same location or on the same crop species are because they have too small a Con and low organic matter content (and thus also potentially non-independent.We addressed this issue by conducting a hier- insufficient nutrients)or too high a Coa in soils in wetlands where organic matter archical,categorical meta-analysis (as described earlier),specifically testing for the accumulates because they are submerged under water.For soil pH,instead,we influence ofnumerous moderators on the effect size.In addition,we examined the defined thresholds based on expert judgment.Soil pH information was often given interaction between categorical variables through a combination of contingency 2012 Macmillan Publishers Limited.All rights reservedMETHODS Literature search. We searched the literature on studies reporting organic-to￾conventional yield comparisons. First we used the references included in the previous study6 and then extended the search by using online search engines (Google scholar, ISI web of knowledge) as well as reference lists of published articles. We applied several selection criteria to address the criticisms of the pre￾vious study6 and to ensure that minimum scientific standards were met. Studies were only included if they (1) reported yield data on individual crop species in an organic treatment and a conventional treatment, (2) the organic treatment was truly organic (that is, either certified organic or following organic standards), (3) reported primary data, (4) the scale of the organic and conventional yield observa￾tions were comparable, (5) data were not already included from another paper (that is, avoid multiple counting), and (6) reported the mean (X), an error term (standard deviation (s.d.), standard error (s.e.) or confidence interval) and sample size (n) as numerical or graphical data, or if X and s.d. of yields over time could be calculated from the reported data. For organic and conventional treatments to be considered comparable, the temporal and spatial scale of the reported yields needed to be the same, that is, national averages of conventional agriculture compared to national averages of organic agriculture or yields on an organic farm compared to yields on a neighbouring conventional farm—not included were, for example, single farm yields compared to national or regional averages or before– after comparisons. Previous studies27 have illustrated the danger of comparing yield data drawn from single plots and field trials to larger state and national averages. The use of selection criteria is a critical step in conducting a meta-analysis. On the one hand, scientific quality and comparability of observations needs to be ensured. On the other hand, a meta-analysis should provide as complete a summary of the current research as possible. There is an ongoing debate about whether meta￾analyses should adopt very specific selection criteria to prevent mixing incompar￾able data sets together and to minimize variation in the data set28 or whether, instead, meta-analyses should include as wide a range of studies as possible to allow for an analysis of sources of variation29. We followed the generally recommended approach, trying to minimize the use of selection criteria based on judgments of study quality30. Instead, we examined the influence of quality criteria empirically by evaluating the differences between observations with different quality standards. We did not therefore exclude yield observationsfrom non-peer-reviewed sources or from studies that lacked an appropriate experimental design a priori. The quality of the study and the comparability of the organic and conventional systems were assessed by evaluating the experimental design of the study as well as the form of publication. Studies that were published in peer-reviewed journals and that con￾trolled for the possible influence of variability in space and time on experimental outcomes through an appropriate experimental design were considered to follow high quality standards. Categorical variables. In addition to study quality criteria, information on several other study characteristics like crop species, location and timescale, and on dif￾ferent management practices, was collected (see Supplementary Tables 1–3). We also wanted to test the effect of study site characteristics on yield ratios and we thus collected information on biophysical characteristics of the study site. As most studies did not report climate or soil variables we derived information on several agroecological variables that capture cropland suitability31, including the moisture index a (the ratio of actual to potential evapotranspiration) as an indicator of moisture availability to crops, growing degree days (GDD, the annual sum of daily mean temperatures over a base temperature of 5 uC) as an indicator of growing season length, as well as soil carbon density (Csoil, as a measure of soil organic content) and soil pH as indicators of soil quality from the latitude 3 longitude values of the study site and global spatial models/data sets at 5 min resolution32,33. We derived the thresholds for the classification of these climate and soil vari￾ables from the probability of cultivation functions previously described31. This probability of cultivation function is a curve fitted to the empirical relationship between cropland areas, a, GDD or Csoil. It describes the probability that a location with a certain climate or soil characteristic is covered by cropland. Suitable loca￾tions with favourable climate and soil characteristics have a higher probability of being cultivated. Favourable climate and soil characteristics can thus be inferred from the probability of cultivation. For a, GDD and Csoil a probability of cultiva￾tion under 30% was classified as ‘low’ suitability, between 30% and 70% as ‘medium’ suitability, and above 70% as ‘high’ suitability (Supplementary Table 3). Sites with low and medium suitable moisture indices are interpreted as having insufficient water availability, sites with low and medium GDD have short growing seasons, and sites with low and medium soil carbon densities are either unfertile because they have too small a Csoil and low organic matter content (and thus insufficient nutrients) or too high a Csoil in soils in wetlands where organic matter accumulates because they are submerged under water. For soil pH, instead, we defined thresholds based on expert judgment. Soil pH information was often given in the studies and we only derived soil pH valuesfrom the global data set if no soil pH value was indicated in the paper. To assess whether the conventional yield values reported by studies and included in the meta-analysis were representative of regional average crop yields, we compared them to FAOSTAT yield data and a high-resolution spatial yield data set34,35. We used the FAO data35, which reports national yearly crop yields from 1961 to 2009, for temporal detail and a yield data set34, which reports sub￾national crop yields for 175 crops for the year 2000 at a 5-min latitude by 5-min longitude resolution, for spatial detail. We calculated country average crop yields from FAO data for the respective study period and calculated the ratio of this average study-period yield to the year-2000 FAO national yield value. We derived the year-2000 yield value from the spatial data set through the latitude by longitude value of the study site and scaled this value to the study-period-to-year-2000 ratio from FAOSTAT. If the meta-analysis conventional yield value was more than 50% higher than the local yield average derived by this method it was classified as ‘above average’, when it was more than 50% lower as ‘below average’, and when it was within 650% of local yield averages as ‘comparable’. We choose this large yield difference as a threshold to account for uncertainties in the FAOSTAT and global yield data set34. Meta-analysis. The natural log of the response ratio25 was used as an effect size metric for the meta-analysis. The response ratio is calculated as the ratio between the organic and the conventional yield. The use of the natural logarithm linearizes the metric (treating deviations in the numerator and the denominator the same) and provides more normal sampling distribution in small samples25. If the data set has some underlying structure and studies can be categorized into more than one group (for example, different crop species, or different fertilizer types) a categorical meta-analysis can be conducted26. Observations with the same or similar management or system characteristics were grouped together. We then used a mixed effects model to partition the variance of the sample, assuming that there is random variation within a group and fixed variation between groups. We calcu￾lated a cumulative effect size as weighted mean from all studies by weighting each individual observation by the reciprocal of the mixed-model variance, which is the sum of the study sampling variance and the pooled within-group variance. Weighted parametric meta-analysis should be used whenever possible to deal with heteroscedasticity in the sample and to increase the statistical power of the ana￾lysis36. The cumulative effect size is considered to be significantly different from zero (that is, the organic treatment shows a significant effect on crop yield) if its 95% confidence interval does not overlap zero. To test for differences in the effect sizes between groups the total heterogeneity of the sample was partitioned into the within group (QW) and between group heterogeneity (QB) in a process similar to an analysis of variance37. The signifi￾cance of QB was tested by comparing it against the critical value of the x2 distri￾bution. A significant QB implies that there are differences among cumulative effect sizes between groups26,38. Only those effects that showed a significant QB are presented in graphs. All statistical analyses were carried out using MetaWin 2.026. For representation in graphs effect sizes were back-transformed to response ratios. Each observation in a meta-analysis is required to be independent. Repeated measurements in the same location over time are not independent. If yield values from a single experiment were reported over several years therefore the average yield over time was calculated and used in the meta-analysis. If the mean and variance of multiple years was reported, the weighted average over time was calculated by weighting each year by the inverse of its variance. Different experi￾ments (for example, different tillage practices, crop species or fertilizer rates) from the same study are not necessarily independent. However, it is recommended to still include different experiments from the same study, as their omission would cause more distortions of the results than the lack of true independence38. We therefore included different experiments from a single study separately in the meta-analysis. If data from the same experiment from the same study period were reported in several papers, the data were only included once, namely from the paper that reported the data in the highest detail (that is, reporting s.e./s.e. and n and/or reporting the longest time period). If instead data from the same experiment from different years were reported in separate papers, the data were included separately in the analysis (for example, refs 39, 40). In addition to potential within-study dependence of effect size data, there can also be issues with between-study dependence of data36—data from studies con￾ducted by the same author, in the same location or on the same crop species are also potentially non-independent. We addressed this issue by conducting a hier￾archical, categorical meta-analysis (as described earlier), specifically testing for the influence of numerous moderators on the effect size. In addition, we examined the interaction between categorical variables through a combination of contingency LETTER RESEARCH ©2012 Macmillan Publishers Limited. All rights reserved
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有