A Multiple regression Model 367 A Multiple regression model to Predict Zebra mussel population Growth Michael p. schubmehl Marcy A. la violette Deborah a chun Harvey Mudd College Claremont. ca 91711 Advisor: Michael E. Moody Summary Zebra mussels(Dreissena polymorpha) are an invasive mollusk accidentally troduced to the united States by transatlantic ships during the mid-1980s Because the mussels have few natural predators and adapt quickly to new envi- ronments, they have spread quickly from the great lakes into many connected waterways. Although the mussel is hardy, sometimes little or no growth is observed in lakes to which it has been introduced extensive research indicates that the chemical concentrations in these bodies of water may be unsuitable for the mussels To quantify the relationship between chemical contents and mussel popu lation growth, we first use the logistic equation, dy y to model Dreissena population as a function of time. After modeling growth rates under a variety of conditions, we used multiple regression to determine which chemicals affect this growth rate. An extensive literature search sup- ported our findings that population growth is linearly dependent on two pri- mary factors: calcium concentration and pH. After further refining our model using the second set of data from Lake A, we obtained the regression equation maximum growth rate=2338[Ca2++39202 pH-334089 The UMAP Journal22(4)(2001)367-383. Copyright 2001 by COMAP, Inc. All rights reserved Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice. Abstracting with credit is permitted, but copyrights for components of this work owned by others than COMAP must be honored To copy otherwise to republish, to post on servers, or to redistribute to lists requires prior permission from COMAP
A Multiple Regression Model 367 A Multiple Regression Model to Predict Zebra Mussel Population Growth Michael P. Schubmehl Marcy A. LaViollette Deborah A. Chun Harvey Mudd College Claremont, CA 91711 Advisor: Michael E. Moody Summary Zebra mussels (Dreissena polymorpha) are an invasive mollusk accidentally introduced to the United States by transatlantic ships during the mid-1980s. Because the mussels have few natural predators and adapt quickly to new environments, they have spread quickly from the Great Lakes into many connected waterways. Although the mussel is hardy, sometimes little or no growth is observed in lakes to which it has been introduced; extensive research indicates that the chemical concentrations in these bodies of water may be unsuitable for the mussels. To quantify the relationship between chemical contents and mussel population growth, we first use the logistic equation, dy dt = ry 1 − y K , to model Dreissena population as a function of time. After modeling growth rates under a variety of conditions, we used multiple regression to determine which chemicals affect this growth rate. An extensive literature search supported our findings that population growth is linearly dependent on two primary factors: calcium concentration and pH. After further refining our model using the second set of data from Lake A, we obtained the regression equation maximum growth rate = 2338 [Ca2+] + 39202 pH − 334089, The UMAP Journal 22 (4) (2001) 367–383. c Copyright 2001 by COMAP, Inc. All rights reserved. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice. Abstracting with credit is permitted, but copyrights for components of this work owned by others than COMAP must be honored. To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior permission from COMAP.
368 The UMAP Journal 22. 4(2001) where the maximum growth rate is in juveniles settling per day, and [Ca2t is in mg/L. Using this model, we predict that lakes B and C cannot support Dreissena population. Because the levels of calcium in lake b are close to those required to support a Dreissena population, however, we advise the community near Lake B to use de-icing agents that do not contain calcium Environmental Factors Affecting Dreissena a large body of research links environmental factors such as temperature pH, calcium ion concentration, and alkalinity to the success or failure of zebra mussel populations. The two factors repeatedly most closely associated with survival are calcium concentration and pH. In a survey of 278 lakes, for exam- le, Ramcharan et al. [1992] found no populated lakes with ph below 7.3 or Ca content below 28.3 mg/L. Recent studies have lowered the minimum Ca con- centration to 15 mg/L for adults and 12 mg/L for larvae [McMahon 1996]. The upper bound for pH is somewhere near 9.4 [McMahon 1996]. The optimum conditions for growth are a pH of 8.4 and 34 mg/L of Ca[McMahon 1996 Other requirements for survival include alkalinity, which must be kept above 50 mg/L [Balog et al. 1995, and dissolved oxygen, which must be above 0.82 ppm(approximately 10% of saturation) Johnson and McMahon 1996 Dreissena also cannot survive in magnesium-deficient water; they require a minimum concentration of 0.03 mM for a low-density population [Dietz and Byrne 1994]. Sulfate(SO4) is also required in small amounts for survival [Dietz and b Zebra mussels can survive in an amazingly wide range of temperatures, but Van der Velde et al. [1996] determined that exposure to 34 C is lethal within 114 minutes and that any temperature above 25 C inhibits movement and feeding Some individuals can tolerate short-term sub-freezing air temperatures [Pauk stis et al. 1996 Although not used by the mussels themselves, phosphorus and nitrogen are essential for freshwater phytoplankton survival, and phytoplankton are the main source of food for Dreissena. Densities of mussel populations are neg- atively related to both phosphates and nitrates; but iron, chlorine, and sodium have no relationship to the existence or density of populations [Ramcharan et al. 1992]. Chlorophyll content measures the density of phytoplankton and thus decreases drastically after the establishment of a zebra mussel colony [Miller and Haynes 19971 Surprisingly, food availability is not an important factor once a zebra musse is established. In one study, Dreissena were able to survive without food for 524 days with only a 60% mortality rate [Chase and McMahon 1995]. Once a population has acclimatized, limited reproduction can occur in brackish water below 7.0 ppt salinity [Fong et al. 1995, with little mortality even up to 10 ppt Kennedy et al. 1996]. Potassium can be tolerated only in low concentrations up to 0.3-0.5 mM. Ammonia(NH3)is lethal in doses as low as 2 mg/L[Baker et
368 The UMAP Journal 22.4 (2001) where the maximum growth rate is in juveniles settling per day, and [Ca2+] is in mg/L. Using this model, we predict that lakes B and C cannot support Dreissena population. Because the levels of calcium in Lake B are close to those required to support a Dreissena population, however, we advise the community near Lake B to use de-icing agents that do not contain calcium. Environmental Factors Affecting Dreissena A large body of research links environmental factors such as temperature, pH, calcium ion concentration, and alkalinity to the success or failure of zebra mussel populations. The two factors repeatedly most closely associated with survival are calcium concentration and pH. In a survey of 278 lakes, for example, Ramcharan et al. [1992] found no populated lakes with pH below 7.3 or Ca content below 28.3 mg/L. Recent studies have lowered the minimum Ca concentration to 15 mg/L for adults and 12 mg/L for larvae [McMahon 1996]. The upper bound for pH is somewhere near 9.4 [McMahon 1996]. The optimum conditions for growth are a pH of 8.4 and 34 mg/L of Ca [McMahon 1996]. Other requirements for survival include alkalinity, which must be kept above 50 mg/L [Balog et al. 1995], and dissolved oxygen, which must be above 0.82 ppm (approximately 10% of saturation) [Johnson and McMahon 1996]. Dreissena also cannot survive in magnesium-deficient water; they require a minimum concentration of 0.03 mM for a low-density population [Dietz and Byrne 1994]. Sulfate (SO4) is also required in small amounts for survival [Dietz and Byrne 1999]. Zebra mussels can survive in an amazingly wide range of temperatures, but Van der Velde et al. [1996] determined that exposure to 34◦C is lethal within 114 minutes and that any temperature above 25◦C inhibits movement and feeding. Some individuals can tolerate short-term sub-freezing air temperatures [Paukstis et al. 1996]. Although not used by the mussels themselves, phosphorus and nitrogen are essential for freshwater phytoplankton survival, and phytoplankton are the main source of food for Dreissena. Densities of mussel populations are negatively related to both phosphates and nitrates; but iron, chlorine, and sodium have no relationship to the existence or density of populations [Ramcharan et al. 1992]. Chlorophyll content measures the density of phytoplankton and thus decreases drastically after the establishment of a zebra mussel colony [Miller and Haynes 1997]. Surprisingly, food availability is not an important factor once a zebra mussel is established. In one study, Dreissena were able to survive without food for 524 days with only a 60% mortality rate [Chase and McMahon 1995]. Once a population has acclimatized, limited reproduction can occur in brackish water below 7.0 ppt salinity [Fong et al. 1995], with little mortality even up to 10 ppt [Kennedy et al. 1996]. Potassium can be tolerated only in low concentrations up to 0.3–0.5 mM. Ammonia (NH3) is lethal in doses as low as 2 mg/L [Baker et
A Multiple regression Model 369 al. 1994. An extensive literature search revealed no correlation between NHg and zebra mussel populations Constructing the Model We need to quantify Dreissena population growth, then examine how this growth is affected by the environment. We use the logistic equation, a standard modeling device in ecology [Gotelli 1998]. We choose a continuous approach because of the huge number of individuals involved, and the logistic equation in particular because its simplicity allows us to make as few assumptions as possible Standard techniques for examining the influence of variables like calcium ion concentrations, pH, and temperature on Dreissena populations include mul tiple regression and discriminant analysis [Ramcharan et al. 1992]. We want to predict actual population growth rates and not just state whether or not population could exist in certain conditions, so we use multiple regression te relate population growth to chemical concentrations Assumptions Population growth rate is proportional to total population We assume that the growth rate of an areas population is proportional to the rate at which juveniles settle on plates there. This rate is, in turn, proportional to the total number of larvae present in the water, which is proportional to the total population. Thus, the population growth rate is proportional to the population level e Carrying capacity is constant Larvae can be thought of as a resource necessary for juveniles to exist breeding season, only a certain number of larvae are produced, so the popu lation can increase only to a certain point. Thus, there is effectively a carrying capacity at work. We assume that this carrying capacity does not depend explicitly on time once the breeding season begins Migration, genetic structure, and age structure do not affect the popula tion Although Dreissena populations spread quickly from one region to another individuals can move only at a slow crawl. Thus, migration of existing population into or out of a region is negligible. Also, there is no evidence for the existence of individuals whose ages or genes dramatically affect their influence on the population, so we neglect age and genetic variation Predation is negligible
A Multiple Regression Model 369 al. 1994]. An extensive literature search revealed no correlation between NH4 and zebra mussel populations. Constructing the Model We need to quantify Dreissena population growth, then examine how this growth is affected by the environment. We use the logistic equation, a standard modeling device in ecology [Gotelli 1998]. We choose a continuous approach because of the huge number of individuals involved, and the logistic equation in particular because its simplicity allows us to make as few assumptions as possible. Standard techniques for examining the influence of variables like calcium ion concentrations, pH, and temperature on Dreissena populations include multiple regression and discriminant analysis [Ramcharan et al. 1992]. We want to predict actual population growth rates and not just state whether or not a population could exist in certain conditions, so we use multiple regression to relate population growth to chemical concentrations. Assumptions • Population growth rate is proportional to total population. We assume that the growth rate of an area’s population is proportional to the rate at which juveniles settle on plates there. This rate is, in turn, proportional to the total number of larvae present in the water, which is proportional to the total population. Thus, the population growth rate is proportional to the population level. • Carrying capacity is constant. Larvae can be thought of as a resource necessary for juveniles to exist. Each breeding season, only a certain number of larvae are produced, so the population can increase only to a certain point. Thus, there is effectively a carrying capacity at work. We assume that this carrying capacity does not depend explicitly on time once the breeding season begins. • Migration, genetic structure, and age structure do not affect the population. Although Dreissena populations spread quickly from one region to another, individuals can move only at a slow crawl. Thus, migration of existing population into or out of a region is negligible. Also, there is no evidence for the existence of individuals whose ages or genes dramatically affect their influence on the population, so we neglect age and genetic variation. • Predation is negligible
370 The UMAP Journal 22. 4(2001) We assume that Dreissena are so numerous that any species that prey on them-and there are few-do not have a substantial impact. Sites within a lake can be treated as distinct lakes Although all of the data came from a single lake, we model each site as a separate lake. That is, we assume that the introduction of mussels from another part of the lake is equivalent to their introduction into a fresh lake and we model the population at the new site independently Population Growth Model: The Logistic Equation We model a Dreissena population with the logistic equation dy where r is the intrinsic growth rate of the population and K is the carrying capacity. For simplicity, we let a=r and b=r/K, so that dy With the initial condition y(0)=yo, the equation has closed-form solutions shown in Figure 1. Because the data from Lake A measure the population growth rate, what we really want to fit to the data is the derivative of this y'(t aeatyo(a +b( whose graph is shown in Figure 2. We can convert the parameters a, b, and yo into the position, height, and full width at half maximum(FWHM) of this peak, making it easy to fit to data Because the first data set did not include information about changes in chemical concentration over time, we average the population growth rates over all years after the introduction of Dreissena and fit the model curve to this"average year"at each site. The position and width of the peak are fairl constant from site to site, as we expect, since the breeding season usually peaks around mid-to late August and lasts for about three months. The peak heights, however, are radically different at different sites, ranging from about 38,000 juveniles per day at site 2(Figure 3) to just 1 juvenile per day at site 10. This variation can be explained only by the environmental conditions there, so we determine how these growth rates varied with chemical concentrations
370 The UMAP Journal 22.4 (2001) We assume that Dreissena are so numerous that any species that prey on them—and there are few—do not have a substantial impact. • Sites within a lake can be treated as distinct lakes. Although all of the data came from a single lake, we model each site as a separate lake. That is, we assume that the introduction of mussels from another part of the lake is equivalent to their introduction into a fresh lake, and we model the population at the new site independently. Population Growth Model: The Logistic Equation We model a Dreissena population with the logistic equation dy dt = ry 1 − y K , where r is the intrinsic growth rate of the population and K is the carrying capacity. For simplicity, we let a = r and b = r/K, so that dy dt = ay − by2. With the initial condition y(0) = y0, the equation has closed-form solutions y(t) = aeaty0 a − by0 + beaty0 , shown in Figure 1. Because the data from Lake A measure the population growth rate, what we really want to fit to the data is the derivative of this function, y (t) = a2eaty0(a − by0) (a + b(−1 + eat)y0)2 , whose graph is shown in Figure 2. We can convert the parameters a, b, and y0 into the position, height, and full width at half maximum (FWHM) of this peak, making it easy to fit to data. Because the first data set did not include information about changes in chemical concentration over time, we average the population growth rates over all years after the introduction of Dreissena and fit the model curve to this “average year” at each site. The position and width of the peak are fairly constant from site to site, as we expect, since the breeding season usually peaks around mid- to late August and lasts for about three months. The peak heights, however, are radically different at different sites, ranging from about 38,000 juveniles per day at site 2 (Figure 3) to just 1 juvenile per day at site 10. This variation can be explained only by the environmental conditions there, so we determine how these growth rates varied with chemical concentrations.
A Multiple regression Model 371 Time Figure 1. Solution to a generic logistic equation, y'= ay- by with population plotted as a function of time Time Figure 2. The derivative of the solution to a generic the logistic equation, showing the time rate of change of population. The peak corresponds to Dreissena breeding season in our model
A Multiple Regression Model 371 Population Time Figure 1. Solution to a generic logistic equation, y = ay − by2, with population plotted as a function of time. Growth Rate Time Figure 2. The derivative of the solution to a generic the logistic equation, showing the time rate of change of population. The peak corresponds to Dreissena breeding season in our model
372 The UMAP Journal 22.4( 2001) Actual and Model growth Rates for an"Average Year 分30000 25000 15000 10000 Time(fraction of average year) Figure 3. The derivative of the population growth model, along with data for an average year at Lake A (at site 2, the most populous site). The peak height of 38,000 is the quantity that best characterizes the populations success, so it is used in the regression analysis Influence of the Environment: Multiple Regression Analysis To determine the effect of environmental conditions on growth rates, we lust correlate the peak growth rates in the logistic model with the chemical concentrations at each site. To this end, we perform a multiple regression with peak growth rate as the dependent variable and some or all of the chemical concentrations as independent variables There are only 10 data points, far fewer than needed to separate the effects of all 1l variables. Fortunately, the literature provides guidance in selecting which variables to use. The dominant factors influencing the success of a Dreissena population are the concentration of calcium and the pH. Although alkalinity seems to be somewhat important, it is included in only the first data set; more- over, it also appears to be closely correlated with calcium concentration,so we exclude it. Another marginally important factor, dissolved oxygen,was not measured in the first data set. According to the literature other chemical perform the regression on just two variables: calcium concentration andpt o factors are negligible as long as they are present in trace amounts. Thus, w The equation we obtain is maximum rate= 1687 [Ca2+]+55703 pH-454995 where the maximum growth rate is in juveniles settling per day and [Ca2+
372 The UMAP Journal 22.4 (2001) Actual and Model Growth Rates for an “Average Year” Growth Rate (juveniles/day) 0.2 0.4 0.6 0.8 1 5000 10000 15000 20000 25000 30000 35000 Time (fraction of average year) Figure 3. The derivative of the population growth model, along with data for an average year at Lake A (at site 2, the most populous site). The peak height of 38,000 is the quantity that best characterizes the population’s success, so it is used in the regression analysis. Influence of the Environment: Multiple Regression Analysis To determine the effect of environmental conditions on growth rates, we must correlate the peak growth rates in the logistic model with the chemical concentrations at each site. To this end, we perform a multiple regression with peak growth rate as the dependent variable and some or all of the chemical concentrations as independent variables. There are only 10 data points, far fewer than needed to separate the effects of all 11 variables. Fortunately, the literature provides guidance in selecting which variables to use. The dominant factors influencing the success of a Dreissena population are the concentration of calcium and the pH. Although alkalinity seems to be somewhat important, it is included in only the first data set; moreover, it also appears to be closely correlated with calcium concentration, so we exclude it. Another marginally important factor, dissolved oxygen, was not measured in the first data set. According to the literature, other chemical factors are negligible as long as they are present in trace amounts. Thus, we perform the regression on just two variables: calcium concentration and pH. The equation we obtain is maximum rate = 1687 [Ca2+] + 55703 pH − 454995, (1) where the maximum growth rate is in juveniles settling per day and [Ca2+]
A Multiple regression Model 373 is in mg/L. Thus, by measuring the concentration of Ca2+ and the ph of the water, we can predict the population growth rate Tests and refinements The population growth model fits the data surprisingly well, considering its simplicity. Although in some cases the model could be strengthened by allowing two peaks of different heights, doing so would introduce at least one more degree of freedom and thus make it difficult to perform a meaningful regression with just 10 sites. Because we are interested in the overall success or failure of the population, we accept some inaccuracy in the population model n order to set up a better regression As a first check on the model, we use it to predict the growth rates at sites 1-10 in Lake a and compared the predictions to the actual rates(fable 1) Table 1 ctual growth rates in Lake A(first data set)vs. predicted growth rates, in thousands per day Site Actua Model 12 123456789 8600 0.003 100.001 Although far from perfect, the agreement gave us confidence that the model can give at least a qualitative idea of how well a Dreissena population will do in a given calcium concentration and ph For a second test of the model, we use it to predict the minimum ph and calcium concentration tolerable to Dreissena. At a pH of 7. 7, which is typical of the data available for Lake A, the regression equation predicts that the lowest tolerable concentration of Ca2+ would be 15.4 mg/L-very close to the accepted value of 15 mg/L[McMahon 1996]. At a calcium concentration of 25 mg/L, also typical of freshwater lakes, the model predicts a minimum pH of 7. 4; this is only slightly higher than the literature value of about 7.3 Having established some confidence in our model, we test it against th econd data set for Lake A. Because this data set does not include ph, we assume that the values reported in the first data set are accurate and use them in concert with the new calcium concentrations to predict growth rates(Table 2) Although this agreement is coincidentally somewhat better than that with the first data set, we perform a new regression on both data sets at once to see
A Multiple Regression Model 373 is in mg/L. Thus, by measuring the concentration of Ca2+ and the pH of the water, we can predict the population growth rate. Tests and Refinements The population growth model fits the data surprisingly well, considering its simplicity. Although in some cases the model could be strengthened by allowing two peaks of different heights, doing so would introduce at least one more degree of freedom and thus make it difficult to perform a meaningful regression with just 10 sites. Because we are interested in the overall success or failure of the population, we accept some inaccuracy in the population model in order to set up a better regression. As a first check on the model, we use it to predict the growth rates at sites 1–10 in Lake A and compared the predictions to the actual rates (Table 1). Table 1. Actual growth rates in Lake A (first data set) vs. predicted growth rates, in thousands per day. Site Actual Model 1 12 18 2 38 28 3 15 6 4 1 10 5 30 20 6 0.002 −100 7 0.003 0.2 8 0.2 9 9 3 14 10 0.001 3 Although far from perfect, the agreement gave us confidence that the model can give at least a qualitative idea of how well a Dreissena population will do in a given calcium concentration and pH. For a second test of the model, we use it to predict the minimum pH and calcium concentration tolerable to Dreissena. At a pH of 7.7, which is typical of the data available for Lake A, the regression equation predicts that the lowest tolerable concentration of Ca2+ would be 15.4 mg/L—very close to the accepted value of 15 mg/L [McMahon 1996]. At a calcium concentration of 25 mg/L, also typical of freshwater lakes, the model predicts a minimum pH of 7.4; this is only slightly higher than the literature value of about 7.3. Having established some confidence in our model, we test it against the second data set for Lake A. Because this data set does not include pH, we assume that the values reported in the first data set are accurate and use them in concert with the new calcium concentrations to predict growth rates (Table 2). Although this agreement is coincidentally somewhat better than that with the first data set, we perform a new regression on both data sets at once to see
374 The UMAP Journal 22. 4(2001) actual growth rates in Lake A(second data set)vs predicted growth rates, in thousands per day Site Actual Model 234567 0000909 57650 0 0.1 8 10 5 if we can improve the model. This gives us the new regression equation maximum rate= 2338 [Ca2+]+39202 pH-334089 Using this new equation, we predict the peak growth rates at all ten sites, based on data from both sets. We found the results given in Table 3 Table 3 Actual growth rates in Lake A(from both data set)vs predicted growth rates from combined regression, in thousands per day. Site Set 1 Model Set 2 Model 16 2 50 45 10 12 0.002 -50.015 2955 700033 60.020 80.150 l10.450 14 8 15 0.001 20030 The revised model illustrates the sensitivity of the coefficients to changes in the data. Although the additional data incorporated are from the same physical locations as the first data set, they have a significant impact on the regression equation. This modification improves some predictions and worsens others Strengths and Weaknesses Like any model, the one presented above has its strengths and weaknesses Some of the major points are presented below
374 The UMAP Journal 22.4 (2001) Table 2. Actual growth rates in Lake A (second data set) vs. predicted growth rates, in thousands per day. Site Actual Model 1 16 16.5 2 50 27 3 45 6 4 10 9.5 5 30 20 6 15 −10 7 0.02 −0.1 8 0.5 8 9 8 150 10 0.03 5 if we can improve the model. This gives us the new regression equation maximum rate = 2338 [Ca2+] + 39202 pH − 334089. (2) Using this new equation, we predict the peak growth rates at all ten sites, based on data from both sets. We found the results given in Table 3. Table 3. Actual growth rates in Lake A (from both data set) vs. predicted growth rates from combined regression, in thousands per day. Site Set 1 Model Set 2 Model 1 12 30 16 28 2 38 32 50 30 3 15 11 45 10 4 0.001 12 10 12 5 30 20 30 19 6 0.002 −5 0.015 −5 7 0.0033 6 0.020 5 8 0.150 11 0.450 10 9 3 14 8 15 10 0.001 2 0.030 5 The revised model illustrates the sensitivity of the coefficients to changes in the data. Although the additional data incorporated are from the same physical locations as the first data set, they have a significant impact on the regression equation. This modification improves some predictions and worsens others. Strengths and Weaknesses Like any model, the one presented above has its strengths and weaknesses. Some of the major points are presented below
A Multiple regression Model 375 Strengths Applies widely accepted techniques The logistic equation is often used to model population growth under the conditions set forth in our assumptions [Gotelli 1998]. Multiple regression analysis has been used effectively in predicting Dreissena populations prey ously [Ramcharan et al. 1992 Produces predictions in agreement with the data and other models Although agreement with the data provided is far from perfect, our model produces peak growth rates that are largely consistent with observed growth rates. The model also correctly predicts minimum [Ca2+] and pH levels for Dreissena survival. Additionally, it is consistent with other models in the literature. Ramcharan et al. [1992 for instance, give a probability-of survival model A=0.045Ca2+]+1.246pH-11696 that is very nearly a constant multiple of our(1) Correctly predicts results at Lakes B and c Equation(2) predicts population growth rates of -8, 000 juveniles/ day for Lake B and-145,000 juveniles/day for Lake C. That is, the lakes are incapable of supporting mussel populations. This is consistent with the fact that both lakes are well below the minimum calcium and pH requirements Weaknesses Extremely sensitive to changes in experimental data. Based on the results described above, this seems to be a fairly substantial problem with the model. Given the extraordinarily small amount of data available, though, it is hardly remarkable that a change in any given peak value changes the model significantly. If more data were available, we would expect much better averaging-out of error and a regression equation with much better predictive power. Neglects the effects of all factors but [Ca2+] and pH. Again, while this would initially appear to limit the predictive power of the model, the literature supports our selection of these two factors as the dominant ones influencing population growth [Ramcharan et al. 1992] Results and Interpretation To apply the model to the data for lakes B and C, we assume that the valu given for the concentrations are representative of the entire lake. With only one
A Multiple Regression Model 375 Strengths • Applies widely accepted techniques. The logistic equation is often used to model population growth under the conditions set forth in our assumptions [Gotelli 1998]. Multiple regression analysis has been used effectively in predicting Dreissena populations previously [Ramcharan et al. 1992]. • Produces predictions in agreement with the data and other models. Although agreement with the data provided is far from perfect, our model produces peak growth rates that are largely consistent with observed growth rates. The model also correctly predicts minimum [Ca2+] and pH levels for Dreissena survival. Additionally, it is consistent with other models in the literature. Ramcharan et al. [1992], for instance, give a probability-ofsurvival model A = 0.045 [Ca2+] + 1.246 pH − 11.696 that is very nearly a constant multiple of our (1). • Correctly predicts results at Lakes B and C. Equation (2) predicts population growth rates of −8,000 juveniles/day for Lake B and−145,000 juveniles/day for Lake C. That is, the lakes are incapable of supporting mussel populations. This is consistent with the fact that both lakes are well below the minimum calcium and pH requirements. Weaknesses • Extremely sensitive to changes in experimental data. Based on the results described above, this seems to be a fairly substantial problem with the model. Given the extraordinarily small amount of data available, though, it is hardly remarkable that a change in any given peak value changes the model significantly. If more data were available, we would expect much better averaging-out of error and a regression equation with much better predictive power. • Neglects the effects of all factors but [Ca2+] and pH. Again, while this would initially appear to limit the predictive power of the model, the literature supports our selection of these two factors as the dominant ones influencing population growth [Ramcharan et al. 1992]. Results and Interpretation To apply the model to the data for Lakes B and C, we assume that the values given for the concentrations are representative of the entire lake. With only one
376 The UMAP Journal 22. 4(2001) data point for each lake, we must extrapolate. Thus, the model's predictions might not hold in areas where the concentration or pH differs significantly from this value The model clearly indicates that there is no chance of zebra mussel infes- tation in Lake C, consistent with the fact that the ph in the lake is far too low to support a mussel population. The literature indicates zero growth at a ph below about 7.3: the highest measurement of pH in Lake C is 6.0, which is clearly far too acidic. In addition, the calcium concentration must be greater than 12 mg/L for larvae survival; Lake C is far below this cutoff, with a mere 1.85 mg/L at maximum The chemical data for Lake B are less clear cut. The pH is in the required range but the calcium concentration is too low for adult survival. Our model and the literature both indicate that it would take a significant shift in the lakes calcium content for it to support zebra mussels Although taken over the course of several years, the data for lakes B and C are not spread out spatially. It is possible that some region in either lake has much higher pH and calcium concentrations. For example, Lake george in the Adirondacks was initially thought to be immune to zebra mussels because of the water chemistry, but they were later discovered in a small region near a cul- vert with elevated calcium concentrations. Scientists are now concerned about Dreissena's potential to spread to other parts of lake George, as the mussels have an amazingly ability to adapt once they have settled [Revkin 2000 Other models strongly agree with our conclusions about Lakes b and c Hincks and Mackies model [1997] also found that zebra mussel populations depend only on pH and calcium concentration. Their formula, where L=1347-3659[a2+1-15.868pH+0.43ICa2+lpH, predicts 100% mortality in Lake C and 99% in Lake B; a population might be able to make some headway if it could establish itself in Lake b Ramcharan et al. [1992] modeled the probability of a population becoming established, finding through discriminant analysis that only ph and calcium levels are significant factors. The discriminant function is A=1246pH+0.045Ca2+]-11.696 where A must be greater than.638 for a population to exist. This equation which is nearly a constant multiple of our (1), suggests that no populations would establish themselves in either lake
376 The UMAP Journal 22.4 (2001) data point for each lake, we must extrapolate. Thus, the model’s predictions might not hold in areas where the concentration or pH differs significantly from this value. The model clearly indicates that there is no chance of zebra mussel infestation in Lake C, consistent with the fact that the pH in the lake is far too low to support a mussel population. The literature indicates zero growth at a pH below about 7.3; the highest measurement of pH in Lake C is 6.0, which is clearly far too acidic. In addition, the calcium concentration must be greater than 12 mg/L for larvae survival; Lake C is far below this cutoff, with a mere 1.85 mg/L at maximum. The chemical data for Lake B are less clear cut. The pH is in the required range but the calcium concentration is too low for adult survival. Our model and the literature both indicate that it would take a significant shift in the lake’s calcium content for it to support zebra mussels. Although taken over the course of several years, the data for Lakes B and C are not spread out spatially. It is possible that some region in either lake has much higher pH and calcium concentrations. For example, Lake George in the Adirondacks was initially thought to be immune to zebra mussels because of the water chemistry, but they were later discovered in a small region near a culvert with elevated calcium concentrations. Scientists are now concerned about Dreissena’s potential to spread to other parts of Lake George, as the mussels have an amazingly ability to adapt once they have settled [Revkin 2000]. Other models strongly agree with our conclusions about Lakes B and C. Hincks and Mackie’s model [1997] also found that zebra mussel populations depend only on pH and calcium concentration. Their formula, p = eL 1 + eL , where L = 134.7 − 3.659 [Ca2+] − 15.868 pH + 0.43 [Ca2+]pH, predicts 100% mortality in Lake C and 99% in Lake B; a population might be able to make some headway if it could establish itself in Lake B. Ramcharan et al. [1992] modeled the probability of a population becoming established, finding through discriminant analysis that only pH and calcium levels are significant factors. The discriminant function is A = 1.246 pH + 0.045[Ca2+] − 11.696, where A must be greater than −0.638 for a population to exist. This equation, which is nearly a constant multiple of our (1), suggests that no populations would establish themselves in either lake