正在加载图片...
4.2 Estimated densities 10 As a preliminary conclusion this indicates that procedure 2 is useful probably only if a 1-dimensional projection does not alter the problem too much.Naturally,a similar statement is true for procedure 1,but the assumption of equal,or at least similar,covariance structures is,most of the time,not that much problematic.That the amount of variance reduction depends on the 'similarity'of the control variate to the true error rate could have already been deduced from equation (11).On the other hand,for procedure 1 the exact expected error rate is not easily computable, in general,and procedure 2 is much quicker. 4.2 Estimated densities Since in practice the distributions of the grouped observations are not known,the implementation of the control variate procedures has somewhat to be adapted.For the purpose of this paper we nevertheless assume normal distributions for conve- nience so that only the corresponding distribution parameters have to be estimated. Moreover,since the densities have to be estimated from the observations,the true expected misclassificationrate has to be estimated by means of a resampling method in order to avoid overoptimism.As the resampling method we use leave-one-out cross validation.I.e.we preliminarily eliminate one observation,estimate the densities from the remaining observations,and predict the class of the eliminated observation by means of these estimated densities.This is done for each observation. This causes two problems.First,the extra loop for resampling leads to such a big computational effort that the number of replicates of the whole Monte Carlo experiment is reduced to V 10 for simulation 1 and to V=5 otherwise.Sec- ond,the exact expectations in procedures 1 and 2 should not have to be computed for each resampled sample.Thus,we propose to compute the exact expectation from the "observed"sample only,and use this value for all resampled samples also. Moreover,for the purpose of the simulations for this paper we decided to use the exact expectations from the densities used to generate the observations in order to reduce computational effort.Finally,we used the same example densities as in the preceding section to be able to judge the extra variance caused by parameter estimation. The results of the Monte Carlo simulations can be found in Table 2.Note the increase of variance and the very small correlation %with procedure 2 in the third simulation.The optimal standard deviation reachable by(8)in this case would thus be 2.53,which is only very slightly lower than 2.57 with the naive Monte Carlo. Since,nevert heless,the results are very similar to the results of the simulations with known densities,the conclusions from the last subsection appear to be valid also in the case of density parameters to be estimated.4.2 Estimated densities 10 As a preliminary conclusion this indicates that procedure 2 is useful probably only if a 1-dimensional pro jection does not alter the problem too much. Naturally, a similar statement is true for procedure 1, but the assumption of equal, or at least similar, covariance structures is, most of the time, not that much problematic. That the amount of variance reduction depends on the 'similarity' of the control variate to the true error rate could have already been deduced from equation (11). On the other hand, for procedure 1 the exact expected error rate is not easily computable, in general, and procedure 2 is much quicker. 4.2 Estimated densities Since in practice the distributions of the grouped observations are not known, the implementation of the control variate procedures has somewhat to be adapted. For the purpose of this paper we nevertheless assume normal distributions for conve￾nience so that only the corresponding distribution parameters have to be estimated. Moreover, since the densities have to be estimated from the observations, the true expected misclassi cation rate has to be estimated by means of a resampling method in order to avoid overoptimism. As the resampling method we use leave-one-out cross validation. I.e. we preliminarily eliminate one observation, estimate the densities from the remaining observations, and predict the class of the eliminated observation by means of these estimated densities. This is done for each observation. This causes two problems. First, the extra loop for resampling leads to such a big computational e ort that the number of replicates of the whole Monte Carlo experiment is reduced to V = 10 for simulation 1 and to V = 5 otherwise. Sec￾ond, the exact expectations in procedures 1 and 2 should not have to be computed for each resampled sample. Thus, we propose to compute the exact expectation from the "observed" sample only, and use this value for all resampled samples also. Moreover, for the purpose of the simulations for this paper we decided to use the exact expectations from the densities used to generate the observations in order to reduce computational e ort. Finally, we used the same example densities as in the preceding section to be able to judge the extra variance caused by parameter estimation. The results of the Monte Carlo simulations can be found in Table 2. Note the increase of variance and the very small correlation % with procedure 2 in the third simulation. The optimal standard deviation reachable by (8) in this case would thus be 2.53, which is only very slightly lower than 2.57 with the naive Monte Carlo. Since, nevertheless, the results are very similar to the results of the simulations with known densities, the conclusions from the last subsection appear to be valid also in the case of density parameters to be estimated
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有