that the proportion data are binomial_中国高校课件下载中心

点击下载：北京大学：《模式识别》课程教学资源（参考资料）Tutorial on maximum likelihood estimation

正在加载图片...

I.J.Myung Journal of Mathematical Psychology 47 (2003)90-100 97 that the proportion data are binomially distributed,not issues in model selection,the reader is referred elsewhere normally distributed.Further,the constant variance assump- (e.g.Linhart Zucchini,1986;Myung,Forster, tion required for the equivalence between MLE and LSE Browne,2000;Pitt,Myung,Zhang,2002). does not hold for binomial data for which the variance,2= np(1-p),depends upon proportion correct p. 5.Concluding remarks 4.1.MLE interpretation This article provides a tutorial exposition of max- imum likelihood estimation.MLE is of fundamental What does it mean when one model fits the data better importance in the theory of inference and is a basis of than does a competitor model?It is important not to many inferential techniques in statistics.unlike LSE. jump to the conclusion that the former model does a which is primarily a descriptive tool.In this paper,I better job of capturing the underlying process and provide a simple,intuitive explanation of the method so therefore represents a closer approximation to the true that the reader can have a grasp of some of the basic model that generated the data.A good fit is a necessary, principles.I hope the reader will apply the method in his but not a sufficient,condition for such a conclusion.A or her mathematical modeling efforts so a plethora of superior fit (i.e.,higher value of the maximized log- widely available MLE-based analyses (e.g.Batchelder likelihood)merely puts the model in a list of candidate Crowther,1997;Van Zandt,2000)can be performed on models for further consideration.This is because a model data,thereby extracting as much information and can achieve a superior fit to its competitors for reasons insight as possible into the underlying mental process that have nothing to do with the model's fidelity to the under investigation. underlying process.For example,it is well established in statistics that a complex model with many parameters fits data better than a simple model with few parameters, Acknowledgments even if it is the latter that generated the data.The central question is then how one should decide among a set of This work was supported by research Grant R0l competing models.A short answer is that a model should MH57472 from the National Institute of Mental Health. be selected based on its generalizability,which is defined The author thanks Mark Pitt.Richard Schweickert.and as a model's ability to fit current data but also to predict two anonymous reviewers for valuable comments on future data.For a thorough treatment of this and related earlier versions of this paper. Appendix This appendix presents Matlab code that performs MLE and LSE analyses for the example described in the text. Matlab Code for MLE This is the main program that finds MLE estimates.Given a model,it takes sample size (n),time intervals (t)and observed proportion correct %(y)as inputs.It returns the parameter values that maximize the log- likelihood function global n t x;%define global variables opts=optimset(DerivativeCheck',‘off','Display',‘off',‘TolX',1e-6,‘TolFun',le-6, ‘Diagnostics',‘off',‘MaxIter',200,LargeScale',‘off'); option settings for optimization algorithm n =100;%number of independent Bernoulli trials (i.e.,sample size) t=1 3 6 9 12 18:%time intervals as a column vector y [.94 .77.40.26.24.16];%observed proportion correct as a column vector x=ny:%number of correct responses init-w rand(2,1);%starting parameter values low-w zeros(2,1):%parameter lower bounds up_w =100ones(2,1):%parameter upper bounds while 1, [w1,lik1,exit1]fmincon ('power-mle',init-w,[],],][],low-w,up-w,[,opts);that the proportion data are binomially distributed, not normally distributed. Further, the constant variance assumption required for the equivalence between MLE and LSE does not hold for binomial data for which the variance, s2 ¼ npð1 pÞ; depends upon proportion correct p: 4.1. MLE interpretation What does it mean when one model fits the data better than does a competitor model? It is important not to jump to the conclusion that the former model does a better job of capturing the underlying process and therefore represents a closer approximation to the true model that generated the data. A good fit is a necessary, but not a sufficient, condition for such a conclusion. A superior fit (i.e., higher value of the maximized loglikelihood) merely puts the model in a list of candidate models for further consideration. This is because a model can achieve a superior fit to its competitors for reasons that have nothing to do with the model’s fidelity to the underlying process. For example, it is well established in statistics that a complex model with many parameters fits data better than a simple model with few parameters, even if it is the latter that generated the data. The central question is then how one should decide among a set of competing models. A short answer is that a model should be selected based on its generalizability, which is defined as a model’s ability to fit current data but also to predict future data. For a thorough treatment of this and related issues in model selection, the reader is referred elsewhere (e.g. Linhart & Zucchini, 1986; Myung, Forster, & Browne, 2000; Pitt, Myung, & Zhang, 2002). 5. Concluding remarks This article provides a tutorial exposition of maximum likelihood estimation. MLE is of fundamental importance in the theory of inference and is a basis of many inferential techniques in statistics, unlike LSE, which is primarily a descriptive tool. In this paper, I provide a simple, intuitive explanation of the method so that the reader can have a grasp of some of the basic principles. I hope the reader will apply the method in his or her mathematical modeling efforts so a plethora of widely available MLE-based analyses (e.g. Batchelder & Crowther, 1997; Van Zandt, 2000) can be performed on data, thereby extracting as much information and insight as possible into the underlying mental process under investigation. Acknowledgments This workwas supported by research Grant R01 MH57472 from the National Institute of Mental Health. The author thanks Mark Pitt, Richard Schweickert, and two anonymous reviewers for valuable comments on earlier versions of this paper. Appendix This appendix presents Matlab code that performs MLE and LSE analyses for the example described in the text. Matlab Code for MLE % This is the main program that finds MLE estimates. Given a model, it % takes sample size (n), time intervals (t) and observed proportion correct % (y) as inputs. It returns the parameter values that maximize the log- % likelihood function global n t x; % define global variables opts ¼ optimset (‘DerivativeCheck’,‘off’,’Display’,‘off’,‘TolX’,1e-6,‘TolFun’,1e-6, ‘Diagnostics’,‘off’,‘MaxIter’,200,LargeScale’,‘off’); % option settings for optimization algorithm n ¼ 100 ;% number of independent Bernoulli trials (i.e., sample size) t ¼ ½1 3 6 9 12 18 0 ;% time intervals as a column vector y ¼ ½:94 :77 :40 :26 :24 :16 0 ;% observed proportion correct as a column vector x ¼ nny;% number of correct responses init w ¼ randð2; 1Þ;% starting parameter values low w ¼ zerosð2; 1Þ;% parameter lower bounds up w ¼ 100nonesð2; 1Þ;% parameter upper bounds while 1, ½w1; lik1; exit1 ¼ fmincon (‘power mle’,init w,[],[],[],[],low w,up w,[],opts); I.J. Myung / Journal of Mathematical Psychology 47 (2003) 90–100 97

<<向上翻页向下翻页>>

点击下载：北京大学：《模式识别》课程教学资源（参考资料）Tutorial on maximum likelihood estimation