正在加载图片...
xiv CONTENTS 2 Probability Distributions 67 2.1 Binary Variables 6 2.1.1 The beta distribution 2.2 Multinomial Variables...... 4 2.2.1 The Dirichlet distribution 76 2.3 The Gaussian Distribution...... 78 2.3.1 Conditional Gaussian distributions.. 85 2.3.2 Marginal Gaussian distributions.. 88 2.3.3 Bayes'theorem for Gaussian variables.. 90 2.3.4 Maximum likelihood for the Gaussian 93 2.3.5 Sequential estimation..................... 94 2.3.6 Bayesian inference for the Gaussian .... 97 2.3.7 Student's t-distribution........ 102 2.3.8 Periodic variables.......·.···- 105 2.3.9 Mixtures of Gaussians 110 2.4 The Exponential Family..... 113 2.4.1 Maximum likelihood and sufficient statistics 116 2.4.2 Conjugate priors 117 2.4.3 Noninformative priors 117 2.5 Nonparametric Methods 120 2.5.1 Kernel density estimators.............. 122 2.5.2 Nearest-neighbour methods··········· 124 127 3 Linear Models for Regression 137 3.1 Linear Basis Function Models..··..·。·.········· 138 3.1.1 Maximum likelihood and least squares....... 。。。 140 3.1.2 Geometry of least squares 143 3.1.3 Sequential learning....... 143 3.1.4 Regularized least squares 144 3.1.5 Multiple outputs..... 146 3.2 The Bias-Variance Decomposition .. 147 3.3 Bayesian Linear Regression 152 3.3.1 Parameter distribution 152 3.3.2 Predictive distribution 156 3.3.3 Equivalent kernel........ 159 3.4 Bayesian Model Comparison...... 161 3.5 The Evidence Approximation······ 165 3.5.1 Evaluation of the evidence function 166 3.5.2 Maximizing the evidence function 168 3.5.3 Effective number of parameters.········ 170 3.6 Limitations of Fixed Basis Functions.......... 172 Exercises 173xiv CONTENTS 2 Probability Distributions 67 2.1 Binary Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 2.1.1 The beta distribution . . . . . . . . . . . . . . . . . . . . . 71 2.2 Multinomial Variables . . . . . . . . . . . . . . . . . . . . . . . . 74 2.2.1 The Dirichlet distribution . . . . . . . . . . . . . . . . . . . 76 2.3 The Gaussian Distribution . . . . . . . . . . . . . . . . . . . . . . 78 2.3.1 Conditional Gaussian distributions . . . . . . . . . . . . . . 85 2.3.2 Marginal Gaussian distributions . . . . . . . . . . . . . . . 88 2.3.3 Bayes’ theorem for Gaussian variables . . . . . . . . . . . . 90 2.3.4 Maximum likelihood for the Gaussian . . . . . . . . . . . . 93 2.3.5 Sequential estimation . . . . . . . . . . . . . . . . . . . . . 94 2.3.6 Bayesian inference for the Gaussian . . . . . . . . . . . . . 97 2.3.7 Student’s t-distribution . . . . . . . . . . . . . . . . . . . . 102 2.3.8 Periodic variables . . . . . . . . . . . . . . . . . . . . . . . 105 2.3.9 Mixtures of Gaussians . . . . . . . . . . . . . . . . . . . . 110 2.4 The Exponential Family . . . . . . . . . . . . . . . . . . . . . . . 113 2.4.1 Maximum likelihood and sufficient statistics . . . . . . . . 116 2.4.2 Conjugate priors . . . . . . . . . . . . . . . . . . . . . . . 117 2.4.3 Noninformative priors . . . . . . . . . . . . . . . . . . . . 117 2.5 Nonparametric Methods . . . . . . . . . . . . . . . . . . . . . . . 120 2.5.1 Kernel density estimators . . . . . . . . . . . . . . . . . . . 122 2.5.2 Nearest-neighbour methods . . . . . . . . . . . . . . . . . 124 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 3 Linear Models for Regression 137 3.1 Linear Basis Function Models . . . . . . . . . . . . . . . . . . . . 138 3.1.1 Maximum likelihood and least squares . . . . . . . . . . . . 140 3.1.2 Geometry of least squares . . . . . . . . . . . . . . . . . . 143 3.1.3 Sequential learning . . . . . . . . . . . . . . . . . . . . . . 143 3.1.4 Regularized least squares . . . . . . . . . . . . . . . . . . . 144 3.1.5 Multiple outputs . . . . . . . . . . . . . . . . . . . . . . . 146 3.2 The Bias-Variance Decomposition . . . . . . . . . . . . . . . . . . 147 3.3 Bayesian Linear Regression . . . . . . . . . . . . . . . . . . . . . 152 3.3.1 Parameter distribution . . . . . . . . . . . . . . . . . . . . 152 3.3.2 Predictive distribution . . . . . . . . . . . . . . . . . . . . 156 3.3.3 Equivalent kernel . . . . . . . . . . . . . . . . . . . . . . . 159 3.4 Bayesian Model Comparison . . . . . . . . . . . . . . . . . . . . . 161 3.5 The Evidence Approximation . . . . . . . . . . . . . . . . . . . . 165 3.5.1 Evaluation of the evidence function . . . . . . . . . . . . . 166 3.5.2 Maximizing the evidence function . . . . . . . . . . . . . . 168 3.5.3 Effective number of parameters . . . . . . . . . . . . . . . 170 3.6 Limitations of Fixed Basis Functions . . . . . . . . . . . . . . . . 172 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有