正在加载图片...
1. 4 Classifier Evaluation The orange juice data are to estimate the level of saccharose of orange juice from its observed near-infrared spectra 45 The abalone data set predicts the age of abalone from physical measure ments 36 Boston 5 and Boston 14 data sets 46, 47 use the 5th and 14th input variables of the Boston data set as outputs, respectively. The fifth variable is NOX (nitric oxide) concentrations and the 14th variable is the house price in the Boston area. For these data sets training data are only provided Techniques for analyzing biological response to chemical structures are called quantitative structure-activity relationships(QSARs). Pyrimidines Triazines 36, and Phenetylamines 48 data sets are well-known QS data sets. For these data sets, only the training data are given. Table 1. 4 Benchmark data specification for function approximation Data nputs Training data Test data Water purification(stationary) Water purification(nonstationary) 10 Orange juice Abalone Boston 14 Triazines 186 Phenethylamines 628 1. 4 Classifier Evaluation In developing a classifier for a given problem we repeat determining input variables, namely features, gathering input-output pairs according to the determined features, training the classifier, and evaluating classifier perfo mance. In training the classifier, special care must be taken so that no infor- mation on the test data set is used for training the classifier. 2 Assume that a classifier for an n class problem is tested using M data samples. To evaluate the classifier for a test data set, we generate an nx n confusion matriz A, whose element aii is the number of class i data classified into class j. Then the recognition rate R or recognition accuracy in calculated by 2 It is my regret that I could not reevaluate the computer experiments, included in the book. that violate this rule.1.4 Classifier Evaluation 13 The orange juice data are to estimate the level of saccharose of orange juice from its observed near-infrared spectra [45]. The abalone data set predicts the age of abalone from physical measure￾ments [36]. Boston 5 and Boston 14 data sets [46, 47] use the 5th and 14th input variables of the Boston data set as outputs, respectively. The fifth variable is NOX (nitric oxide) concentrations and the 14th variable is the house price in the Boston area. For these data sets training data are only provided. Techniques for analyzing biological response to chemical structures are called quantitative structure–activity relationships (QSARs). Pyrimidines [36], Triazines [36], and Phenetylamines [48] data sets are well-known QSAR data sets. For these data sets, only the training data are given. Table 1.4 Benchmark data specification for function approximation Data Inputs Training data Test data Mackey–Glass 4 500 500 Water purification (stationary) 10 241 237 Water purification (nonstationary) 10 45 40 Orange juice 700 150 68 Abalone 8 4,177 — Boston 5 13 506 — Boston 14 13 506 — Pyrimidines 27 74 — Triazines 60 186 — Phenetylamines 628 22 — 1.4 Classifier Evaluation In developing a classifier for a given problem we repeat determining input variables, namely features, gathering input–output pairs according to the determined features, training the classifier, and evaluating classifier perfor￾mance. In training the classifier, special care must be taken so that no infor￾mation on the test data set is used for training the classifier.2 Assume that a classifier for an n class problem is tested using M data samples. To evaluate the classifier for a test data set, we generate an n × n confusion matrix A, whose element aij is the number of class i data classified into class j. Then the recognition rate R or recognition accuracy in % is calculated by 2 It is my regret that I could not reevaluate the computer experiments, included in the book, that violate this rule
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有