The context Lecture 12 There are many contexts in which a variable is ordinal that have three or more Ordinal Logistic Some typical examples are health status Regression bad, very bad), olitical ic (very liberal, slightly slightly conservative, e), fertility intention(the more the better, two, one, no) This lecture briefly introduce ordinal logistic regression In these examples, the distance between gories is not equal The context and data type continuous. In this case, just use OLS The ordinal logistic regression equation egression. Certainly, this is widely done, Fitting an ordinal logistic regression particularly when the dependent variable will often result in biased estimates of the An illustrative example of fertility analysis using Stata
1 1 Lecture 12 Ordinal Logistic Regression 2 This lecture briefly introduce ordinal logistic regression • The context and data type • The ordinal logistic regression equation • Fitting an ordinal logistic regression • Results and interpretation • An illustrative example of fertility analysis using Stata 2 3 The context • There are many contexts in which a variable is ordinal that have three or more categories • Some typical examples are health status (very good, good, so-so, bad, very bad), political ideology (very liberal, slightly liberal, moderate, slightly conservative, very conservative), fertility intention (the more the better, two, one, no) 4 • In these examples, the distance between categories is not equal. • Treat the variable as though it were continuous. In this case, just use OLS regression. Certainly, this is widely done, particularly when the dependent variable has 5 or more categories. However, this will often result in biased estimates of the regression parameters
Ignoring the ordinal categories of the The Ordered Logit Model (OLM) variable and treating it as nomial, i.e. us MNLM. The key problem is a loss of Say Y is an ordinal dependent variable efficiency By ignoring the fact that the with c categories. Let Pr(Y sj)denote the ategories are ordered, you fail to use probability that the response on Y falls in some of the information available to you, category j or below(i.e, in category and you may estimate many more 1, 2, .. or j). This is called a cumulative parameters than is necessary. This probability. It equals the sum of the increases the risk of getting insignificant probabilities in category j and below results, but your parameter estimates still should be unbiased Pr(Y sil=PrY=1)+伊Pr(Y=2)+… +Pr(Y=j) Data type A"c category Y dependent variable"has c cumulative probabilities: Pr(Y $1), Pr(Ys As in other logistic regression, the 2), Pr(Y sc). The final cumulative predictors in ordinal logistic regression probability uses the entire scale; as a may be quantitative, categorical, or a consequence, therefore Pr(Y sc)=1 mixture of the two. The dependent variable The order of forming the final cumulative should be discrete and ordinal with three probabilities reflects the ordering or more categones dependent variable scale, and those probabilities themselves satisfy In SPSS, discrete(categorical) variables are entered as factors and continuous PrYs1)sPr(YS2)≤S∴≤ Pr(Y sc)=1 variables as covariates
3 5 • Ignoring the ordinal categories of the variable and treating it as nomial, i.e. use MNLM. The key problem is a loss of efficiency. By ignoring the fact that the categories are ordered, you fail to use some of the information available to you, and you may estimate many more parameters than is necessary. This increases the risk of getting insignificant results, but your parameter estimates still should be unbiased. 6 Data type • As in other logistic regression, the predictors in ordinal logistic regression may be quantitative, categorical, or a mixture of the two. The dependent variable should be discrete and ordinal with three or more categories. • In SPSS, discrete (categorical) variables are entered as factors, and continuous variables as covariates. 4 7 The Ordered Logit Model (OLM) • Say Y is an ordinal dependent variable with c categories. Let Pr(Y ≤ j) denote the probability that the response on Y falls in category j or below (i.e., in category 1,2, …, or j). This is called a cumulative probability. It equals the sum of the probabilities in category j and below: Pr(Y ≤ j)= Pr(Y = 1) + (Pr(Y = 2)+ … +Pr(Y = j) 8 • A “ c category Y dependent variable” has c cumulative probabilities: Pr(Y ≤ 1), Pr(Y ≤ 2), … Pr(Y ≤ c). The final cumulative probability uses the entire scale; as a consequence, therefore, Pr(Y ≤ c) = 1. The order of forming the final cumulative probabilities reflects the ordering of the dependent variable scale, and those probabilities themselves satisfy: Pr(Y ≤ 1) ≤ Pr(Y ≤ 2) ≤ … ≤ Pr(Y ≤ c) = 1
In ordered logit, an underlying probability The coefficients and threshold points are score for an observation of being in the ith estimated using maximum likelihood In the response category is estimated as a linear parameterization of SPSS, no constant function of the independent variables and appears because its effect is absorbed into a set of threshold points(also called cut the threshold The SPSS output provides single values for The probability of observing response ategory i corresponds to the probability that the estimated linear function, plus each X variable) are the main items of random error, is within the range of the interests in the ordered logit table. (One of threshold points estimated for that the advantages using Stata is that odds ratios are available) Pr(response category for the jth When b=0. x has no effect on y. the outcome=)=Pr(-1<b, X,+ b2X2+ effect of x increases as the absolute value bkxk+u ski) of b increases. There are not separate b One estimates the coefficients b,, b2,.b, coefficients for each of the outcomes(or ne minus the number of outcomes as we along with threshold points k,, k2,..., KH-1 have seen in multinomial logistic here i is the number of possible response categories of the dependent variable. All of regression in which we considered logistic this is a direct generalization of the binary gression with a nominal dependent ariable) logistic model
5 9 • In ordered logit, an underlying probability score for an observation of being in the ith response category is estimated as a linear function of the independent variables and a set of threshold points (also called cut points). • The probability of observing response category i corresponds to the probability that the estimated linear function, plus random error, is within the range of the threshold points estimated for that response. 10 • Pr(response category for the jth outcome = i) = Pr(ki-1 <b1X1j + b2X2j + … + bkXkj + uj ≤ ki) • One estimates the coefficients b1, b2, … bk along with threshold points k1, k2, …, ki-1, where i is the number of possible response categories of the dependent variable. All of this is a direct generalization of the binary logistic model. 6 11 • The coefficients and threshold points are estimated using maximum likelihood. In the parameterization of SPSS, no constant appears because its effect is absorbed into the threshold points. • The SPSS output provides single values for the b coefficients. The b coefficients (one for each X variable) are the main items of interests in the ordered logit table. (One of the advantages using Stata is that odds ratios are available) 12 • When b = 0, X has no effect on Y. The effect of X increases as the absolute value of b increases. There are not separate b coefficients for each of the outcomes (or one minus the number of outcomes as we have seen in multinomial logistic regression in which we considered logistic regression with a nominal dependent variable)
Estimating an ordered logit model In OLM, a particular b coefficient takes the same value for the logit coefficient for The explication of the OLM is facilitated by each cumulative probability. The model considering an example using the 1997 assumes that the effect of x is the same data. Suppose that the response variable for each cumulative probability. This is health status of children, this is captured cumulative logit model with common by question 302F: effects is often called a"proportional odds F. Health conditions of live births model 2). Basically health Sick but not disabled Congenitally disabled ). Disabled after birth Ordered logit model has the form: We are going to examine the effect on child health of matemal age at childbearing, residence, ethnicity, education, duration of breastfeeding, and child sex We recode the health status variable into 4 categories (1)healthy, (2) basically healthy, (3)sick or disabled P+p and(4)dead, as shown in the following table(we 1-(+2) restrict our sample to children aged 0-5) HEALTH4 ++k ak+ 1-(+2+ sically healthy 9061 d+B+,R=1 Missing system 8
7 13 • In OLM, a particular b coefficient takes the same value for the logit coefficient for each cumulative probability. The model assumes that the effect of X is the same for each cumulative probability. This cumulative logit model with common effects is often called a “proportional odds” model. 14 8 15 Estimating an ordered logit model • The explication of the OLM is facilitated by considering an example using the 1997 data. Suppose that the response variable is health status of children, this is captured by question 302F: F. Health conditions of live births? 1). Healthy 2). Basically healthy 3). Sick but not disabled 4). Congenitally disabled 5). Disabled after birth 6). Dead 7).N/A 16 HEALTH4 1121 75.8 89.3 89.3 90 6.1 7.2 96.5 15 1.0 1.2 97.7 29 2.0 2.3 100.0 1255 84.9 100.0 224 15.1 1479 100.0 healthy basically healthy sick or disabled dead Total Valid Missing System Total Frequency Percent Valid Percent Cumulative Percent We are going to examine the effect on child health of maternal age at childbearing, residence, ethnicity, education, duration of breastfeeding, and child sex. We recode the health status variable into 4 categories: (1) healthy, (2) basically healthy, (3) sick or disabled, and (4) dead, as shown in the following table (we restrict our sample to children aged 0-5):
We are going to fit the following Our hypothesis is that both child and equation maternal characteristics affect child ategories will be more likely to have In -P(Ysj) =a,+bX,+b,X,++bX healthier children. Prolonged duration of l-P(Y≤j breastfeeding is associated with increased obability of being healthy of a child. The practice of discrimination against girl Dependent variable suggests that a girl child is more likely to alth status. denoted as health4 be in a worse status of health than a boy (4 categories: healthy, basically healthy, sick or ild Independent variables The ordinal logistic regression equation in our exampl Pa Bfeed: duration of breastfeeding, an interval variable P(Ysj=bMac+b Par_mum+b,Bfeed Urban: place of residence, 1 if urban, 0 otherwise PY≤j b, Chdsex +b Urban+b Han Primary. 1 if primary school, 0 otherwise +b, Primary +byJumior +bg Sencol junior. 1 if junior middle school, 0 otherwise Sencol: 1 if senior middle school and over. 0 otherwise
9 17 We are going to fit the following equation: 11 2 2 ( ) ln ... 1( ) j n n PY j a bX bX bX PY j ⎡ ⎤ ≤ = + + ++ ⎢ ⎥ ⎣ ⎦ − ≤ Dependent variable: health status, denoted as health4 (4 categories: healthy, basically healthy, sick or disabled, and dead). 18 • Independent variables: MAC: Maternal age at childbearing, an interval variable Par_num: parity, an interval variable Bfeed: duration of breastfeeding, an interval variable Chdsex: child sex, 1 if a girl, 0 otherwise Urban: place of residence, 1 if urban, 0 otherwise Han: 1 if Han, 0 otherwise Primary: 1 if primary school, 0 otherwise Junior: 1 if junior middle school, 0 otherwise Sencol: 1 if senior middle school and over, 0 otherwise 10 19 • Our hypothesis is that both child and maternal characteristics affect child survival. Women in higher socio-economic categories will be more likely to have healthier children. Prolonged duration of breastfeeding is associated with increased probability of being healthy of a child. The practice of discrimination against girls suggests that a girl child is more likely to be in a worse status of health than a boy child. 20 The ordinal logistic regression equation in our example: 12 3 4 56 7 89 ( ) ln _ 1( ) PY j b Mac b Par num b Bfeed PY j b Chdsex b Urban b Han b Primary b Junior b Sencol ⎡ ⎤ ≤ =+ + ⎢ ⎥ ⎣ ⎦ − ≤ + ++ + ++
A positive and statistically significant coefficient SPSS syntax estimate implies that the corresponding xplanatory variable significantly increases the obability that the child is healthy, while a health WITH mac par num feed negative and statistically significant coefficient chdsex urban han primary junior sencol stimate implies that the corresponding /CRITERIA=CIN(95 DELTA(O) LCONVERGE(O explanatory variable significantly increases the MXITER(100)MXSTEP(5) probability that the child dies after birth. Thus PCONVERGE(1.0E-6)SINGULAR(1.0E-8 igher education of mothers significantly /INK= LOGIT increased the likelihood of having a healthy /PRINT FIT PARAMETER SUMMARY child, as did longer duration of breastfeeding oefficients of other explanatory variables are however insignifica ramer estimtes An example of fertility analysis Locaton MAC 1425 跃0涕 Number of children ever bon(three categories: none, few, multiple) Individual characteristics(age, place of residence, ethnicity, education
11 21 SPSS syntax: PLUM health4 WITH mac par_num bfeed chdsex urban han primary junior sencol /CRITERIA = CIN(95) DELTA(0) LCONVERGE(0) MXITER(100) MXSTEP(5) PCONVERGE(1.0E-6) SINGULAR(1.0E-8) /LINK = LOGIT /PRINT = FIT PARAMETER SUMMARY. 22 12 23 A positive and statistically significant coefficient estimate implies that the corresponding explanatory variable significantly increases the probability that the child is healthy, while a negative and statistically significant coefficient estimate implies that the corresponding explanatory variable significantly increases the probability that the child dies after birth. Thus, higher education of mothers significantly increased the likelihood of having a healthy child, as did longer duration of breastfeeding. Coefficients of other explanatory variables are however insignificant. 24 An example of fertility analysis • Dependent variable: Number of children ever born (three categories: none, few, multiple) • Independent variables: Individual characteristics (age, place of residence, ethnicity, education)
Using Stata to Estimate an Ordered Logit Model of Chinese Fertility The dependent variable is called CEB3, an ordinal variable scored 1 if the woman has no births, 2 if the woman has few(1-2)births, and 3 if the woman has multiple(3+)births. Thus, the outcomes of tab c college|-1,27586265286和自0n-1,7%821,7591新 variable are three none.few-/、m.mm cebu 1 Probability coerver bouc Total I 4. 134 Ordinal logistic regression is used to lote that the seven logit coefficients have model the CEB3 dependent variable; the single values(which is not like the X variables are AGE(in years), and six situation in last lecture when estimate a dummy variables representing place of multinomial logistic regression) residence, ethnicity and education Note also the two cut points of cut1=0.92, URBAN, HAN, PRIMARY, JUNIOR, nd cut= 6.53 these are the so-called SENIOR COLLEGE ancillary parameters. Their values assist The Stata command is logit, following by s in calculating probabilities for each the dependent variable followed by the woman of her being in each of the three independent variables outcomes on the cEB3 dependent riable; they also assist in interpreting the logit coefficients and their odds ratios
13 25 • The dependent variable is called CEB3, an ordinal variable scored 1 if the woman has no births, 2 if the woman has few (1-2) births, and 3 if the woman has multiple (3+) births. Thus, the outcomes of the outcomes of the dependent the dependent variable are variable are three: none, few, three: none, few, multiple. multiple. Using Stata to Estimate an Ordered Logit Model of Chinese Fertility 26 • Ordinal logistic regression is used to model the CEB3 dependent variable; the X variables are AGE (in years), and six dummy variables representing place of residence, ethnicity and education: URBAN, HAN, PRIMARY, JUNIOR, SENIOR, COLLEGE. • The Stata command is ologit, following by the dependent variable followed by the independent variables. 14 27 28 • Note that the seven logit coefficients have single values (which is not like the situation in last lecture when I estimate a multinomial logistic regression). • Note also the two cut points of cut1 = 0.92, and cut2 = 6.53; these are the so-called ancillary parameters. Their values assist us in calculating probabilities for each woman of her being in each of the three outcomes on the CEB3 dependent variable; they also assist in interpreting the logit coefficients and their odds ratios
Model fit The Ordered Logit Coefficients From the lecture last week, we already In a summary way, the coefficients tell us know how to evaluate the adequacy or fit that older women have more children of the overall model. The LRy test educated women have fewer children statistic in the full model has a value of urban women have fewer children than 1429.64, which is the difference between rural women and han women have fewer values of (-2L0)and (-2L4) With 7 degrees children than minority women of freedom this statistic has a p of o 0000 Each coefficient refers to the linear for testing the null hypothesis that 1=B2 change in the log odds of being above =B2=0 either of the first two categories seudo R2 is 0.2384, fairly good fit of the Other things equal, with each increase in age, there is an increase of 0.17 in the log model to the data odds of CEB3 being above either of the two All seven logit coefficients have very high fixed levels that is the two fixed levels of z(t) scores, and all are significant at P none or few 0.01, meaning that the seven logit Other things equal, the log odds for urban coefficients are all significantly different women of having a CEB3 value above either from 0, having significant influence or of the two fixed levels are -1.14 lower in value than for rural women fertility Other things equal, the log odds for Han women of having a CEB3 value above either of the two fixed levels are -0.66 lower in alue than for minority women
15 29 Model Fit • From the lecture last week, we already know how to evaluate the adequacy or fit of the overall model. The LRχ2 test statistic in the full model has a value of 1429.64, which is the difference between values of (-2L0) and (-2LF). With 7 degrees of freedom, this statistic has a P of 0.0000 for testing the null hypothesis that β1 = β2 = β3 = 0. 30 • Pseudo R2 is 0.2384, fairly good fit of the model to the data. • All seven logit coefficients have very high z(t) scores, and all are significant at P = 0.01, meaning that the seven logit coefficients are all significantly different from 0, having significant influence on fertility. 16 31 The Ordered Logit Coefficients • In a summary way, the coefficients tell us that older women have more children, educated women have fewer children, urban women have fewer children than rural women, and Han women have fewer children than minority women. • Each coefficient refers to the linear change in the log odds of being above either of the first two categories. 32 • Other things equal, with each increase in age, there is an increase of 0.17 in the log odds of CEB3 being above either of the two fixed levels, that is, the two fixed levels of none or few. • Other things equal, the log odds for urban women of having a CEB3 value above either of the two fixed levels are -1.14 lower in value than for rural women. • Other things equal, the log odds for Han women of having a CEB3 value above either of the two fixed levels are -0.66 lower in value than for minority women
Other things equal, the log odds for women Odds ratios ---wrwwrr with primary school education, junior middle school education, senior middle (e~b), 4th school education, and college or over columnof齟 education having a CEB3 value above data either of the two fixed levels are espectively,-031,-064,-112,and-1.28 lower in value than for illiterate women tisane, Hemr we Of course, each of these interpretations Percentage captures the linear effect of the particular X-variable, holding all other variables ratios ( %) 4th constant /描的:= change In odds column of data Odds ratios With every one year increase in age, the odds of being in a higher fertility outcome category is 1.19 greater, or increase by Stata's listcoef command will give the 19%, holding all other variables constant. odds ratios, along with the ordered logit Urban women have odds of being in a ligher CEB3 category that are 68% less Stata will give percentage change in odds than those of rural women, holding all ratios with the listcoef command followed other variables constant by percent after the comma Women with college above education is 72% less likely in a higher fertility category than illiterate women, holding all other variables constant
17 33 • Other things equal, the log odds for women with primary school education, junior middle school education, senior middle school education, and college or over education having a CEB3 value above either of the two fixed levels are, respectively, -0.31, -0.64, -1.12, and -1.28 lower in value than for illiterate women. • Of course, each of these interpretations captures the linear effect of the particular X-variable, holding all other variables constant. 34 Odds Ratios • Stata’s listcoef command will give the odds ratios, along with the ordered logit coefficients. • Stata will give percentage change in odds ratios with the listcoef command, followed by percent after the comma. 18 35 Odds ratios (e^b), 4th column of data Percentage change in odds ratios (%) , 4th column of data 36 • With every one year increase in age, the odds of being in a higher fertility outcome category is 1.19 greater, or increase by 19%, holding all other variables constant. • Urban women have odds of being in a higher CEB3 category that are 68% less than those of rural women, holding all other variables constant. • Women with college above education is 72% less likely in a higher fertility category than illiterate women, holding all other variables constant
What of the relative importance of the Semi-standardized ordered logit effects of these logit coefficients? Which of coefficients on the x-variable control for the seven x variables is the most the metric of the x variable: For every one infiuential in affecting the odds of a woman standard deviation increase in age, there eing in the next higher category of the is an increase of 1.35 in the log odds (ie fertility variable? the logit) of CEB3 being above either of According to the unstandardized logit the two fixed levels that is the two fixed coefficients, college is ranked first, levels of none or few, holding all other followed by urban. The smallest is that of variables constant Semi- and fully standardized ordered logit coefficients Ordered logits coefficients standardized on the Y-variable: For every increase of one year in age, there is an increase of 0.07 standard deviations in the woman's fertility, holding all other variables constant The fully standardized ordered logit n2=!:2:t::.::: coefficient: For every one standard deviation increase in age, there is an E,-213=自.531 increase of o 056 standard deviations in the woman's fertility, holding all other ariables constant
19 37 • What of the relative importance of the effects of these logit coefficients? Which of the seven X variables is the most influential in affecting the odds of a woman being in the next higher category of the fertility variable? • According to the unstandardized logit coefficients, college is ranked first, followed by urban. The smallest is that of age. 38 Semi- and fully standardized ordered logit coefficients 20 39 • Semi-standardized ordered logit coefficients on the X-variable control for the metric of the X variable: For every one standard deviation increase in age, there is an increase of 1.35 in the log odds (i.e., the logit) of CEB3 being above either of the two fixed levels, that is, the two fixed levels of none or few, holding all other variables constant. 40 • Ordered logits coefficients standardized on the Y-variable: For every increase of one year in age, there is an increase of 0.07 standard deviations in the woman’s fertility, holding all other variables constant. • The fully standardized ordered logit coefficient: For every one standard deviation increase in age, there is an increase of 0.056 standard deviations in the woman’s fertility, holding all other variables constant