328 views

Uploaded by Fanny Sylvia C.

- Are Food Insecurity’s Health Impacts Underestimated in the U.S. Population?
- ICSENM_110.pdf
- 10.1.1.534.6057.pdf
- R_examples.pdf
- 500 Data Science Interview Questions and Answers - Vamsee Puligadda.pdf
- Economic Analysis of the Utilization of Disused Biomass From the Agricultural Activity in the Region of Thessaloniki
- ApplyingMILHDBK1823ForPODDemon
- Logistic Regression Model
- LSE SRM MSc Dissertation Titelman
- The Influence of Sale Promotion Factors on Purchase Decisions
- logregr
- Causes of Exchange Rate Crisis Output Same - SPSS Ouput File
- Probability of Financial Distress
- Classification
- Multivariate in Epidemiology
- Effects of International Institutional Factors on Earnings Qualit تابع التطبيق.pdf
- v8n2a11.pdf
- 06019249 Projection
- 10.1016@j.geomorph.2011.06.010
- Estimation of Parameters2

You are on page 1of 4

Response variable is binomial counts, denoted by counts of binary variables. That is, if

X~bernoulli(p), then Y = ∑ X i ∼ Binomial ( n, p )

i

Example 1 Case Study 21.1: Island size and bird extinctions: On each island we count the

number of species that went extinct out all the species on the island. What is the relationship

between the area of an island and the probability of extinction of birds present on the island?

Example 2 Case study 21.2: Moth coloration and natural selection: At each distance from

Liverpool we count the number of moths from each morph that were taken by predators. What is

the relationship between the distance from Liverpool, where trees are dark from industrial soot,

and the probability of predation on the light and dark morphs of the moth Carbonaria?

Y = the number of successes in m binomial trials. For example, how many species went

extinct in the 10 year period of the study on each island?

Yi ~Binomial(mi,πi) where i is the ith island and πi is the probability of extinction on each

island.

X 1 ,..., X p : explanatory variables, in the extinction example, X is the area of the island.

Y/m = the binomial proportion. Note that the sample size in the bird extinction study is

the number of islands, not the number of species.

• logit(π) = η = β 0 + β1 X 1 + … + β p X p

• As before:

eη

π=

1 + eη

page 2

Continuous versus Counted Proportions:

Not all proportions are appropriate to model with logistic regression. We model proportions like

fat calories/total calories, etc., using normal theory, usually. The only proportions that are

appropriate in this context are those that result from an integer count of a certain outcome over

the total number of trials or outcomes.

Variance

µ (Yi X 1i ,…, X pi ) = πi

SD (Yi X 1i ,… , X pi ) = miπ i (1 − π i )

As for the binary response model we use the maximum likelihood estimators (MLE’s).

Model Assessment

Estimated versus observed: One was to assess the appropriateness of the model and the efficacy

of the estimation routine is to plot the estimated probability, πˆi , against the observed response

Y

proportion, π i = i . Additionally, plots of the observed logits versus one or more of the

mi

explanatory variables are useful for visual examination as we do for ordinary scatterplots in

linear regression. See Display 21.2.

Residual analysis: As in the binary response case, we have two widely used residuals for

binomial counts models.

Residuals

There are two standard ways to define a residual in logistic regression.

yi − miπˆi

1. Pearson residual = .

miπˆi (1 − πˆi )

⎧⎪ ⎛ Y ⎞ ⎛ m − Y ⎞ ⎫⎪

2. Deviance residual = Dresi = sign (Yi − miπˆi ) 2 ⎨Yi log ⎜ i ⎟ + ( mi − Yi ) log ⎜ i ⎟⎬

⎩⎪ ⎝ miπˆi ⎠ ⎝ mi − miπˆi ⎠ ⎭⎪

The Pearson residual is more easily understood, but the deviance residual directly gives the

contribution of each point to the lack of fit of the model.

page 3

Since the data are grouped, the residuals in a binomial counts logistic regression (either Pearson

or deviance) are more useful than in the binary response regression.

The residuals should be plotted against the predicted values for πis and examined for outliers or

remaining patterns.

As with the binary response case, we use the value –2ln(Maximized likelihood function) to

compare models. Recall that the MLE’s of the β i ’s are the values that maximize the likelihood

function of the data. So, we find that the values for the β i ’s that maximize the likelihood

function, take the natural log and multiply by –2.

• The quantity –2 ln(Maximized likelihood) is also called the deviance of a model since

larger values indicate greater deviation from the assumed model. Comparing two nested

models by the difference in deviances is a drop-in-deviance test.

• The difference between the values of –2ln(Maximized likelihood function) for a full and

reduced model has approximately a chi-square distribution if the null hypothesis that the

extra parameters are all 0 is true. The d.f. is the difference in the number of parameters

for the two models.

Model Selection

Both AIC and BIC can be used as model selection criteria. As with linear regression models,

they are only relative measures of fit, not absolute measures of fit.

AIC = Deviance + 2p

where p is the number of parameters in the model. Stepwise model selection methods are

available in SPSS using likelihood ratio tests or Wald’s test. The LR methods are preferred.

Other software programs, like S-Plus, have stepwise procedures using AIC or BIC.

Goodness-of-fit Tests

Since we have multiple counts per cell there is a goodness-of-fit test similar to that in linear

regression. We can compare the model with the log odds or logit is linear in the parameters to the

model where each cell has a separate mean. So we are comparing the logistic regression model

with p predictors to the model n different parameters, where p is the number of predictors in the

logistic regression model and n is the number of categorical treatment combinations. That is we

are testing

Saturated model: logit(π i ) = α i (n parameters)

page 4

⎧⎪ ⎛ Y ⎞ ⎛ m −Y ⎞ ⎫⎪

The test statistic is D 2 = ∑ Di 2 = ∑ 2 ⎨Yi log ⎜ i ⎟ + ( mi − Yi ) log ⎜ i ⎟⎬

i i ⎩⎪ ⎝ miπˆi ⎠ ⎝ mi − miπˆi ⎠ ⎭⎪

Both the denominators, miπˆi and mi − miπˆi , need to be large for the distribution of the test

{

statistic to be approximately χ 2 n − p . The pvalue for the test is then Pr χ 2 n − p ≥ D 2 }

Wald Test and Confidence Intervals for Single Coefficients.

The Wald test performs similarly as in the binary counts case. The normal approximation used

by this test is adequate so long as n is moderately large and the mπ is greater than 5.

Below is the code for fitting the bird extinction model in Matlab.

105.8000 67.0000 3.0000

30.7000 66.0000 10.0000

8.5000 51.0000 6.0000

4.8000 28.0000 3.0000

4.5000 20.0000 4.0000

4.3000 43.0000 8.0000

3.6000 31.0000 3.0000

2.6000 28.0000 5.0000

1.7000 32.0000 6.0000

1.2000 30.0000 8.0000

0.7000 20.0000 2.0000

0.7000 31.0000 9.0000

0.6000 16.0000 5.0000

0.4000 15.0000 7.0000

0.3000 33.0000 8.0000

0.2000 40.0000 13.0000

0.0700 6.0000 3.0000];

extinct=case2101(:,3);

atrisk=case2101(:,2);

area=case2101(:,1);

[b,dev,stats]=glmfit(area,[extinct atrisk],'binomial');

x = 1:10:180;

y = glmval(b,x,'logit');

plot(area,extinct./atrisk,'x',x,y,'r-')

- Are Food Insecurity’s Health Impacts Underestimated in the U.S. Population?Uploaded byPatricia Dillon
- ICSENM_110.pdfUploaded byzmudio
- 10.1.1.534.6057.pdfUploaded byFerdy Lainsamputty
- R_examples.pdfUploaded byTodd Martinez
- 500 Data Science Interview Questions and Answers - Vamsee Puligadda.pdfUploaded byFranco Abanto
- Economic Analysis of the Utilization of Disused Biomass From the Agricultural Activity in the Region of ThessalonikiUploaded byAnonymous 0K8PXY
- ApplyingMILHDBK1823ForPODDemonUploaded byPDDELUCA
- Logistic Regression ModelUploaded byRavi Kumar
- LSE SRM MSc Dissertation TitelmanUploaded byNoam Titelman
- The Influence of Sale Promotion Factors on Purchase DecisionsUploaded byNasir Ali
- logregrUploaded bySridhar Kalyankar
- Causes of Exchange Rate Crisis Output Same - SPSS Ouput FileUploaded byaksgupta123
- Probability of Financial DistressUploaded byFaye Alonzo
- ClassificationUploaded byRonak Patel
- Multivariate in EpidemiologyUploaded byNasir Ahmad
- Effects of International Institutional Factors on Earnings Qualit تابع التطبيق.pdfUploaded byMohamed Kamel Maaty
- v8n2a11.pdfUploaded byMir wanto
- 06019249 ProjectionUploaded byShardul Singh
- 10.1016@j.geomorph.2011.06.010Uploaded byLuchin Lopez Merino
- Estimation of Parameters2Uploaded byNamrata Gulati
- POLO2: a user's guide to multiple Probit Or LOgit analysisUploaded byPACIFIC SOUTHWEST RESEARCH STATION REPORT
- 7 NonlinearUploaded byWilliam
- Appendix BUploaded byuzzn
- ch5.3Uploaded byjuntujuntu
- Willingness To PayUploaded byKrisnita Candrawati
- EndodonticTreatment Outcomes TorontoStudyUploaded byEduardo Ayub
- Park 89Uploaded byAlberto Sarco
- 113Uploaded byFann Yin
- SKALA PERINGKAT FMEAUploaded bySri Kombong
- ValidateUploaded bySantiago Lopez Uribe

- Chapter 20Uploaded byFanny Sylvia C.
- Chapter 10Uploaded byFanny Sylvia C.
- Chapter 14Uploaded byFanny Sylvia C.
- ReviewChaps3-4Uploaded byFanny Sylvia C.
- Chapter 12Uploaded byFanny Sylvia C.
- Hypo%26PowerLectureUploaded byFanny Sylvia C.
- Model- vs. design-based sampling and variance estimationUploaded byFanny Sylvia C.
- Non%26ParaBootUploaded byFanny Sylvia C.
- SampleSizeCalcRevisitedUploaded byFanny Sylvia C.
- Chapter 11Uploaded byFanny Sylvia C.
- Chapter 8Uploaded byFanny Sylvia C.
- Charles TaylorUploaded byFanny Sylvia C.
- Chapter 9Uploaded byFanny Sylvia C.
- Chapter 13Uploaded byFanny Sylvia C.
- ReviewChaps1-2Uploaded byFanny Sylvia C.
- Chapter 7Uploaded byFanny Sylvia C.
- Chapter 7Uploaded byFanny Sylvia C.
- An Ova PowerUploaded byFanny Sylvia C.
- Clustering in the Linear ModelUploaded byFanny Sylvia C.
- Good Article on Standard Error vs Standard DeviationUploaded byAshok Kumar Bharathidasan
- Intro BootstrapUploaded byMichalaki Xrisoula
- R Matrix TutorUploaded byFanny Sylvia C.
- The not so Short Introduction to LaTeXUploaded byoetiker
- Chapter 6Uploaded byFanny Sylvia C.
- Data Modeling: General Linear Model &Statistical InferenceUploaded byFanny Sylvia C.
- Chapter5p2LectureUploaded byFanny Sylvia C.
- GRM: Generalized Regression Model for Clustering Linear SequencesUploaded byFanny Sylvia C.
- Chapter 5Uploaded byFanny Sylvia C.
- Bio Math 94 CLUSTERING POPULATIONS BY MIXED LINEAR MODELSUploaded byFanny Sylvia C.

- Info de CelticUploaded byAlexandru Baciu
- ATKearney Global Megatrends And AgriBusiness 2015Uploaded byPhDrouillon
- CIDA Specification of Certificate in Digital ApplicationsUploaded byChinthaka Abeygunawardana
- 8538 825 01391Uploaded byVasileHoartza
- Emerging Trends in Cyber CrimeUploaded byanindyaaggarwal
- readme_enUploaded byKingSzili
- Mobilink Jazz brand auditUploaded byawaisyounas
- Corps of Engineers Construction in the United StatesUploaded byBob Andrepont
- Development of Mathematical Model and Stability Analysis for UAHUploaded byIJSRP ORG
- Beyond ThermographyUploaded bypartho143
- Siva DiscussionUploaded byKerol Kerol Kerol
- Adobe CoreAPIOverviewUploaded byTóni Bekő
- PresentationUploaded bysaabiaan
- Accounting Pre Test and AnswersUploaded byJen Rey
- GATE2012_1037959Uploaded byDarshak Gowda
- Sharples- Theory of MobileUploaded byjyotik09
- RFP Cut Code HAM Reference - Current EditionUploaded byWarren
- BMW E30 3 Series SpecificationsUploaded byMariam Kocharyan
- Prison Issues at CMCFUploaded byScott Sisters
- bongshin bs-205-0512e.pdfUploaded byibalco
- AGSR - Advanced Ground Surveillance Radar - ELM-2140NGUploaded byrgt89hw-g0h8-
- sw 4010 paper 2Uploaded byapi-319252753
- Satelite Tecra A4 Service ManualUploaded byJulio Gonzalez Isea
- English 365-3 ContentsUploaded bytatianaspb197
- SiegelSupplyChainPanel.pdfUploaded byBENHADJ
- Base Flux - Sanal & Anjali (1)Uploaded bySanal Anjickal
- IntroToIndianAccountsAudit.pdfUploaded byRananjaySingh
- BTEC Level 3 National Business_Unit 19Uploaded byspaghettipaul
- Competencias de un diseñadorUploaded byDiseño Industrial Escuela Isthmus
- Credit Transactions DigestsUploaded byRomulo Trajano Espalmado Jr.