[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Nagelkerke's R^2 as estimator of goodness of fit

Dear Pragati,

Nagelkerke's R^2 is a reasonable choice though probably not the optimal R^2 measure (see DeMaris, A. (2002). Explained variance in logistic regression - A Monte Carlo study of proposed measures. Sociological Methods & Research, 31(1), 27-74.).

What you might think about is using two alternative approaches to assessing goodness of fit.

First, you could conduct a *test for global goodness of fit*. This test will show you whether your model performs significantly worse than the so-called "saturated model" using the maximal number of free parameters (cf. Agresti, A. (2002). Categorical data analysis (2. ed.). New York, NY: Wiley). In other words, it tells you whether there exists a model providing a better description of the data than the model you used. If this is not the case (i.e., if the test is non-significant, say p-value greater than 0.1), then your model is a very reasonable choice! If you conduct this test make sure to use the correct variant in case you have sparse data in the sense that the number of different covariate-combinations you studied is not much smaller than the number of subjects (cf. Hosmer, D. W., Hosmer, T., leCessie, S., & Lemeshow, S. (1997). A comparison of goodness-of-fit tests for the logistic regression model. Statistics in Medicine, 16(9), 965-980).

The second alternative measure that is useful for assessing goodness-of-fit for a logistic regression model is the *predictive power*, which provides information about the degree to which the predicted probabilities are concordant with the observed outcome. Statistics software like SAS PROC LOGISTIC by default supplies this information in terms of the area under the ROC curve. Again a detailed description of this approach can be found in Agresti, A. (2002): Categorical data analysis.

For examples of how to apply these two approaches to data see Dittrich, K., & Oberfeld, D. (2009). A comparison of the temporal weighting of annoyance and loudness. Journal of the Acoustical Society of America, 126(6), 3168-3178. or Oberfeld, D., & Plank, T. (2011). The temporal weighting of loudness: Effects of the level profile. Attention, Perception & Psychophysics, 73(1), 189-208.

In case need you help in implementing these approaches using software, you are welcome to contact me again.



Date:    Thu, 10 May 2012 16:19:34 +0530
From:    Pragati Rao <pragatir@xxxxxxxxx>
Subject: Nagelkerke's R^2 as estimator of goodness of fit

Content-Type: multipart/alternative; boundary=14dae9340621197cd704bfac60d3

Content-Type: text/plain; charset=ISO-8859-1

Dear all,

After many suggestions how to fit the data (for the question GLM vs Cubic
Smoothing Spline) and reading the articles suggested by members, I am now
using maximum likelihood for logistic regression to fit the data. As I
remember reading, the usual R^2 is not a good way to comment on goodness of
fit for logistic regression. So Nagelkerke's R^2 should be used. I am using
the following formula to calculate nagelkerke's R^2.

R^2=[1- (L0/L)^(2/n)]/ [1-L0^(2/n)]

1. I wanted to know whether L0 is the likelihood of observed data if the
estimator predicted constant probability irrespective of input (vot, f2

 2. I have attached two figures where this method was used to estimate the
fit . For figure VOT_hin_sub9 the nagelkerke R^2 value is 0.9676 and for
the figure VOT_hin_sub15, it is 0.465.I wanted to know if the goodness of
fit is reflected accurately in values of R^2?

Any suggestions/comments are welcome.


Pragati Rao
Research Officer,
All India Institute of Speech and Hearing,
Mysore, India.

Dr. habil. Daniel Oberfeld-Twistel
Johannes Gutenberg - Universitaet Mainz
Department of Psychology
Experimental Psychology
Wallstrasse 3
55122 Mainz

Phone ++49 (0) 6131 39 39274
Fax   ++49 (0) 6131 39 39268