[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Nagelkerke's R^2 as estimator of goodness of fit

To: AUDITORY@xxxxxxxxxxxxxxx
Subject: Re: Nagelkerke's R^2 as estimator of goodness of fit
From: Daniel Oberfeld <oberfeld@xxxxxxxxxxxx>
Date: Fri, 11 May 2012 10:07:25 +0200
Approved-by: oberfeld@xxxxxxxxxxxx
Delivery-date: Fri May 11 04:10:14 2012
In-reply-to: <AUDITORY%201205110000542780.8E79@xxxxxxxxxxxxxxx>
List-archive: <http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>
List-help: <http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>, <mailto:LISTSERV@LISTS.MCGILL.CA?body=INFO AUDITORY>
List-owner: <mailto:AUDITORY-request@LISTS.MCGILL.CA>
List-subscribe: <mailto:AUDITORY-subscribe-request@LISTS.MCGILL.CA>
List-unsubscribe: <mailto:AUDITORY-unsubscribe-request@LISTS.MCGILL.CA>
References: <AUDITORY%201205110000542780.8E79@xxxxxxxxxxxxxxx>
Reply-to: Daniel Oberfeld <oberfeld@xxxxxxxxxxxx>
Sender: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>
User-agent: Mozilla/5.0 (Windows NT 5.1; rv:12.0) Gecko/20120428 Thunderbird/12.0.1

Dear Pragati,

Nagelkerke's R^2 is a reasonable choice though probably not the optimalR^2 measure (see DeMaris, A. (2002). Explained variance in logisticregression - A Monte Carlo study of proposed measures. SociologicalMethods & Research, 31(1), 27-74.).

What you might think about is using two alternative approaches toassessing goodness of fit.

First, you could conduct a *test for global goodness of fit*. This testwill show you whether your model performs significantly worse than theso-called "saturated model" using the maximal number of free parameters(cf. Agresti, A. (2002). Categorical data analysis (2. ed.). New York,NY: Wiley). In other words, it tells you whether there exists a modelproviding a better description of the data than the model you used. Ifthis is not the case (i.e., if the test is non-significant, say p-valuegreater than 0.1), then your model is a very reasonable choice!If you conduct this test make sure to use the correct variant in caseyou have sparse data in the sense that the number of differentcovariate-combinations you studied is not much smaller than the numberof subjects (cf. Hosmer, D. W., Hosmer, T., leCessie, S., & Lemeshow, S.(1997). A comparison of goodness-of-fit tests for the logisticregression model. Statistics in Medicine, 16(9), 965-980).

The second alternative measure that is useful for assessinggoodness-of-fit for a logistic regression model is the *predictivepower*, which provides information about the degree to which thepredicted probabilities are concordant with the observed outcome.Statistics software like SAS PROC LOGISTIC by default supplies thisinformation in terms of the area under the ROC curve. Again a detaileddescription of this approach can be found in Agresti, A. (2002):Categorical data analysis.

For examples of how to apply these two approaches to data see Dittrich,K., & Oberfeld, D. (2009). A comparison of the temporal weighting ofannoyance and loudness. Journal of the Acoustical Society of America,126(6), 3168-3178. or Oberfeld, D., & Plank, T. (2011). The temporalweighting of loudness: Effects of the level profile. Attention,Perception & Psychophysics, 73(1), 189-208.

In case need you help in implementing these approaches using software,you are welcome to contact me again.

Best

Daniel

Date:    Thu, 10 May 2012 16:19:34 +0530
From:    Pragati Rao <pragatir@xxxxxxxxx>
Subject: Nagelkerke's R^2 as estimator of goodness of fit

--14dae9340621197cda04bfac60d5
Content-Type: multipart/alternative; boundary=14dae9340621197cd704bfac60d3

--14dae9340621197cd704bfac60d3
Content-Type: text/plain; charset=ISO-8859-1

Dear all,

After many suggestions how to fit the data (for the question GLM vs Cubic
Smoothing Spline) and reading the articles suggested by members, I am now
using maximum likelihood for logistic regression to fit the data. As I
remember reading, the usual R^2 is not a good way to comment on goodness of
fit for logistic regression. So Nagelkerke's R^2 should be used. I am using
the following formula to calculate nagelkerke's R^2.

R^2=[1- (L0/L)^(2/n)]/ [1-L0^(2/n)]

1. I wanted to know whether L0 is the likelihood of observed data if the
estimator predicted constant probability irrespective of input (vot, f2
etc)?

 2. I have attached two figures where this method was used to estimate the
fit . For figure VOT_hin_sub9 the nagelkerke R^2 value is 0.9676 and for
the figure VOT_hin_sub15, it is 0.465.I wanted to know if the goodness of
fit is reflected accurately in values of R^2?

Any suggestions/comments are welcome.

Regards,

Pragati Rao
Research Officer,
All India Institute of Speech and Hearing,
Mysore, India.

--
Dr. habil. Daniel Oberfeld-Twistel
Johannes Gutenberg - Universitaet Mainz
Department of Psychology
Experimental Psychology
Wallstrasse 3
55122 Mainz
Germany

Phone ++49 (0) 6131 39 39274
Fax   ++49 (0) 6131 39 39268
http://www.staff.uni-mainz.de/oberfeld/

Prev by Date: Nagelkerke's R^2 as estimator of goodness of fit
Next by Date: Paper Request: Seagull et al. 2001, Proceedings of the Human Factors and Ergonomics Society Annual Meeting
Previous by thread: Nagelkerke's R^2 as estimator of goodness of fit
Next by thread: Paper Request: Seagull et al. 2001, Proceedings of the Human Factors and Ergonomics Society Annual Meeting
Index(es):
- Date
- Thread