Return to search form  

Session Title: Issues in Measuring Reliability and Retention
Multipaper Session 766 to be held in International Room on Saturday, November 10, 10:30 AM to 12:00 PM
Sponsored by the Quantitative Methods: Theory and Design TIG
Chair(s):
Brian Dates,  Southwest Counseling Solutions,  bdates@swsol.org
Factors Affecting the Behavior of Interrater Reliability Statistics
Presenter(s):
Brian Dates,  Southwest Counseling Solutions,  bdates@swsol.org
Jason King,  Baylor College of Medicine,  jasonk@bcm.tmc.edu
Abstract: Many evaluators are unaware of recent developments regarding measures of inter-rater agreement, being only familiar with the earlier kappa measure. Interpretation of kappa is limited by its dependence on marginal probabilities and trait prevalence. Although alternatives have been developed, these have not been widely implemented in the statistical computing packages nor systematically studied. We recently (Authors, 2007) developed user-friendly SPSS syntax for this purpose, which includes capabilities for calculating inter-rater reliability estimates for any number of raters and response categories. In the present paper, we present results from a series of Monte Carlo simulations comparing the performance of Scott's pi, Cohen's kappa, Conger's kappa, and Gwet's AC1 statistic across a number of conditions. Application to practice is emphasized.
A General Method for Estimating the Reliability of High-stakes Educational Decisions
Presenter(s):
Karen Douglas,  International Reading Association,  douglasdouglas@verizon.net
Abstract: Important educational decisions use complex rules to combine information from a number of tests and assessments. It is widely recognized that the reliability and validity of such scores is of central importance in supporting the fairness of such decisions. This paper presents a simulation method for estimating the reliability of conjunctive, complementary, and compensatory decision rules for a target group of tests and students. Results show that the reliability of a decision depends on the type of decision rule utilized, as well as the number of tests, test difficulty, and the allowable number of attempts to pass. Through application of the suggested simulation method, policy makers can strive to improve the reliability and equity of high-stakes decisions for all students.
Surveying Nonresponders: Implications for Surveying Methods
Presenter(s):
Jacey Payne,  Howard Research & Management Consulting Inc,  jacey@howardresearch.com
Teresa Roeske,  Howard Research & Management Consulting Inc,  teresa@howardresearch.com
Abstract: When evaluators solicit information from the public, clients, or stakeholders, they may be pleased with a 25% response rate or ecstatic with a 50% response rate. While the results may inspire confidence, the question remains, what did the other half think? This paper focuses on the perspectives of responders versus nonresponders through the example of a government-sponsored evaluation of a responsible gaming program aimed at providing education and awareness among Video Lottery Terminal (VLT) retailers. A multi-mode survey of retailers yielded a response rate of approximately 50% with responses heavily weighted towards program participants. The sponsor's concern for the notably absent perspectives of nonparticipants became the impetus to retailor the methodology and subsequently survey known nonresponders. Differences between the two phases of the survey can provide insight into the implications of surveying in nonresponse, particularly when survey results are assumed to represent a given population.
Calculating Retention With Caution: A Look at How Much Measurement Matters
Presenter(s):
Mary Kay Falconer,  Ounce of Prevention Fund of Florida,  mfalconer@ounce.org
Abstract: Retention of participants in voluntary long-term interventions is a challenge that confronts many practitioners interested in providing support services to high-risk families. Participant retention is often an important moderator of outcome performance in evaluations of these programs. This analysis tests the statistical relationships between multiple predictors and different measures of retention. The participant sample used for this analysis is families who enrolled in a large home visiting program in a southern state between January 1 and July 1 of 2004. The predictors in the models tested include a selection of participant characteristics and programmatic experiences that have appeared consistently in the relevant research literature. The retention measures include retention rates at different time periods and number of days based on date parameters. The analytical techniques applied include linear regression and binary logistic regression. The results provide systematic documentation of the importance of retention measurement in program evaluation and improvement.
Search Form