2011

Return to search form  

Contact emails are provided for one-to-one contact only and may not be used for mass emailing or group solicitations.

Session Title: Reliability: The Beginning of Value
Multipaper Session 857 to be held in Pacific C on Saturday, Nov 5, 9:50 AM to 11:20 AM
Sponsored by the Quantitative Methods: Theory and Design TIG
Chair(s):
Dale Berger,  Claremont Graduate University, dale.berger@cgu.edu
Reducing Rater Bias in Scoring Performance Assessments
Presenter(s):
Robert Johnson, University of South Carolina, rjohnson@mailbox.sc.edu
Min Zhu, University of South Carolina, helen970114@gmail.com
Brandon Loudermilk, University of South Carolina, loudermb@mailbox.sc.edu
Xiaofang Jae, University of South Carolina, jae2008@gmail.com
Ashlee Lewis, University of South Carolina, lewisaa2@mailbox.sc.edu
Abstract: This study examines the use of visual representations of types of scoring bias in training raters to score arts assessments. Initial evidence is mixed about whether the quality of raters' scores improved when their training incorporated visual representations of scoring bias versus all verbal descriptions of bias. Two of the three treatment groups (i.e., raters trained with visual representations of bias) displayed closer agreement with validation scores than did the controls; whereas, two of the control groups had higher interrater reliability than the treatment groups. The potential for the utility of visual representations of bias is reflected in that a majority of raters in the treatment group successfully recalled more of the types of bias that were presented visually rather than verbally.
Bootstrap Reliability
Presenter(s):
Cristian Gugiu, Western Michigan University, crisgugiu@yahoo.com
Abstract: All measures are imperfect since no test (survey) is capable of measuring with perfect accuracy or precision. Over the past century, psychometricians have developed a multitude of reliability measures in order to provide researchers with a simple index that quantifies the reliability of a test. Undoubtedly, the most popular index continues to be coefficient (Cronbach's) alpha. However, this reliability estimator is plagued by a number of issues, including that it (a) only provides a lower bound estimate of the true reliability, (b) is only interpretable if the test is unidimensional, (c) has no lower bound (i.e., negative estimates are possible), and (d) is only appropriate for continuous data. This paper will introduce a new method for estimating the reliability of a test based upon the bootstrap method. The bootstrap reliability method is simple to understand and has the versatility to work with continuous, ordinal, and nonlinear data.
Employing Generalizibility Theory to Assess the Reliability of Peer Evaluation
Presenter(s):
Mihaiela Gugiu, Central Michigan University, gugiu1mr@cmich.edu
Abstract: Since the early 1990s, a growing movement in education has advocated for the implementation of peer evaluation in teaching. Naturally, this movement has not gone unchallenged by those who question the reliability and validity of peer evaluation. The purpose of this paper is to demonstrate the applicability of Generalizibility theory in estimating the reliability and validity of student peer evaluation. I draw from my experience in teaching introductory courses in political behavior, where group projects were utilized as a means of enhancing student knowledge. College students were asked to evaluate the group presentations of their peers and the written papers of two other groups. Generalizability theory was used to examine the reliability of peer evaluation of student written group projects and oral presentations. Moreover, the validity of peer evaluations was examined by comparing the grades resulting from this method to those produced by the instructor and a graduate teaching assistant.

 Return to Evaluation 2011

Add to Custom Program