|
Session Title: The Case for Brief(er) Measures
|
|
Panel Session 666 to be held in Lone Star E on Friday, Nov 12, 4:30 PM to 6:00 PM
|
|
Sponsored by the Quantitative Methods: Theory and Design TIG
|
| Chair(s): |
| Lee Sechrest, University of Arizona, sechrest@email.arizona.edu
|
| Abstract:
It is often assumed that longer measures will be better than shorter measures; that may not always be the case. Determining how measures might be shortened without important cost to reliability or validity would be of great potential value to program evaluators and other researchers. Some instances of the utility and even superiority of single item measures have been identified, and principles underlying them have been described. Very brief scales have been developed by application of methods of intensive data analysis, often resulting in better predictions of criteria than possible with the full scales. Moreover, similar methods can be effective in reducing even very large omnibus measures and sets of measures to a small subset of items or scales that effectively can, by regression methods, reproduce the information in the total set. Description of these approaches and methods and illustration of their applications will be the focus of this panel.
|
|
Single Item Measures
|
| Lee Sechrest, University of Arizona, sechrest@email.arizona.edu
|
|
Single item measures are commonplace in social science research: sex, age, marital status, education, income, and many other variables are routinely assessed by single items. Rossiter has observed that single items may be better than multiple items when the characteristic being measured can be conceptualized as concrete and singular. Such characteristics as age, marital status, and income fit those requirements, but so do many other possible characteristics of interest in program evaluation and in social science more generally. Nicotine addiction, medical conditions (erectile dysfunction), liking for various “objects,” intentions, and some attitudes, have, among other characteristics, been found to be adequately, or even better, assessed by single item measures. Careful consideration should be given in planning for evaluations to whether characteristics to be assessed could be classified as concrete and singular and, therefore, potentially quantifiable by a single item.
|
|
|
Very Brief Scales: Carved Out Scales
|
| Patrick McKnight, George Mason University, pmcknigh@gmu.edu
|
|
Evaluators long for short, psychometrically-sound instruments for all applications because shorter measures reduce respondent burden and missing data. Psychometric theory, however, dictates the opposite: we must increase the length of our measures to improve reliability and validity. Previous research indicates that some short measures – even single items - provide equal or greater predictive validity compared to longer versions. These findings lead us to create a procedure to empirically test whether shorter versions of a measure can provide equal or better predictive validity than the longer versions. The procedure involves randomly generating and comparing the predictive/criterion validity of small item subsets via a genetic algorithm - an iterative mulitple comparison approach. Only the best performing subsets survive each comparison round resulting in some subsets "winning" after several hundred comparisons. The purpose of this talk is to demonstrate the procedure and show that in many cases, shorter measures do outperform longer measures.
| |
|
Abbreviated Omnibus Measures
|
| Mei-kuang Chen, University of Arizona, kuang@email.arizona.edu
|
|
Yarkoni has shown recently that the application of a genetic algorithm for item selection made it possible to reduce 500 items on eight “broadband” personality measures constituting 200 separate scales to a total of 181 items that correlated quite highly with the original scales. The method is computer (and data) intensive, but it has substantial potential importance and wide applicability in many research settings, including program evaluation. Analyses emulating genetic algorithms, show that great compression can be achieved in evaluation measurement when the number of variables is reasonably large and many measures are inter-correlated at least modestly. These analyses will have to be carried out on measures and data sets of general interest in order for investigators to be able to use the results to plan and carry out their own work.
| |