Evaluation 2008 Banner

Return to search form  

Contact emails are provided for one-to-one contact only and may not be used for mass emailing or group solicitations.

Session Title: Core Quantitative Issues: New Developments in Outcome Measures
Panel Session 515 to be held in Centennial Section B on Friday, Nov 7, 9:15 AM to 10:45 AM
Sponsored by the Quantitative Methods: Theory and Design TIG
Chair(s):
Manuel C Voelkle,  University of Mannheim,  voelkle@rumms.uni-mannheim.de
Abstract: The session gives an overview of important new developments in constructing, analyzing and communicating outcome measures in evaluation research. It contains methodologically oriented presentations as well as examples from current evaluation studies. It is thus expected to be of interested to both the methodologically interested and applied researcher alike. The first presenter, Lee Sechrest, talks about the importance of properly calibrating outcome measures in order to make them meaningful for policy makers and practitioners. AndrTs Steffanowski proposes a new individualized outcome measure in rehabilitation research by combining information on status and change. Manuel Voelkle addresses the more general issue of emergent versus latent outcome measures in longitudinal designs. Mende Davis demonstrates the use of Rasch modeling to construct a measure of academic and professional success in mathematics, and illustrates its use in an outcome evaluation. Finally, Werner Wittmann discusses different approaches to synthesize outcome measures of large-scale evaluation studies.
Properly Calibrated Measures
Lee Sechrest,  University of Arizona,  sechrest@u.arizona.edu
The measures used in social science are often expressed in metrics that have no intrinsic meaning. Even measures of effect size are often not interpretable in any direct way. Persistent efforts should be exerted toward calibrating measures for meaning so that the effects of interventions can be described in ways that make sense to policy makers and practitioners. Unfortunately, very few such efforts have been mounted, and none systematically. It is easy, however, to exemplify the need and to illustrate the possibilities by reference to existing work. Examples also show why calibrated measures would be more persuasive and likely to result in implementation of effective interventions. Calibration research is an activity distinct from the development of basic measuring tools and need not interfere in any way with the production of evaluation measurement tools based on sound theory, state of the art methods, and pragmatic concerns for their implementation in the field.
Individualized Outcome Measures in Pre-Post-Studies: Combining Information on Status and Change
Andros Steffanowski,  University of Mannheim,  andres@steffanowski.de
Manuel C Voelkle,  University of Mannheim,  voelkle@rumms.uni-mannheim.de
Social interventions try to achieve two goals: First, to cure (or at least ameliorate) dysfunctional states (rehabilitation) and, second, to maintain desirable states (prevention). Regarding these two aspects, how is a fair outcome evaluation possible? The traditional approach has been to compute pre-post-difference scores. This, however, addresses only the first goal but not the second. In other words, maintaining a dysfunctional state, as well as maintaining a healthy state, would result in a zero-effect size (d = 0), with zero-effects being typically interpreted as 'no success'. Accordingly, using only pre-post-difference scores in outcome evaluation, the overall effect can be severely underestimated. To deal with this problem, an alternative measure has been developed by combining z-standardized pre-post-difference scores with z-standardized post-status scores for each item. The procedure is illustrated using a dataset of N = 858 psychosomatic inpatients with results indicating high reliability and good validity of the new outcome measure.
Emergent Versus Latent Outcome Measures in Longitudinal Analysis
Manuel C Voelkle,  University of Mannheim,  voelkle@rumms.uni-mannheim.de
Andros Steffanowski,  University of Mannheim,  andres@steffanowski.de
When designing and analyzing outcome criteria it is important to distinguish between emergent (i.e., formative) and latent (i.e., reflective) measures. While the former have typically been analyzed using methods of variance decomposition, the latter have a factor analytic tradition (Cole, Martin, & Steiger, 2005). In this presentation the distinction is reviewed for the analysis of longitudinal data. It is argued that approaching the distinction from the perspective of latent growth curve modeling as a general data analytic system for the analysis of change has several methodological, didactical and statistical advantages. All arguments are illustrated by our research on quality monitoring in ambulatory psychotherapy and results of a short Monte-Carlo simulation are presented to evaluate the underlying assumptions specific to the analysis of change.
Measuring the Educational Pipeline
Mende Davis,  University of Arizona,  mfd@u.arizona.edu
The measurement of change is often limited to single outcome variables, even when multiple measures have been collected. Relying on a single outcome measure can decrease power and reduce the likelihood of detecting the effect of an intervention. Combining multiple measures into a scale to measure change may result in greater sensitivity to intervention effects. This possibility is illustrated by the development and illustration of a scale to measure academic and professional progress in graduate school. An educational pipeline scale can incorporate multiple types of indicators, multiple sources of data, and even processes that play out over time. Rasch analyses are used in scale development, and the resulting scale is demonstrated in a program evaluation.
Outcome Measurement and Meta-Analysis, What are the Most Adequate Effect Sizes?
Werner Wittmann,  University Mannheim,  wittmann@tnt.psychologie.uni-mannheim.de
The majority of programs we evaluate are complex ones, i.e. intervention packages. Fair evaluations therefore need a set of different outcome measures. Applying meta-analysis to synthesize the effects of such programs we have several options. One is computing the effect sizes for each single outcome measure and average them or report them as an outcome profile to visualize level, scatter and shape what the program produced. Another strategy is considering the redundancy of the measures and reducing them via aggregation or factor analysis. The effect sizes resulting from the latter ones should be higher than the average of the single outcomes. Data from large-scale program evaluations using such multiple act outcome criteria are used to illustrate the differences. Meta-Analysis with no access to the intercorrelations of the outcomes used, will underestimate the effects of programs.

 Return to Evaluation 2008

Add to Custom Program