READING

Panel Comments: Thomas Chapel

I’m Tom Chapel; I’m with the Centers for Disease Control and Prevention. What I am, as mentioned, is the very, very Senior Health Scientist for Evaluation. This is a disclaimer: Much as I’d like to be the spokesman on behalf of evaluation, as I always tell people, I’m more of the poster child. I’m the living embodiment of what we’re trying to do with evaluation. As we’ll point out, one of the things lacking in CDC is a strong home base for evaluation and there’s ways that operates to our benefit and there’s ways in which that becomes trouble – problematic.

Today I’m going to address the three questions that were set out for us by our Panel Abstract: who sets evaluation policy in the CDC environment, what can AEA do to influence the evaluation of policy, and what are some particular points of influence? The first thing to understand is that public health is very decentralized and CDC itself is very decentralized and so a short answer is, there really is no kind of home base or center which really sets evaluation policy. There’s a variety of forces, that I’ll go into in a second, that tend to shape how it plays out center by center, program by program.

To a degree there’s anything kind of framing how we approach it that might make our approach distinctive, I think it would be our CDC evaluation framework. This was a very comprehensive and inclusive effort in the late 90’s to rethink how we did evaluation. It was motivated, not so much by, or just by, our desire to do better evaluation, or more evaluation, but as much by the desire that the evaluations that we did would actually make a difference. They would actually be used for program improvement.

With that in mind, there’s nothing very magical about it and it looks probably like any other applied framework that you’ve seen, but it was fairly influential in changing how we thought about evaluation - particularly by introducing the idea of using logic models and understanding your program, and particularly this idea of engaging stakeholders, thinking broadly about who cared about the program, who could actually make a difference. In our framework, we also adopted the evaluation standards, which we did not create but we adopted from others, and that’s been very influential in asserting and forcing the point that there’s no one right evaluation. There’s no one right set of evidence, or way to analyze things, but things need to be case specific. Now, the good side of that is that it’s made evaluation much more case specific, instance relevant, and probably led to higher use. The downside of it is that, consequently, the application of this framework is going to look very different, center by center, program by program, instance by instance, and so it becomes hard to typify CDC evaluation.

Who sets CDC policy - really normative practices, standards of practices, peer pressure to do evaluation in a certain way? In the past we probably had something closer to a home base for evaluation. We probably don’t now. There are several places not contending for that role, I don’t know that anyone’s contending at all, which is perhaps part of the problem. But there are several parts of the organization where this kind of thinking might be housed in the future. The forces shaping it [evaluation practice]- meaning when you look across our center’s programs and divisions and, of course, our many front line partners doing public health work - what kind of distinguishes why evaluation looks the way it does and there’s these 5 kinds of big clumps of forces or influences. I’ll talk a little bit about how AEA can find teachable moments to influence those forces or to mold those forces.

(1) The first is that CDC is a science-oriented agency. Its roots are in epidemiology. It started off as a communicable disease center, with a very strong epidemiological emphasis. That’s not routinely the case anymore. A lot of what we do tends to be programmatic, but if I think about how evaluation plays out and particularly what most consultations across the agency are like, they can be very different in parts of the organization, with a very heavy science component, or where they see the production of science or translation of science as one of their major missions. And, because epidemiologists, while they may not bring randomized control trial thinking, have their own versions of very rigorous methods of cause, distribution, etc, which are often exactly what the doctor ordered for the sorts of evaluations we do. [but in] Other cases [this approach runs] aground on the reality of what we’re trying to do at the front line.

The second issue of science orientation is that we have so heavy an investment in and identification with epidemiology and surveillance that it’s not uncommon - and this was a problem for a while, especially in infectious diseases - for people to see surveillance as evaluation. Surveillance, for those of you who are unfamiliar with public health terminology, doesn’t mean the kind of thing we’re worried about these days, but it means the reporting on distal outcomes, reportable diseases, how many people in a community have an STD? What are the risk factors that may be exhibited related to it? How many people have HIV? Those sorts of things, and there’s a tendency to see those….first of all, God bless us that we have that distal outcome information, but because we have that, there’s a tendency sometimes to see that as evaluation. And when things are going right, we’re doing well on STD’s, we’re doing well on HIV or whatever, it works out swell, it can actually be evaluation. When things are heading south, we kind of lack all that intermediate and short term outcome information that allows us to improve our programs.

(2) A second thing, and I don’t quite know how it plays out, but in certain issues in the organization about how evaluation is practiced depends on where evaluators sit, and they sit in one of a couple of places – sometimes they sit in their own unit, so they’re closely allied with the research and science side of the aisle. In other cases, they sit firmly in with the programmatic folk. I don’t know that one is better than they other, but I can say that when I see evaluation practiced at front-line level with grantees and partners, that the influence of where the evaluator sits can often be evident in that.

(3) The third force, and these are not in order of importance, but the third force, (and George alludes a lot more of them than I even thought of. As I was sitting there, I was thinking, oh, so that’s where that stuff comes from) is the many, many external forces, and, of course, we think of GPRA and PART, but also for us at CDC, legislative mandates, authorizing legislations for program XYZ, which of course, then cascades down through us as our RFP’s which go out to our grantees and partners etc. I would say, in general, that [GPRA/PART] tends to be less influential on how evaluation is practiced at CDC than perhaps it might be in other places. As my friend, Michael Schooley - who may be here - said in comments when I asked him to look at my talking points, that these things tend to be general in orientation and very specific in practice, but also very different in practice by practice by practice.

So GPRA and PART, for example, as a citizen, I just celebrate those. It’s very hard to dispute the logic of the orientation and motivation behind them. As they’re applied in practice, they tended not to be terribly influential and also, I think - this is very disturbing - that they’re not seen as evaluation. While they may influence the practice of data collection and the proof of program performance, it’s not like people then go back and say, “Well, that was really an evaluative experience.”

I saw a friend once say, after going through the PART experience, who said: “This has really been good, because it made me really ask hard questions and I’m going to take that back and think about it.” But that’s the only person I’ve ever heard say that in my time working on PART. Most people think of it as, “Good Lord, what do they want? How can I get this done? How can we meet the needs of this and then move on with my life?” So I’m working hard with the folk who now own the PART and GPRA stuff because they just lament this fact. They know we’re sitting on all this great information. How is it we can get people to see this as evaluation and useful information?

(4) The structure of public health is a huge factor in this and I don’t know we’re unique in this, but I think it is definitely a factor. It’s not only [that] CDC is decentralized but remember, the way public health is practiced in the United States is through a network of about 3,000 local health departments. Of course, money is funneled from us sometimes directly to them, more often through a series of states, territorial, tribal departments etc. What that means is that our programs tend to be very, very cognizant, and if they weren’t, our partners would make sure that they were, of the limitations of what you can do at front-line level. And so what you end up with is a very constrained resource environment -either the inability to have the skills and capacity to do much more than sometimes basic performance measurement, even if we were to ask for something very rigorous in our proposals, and where people do have that capacity, a resentment that, gosh darnit, I’ve got to divert this money [to evaluation]. Every once in a while, the question of 1%, 5%, 10% going to evaluation comes up, and those proposals often don’t get legs on the programmatic side of the aisle because people understand how it’s going to play out at the front-line level.

(5) Finally, and this is probably last, but not least, because it actually is the most major force I’ll talk about when we talk about teachable moments: the CDC’s current strategic planning efforts, which have been going on for a reasonably long period of time. They’ve had various iterations, [and], like all strategic planning efforts, they’re taking probably longer than anticipated. This was an effort by the current director to get us as an organization to align ourselves behind some limited number of goals and objectives to get our programs to align to those, to try and define best pathways to those goals and objectives and then to measure that stuff and use it as an organized reflection.

There’s a whole bunch of associated measurement with that, and here’s the irony - well, the lament and irony--:it does not, at this point, shape evaluation, it shapes data collection, it shapes reflection, it has everybody running around thinking about what they need to do better, but those are not seen as evaluative questions. And that’s probably the biggest lament I have in my life right now, is that this performance measurement and metric stuff is growing up-- it’s important but it’s growing up--with a whole different crowd of people than the crowd that I’m used to dealing with.

One of the things I want to talk about, when we talk about how to enter into these conversations at CDC, is how to grab these questions and make people see them as evaluation questions, as paths that have already been tried by evaluators long before the corporate consultants and others came in and had us thinking about these questions.

[On the issue of ] Helpful input – there’s a whole bunch of stuff that would be useful to us, both as an agency and especially in trying to deal with front-line staff. First off, an endorsement that we are right, the realization that focused evaluation means, by definition, [that] there’s no right evaluation, but rather that it’s an application of standards case by case. There’s a matching of standards and approaches to a situation. Having said that, what that means is, we have as many cases where we really need really strong rigorous models and have sometimes the lack of capacity to do it, but also the need to kind of push that models of credible evidence can be quite broad, and models of causal attribution can be quite broad and aren’t always RCT’s. The discussions that we’ve had within AEA the last couple of years about causal attribution and experimental models as a goal standard, I think those are right on target for some of the conversations we’ll be having in CDC once the dust clears on goals and objectives and people start turning an eye really towards what does it mean to have performed well? What does it mean to prove that we made a difference, an impact?

The application of PART and GPRA, I think what AEA did in commenting on PART and the ways in which it can be used more effectively, was exactly on target. Likewise, the comments the AEA made--what George alluded to—on the Department of Education and the desire, the pressure for randomized control trials. I thought that was very, very helpful and I think can be very influential in a place like CDC, by offering credible experts to come and endorse - especially for the science side of the aisle - [approaches that] may seem sometimes as a too “loosey-goosey” approach to data collection and analysis.

Then finally, we have a big problem. Which is, as I said, on the one hand there are folk who see surveillance as evaluation. And of course, we don’t want to preach that--as valuable as surveillance is. On the other hand, we had a panel earlier today where we talked about this challenge of how important performance measurement is, how much brand it has taken on, and yet the ability or the danger that performance measurement will trump evaluation. I think as evaluators, it’s going to be very important, as credible experts to come in and say, “Look, this is a continuous cycle of improvement, there’s a role for all these things to play.” But none of them trump evaluation; they’re all part of the evaluative puzzle.

[Related to the question of ] Means for providing input. I call these looming teachable moments. The first one is the one about which I have the greatest hope - I’m sorry. I don’t know if I have the greatest hope, but it holds the greatest opportunity. And that is the goals and objectives have been rolling out for a while, but I believe there’s light at the end of the tunnel and that there’s enough clarity on them that they will start rolling out and people at divisional level will need to start thinking about, if that’s what we’re going to be as an agency, how do I fit in, how do I align?

This idea of cascading down, aligning up, I think it’s on the tip of most people’s tongues, and it comes with a whole bunch of associated reporting. Right now, the associated reporting is probably not well liked, it’s very high level, very complicated like many MIS systems are, but I think it’s going to get better, and if evaluators can be there, I think most of what we have already talked about – logic modeling and intermediate outcomes, etc, will be very helpful.

What’s happened in the absence of evaluators being present at that discussion – it’s not like someone’s closed the door on us, they haven’t even thought about this., then I guess it’s not much different than the questions that we broached with the framework—is a whole bunch of corporate and management thinking is brought to bear, and what that’s done is that words are being introduced, which have their equivalents in evaluation, but are being introduced on the wrong side..

As I said this morning, I have to talk about “metrics”. No one talks about “indicators.” I can’t talk about “intermediate outcomes”, because the management consultant talks about them as “success factors”. You don’t have “logic models”, you have “project charters”. Things like that, which make it seem like this is the first time we’ve done this, [that] This is a brand new question. And so I think probably teaching people that this path has been trod, there’s light at the end of the tunnel, please listen to us, we have a way forward, would be very valuable.

Second thing, the CDC framework is ten years old; I’m a huge fan of it, [and] not just because I’m paid to say that. I really think it served us well, but interestingly, its blessing is also its curse. It’s so open-ended because it needs to be, we want people to be case specific, but the downside is now that programs have been around for a while, they’re actually looking for more guidance. They’re looking for more structure. In this very open-ended landscape, can we help with focusing? How many stakeholders are too many stakeholders to engage? How long should a logic model take? How detailed should it be?

You say an evaluation focus can be about anything. It really can’t be about anything, can it? Shouldn’t we demand that it always have process and some outcome orientation, etc.?

A good friend of mine, Goldie McDonald, and I were talking about it, and she’s much less of a fan of our approach than I am, and I think what we’ll probably try to do is engage some conversations this year. Jumping ahead to my slide about opportunities for input, this is a place where experts should be coming in and saying, “This is what we like about your approach, this is what’s perilous about the approach, there are [something equivalent to] “AAA Trip Tiks” through this landscape and here’s how you match the characteristics of a situation into a strong evaluation approach.”

Peer review is a huge issue. Our current Director, Julie Gerberding, was a huge fan of peer review, has really moved us to extramural research programs, etc. It doesn’t happen that often, although a lot of evaluation stuff gets subjected to clearance. A lot of evaluators spend their lives trying to get their project classified as not evaluation, but - this is the kind of organization we are - things are either “research” or--the rest of the world-- “not research.” So we try hard to justify that we’re on the “not-research” side of the aisle, mainly to avoid a lot of the processes that we talked about. Nevertheless, when things go up for clearance, that’s often where a lot of the science bent of the organization comes to bear, often for good, sometimes for ill - a mismatch with an understanding of what we’re trying to do versus the very typical, rigorous science that we apply. Having said that, it could be that this is exactly the time to start thinking about “peer review,” very loosely held. Looking at the portfolio of our evaluations, trying to look at them and see if there’s some pattern, that as external experts, we can endorse that says, “This is a good approach. That’s probably an approach that’s a little bit wanting.”

Last, but not least, publishing in “our journals.” I dare say, most of the folk doing the performance measurement at the organization do not read the American Journal of Evaluation. Perhaps they read New Directions for Evaluation, if I’ve left it on their chair. But the American Journal of Public Health. [also] we have some internal in-house journals. There was a big issue on evaluation a couple of years ago. We probably ought to instigate something like that again. And that would be a really wonderful opportunity for big names in the field to influence people who read these broader vehicles and are in our leadership.

[Related to] Challenges, there’s two challenges: one is, there’s no home base for evaluation. There’s a lot to celebrate about that. I’m sure we’re all in situations where, having a home base on anything, whether it’s budget, legislation, science, can be as confining and constraining- establishing an orthodoxy that can be unhelpful - as it is liberating by giving you guidance. Having said that, though, I think trying to formalize much of the informal network that we have - perhaps turning our evaluators into an official working group, trying to reconstitute the evaluation working group, that established a framework way back when— I think would be useful right now because I think it would give a point of advocacy or a point of entry for outside experts to help us advocate for evaluation.

The hugest challenge I see is the most ironic – CDC loves consulting and external advice. Maybe even to a fault at times, but it’s good news that our organization's very open to external advice. There's not an internal orthodoxy that people are wedded to. The issue for us is, as I mentioned, these issues facing the organization - which I think eventually will lead to evaluation “brand” down the road if we channel it right—are not seen as evaluation questions.

When I was at MBA School, we used to talk about primary versus secondary demand. Primary demand: milk is good for you. Secondary demand: buy Sealtest [brand] milk. The issue for AEA is that we have secondary demand, but we don’t have primary demand. The questions being asked in the organization right now are not framed as evaluation questions. They’re framed as performance measurement questions, corporate questions. How can I be more corporate? The first big challenge is to move that around so that people see this all as part of program reflection. An organized process of program reflection, and that evaluation, performance measurement, those sorts of things are all part of the same puzzle.

Once we solve that, it creates a demand for, “If this is about evaluation, how can I find the world’s best evaluators to help me out?” Of course, we have them at AEA and it gives us that entrée. Thank you, those are my setup points and I’m looking forward to hearing my colleagues and some discussion later.

Panel comments: Patrick Clark