Date: Wednesday, January 14, 2026
Greetings! We are Michael FitzGerald, Senior Research and Evaluation Associate and Lana Rucks, President + CEO with The Rucks Group. Over the years, we have worked with EvaluATE—NSF’s evaluation hub for the Advanced Technological Education (ATE) program—to help them understand the extent to which their work supports evaluation capacity building (ECB) across the ATE community as their external evaluator. Starting with the 2018–2022 evaluation cycle, we turned to the Kirkpatrick four-level model (Kirkpatrick model) as a structured and adaptable framework to evaluate the layered and evolving nature of ECB—one that helped the EvaluATE team understand not only what changed, but the significance of those changes across multiple levels of use and impact.
We selected the Kirkpatrick model because it offered a practical structure for organizing evidence across sources and time. Its emphasis on outcomes at multiple levels—from initial satisfaction to applied results—helped ensure we were capturing whether evaluation capacity was being built in a clear and credible way. It also helped us frame our evaluation questions from the outset. Rather than trying to assess capacity as a fixed endpoint, we asked:
To better reflect EvaluATE’s work, we added two supplemental domains: Engagement, to examine how the ATE community interacted with the project’s offerings, and Implementation, to consider how resources and events were developed and delivered. These additions helped us account for the contextual factors that influence whether and how capacity building takes hold (see Figure 1).
Beyond its usefulness in structuring our evaluation approach, the Kirkpatrick model created a clear pathway from data collection to actionable insights. For example, our findings around Levels 2 and 3 (Learning and Application) showed that just-in-time tools like the Evaluation Planning Checklist weren’t just appreciated—they were being used to write proposals, inform evaluation strategies, and prepare for external reviews. Based on this evidence, EvaluATE made a strategic decision to invest more heavily in short, high-utility resources.
The Kirkpatrick model also gave us a shared language in our work with the EvaluATE team. Within our team, it supported cross-method synthesis; with EvaluATE, it made it easier to explore variation across data sources and surface implications without oversimplifying. In a multi-year evaluation effort where no single metric could tell the whole story, that was invaluable.
We describe our adaptation and application of the model in more detail in the open-access New Directions for Evaluation. If you’re navigating a similarly complex evaluation, we hope it offers a framework that can provide structure and guidance throughout the process.
Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators. The views and opinions expressed on the AEA365 blog are solely those of the original authors and other contributors. These views and opinions do not necessarily represent those of the American Evaluation Association, and/or any/all contributors to this site.