How Do You Evaluate Leadership? The Impossibility of Evaluating Principals and Leadership Programs Using Student Outcomes

Abstract: 

In our age of accountability and evaluation, teachers were first subject to competency exams in the 1980s; now they're subject to value-added measures. States imposed accountability systems on schools during the 1990s, and the federal government created accountability for districts in No Child Left Behind. Now accountability and more formal evaluation have spread to school leaders, because they may be second in importance only to teachers. The two national organizations of principals have produced frameworks for evaluation (NAESP and NASSP, n.d.), and formal methods of evaluation like VAL-ED are widely used. Inconclusive statistical literature has tried to ascertain what dimensions of leadership affect outcomes (Grissom et al., 2014). A few states require that every administrator's evaluation be partly based on test scores, and some districts are using test scores to determine salaries or bonuses (Murphy et al., 2014). The U.S. Department of Education is now turning its attention to principals after proposing plans to evaluate teacher preparation programs with various measures, including test score gains.

Now the focus is turning to preparation programs for leaders. Frameworks for evaluating leadership programs exist, using multiple ways of ascertaining effectiveness (Tredway et al., 2012). Two high-profile programs (the Aspiring Principals Program in New York City and New Leaders for New Schools) have linked their efforts to test scores, with distinctly mixed and often negative outcomes (Corcoran et al., 2012; Gates et al., 2014), along with methodological flaws that we will review.

Closer to home, foundations have asked our program—the Principal Leadership Institute (PLI) at the University of California, Berkeley—how we can prove our impact, including effects on test scores, since many foundations do not fund programs without outcome evidence. Federal grants ask grantees to quantify their effects, including those on student outcomes. In California the need to improve and monitor leadership programs is being discussed, though measures of quality have not emerged. The evaluation of principals and leadership programs falls firmly onto the agenda of districts, governments, and foundations.

This article reports our efforts to measure the effectiveness of the PLI. In practice, evaluating the impact of leadership programs on student outcomes proves horrendously complex. Existing efforts have not even realized the full extent of the problems, let alone resolved them. Therefore, because we aren't sure what data existed in our partnership districts and failed to anticipate the methodological challenges, we consider our efforts to be experimental. Our results proved useful for uncovering problems, but not for conclusions about effectiveness.

Author: 
W. Norton Grubb
Patrick Liao
Rebecca Cheung
Publication date: 
April 1, 2015
Publication type: 
Leadership Programs