The Seven Deadly Sins of Program Evaluation

Are you seeking grantmaking perfection? Of course you are.

And the best find a way to nirvana is to work out where you're going wrong. Which is why we've delved deep into our vaults to bring you, "The Seven Deadly Sins of Program Evaluation". Step through all seven for perfection.

Let's begin at the beginning, with what was, by some accounts, the first evaluation ever:

In the beginning, God created the Heaven and the Earth. And God said, "Let there be light". And there was light. And God saw the light, that it was good…

That's evaluation!

But it didn't stop there. God went on creating - dry land, grass, fish, cattle and so on. And she kept on evaluating as well, finding over and over again that what she was creating "was good".

There was even a program evaluation, which came out rather well - "And God saw everything that she had made, and, behold, it was very good."

The problem with this scenario is that that as an evaluation, it doesn't really measure up to today's best practice.

Let's begin with what the project team did well. One, they were engaging in continuous evaluation - once a day, in fact, which is a pretty taxing schedule for formative evaluation. If it had turned out that, say, the Moon didn't work out according to specifications and had to be taken out of the program, they could have taken that into account straight away and would have been able to compensate by making the tides operate by some other mechanism, perhaps.

The second plus is that evaluation was taken seriously enough to have it considered at the highest level - by God herself, no less. Top management - really top management - was looking over the returns on a regular basis. That's about it for praise, though. After that, we have to start marking the evaluation down.

Perhaps the most obvious problem is that the feedback isn't very specific. There are only two grades - "good" and "very good" - and neither refers back to the original criteria, perhaps because there don't appear to have been any original criteria. Where are the KPI's? And while the speed of the project is impressive for such a large job, the follow-up is neglected. You really can't call a project "very good" after less than a week. Important impact measurements can't really be completed in that time period. Trends take longer to emerge.

In fact, the conclusions do seem to have been rather premature. With the benefit of hindsight, we know that years later the project funder decided to reboot the entire operation by wiping out nearly all the grass, cattle and people in a great flood, and building up the community again from a small group of professionals. That kind of cruel-to-be-kind cut-off decision can be necessary - I'm sure everybody here has had similar discussions in their grants committees - but it does cast some doubt on the optimistic sunniness of the original assessment.

All of which leads up to the next issue: dissemination and diffusion. Someone reading God's program evaluation would very little wiser about how any other funding body would go about creating another universe next time, if that became necessary. What learnings does one carry away, either for the community that was created or the overseeing body that was supporting that creativity?

We can say with some confidence that we've learned a lot about evaluation in the last 6000-odd years. The biggest problem in God's evaluation, though, is the absence of initial goals. How can one judge the success of the creator if we don't know what she was trying to achieve? The most important element in any evaluation - in any project - is knowing what you want.

It has been suggested, of course, that what the creator was after was a world full of people singing her praises. We've all known grantmakers who seemed to have that as their main goal, but these days we do tend to look for a little more from our resources. We want to change the world for the better. More than that, we want to learn how change the world for the better. We want to produce certain outcomes, make particular impacts, produce outcomes and impacts quicker and cheaper and more effectively next time. Along the way, though, there are many serious mistakes that can impede our progress.

We call these The Seven Deadly Sins of Program Evaluation.

Avarice, or "greediness for riches", if you like that definition better, is - in grantmaking - less about wanting more money (though we always do want more in our particular pot) as it is about wanting more knowledge than we're willing to pay for. Knowledge doesn't come cheap. There's no such thing as a free launch. Surveys, focus groups, independent reviews, even feedback forms cost money.

You want to spend the maximum feasible on program services, we know that - and money spent on evaluation isn't spent on delivering services to clients. If you're feeding the hungry, a vigorous evaluation program will mean that everybody gets one less spoonful of soup. Still, if you don't write evaluation into the budget you're not going to get any, or at least you're not going to get any that hasn't been scribbled down off the top of someone's head the day before the project report is due in.

So what proportion of the project budget should it be? Two percent? Five? Ten? More? Whatever it takes? What's your absolute maximum? How much of your own resources are you going to contribute? At what point do you think your stakeholders are going to begin to feel uneasy? And how much time are you going to put into drawing up your evaluation requirements for each project? Are you going to write down a set of general principles and hand them out to everyone? Are you going to set down specific milestones and KPIs for each grant? Are you going to work with each grantee to get to a mutually acceptable compromise between their wishes and your own?

The grantee's wishes, generally speaking, will be an assessment form that asks three questions:

Did you get the money?
Did you spend it all?
Do you want some more?

You may want a little more detail than that. And that's just in relation to what you've funded. How much of your funds are you going to spend evaluating your own work - how much value you've added, how much you've moved the process forward, how well your systems functioned?

There are figures for this, at least in America. One survey found that the average amount that foundations spent on evaluation, overall, was two percent. Of those that rated the importance of evaluation particularly highly, one-third spent nothing at all. You get what you pay for. What you need to do is work out how much you want.

Our next Deadly Sin of Program Evaluation is sloth. Sloth, in the world of grantmaking, comes about when you're not prepared to put in the effort to find out what you actually do want. It's a lot easier to set up goals that will satisfy the auditors but don't ask anything much of you.

The most obvious is input assessment - the grant was for $1000; they spent $1000: objective achieved. You don't see that one much these days, except at the very, very small grant end of things. Output assessment, the next stage, isn't a great improvement - you fund the group to deliver 20 vocational training sessions; they deliver 20 sessions: objective achieved. After 'output' comes 'outcomes'. What do you think is going to be there at the end of the project? What's going to be different? What's going to be left behind once the money's spent? Finally we get to a question that's worth asking.

Moving on yet another stage, we have 'impact'. This is where you ask the question "So what?" How has the world become a better place? How much closer are we to heaven on earth? That's what we really want to know. Of course, the more important the question, the harder it is to answer. The world is a big place, a very complicated place, and it's never easy to determine true cause and effect. If you really want to assess impact, one thing is clear: you're going to have to wait a while.

It's hard enough to find a grantmaker who's prepared to make a grant that runs longer than two or three years: finding one that's prepared to come back five years after that and sample the water again is damn near impossible, at least in this country. But it's something you really should be thinking about. It matters. Take the evaluations of the American Head Start preschool program. An early study showed the children making gains, and then a later study showed that the gains children made through the program disappeared after two to four years.

You've got to be patient! Just how patient came out in a more recent evaluation, carried out a full 40 years after the original tests, which found that the Head Start participants are once again significantly statistically better off than the controls. What have you got in your budgets for the 40-year review?

Even in the short term, though, it's important to know what impact you're shooting for.

Evaluation - good evaluation - runs backwards. You have your goal, you explain why you believe your intervention is likely to lead to that goal, you work out what the steps are along the way, and you then work out measures that will tell you whether you're on course at every stage. You know it as a Logic Model. It's hard work, and there's no place for sloth.

OK, Deadly Sin number three: Wrath. For a grantmaker, wrath is something that happens to other people. Specifically, to grantees. Our Community has received numberous stories of 'Grants Rage' over the years, some more graphic than others. Grantees can go postal when they're faced with evaluation plans that haven't been thought through. And there's a long list of ways in which evaluation can go wrong:

Evaluation can be disproportionate, with the grantee having to write 20 pages to acquit a $200 grant.
Evaluation can be unnecessary, with the grantmaker filing the responses carefully away not to be referred to again before the heat death of the universe.
Evaluation can be repetitious, with the grantmaker making you write out 20 pages despite the fact that they've still got your last form in the filing cabinet and nothing has changed.
Evaluation can be just irritating, making you write out a new set of 20 pages because their forms want the data in a slightly different format from the way everybody else asks for the data.
Evaluation can be misguided, where the grantmaker is just making up questions at random that have nothing to do with the real world.
And - what causes more burst blood vessels than any of these - evaluation can be done with computerised forms that won't open, or won't accept your data, or accept it and then lose it, or crash halfway, or have to be printed out and posted, or make the whole machine hang for hours while you watch the spinning basketball of doom.

All in all, it's mildly surprising that more computers aren't hurled through more windows more often. Our SmartyGrants system can save you from most if not all of the above issues - if you find you're on the receiving end of copious grantseeker wrath, it might be time to talk to the SmartyGrants team.

We can't leave the topic of wrath without at least mentioning grantmakers' own wrath - at grantseekers not reading the guidelines, not getting things in on time, not answering the questions, not doing as they said they'd do. But that's for another time.

One of the things that grantees object to particularly, as I say, is being asked for lots and lots of information that you're not going to use. Wanting too much stuff that you don't really need - that's gluttony (Grantmaking Deadly Sin number four).

There's an easy rule that you can apply to your grantmaking that will eliminate gluttony: Don't include anything in the form unless you know it's useful. No exceptions. No questions just because you had them in last year, or because somebody might find them useful in the future. If you haven't thought of the question to which they're the answer, leave them out.

And don't include measures that can be fudged, or faked, or that lead to a misguided reallocation of resources. Just as teachers sometimes teach to the test, grantees can shift resources into meeting the letter of particular criteria. In the old Soviet Union, where all KPIs were set by the Kremlin, nail factories overproduced large spikes when quotas were set by tonnage and small nails when quotas were set by number. And that's not just a quirk of Stalinist functionaries. Even the best of us can be led astray.

As one American not-for-profit director has remarked, "When outcome measures became the way for government to evaluate programs for runaway teens, there was a shift from process objectives (like beds being filled at night in shelters), to outcomes - family reunification, in particular." That sounded like a great idea, until it became clear that the metric was encouraging agencies to return some adolescents to abusive situations.

If you're recording progress on your actual objectives, this won't be a problem. For a very large part of the time, however, we don't really know how to measure what we want to know. Things like capacity building, community cohesion, innovation, or creativity can't simply be laid out flat next to a ruler. Nonetheless, we have to have some way to assess progress. We can't just assume that these good outcomes will come about because we're such good people and such knowledgeable experts - that's pride, but we'll get to that next.

What we tend to do is look for proxies. If what we actually want to measure was increasing, what else would be the case? What would be consistent with that situation? If we're looking for stronger communities, it would make sense to think that more people would be volunteering, or that more people would say that they trusted their neighbours; and those would be proxy measures. The problem with proxies, of course, is that they can lead you off the direct path. Pursuing available proxies too slavishly can lead to real distortions. In the words of one grantmaker, "If all you do is stuff you think you can measure, you're actually lowering the bar."

It's like the old joke; someone comes along and see you under the street lamp, stooping and fumbling around the street. "Can I help?" they say, being a nice person.

"Thanks," you say, "I'm looking for my keys." You both look around companionably for a while without finding anything.

"Where did you lose them?" they ask.

You say, "About a hundred metres up there, under the bridge."

"So why the devil are you looking here?" they ask.

"The light's better."

You've got to be careful that you aren't limiting yourself to the spot under the streetlamp. You may have put in some funding to invent the torch.

That leads us into pride, which could be defined as thinking that you know everything that there is to be known and can anticipate everything that's going to happen. This shows up when you set out a rigid evaluation schema to record everything that you want to know and don't have any room to include what you may find. There are, as Donald Rumsfeld famously said, "known unknowns and unknown unknowns" - things we don't know but can test for, and things that simply strike us out of a clear sky. Some of these are pleasant surprises. Some - perhaps more - are disasters; but they have to be taken into account in how we plan the next round.

A lot of project evaluations - perhaps the majority - don't allow for unforeseen findings. We decide in advance what it is we want to know, what we regard as important, and what is going to be recorded. At the end of the project we have filled in those blanks. If, however, anything has happened that we hadn't anticipated - anything that didn't have a box in our data collection form - we have to leave it out of the final report.

That works well enough if we have a grasp of the field so thorough that we can predict accurately the whole range of possible outcomes - but in grantmaking, there aren't many areas where that's true. In reality, evaluations don't deal with what happens during a project. They deal with a very much smaller sub-set - the things that happen in the areas we thought were important before we had the benefit of actually doing the project. The problem has become more acute with the welcome move in grantmaking towards addressing the underlying factors in a situation rather than the less important but more easily counted symptoms. If you want to improve people's health by putting them on a better diet then it's relatively easy to measure this; the problem lies in proving that it has in fact improved their life expectancy. If, however, you follow the direction pioneered by social epidemiologists such as Syme, Marmot, and Kawachi, you're looking at factors such as social involvement, hope, and a sense of control. It's hard to measure these, hard to evaluate them, and hard to remove confounding factors in the social environment.

On the other hand, a lot of the experiences with loose qualitative measures haven't worked out all that well either. People tend to emphasise whatever measures they are doing well on and play down the others. It's hard to compare different projects working under different evaluation codes.

The problem, then, is to find an evaluation strategy that is both rigorous and open-ended. And here it's worth considering the work of Canadian health promotion researcher Ron Labonte. Labonte found, like Syme, that what counted in successful health promotion projects was not what the researchers thought was successful but what the community itself thought was successful. The old practice of assuming that numbers were "hard", "objective", and reliable data, while people's stories of their own lives were "soft", "subjective", and suspect, was missing the point completely. The important thing was not to measure the situation; it was to understand it.

Labonte began moving towards a methodology that was specifically interpretive, where the research findings arose in the course of the process of inquiry rather than simply being nuggets of pre-existing fact. The process relied on "iteration, analysis, critique, reiteration, reanalysis, and synthesis" - or, to put it another way, talking it through with the players, bringing out stories and dialogues. It's resource-intensive, and it's difficult, and perhaps the most important consideration in many offices - it's not what you did last time, but if you want to uncover the true story you'll have to leave room for a little dialogue along the way.

And then there's envy. The trouble with envy, though, is that these days it's less a deadly sin that plunges you into the heart of the fiery furnace and more an essential component of a modern consumer economy. If we don't want what everyone else has, how are we going to keep on spending? But from a grantmaking point of view, the problem is that we're not envious enough, or at least not envious enough about the right things. How many grantmakers have you seen hurl the newspaper across the room when they read that their worst rival has satisfactorily brought home an important grant program with glowing reviews from all the critics?

Not many, if only because there's very little coverage in the newspapers of anything grantmakers do. Some of this is because of the competition from the Olympics and the Kardashians, and part of it is because we don't really put the effort that we ought into getting our message out. It's not easy to envy something that's invisible.

The point I'm trying to make is that we need to be disseminating our work at every level. Start with what the findings of the funded program were. Who is going to benefit from knowing that, and how are we going to get it out to them? You've heard this line from us before: Information, like manure, is only valuable when it's spread around widely. We're going to keep on saying it.

At the next level, how are you going to ensure that the lessons of all the projects you fund are synthesized into meta-findings? Everybody likes meta, because it can tell us so much about how this program compares with the last, or the one beside it, or the original plan. As a grantmaker, you have privileged access to a lot of useful knowledge. Your office shelves are groaning under the weight of grant reports, evaluations, assessments, and research studies. And of all this bounty, how much gets captured? And how much gets released?

One complicating factor, of course, is that we don't always want things released, because that could be a bit like admitting our mistakes. Learning from our mistakes is more easily said than done, for two reasons. One is that our grantees aren't eager to own up to things that have gone pear-shaped, and the other is that when they do own up we instinctively try to sweep it under the carpet. Both of those reactions, unfortunately, quite often find themselves being positively reinforced. Failed projects quite often result in the grantee being marked down for the next round; grantmaker failure often results in higher management (or the media) storming down to beat you around the head.

We'd desperately like to see a change in this situation. It's stupid. It's just throwing away the best parts.

For one thing, there could be public relations advantages in being open about failure. No, really! The emphasis these days is on communication and transparency and accountability, and it's not easy to establish your credibility in these areas if you talk about nothing but your successes. And from a public relations point of view, if things are going to come out, it's much better that the story comes from you, rather than from someone with an axe to grind. Being open about your mistakes also gives you more opportunities to learn from them - hiding something this time may make the same situation harder to avoid next time, because you haven't really learned anything. I want to urge each one of you to put a stop to this practice of hiding and ignoring from today onwards. Encourage your grantees to let you know as soon as things seem to be straying from the plan. Work with them to see if there's a way to get the project back on track, working at the problems together - after all, you've seen this sort of thing before, and they may not have. See if there's scope for modifying the project objectives to take account of new realities. Let your grantees know you can tolerate bends in the road, and that you don't see this as failure - not, unless, you and they learn nothing from it in the process.

If you're not making mistakes you're really not trying. You ought to be pushing limits, and taking risks. If you only take on tasks you know you can handle, you're holding yourself back. Above all, don't be scared to share your experiences with others - the good, the bad and the ugly.

Well, here's the one you've all been waiting for - lust. The first thing you think of when we say "Lust" likely isn't "Grantmaking!" (we hope). So let's redefine 'lust' as 'passion', and having moved it out of the X-rated zone, the connection between the two is much clearer. If you're not passionate about your job - well, not every winter morning, but most of the time - you should start checking out the ads online.

And that goes for evaluation too. It's not something you should see as a chore, or a drag, something to get over and done with as quickly and easily as possible. You should be inspired by it. Enlivened by it. Grantmakers are given a marvellous opportunity, a chance to get paid for spending someone else's money on making the world better. What's not to like?