25 October 2011

The Empire Strikes Back: Sachs Vs. The World

The debate over the Millennium Village Project (MVP) turned the burner on high this month with more people jumping in to question how they are being evaluated. Michael Clemens and Gabriel Demombynes put out a paper saying that the evaluation design is flawed and were even set to have a public debate with Sachs until the event fell apart.

Things really picked up with Madeline Bunting writing about the debate in the Guardian Development on October 10. She writes:
As part of the announcement this week, the MVP proudly claimed that malaria in its villages had fallen by 72%, access to clean water had more than tripled, and average maize yields had doubled. All of this was achieved on a budget of $60 a head per year, according to the project. The next stage of funding will build on business and enterprise to help villages to link better to the wider market. Soros punched the point of this huge programme home: here was a model that was replicable and could be scaled up across Africa.

But it is on this last point that questions continue to dog the project. Is it replicable and does it really serve as the model for development? The handling of those questions has been pretty brusque.
Noting the questions raised by Clemens and Demombynes, she alludes to the fact that the claims may not be substantiated.

In response to Bunting, Sachs writes in the Guardian Development that the MVP is working quite well.
The inputs and outputs of the project are all carefully measured. The budgetary costs are studied. Contrary to the loose talk of critics, this project is not throwing "gazillions" of dollars at poverty. The project spent $60 on each villager every year between 2006 and 2011 to build the capital of the community. That prompted further contributions from the government itself and in-kind contributions from the community. This is a replicable and scalable budget model, well within the official development assistance amounts donors have long promised. It's nonsense to suggest otherwise, or to change the game now this amount has been shown to work so powerfully.

The systems the Millennium Villages Project are building and working to expand will continue to be improved and upgraded along the way, and we will follow the challenges and successes of our colleagues in the villages for much longer. Our critics are quick to point out how we should be doing this or that better, but they do not offer concrete alternative models for achieving the MDGs. We wish they would. When improved methods of service delivery come along, the project is very keen to take on those lessons and ideas.
Lawrence Haddad, Director of the Institute for Development Studies, then struck with a post slamming the MVPs.
The 2 main critiques of the MVP seem to be (a) of course if you spend $60 per head per person for 5 years you will see dramatic development improvements--but what happens when the donor money runs out? and (b) actually we don't know if the impacts are there because the MVP has no baseline comparison group of villages (and there is absolutely no technical reason the MVP experiment could not have been randomised at the village level a la Progresa).

The second critique seems sound to me. It is hard to understand why baselines of case control villages were not undertaken. The second critique gets us impact folk excited, but I suspect it is the first critique that is more widely supported--who on earth will pay for this once the donors leave?
Not to miss an opportunity to respond, Sachs writes back with the help of Dr. Prabhjot Singh to disabuse Mr. Haddad of his supposed misconceptions.
The critiques presented by Lawrence Haddad on the Development Horizons blog are second hand (we believe that he has not visited a Millennium Village nor consulted with our team in any substantive way) and reflect a real misunderstanding of what is happening in the Millennium Village Project. There is a huge difference between the objectives of a household-level intervention project like the Progresa program in Mexico and the Millennium Village Project. The Millennium Villages Project builds community capital at the scale of 30,000 or more people. A major purpose is the design and implementation of community-based and district-based systems with cutting-edge ICT tools to achieve complex goals (such as fashioning a primary health system from almost nothing). The notion that this is about randomized trials like Progresa is a misunderstanding.

(snip)

There has been much na├»ve talk about paired “comparison” villages. The Millennium Villages Project actually has them, though we introduced them in year 3 rather than year 1, because in year 1 the considerable work required to create a foundation of community-driven strategies in the context of a very complex project took precedent. We knew from the start that there would be many complexities in comparison sites and we began to introduce them only when the project was functioning in all sites.

For anyone who has taken the time to understand the difference in pace of initiation, organizational culture and preexisting capacity between the varied settings of the Millennium Villages will know that a “Year 1” comparison would be meaningless. The fact that we started in year 3 rather than year 1 of a 10-year project is taken as a mortal sin by some critics, but frankly that position is taken by polemicists who are keen to criticize the project rather than by people who actually carry out complex projects or care to understand the real practicalities of such project.
Sachs and Singh continue to point out where they think Haddad is wrong with the basic argument that RCTs are not appropriate for measuring the impacts of the MVP.

Not wanting to miss out, David McKenzie took to the World Bank Development Impact blog to address the Sachs/Singh post that he called, "a rather stunning reply." In the post, McKenzie responds to each of the claims made by Sachs and Singh. One example:
“The logic is also flawed. In a single-intervention study at the individual level (e.g. for a new medicine) one can have true controls (one group gets the medicine, the other gets a placebo or some other medicine). With communities, there are no true controls. Life changes everywhere, in the MVs and outside of them.”

Comment: This is just a baffling comment. The whole reason for having controls is that life changes everywhere – if it didn’t, before-after analysis would be just find. The purpose of having these similar control communities is precisely to control for all the other stuff going on in these countries which could be causing changes in the Millennium Villages regardless of the impacts of the MVP. The work by Clemens and Demombynes critiquing the earliest claims of the MVP’s impacts showed clearly some of the massive changes occurring in Africa in indicators such as cellphone ownership that clearly render before-after analysis misleading.
On the same day, researchers Michael Clemens and Gabriel Demombynes responded to Sachs claims with an article in Guardian Development. The text covers the findings made in their paper and use that research to directly address the claims that Sachs had made previously on the same news site.
We argue that weaknesses in the MVP's evaluation methods will make it impossible for anyone to know if the project is achieving its goals. We also argue that the published evidence does not provide a basis for advocates' claims that the project "has been shown to work powerfully" and is "enormously successful".

Among the five weaknesses we document in the MVP's impact evaluation, the most important is the failure to properly compare outcomes at the project sites to what would have happened in the absence of the project. In two reports (pdf), the MVP has presented before-and-after comparisons of living conditions at its sites, describing the differences as "impacts" and "results" of the project. These reports give no consideration to the possibility that some or all of these changes might have occurred even if the MVP had never been implemented at those sites.
Taking to the same blog, Berk Ozler writes a post the following day questioning why the program was getting so much financial support.
[T]he impact evaluation nerds are losing the battle to "I've seen it with my own eyes: it is working" and "we've been monitoring progress: it's all good." This is why I am writing.

The tack that the critics have taken so far is that MVP needs to be evaluated. That has not worked (yet). I propose another tack: we should start talking to the people backing MVP at the UN; Michael Clemens should have his people start calling George Soros' people. We should tell them that they should not be giving away precious resources, which could be used to implement interventions with far better evidence of effectiveness, to people who have not even started to provide evidence of impact for their projects. In other words, Occupy United Nations or the Open Society Foundation.

If we don't succeed in at least getting a more coherent explanation from the donors than "I've seen it working," then we're not doing our jobs.
As the dissent mounted, MVP looked to someone to shore up support. What they got was a quote from Kenyan Minister of Water Charity Ngilu.
The Millennium Villages Project, and Professor Sachs individually, had a huge effect in enabling Kenya to pursue a policy of mass distribution of bed nets and the shift to community-based treatment of malaria. The Millennium Villages Project informed our government about the efficacy of such policy breakthroughs. Professor Sachs’s advocacy inside Kenya, with the Global Fund, and at the United Nations, helped not only Kenya, but all of Africa to make a breakthrough in malaria control. It is because of this important work and the lessons of the Millennium Villages that our women and our children have stopped dying from wholly preventable causes. Nobody should doubt the importance of the Millennium Villages in showing the way. It has worked, it has made a huge impact on Kenya.
A summary post from Michael Clemens on the same day catches people up with debate. He concludes by addressing the statement from Minister Ngilu.
I invite readers to judge for themselves the plausibility of the claim that “Professor Sachs individually” has caused the huge increases in school enrollment, vast increases in cell phone ownership, and huge declines in malaria prevalence that have occurred all across Africa over the past decade—all changes that the Millennium Villages Project has uncritically claimed in large part as its own “impacts” and “achievements” with its before-and-after evaluation. If that is true, Africans themselves would not have made the enormous efforts and sacrifices they have made to accomplish those sweeping improvements if not for the arrival of Professor Sachs from New York. On that, I am speechless.

The most recent post comes from the MVP blog with yet another researcher wanting to spar. Director of Monitoring and Evaluations Dr Paul Pronyk addresses each of the criticisms individually. He summarizes his post saying,
In all program evaluations, there are trade-offs and limitations – and the need to design systems that are appropriate to the real-world questions and challenges faced by the project. Evaluators should never start with a methodology and define their intervention around it. Rather we have to start with the challenges at hand, and construct the best possible methods to understand what works and why. With this in mind, we have done our best to pull together a suite of analytical methods for learning, documentation, monitoring, evaluation, and scaling.

The Millennium Villages are offering a wealth of knowledge about the systems needed to achieve the MDGs. We are noting not only the key successes – such as in the reduction of malaria, the mobilization of community health workers, the development of pre-paid electricity systems, and more – but we are also learning about the challenges, costs, human resource needs, strategies for community leadership, methods of national policy scale up, and much more. The tools and methods that are in place, and new systems that are being developed over time, allow for the measurement of specific outcomes while simultaneously providing insights into how real-time systems of public services and investment can be replicated and scaled.

I look forward to ongoing discussion, debate, and constructive new ideas around these and other issues, and welcome colleagues to meet with us at The Earth Institute or in the Villages themselves to understand the project and these complex systems and design challenges first hand.
The discussion has spilled over to twitter (economist Justin Wolfers has two very harsh tweets about Sachs here and here), but it remains wonky and disconnected from the very people who are affected by the MVPs. That is because they have not been a part of the discussion. Sachs argues that the evaluation could not start until year 3 because of the variables and obstacles that exist in setting up the MVP. What is unfortunate is that is the time to have the most rigorous data (qualitative and quantitative) coming out of the intervention village. This is when beneficiaries are (hopefully) being heard and programs shifting. If the goal is to reach scale, ignoring the first few steps makes it much harder to replicate.

Also, on the issue of scale. The argument against comparison villages partially rests on the idea that outside impacts may affect the development of a village that cannot be measured. This may be true, but there is a chance that the MVP will have an impact on surrounding villages. What if they were to find that villages that were within 50 miles achieved the same rate of improvement? If there is a noticeable difference between that group and the rest of the country or region, it would change how scale is defined for the project.

The group of researchers who have raised questions appear to have made valid points. The defense made by Sachs in the Guardian article (he has done it many times before) is that they are trying something while the others are being critical. Fair enough. Sachs should be applauded for trying something and working on an innovation. He should also be held accountable for making sure that his intervention that garners a lot of financial support actually works. The reply to financial question will be that it is really only $60 per person for the MVP to operate, but that is rhetorical red herring. Aid money is a finite resource. Spending more money on more people does not lower the stakes, it makes them higher.

My recommendation is that the researchers offer to do a RCT when the next village is started. Between CGD, World Bank and the various university affiliations I am sure they could come up with the money to do it. We sure know that MVP can. What needs to be said outright is that this is not an argument where one side wants the other to fail. Rather, the critics want the MVP to be better and an intervention where lessons can be learned. I often use sports metaphors in my posts, but they break down at the end because we are not competing. This is a field where the goals are shared. Critics are not like opponents, they are the technical staff on the team. They spend time breaking down a batter's swing to make it better.

***Just so nobody makes the wrong assumption the 'Empire' in my title is not meant to imply that anyone is bad or evil. Just that there are two opposing sides in the debate which, much like Luke and Vader, are more linked than they realize.

Permissions