More evidence won’t save development, but it will help the poor
Revisiting an old conversation
Todd Moss wrote a piece recently criticizing randomized controlled trials (RCTs) in development, and those of us who try to encourage evidence use in development. In fact, I’m probably among the top offenders, having the distinction of having been the first (and last) Chief Economist in the nascent USAID Office of the Chief Economist, where I tried to encourage more USAID decisions to be made based on evidence (more on that and the demise of USAID here). I’m proud to have shifted $1.7 billion in USAID spending towards more cost-effective programs.
One quick thing to make clear: when USAID tumbled, many argued “we need to make the case on its impact.” Or “if only USAID were more effective, it wouldn’t have been dismantled.” The pretense is that some analysis on its actual effectiveness was underlying the decision to dismantle it, but I’m unaware of any such analysis. Of course I’m all for being more effective (and then publicizing that too), but I do not believe doing so would have changed anything.
To the thrust of Todd’s article, it seems like these criticisms of RCTs keep coming up and we keep going in circles. But much like the Fast and the Furious franchise needs to keep putting out movies, this conversation keeps coming back every so often, which is probably a sign that we should keep clarifying what methods help achieve which goals. For other expositions on this, see Abhijit Banerjee and Esther Duflo’s How Poverty Ends (2020, in Foreign Affairs, gated), for a broader history of RCTs, see Tim Ogden’s interviews with Randomistas and their critics in his book, Experimental Conversations: Perspectives on Randomized Trials in Development Economics, and more recently for an overview of the field in general including RCTs, Noah Smith’s recent Substack post, Could Development Economics be More Useful?
But nearly all of these attacks on RCTs seem to take the format of “RCTs have taken over the field—but one method can’t tell you everything!” To which we all sigh and say, “Yes, you’re absolutely right. If you know of a method that answers all questions, please let us know. Until then, we will keep working with this method to answer the many questions it does.” So without further ado I present this year’s latest installment in the “RCTs Can’t Do Everything” franchise of blog posts and thought pieces.
Some of Todd’s criticisms are familiar, accurate, and useful to point out. Others are based on common misunderstandings which I’d like to correct. First, a basic principle I think we can agree on is that we both want the same thing: less poverty and more prosperity for as many people as possible. And it’s useful to remember that in the grand scheme of history, humanity has made great progress reducing extreme poverty, albeit not monotonically improving everywhere. But the world was already at a critical moment in danger of slowed growth before aid cutbacks from wealthy nations, disruptions to trade, and a dual energy and fertilizer crisis for the world’s poorest countries where the vast majority of the population make their living from agriculture. And that means the decisions of how to spend fewer dollars are even more critical to get right, which is why I think it’s unethical not to make the most informed decisions we can.
Where Todd and I part ways is on the role of RCTs—he thinks their inability to answer big macro questions is a reason not to use them (or to scale them far back), and I think it’s a reason to be clear about what they are good for.
Todd’s characterization of the mechanics of an RCT is fairly accurate at its core. Some people get a program (or a version of a program), others don’t (or they get another version of the program), and we compare them. To make sure you’re comparing equivalent groups, you assign them randomly to be offered the different programs. As it happens, in under-resourced areas, this often ends up being the most fair way to choose who should and who shouldn’t get access to something. RCT designs have evolved, however, and we see many that are not just estimating program effects, but rather are designed to estimate parameters of interest for a broader model, one that could, for instance, even be a macro model! But he’s right that you can’t randomize everything, nor would you want to, and I’ve never met an economist, or any researcher, who’d claim that you should or could. (The closest I came to arguing for it was this satirical piece proposing we randomize which SDGs apply to which countries.)
But the tone of the piece feels a bit straw man-ish.
First, a fact check: He calls RCTs the “dominant tool in development economics,” about which I’m conflicted. Todd says “This approach has pretty much overtaken the entire field of development economics. It’s nearly a cult.” He’s wrong with the implication (as I read it), that the RCT methodology has taken over development economics. Jessica Leight’s recent analysis of published research papers in the field shows that from 2021-2025 the share of RCTs in development economics has been consistently about 25%. So 75% of published papers were other methods. If that’s “dominant,” Todd’s competing with my sycophantic AI for flattery. I’d also point out: the share of all the development programs undergoing academically publishable evaluations is a minority, so the share of development programs being RCTed is probably very small. (There is a certain irony in RCTs being criticized for only being conducted with a small set of highly selected NGOs and thus not being representative of development efforts writ large, and also being so dominant in the space as to drown out all other forms of expertise.)
Second: When it comes to what RCTs can tell us, are they the “Gold Standard”? Yes, if it is the right tool to answer your question. No, when it is not. Which is, when put together, fairly tautological and thus meaningless. Which is why I tend not to use the phrase. Todd’s also right that ranking methodologies is a fool’s errand. We should be asking “is this the best method to answer the question at hand,” and no serious researcher I know would disagree.
RCTs reduce the uncertainty (and in many cases, bias) inherent in estimating impact of a program or policy or just in general in estimating any parameter of interest (could be a parameter such as price elasticity, to then use to calibrate a macroeconomic model, the type that Todd, and I, would like to see more work on). Reducing the uncertainty, and bias, on key decisions that policymakers must make seems tough to argue against. Would we want our doctor to ignore the latest evidence on what medical treatments are most effective? Certainly not. I want nothing less of my policymakers, when the setting allows for it. And often an RCT is the right tool for the job. A lot of concrete day-to-day decisions about how to spend limited dollars on education, health, livelihood promotion, economic opportunities for women and marginalized populations, fighting corruption, and encouraging good governance, come down to a limited budget and choice between many options. Knowing which ones to choose can help guide many government decisions to the best possible outcomes.
Todd’s mostly right when he says that RCTs of individual programs can’t tell the finance minister how to make their country rich, but that’s not exactly a critique of RCTs: what he’s saying is microeconomics can’t tell the finance minister which fiscal policy will usher in an era of growth and prosperity for all. He’s thinking of macroeconomics. And what he’s saying is macroeconomics is where it’s at. And he’s almost right, except that development macroeconomists don’t know either. Noah’s substack cites another similar debate online, and the example of Bolivia and South Korea starting in the same place but one got rich and the other didn’t. Macroeconomists can describe what happened in retrospect, but didn’t know in advance which one would turn out that way, and generally still can’t predict that. (And “growth” is not sufficient; it must be inclusive growth to handle the issues that many of us are quite concerned with, such as the ultra-poor, the most disconnected to markets, etc.)
In fact, while microeconomics is a relative newcomer in development, macroeconomists have been doing development far longer. Give us micro folks a little time to catch up to not knowing as much as them.
Some years ago, when the micro/macro debate in development was already old, I was in the audience at a symposium in which the legendary macroeconomist Robert Lucas, Jr. said (I’m paraphrasing as best as I remember!): “I don’t understand why you’re arguing about which is more important. We learn something from each approach, both are interesting and important and we need advances in both in order to better understand the world.”
A microeconomist can’t tell you how to make Kenya rich. But she can tell you which of several candidate strategies, based on prior theory and evidence, is most likely to get more kids into school, keep them healthy, move them into the workforce, or help girls delay marriage to a healthier age. These things matter, a lot of money rides on them, and (critically) the right answer is often less than obvious. If we can do each of those even 10% better, I count those as wins even if they’re small potatoes by Todd’s standards.
Todd points to an RCT in his area of expertise, providing electricity, which didn’t have to be done because the answer was completely obvious to him. I also find this effect commonly. After finding out the answers, many experts say “well that’s obvious! I could have told you that!” Study co-author Catherine Wolfram says it wasn’t, and still isn’t, obvious to many in the field. Todd tells us that the key to success isn’t providing people electricity in their homes, it’s businesses:
don’t prioritize connecting the poorest households. Instead, focus on delivering cheaper and more reliable power for businesses in towns and cities. And let’s just accept that it’s impossible to randomize power for industry. No business would ever knowingly volunteer to suffer blackouts to help an academic study.
Two problems. First, the oversimplification of both the problem and the solution. We frequently find this phenomenon among experts who have the one great idea that will fix everything! But often when we do test it, it doesn’t work for other reasons—the businesses don’t have the machinery to take advantage of the electrification and would first need loans to afford them. Or they don’t have the customer base to sell their higher outputs to because transportation to bigger markets is expensive or impractical, or the towns don’t have the trained professionals to fix the machines when they break. If you assure the Kenyan government that business electrification is the answer, and then one of these other constraints turns out to bind, what is the fallback?
Second, the claim that this isn’t randomizable. Of course no business would volunteer for blackouts, but what if there were a way to find out in advance whether Todd’s idea works? Governments typically can’t electrify everywhere at once, so working with one you could randomize the sequence in which towns get connected, and see if prosperity follows (or at least whether the shorter-run changes the theory depends on actually show up). And if it doesn’t, why not? If it does turn out that businesses need a loan to afford the machinery needed to take advantage of the electrification, wouldn’t that be good to know before rolling it out nationally?
If we had a room full of omniscient experts who knew the answers in advance, we wouldn’t need to test anything. But until we do, we need a way to find out practical answers, and if we have a tool that helps do that, isn’t it incumbent on us to use it when we can?
I disagree with Todd on two other aspects. One is in rhetorical style, painting development research as an all or nothing binary: if your method can’t solve poverty, why bother? That’s like saying “Ugh, your car can’t even drive on water? On a planet that’s 70% water?” My car is the right tool for some jobs, but not all. Development challenges are complicated and tricky, and falling into the trap of assuming one tool has to solve all problems is a recipe for failure.
The second is partially economists’ fault, but is not unique to economics. Todd brings up a criticism that is valid and more people should talk about. He cites the household electrification study title “Does Household Electrification Supercharge Economic Development?” Economics articles often have bold titles that sound like they’re trying to answer the question stated quite generally but in reality are simply testing the question in one context. That’s an artifact of how economists write for each other in journals: framed as strong arguments, knowing other economists will deduct strength in their reading, and will look in the meat of the paper for contextual information in order to judge how broadly to consider the results. Oncologists who write “Drug X shrinks tumors of type Y,” but in paragraph 5 note it was a test in mice, cause similar frustrations with lay readers. But the intended oncologist audience likely knows the test was done in mice and was expecting it. All fields use technical jargon and shortcuts that bear resemblance to English but can cause misunderstandings.
To summarize my arguments (feel free to copy/paste the next time this debate gets a sequel release):
1. Micro and macroeconomics both care about development but specialize in answering different kinds of questions. We need both.
2. No single method is the gold standard for everything. Choose good questions and then best viable tool for that question.
3. I’ve found you can study, and sometimes even randomize, more things than people first assume, including parameters to models that speak to “big” questions in macro.
4. If we have a choice between guiding critical decisions with more information or less information, why not choose more?
Bottom line: Todd’s right that more evidence might not have been a political solution in these times, but it can help us spend scarce dollars where they’ll do the most good. Hopefully that’s a goal on which we can agree.


Perhaps I misread Todd's intent, but I thought the most important point he made in his recent post is that if we (read the "development community") care about reconstituting a meaningful development assistance capability in the US Government, the political argument is not going to be won on the basis of more / better evidence of its effectiveness (in a developmental sense). This particular sequel addressing the macro-micro debate is useful if we are concerned about how to make the most effective use of a given - in fact, rapidly shrinking - pool of resources. But I don't think that's the primary question we should be particularly concerned about now - and I don't think it's the main question Todd was pointing to in his post. Given where we are today, I would prefer to see us work together to rebuild the political constituency for a robust development assistance capability as part of US foreign policy. We need a compelling narrative that appeals to the American people and their representatives for why US support for development remains in the interests of the United States. I don't yet know what that argument is, but I am confident it doesn't require more RCTs.
Great note, Dean.