In an article titled "Field Experiments in Economics: The Past, the Present, and the Future," Levitt and List (2009) make three important claims about the history, philosophy, and future of field experiments in economics. They claim that field experiments in economics began in the 1920s and 1930s, in agricultural work by Neyman and Fisher. Second, they claim that artificial randomization is the sine qua non of good experimental design; they claim that randomization is the only valid justification for use of Student‘s test of significance. Finally, they claim that the theory of the firm will be advanced by economists doing randomized controlled trials (RCTs) for private sector firms. Several areas of economics, for example the development economics of Banerjee and Duflo, have been influenced by the article, despite the absence of historical and methodological review. This comment seeks to fill that gap in the literature. Student has, it is found, priority over Fisher and Neyman; he compared balanced and random designs in the field—on crops from barley to timber—from 1905 to 1937. The power and efficiency of balanced over random designs - discovered by Student and confirmed by Pearson, Neyman, Jeffreys, and others adopting a decision-theoretic and/or Bayesian approach - is not mentioned by Levitt and List. Neglect of Student is especially regrettable, for he showed in his job as Head Brewer of Guinness that artificial randomization is neither necessary nor sufficient for improving efficiency, identifying causal relationships, or discovering economically significant differences. One way forward is to take a step backwards, from Fisher to Student.Hopefully some of the randomista's will jump into the discussion to share their thoughts. I am not familiar with either paper and will abstain from in-depth commentary at this point. What I find curious is the fact that the rejection of a method 100 years ago appears to be sufficient to say that it is not appropriate now. I will need far more evidence than that to prove that RCTs are not accurate.
Anyone have thoughts?
Read Ziliak's paper here and List/Levitt's here. Also, see Ziliak's full letter to Thoma here.