Does it work? Aid for Trade through the evaluation prism
In the early days after the Hong Kong Ministerial in December 2005, trade negotiators, especially from developing countries, gauged their success in Aid for Trade (AfT) merely by increases in amounts. By this measure, the AfT initiative has indeed been successful, rising from $25 billion in 2005 to $40 billion today. (1) To be sure, difficulties in measurement complicate this calculation. The OECD has elected to define AfT as concessional assistance to (mainly) low-income countries for economic infrastructure, productive capacity, and technical assistance for regulation, capacity building and policy support - a broad definition that subsumes 30 percent or more of all concessional development assistance. In the latest Aid for Trade at a Glance , the OECD and WTO have also reported non-concessional trade-related lending to middle-income countries. The OECD's measure does not capture several other forms of AfT, notably investments by multilateral institutions in private companies and free-standing technical assistance.
Because of the diversity of AfT we argue that no credible evaluation of AfT's success can rely on one single methodology. Rather, the diversity of objectives, instruments, sectors and activities requires using multiple lenses - in effect, a prism of evaluation approaches. This note reviews cursorily different evaluation approaches with a view toward pointing out strengths and limitations. As a corollary, it notes that one of the lenses, impact evaluation, is conspicuous for its absence in the AfT literature.
Diversity in AfT goals
The WTO's Task Force on AfT highlighted the objective of trade-related development assistance - broadly to help countries expand trade to promote growth and reduce poverty (WTO, 2006). The first problem evaluators confront is the multiple intermediate objectives on the path to the overarching objectives of trade-led growth and poverty reduction - ranging from increasing exports, diversification, and intra-regional trade, to raising the incomes of small-scale (often female) traders, or, in the case of infrastructure, improving competitiveness through wider and cheaper access to power, transport and telecommunication services. This diversity of objectives is highlighted in the OECD/WTO's rich collection of 269 case stories of AfT. A simple word count of performance objectives, for example, surprisingly returns "gender" three times more often than "poverty". Similarly "environment" is mentioned far more often than "poverty reduction".
One commonality of this collection is the relative absence of quantitative indicators of performance. Only 44 percent of the case stories have any quantitative output measure and only 22 percent have any quantitative indicator of outcomes or impacts. Sampling bias may be partly to blame. Still, these findings are symptomatic of Cadot's (2011) conclusion that "the aid-for-trade community has been slow to build a culture of rigorous evaluation".
Towards a prism for evaluation
To supplement case studies, understanding the full effects of AfT, with its variety of objectives and instruments, requires a prism of evaluation comprising three other broad approaches: aggregate cross-country evaluations of AfT; sectoral and program evaluations, and project evaluations.
Aggregate cross-country evaluations.
Cross-country (or panel) regressions have long been the choice method to evaluate the effectiveness of aid. Econometric studies of AfT have shown large positive results in expanding trade. For example, the Commonwealth Secretariat reports studies suggesting that a doubling of AfT to infrastructure would raise merchandise exports by 3.5 percent; while a doubling of aid to trade facilitation would lower import costs by 5 percent. Similarly, UNECA's econometric studies of Africa show that a 10 percent rise in AfT correlates with a 0.4 percent increase in an index of economic diversification.
This methodology has the advantage of neatly capturing all economic interactions. Their results are also, in principle, valid in a variety of contexts, since they identify averages. However, this approach has two limitations. First, the identification of causal linkages is weak because even clever econometrics cannot filter out many confounding influences and reverse-causality mechanisms. Second, cross country averages rarely help in providing specific advice at the country level.
Sectoral and program evaluations
Sectoral evaluations of donors-in transportation, agriculture, and power- are rarely centered on trade issues, but sometimes provide settings where AfT impacts can be measured more directly. Similarly, program evaluations, such as the World Bank (2006) evaluation of its support to trade over 1987-2004, focus more directly on trade; while generally supportive, the analysis highlighted gaps in complementary policies. The US evaluation of its AfT program, a review comprising 265 projects over 2002-2006, concluded that "each US$1 invested yielded a return of US$42 in developing country exports two years later". (2)
Sector studies have the advantage of better identifying causal mechanisms and allowing for the review and assessment of individual or combined policy interventions. By themselves, however, they can rarely link policy interventions directly to outcomes because many intervening or simultaneous variables make attribution difficult. Moreover, impact assessments are largely based on before-after comparisons, failing to accurately take counterfactuals into account.
Project-level evaluations are common for trade-related interventions, but they too often rely on crude methodologies. For example, over 2002-2008, 85 percent of the World Bank's trade-related projects were rated satisfactory or higher, with an average economic rate of return of 32.4 percent (compared with 23.7 percent for other projects-see World Bank Group, 2009). Or, a review of 85 World Bank trade-related investment projects in 1995-2005, found that too frequently evaluations were partial or absent altogether. Most projects used simple economic rates of returns calculations (31 percent), sometimes combined with stakeholder workshops and/or surveys to assess qualitative elements (an additional 26 percent), but that 10 percent of surveyed projects had no evaluation at all. Even when quantitative, ex-post assessment cannot control for outside influences, attributing to programs the merit of favorable conditions and vice versa.
Project-level evaluation could be made much more informative by adopting formal impact-evaluation methods based on treatment and control groups, widely used in health, education and other areas of development work. By construction, such methods make sense only for "targeted" interventions such as export promotion, technical assistance, or geographically limited interventions. They have advantages and drawbacks that mirror inversely those of cross-country studies: Whereas they identify causal mechanisms very precisely and provide highly relevant lessons on the ground, it is rarely clear how those lessons would carry over to different settings. Moreover, they are expensive: for small-scale activities, an evaluation can cost as much as the activity itself.
Still, much more could be done. Cadot et al. (2011) suggest ways of conducting "quasi-experiments" circumventing the strictures of more classical randomized approaches through the use of so-called matching and difference-in-differences methods. Examples include Brenton and Von Uexkull (2009), who used a difference in differences method to examine the effects of 88 export development programs in 48 different countries. They found that, on average, export development programs have coincided with or predated stronger export performance. Volpe and Carballo's (2008) evaluation of export promotion programs in six Latin American countries (which found positive impacts) or Jaud and Cadot's (2011) evaluation of the EU's Pesticides Initiative Program in Senegal, which found no impact, are other examples of this method.
Convincing assessments of the effectiveness of AfT are unlikely to come from just one approach. For a given country, elements of program and sectoral evaluations, best in combination with cross-country econometric analysis, can begin to provide a more complete picture.
That said, one lens of the prism is woefully under-utilized: impact evaluations of AfT projects. These are the best way to link interventions with outcomes and poverty impacts. Trade lags behind other fields in applying impact evaluation. The basic challenges of this approach are twofold: first, mobilizing the necessary financing; and, second, ensuring that projects are designed for evaluation at the outset, by identifying and preserving credible control groups. To respond to the first challenge, Aaditya Mattoo (3) suggested that one way to minimize cost is to fund separately the fix costs of a core team of specialist evaluators, perhaps located in one agency, such as the OECD or World Bank, and then finance the marginal survey and data work necessary for each project as a project component, with the specialist team to help guide or undertake the analysis. To respond to the second challenge, designs that build in quantitative benchmarks of initial conditions and other evaluation elements must become the norm rather than the exception to overcome negative incentives by project managers (who often see impact evaluations as sources of bad news with little upside potential). Crucially, it must also become part of donor dialogue with countries, so that the evaluation culture is built into government interventions rather than just donor practice.
In that spirit, perhaps technical assistance to augment the evaluation capacities of governments in developing countries themselves could well be a next valuable step in aid for trade.
Olivier Cadot is Professor of Economics, University of Lausanne and Richard Newfarmer is Country Director for Rwanda for the International Growth Centre and Senior Fellow at the World Trade Institute in Bern.
See related article: From Market Access to Accessing the Market : Aid for Trade and the Program of the World. Elisa Gamberoni and Richard Newfarmer. Trade Negotiations Insights.Vol.8, No.9, November 2009
Brenton, P. and von Uexkull, E. (2009). "Product specific technical assistance for exports - has it been effective?" The Journal of International Trade and Economic Development: An International and Comparative Review, 18(2), 235-254.
Cadot, Olivier; A. Fernandes, J. Gourdon and A. Mattoo (2011), « Impact Evaluation of Trade Assistance: Paving the Way"; in O. Cadot, A. Fernandes, J. Gourdon and A. Mattoo , eds. (2011), Where to spend the next million? Impact evaluation of trade interventions; World Bank/CEPR, forthcoming.
Commonwealth Secretariat (2011) "Assessing the Effectiveness of Aid for Trade" Case Story Global 34, OECD/WTO data base of Case Stories of Aid for Trade .
Gamberoni, E., and Richard Newfarmer, 2009. "Aid for Trade: Matching Supply and Demand," World Bank Policy Research Working Paper 4991; see also Gamberoni and Newfarmer (2011) "Aid for Trade: Who Gets It, Who Should Get it?" for a more sophisticated econometric treatment.
OECD/WTO (2011) Aid for Trade at a Glance: Showing Results Report for the Third WTO Global Review of Aid for Trade OECD: Paris.
World Bank Group, 2009. Unlocking Global Opportunities: The World Bank's Program of Aid for Trade Washington: World Bank.
WTO (2006), Recommendations of the Task Force on Aid for Trade. WT/AFT/1.
1 WTO-OECD 2011
2 USAID (2010), "From Aid To Trade: Delivering Results - A Cross-Country Evaluation of USAID Trade Capacity Building Washington: USAID.
3 World Bank