Resources

Resources:

General Regression Discontinuity (RD) resources:

Key papers:

Hahn,Todd, van der Klaauw (2001) "Identification and Estimation of Treatment Effects with a RD Design" Econometrica

(Primary resource for treatment of identification in RD design)

Imbens and Lemieux (2008) "Regression Discontinuity Design: A Guide to Practice" Journal of Econometrics

(Primary resource for estimation in RD design. A bit more complex than Lee and Lemieux (2010) -see below-)

Lee and Lemieux (2010) "Regression Discontinuity Designs in Economics" Journal of Economic Literature

(Very comprehensive guide of RD for applied purposes. The "cookbook/recipe" for RD design)

Bandwidth Selection:

Imbens and Kalyanaraman (2012) "Optimal Bandwidth Choice for the Regression Discontinuity Estimator" ReStud

(Other methods exist--see Lee and Lemieux (2010) p. 319 for discussion)

Two sources for codes:

1. See Stata and Matlab code here (Code from Imbens' software page. Only one choice of kernel -triangular-)

2. See Stata help file here (Code can be installed directly from Stata. Allows two choices of kernel)

Testing endogenous sorting across treatment boundary:

McCrary (2008) "Manipulation of the Running Variable in the Regression Discontinuity Design: A Density Test" Journal of Econometrics

(Test for breaks in the density of the forcing variable)

1. See Stata code and description here

Spatial Regression Discontinuity (RD) resources:

Papers illustrating method (Conditional Treatment Effects)

Imbens and Zajonc (2011) "Regression Discontinuity Design with Multiple Forcing Variables" Working Paper

(Available in Chapter 2 of Tristan Zajonc's dissertation)

Keele and Titiunik (2015) "Geographic Boundaries as Regression Discontinuities" Working Paper

(Applied to geographic boundaries)

Spatially correlated standard errors:

Conley (1999) "GMM Estimation with Cross Sectional Dependence" Journal of Econometrics

Three sources for codes:

1. Stata code and description here (Tends to be slow, does not allow for covariates)

2. Stata and Matlab code here (Faster than source 1, allows for covariates and serially correlated errors)

3. R code and description here (Not familiar with this one)


ArcGIS resources for Economists:

Learning resources oriented to economists:

Melissa Dell's class notes: GIS Analysis for Applied Economists

(Very comprehensive, good reference for definitions of ArcGIS tools, packages, etc., no exercises to replicate)

Masayuki Kudamatsu's class notes: ArcGIS 10 for Applied Microeconometric Research

(Set of 7 lectures with replication exercises and corresponding data for replication, great learning resource)

Geocoding resources:

Great Stata program to geocode addresses using Google's geocoder tool. Daily limit of 2,500 addresses.

(Documentation here)

Installation requires three steps:

ssc install geocode3

ssc install insheetjson

ssc install libjson


Data sources (GIS, corruption, etc.):

GIS data sources:

Geocommons (Largest source of GIS data. Most datasets are user-submitted, hence be careful about data quality)

Elevation data (Shuttle Radar Topography Mission (SRTM) data covering the whole world)

Climate data (source for historic and current data on temperature (min,mean,max) and precipitation)

Population data (NASA's Socioeconomic Data and Applications Center gridded historic and current population data)

Roads and streets (street network extracts from OpenStreetMap for main cities over the world)

Land Cover (Set of land cover datasets collected by USGS Land Cover Institute )

Conflict data (Armed Conflict Location and Event Data Project -conflict data for over 60 developing countries-)

Gazetteers (Great source for geocoding data)

Corruption data sources:

Afghanistan (Electoral fraud data at the polling station level (2009). Election related data (turnout, etc.) for other years)

Country-level (Transparency International's Corruption Perception Index. Some issues (see Olken, 2009))

Firm-level (BEEP survey. Data on whether firm has paid bribes for government services)

Afghanistan data sources:

Election data (Afghanistan's Independent Election Commission. Detailed election-related data at polling station level)

Fraud data (Electoral fraud data at the polling station level (2009). Election related data (turnout, etc.) for other years)

Demography (MISTI Project. Data on population, language spoken, etc. for more than 40,000 Afghan villages and cities)

Conflict and Violence data sources:

ESOC (Princeton's Empirical Studies of Conflict. Great source for spatial data on Afghanistan and other conflict zones)

Infant and Perinatal mortality data sources:

EURO-PERISTAT (Great source for perinatal mortality and live births that are comparable across 25 European countries)