We use cookies on our website to give you the best online experience. Please know that if you continue to browse on our site, you agree to this use. You can always block or disable cookies using your browser settings. To find out more, please review our privacy policy.


Real Estate Research provided analysis of topical research and current issues in the fields of housing and real estate economics. Authors for the blog included the Atlanta Fed's Jessica Dill, Kristopher Gerardi, Carl Hudson, and analysts, as well as the Boston Fed's Christopher Foote and Paul Willen.

In December 2020, content from Real Estate Research became part of Policy Hub. Future articles will be released in Policy Hub: Macroblog.

Comment Standards:
Comments are moderated and will not appear until the moderator has approved them.

Please submit appropriate comments. Inappropriate comments include content that is abusive, harassing, or threatening; obscene, vulgar, or profane; an attack of a personal nature; or overtly political.

In addition, no off-topic remarks or spam is permitted.

September 18, 2014

The Economic Effects of Urban Renewal

Editor's note: An earlier version of this post inadvertently included a paragraph from last week's post. The corrected post is below, and we apologize for the oversight.

This year, the 50th anniversary of the "War on Poverty," has seen an effort in the news media and among policy commentators to review the success and failure of past efforts to address poverty (see, for example, this, this, this, and this). Some of these efforts have included place-based policies such as the Model Cities program, which attempted to improve housing stock and reduce urban blight at the neighborhood level. In part, this renewed interest is policy-relevant: many cities are struggling with blight in the wake of the foreclosure crisis, and place-based policy has returned to popularity. For these reasons and more, I was quite interested to read a recent article in the American Economic Journal: Applied Economics. "Slum Clearance and Urban Renewal in the United States" by William J. Collins and Katherine L. Shester revisits the topic of urban renewal programs in the latter part of the last century.

The set of policies loosely referred to as "urban renewal" has been controversial since implementation. In fact, the programs changed a lot from 1950 to 1974, largely in reaction to the outraged response and perceived failures of early efforts. Title I of the 1949 Housing Act, which focused on "slum clearance," was a precursor to the 1954 Housing Act, which shifted the emphasis away from demolition and towards rehabilitation and preservation. Later legislation added programs to smooth the relocation process for those who were displaced by Title I programs and to direct resources towards the elderly poor. Throughout the 1960s, policy shifted away from changing the quality of housing stock towards creating a suite of policies focused on healthy communities. In 1965, as a result of a major reorganization, the Housing and Home Finance Agency, which had administered Title I, became the Department of Housing and Urban Development, commonly known as HUD. Finally, in 1968, the Fair Housing Act passed, further affecting the dispersal of funds.

In the early sixties, Jane Jacobs was one of the more famous critics of the destruction of historic neighborhoods and reconstruction along rationalist, modernist lines. In her 1961 classic, The Death and Life of Great American Cities, she argued that cities embodied organized complexity and that so-called "disorderly" slums were better than the rationally planned spaces that displaced them, both economically and socially. Other research on urban renewal has focused on political, social, and legal implications. This line of inquiry has focused on the impact of eminent domain on property rights, aesthetic concerns about how to incorporate historic preservation into revitalization, and concerns of justice and equity, primarily the issue that urban renewal placed the burden of displacement and disruption onto poor and minority residents without due consultation or compensation (see Gans 1962, Gotham 2001, Jacobs 1961).

The 2013 Collins and Shester paper cites this literature, but is distinct from it in its quantitative, nationwide study of economic impacts. It evaluates the effect of a series of programs over a 30-year period across 458 cities, and calculates that effect on broad economic outcomes. The authors measure urban renewal by combining the dollars allocated under the various programs implemented between 1950 and 1974. They evaluate the combined effect of these programs using a regression model. This model estimates the impact of federal dollars spent on the change in economic health of each city between 1950 and 1980. Using census-region fixed effects, the authors evaluate the impact of expenditures on median income, median property values, the employment rate, and the percentage of people living in poverty.

The authors' first-stage findings show that federal dollars spent on urban renewal projects between 1950 and 1974 had a negative effect on various economic outcomes. However, Collins and Shester suspect there is endogeneity in the relationship they are trying to uncover. That is, they say we cannot be sure what causes what: did urban renewal cause economic growth or decline, or did blighted cities pursue more urban renewal? In the latter case, even if the program improved the economy, these cities might still be doing more poorly than cities that had no blight to begin with.

The authors deal with endogeneity using an instrumental variable approach. That is, they seek to use exogenous variation in the allocation of federal funds. The variable they use is the year in which a state passed enabling legislation that made these sorts of projects legal. At first glance, this isn't a great instrument. Instrumental variables have to meet what's called the "exclusion restriction" to be credible. That restriction is untestable; you have to evaluate this claim on its merits. So, for us to believe this instrument delivers credible result, we have to be convinced that a state's decision to pass enabling legislation affects economic outcomes only by the way it influences urban renewal expenditures. There can't be any other chain of effects of related issues that connect those two events—the instrument and the outcome.

Collins and Shester perform several tests to justify their instrument. First, they look just at the effect of the instrument in places where court cases affected the timing of the laws passing. Then they perform a test of known effects to see whether their model predicts the economic growth in rural areas where urban renewal was not pursued. Finally, they use an alternate specification of the instrument. The instrument holds up under these examinations.

The authors then use their metric to predict the urban renewal funds distributed, and then use that predicted value in the original model. In this specification, urban renewal dollars have a strong positive effect on income and property values. These findings are consistent across several specifications and robustness checks. Furthermore, they find no effect on employment or poverty rates, leading them to posit that the positive effects they observe were not generated by displacement of poorer residents from inner cities. As a whole, these results suggest that overall, urban renewal programs created positive growth in average wages and property values.

A concern is that these conclusions rest on the credibility of the instrumental variable, and I'm not sure that the instrumental variable meets the exclusion restriction. I also wonder whether the average effects might reflect underlying variation in the effect of individual programs in urban renewal as well as different contexts where the program was applied. A map of the instrument (below) shows a strong spatial component to the instrument. Of the 458 cities that the authors measured in 1950–80, 68 percent of the cities, or 311, were in states that passed enabling legislation immediately. Regions in the Northeast, Midwest, and West pursued urban renewal programs immediately. These states were the most industrialized parts of the country; they experienced sectoral change and decline of their manufacturing center. The more agricultural, conservative areas of the country pursued funds relatively later, and received funds under later programs.

Source: Collins and Shester 2014, author's calculations

This makes me wonder if there isn't sufficient variation in the manufacturing states, and that the instrumental variable instead down weights these cases, providing in essence a regional estimate. Looking at the first stage results within each census region, we find that the results vary by region. For heavily industrial regions—the Mid-Atlantic, East North Central, and East South Central—urban renewal funding had a negative on growth. The other regions show a positive relationship between urban renewal and growth and economic growth.

There is also inconsistency in the second-stage, or instrumented, results within each region. The two regions in the Midwest, stretching from Wisconsin to New York, drop out as there is no variation. The regions on the eastern half of the nation show a positive effect, while those in the West show a negative effect.

Collins and Shester want to evaluate the treatment effect of urban renewal dollars by creating as-if-random variation in the administration of urban renewal funds. But if we aren't convinced that the instrument meets the exclusion restriction, or that the policy is having a constant effect, then what can we make of the results generated by this instrumental variable? We might surmise that the instrument is telling us something about the impact of the program in the subset of cities where the instrumental variable generates variation. If we believe that the study design can actually capture the effects of urban renewal, we might think of these new estimates as telling us the average effect of later urban renewal projects in 158 cities in the South and rural West, and not so much the effect of the program in the 311 cities where urban renewal was most intensively pursued.

Photo of Elora RaymondBy Elora Raymond, graduate research assistant, Center for Real Estate Analytics in the Atlanta Fed's research department, and doctoral student, School of City and Regional Planning at Georgia Institute of Technology

February 14, 2011

New study claims to solve the econometric problem of the link between foreclosure and house prices

Many policymakers are now concerned about how the next wave of foreclosures will affect the housing market. Analysts have cited a large "shadow inventory" of homes, referring to the mass of delinquent mortgages that have yet to make their way through the foreclosure process. When these foreclosures occur, they could raise the number of homes for sale and put downward pressure on house prices. They could also impose negative externalities to other homes in the same neighborhoods, sending house prices even lower. (We recently blogged about the so-called contagion effects of foreclosures on surrounding properties.)

These potential effects seem intuitive, but measuring them is not easy. The main problem is what economists call "simultaneity." Foreclosures lead to an increased supply of homes for sale, which can lower prices—but lower prices also increase the probability that borrowers have negative equity, which can lead to foreclosure. Thus, there is simultaneous causality: foreclosures can reduce prices, and lower prices can cause the negative equity that leads to foreclosure. As a result, simply showing a correlation between foreclosures and falling house prices is not sufficient to measure—or even establish—a causal effect of foreclosures on prices.

A new study by Atif Mian, Amir Sufi, and Francesco Trebbi claims to have solved this econometric problem. Their paper reports a substantial causal impact of foreclosures on not only house prices, but also residential investment and automobile purchases. However, the authors make a major data error that, in our opinion, invalidates a large part of their analysis. In addition, there are important conceptual issues that raise deep questions about their identification strategy, even if it is possible to correct the data error.

Can simultaneity be solved by classifying states as judicial, nonjudicial?
The authors attack the simultaneity problem with a classic method: they use differences in state laws as an instrumental variable. The essential idea is that states vary randomly as to whether they are judicial or nonjudicial. Judicial states are typically characterized by longer foreclosure durations, since the mortgage servicer must navigate through the legal system to get court approval, which usually entails a significant amount of time (see Pennington-Cross 2010 for a nice discussion). If the judicial/nonjudicial classification is random with respect to the health of state-level housing markets, then state laws will generate random variation in the number of foreclosures across states. Under these assumptions, using the classification as an instrument yields consistent estimates of the effect of foreclosures on house prices.

Of course, the classification of states into judicial and nonjudicial groups may not be random. It turns out that there is a strong regional component to this classification. Figure 3 in the Mian-Sufi-Trebbi paper shows that states in the Northeast and Midwest tend to be judicial, while the states in the South and West are mostly nonjudicial. It's no secret that problems in the U.S. housing market also have a strong regional character, with housing markets in Arizona, California, Florida, and Nevada (all located in the South and West) in particularly bad shape.

One way to check for the possibility of confounding effects across the two classifications of states is to compare their observable variables. The authors do this, and then claim that "states with a judicial foreclosure requirement are remarkably similar to other states in all attributes of interest except the propensity to foreclose" (p.3). But eyeballing their Figure 3 should give a reader pause. Nevada and Arizona, which are nonjudicial states, include the number one and two MSAs for new construction and for house price appreciation in the two years prior to the collapse of the mortgage market.1

Cross-state differences challenge regressions
Regional patterns in both state laws and housing markets cause problems for the authors' identification strategy. If we find that foreclosures tend to be more frequent in the nonjudicial states, this might be because foreclosing on delinquent homeowners is easier in those states, as the authors' identification strategy assumes. But high foreclosure rates in the nonjudicial states could also stem from negative shocks to housing demand in the parts of the country where the nonjudicial states happen to be located. Consequently, if we find that housing prices are lower and foreclosure rates are higher in nonjudicial states, then we can't be sure what's causing what. The high foreclosure rates could be causing the falling prices, as the authors' claim. But it could also be true that low regional demand and falling prices in the South and West are causing the high foreclosure rate—the very possibility that the authors were hoping to rule out.

The authors recognize that unobserved cross-state differences make the state-level experimental approach problematic so they propose an alternative set of regressions that are not subject to such criticism. In addition to estimating the first set of regressions—which, in the manner described above, uses all the states in the country—they estimate a second set that includes only ZIP codes adjacent to borders between judicial and nonjudicial states. The idea is that while unobserved heterogeneity across states could potentially invalidate the first set of regressions, this heterogeneity is less likely to be a problem in the second. In other words, the housing market in Arizona may differ markedly from the housing market in Maine and not just because Arizona is a nonjudicial state while Maine is judicial.

However, the ZIP codes just north of the Massachusetts-Rhode Island border are likely to have similar housing markets to the ZIP codes that are just south of this border. So, if the border ZIP codes in Massachusetts, which the authors label a judicial state, are experiencing higher foreclosures than the border ZIP codes in Rhode Island, a nonjudicial state, then differences in the two state's laws—and not unobserved differences in demand— are probably the reason why. And if the state laws are generating random variation in foreclosures, then the authors claim that this variation can be used to get a clean estimate of the causal effect of foreclosures on housing prices.

Problems in the data: Massachusetts, Wisconsin are misclassified
The authors find similar results in both sets of regressions. This similarity gives them some confidence that they have truly pinned down the direct effect of foreclosures on other economic outcomes. But here's where the data error comes in: the authors make a mistake in classifying at least two states as judicial or nonjudicial, which has major implications for their results. Specifically, they misclassify Massachusetts as judicial and Wisconsin as nonjudicial.2 Most sources, including the National Consumer Law Center (NCLC), reverse those classifications.

(For readers interested in the gory details, we show that for Massachusetts, there is no question that the NCLC is right.)

While the misclassification of two out of 50 states may seem minor, it turns out that Wisconsin and Massachusetts dominate the samples for the "border discontinuity" regressions. As the table shows, depending on the sample, using the alternative classification from the NCLC invalidates between 58 and 78 percent of the ZIP codes the authors use. Consider the sample that uses ZIP codes in 5-mile bands around state borders. Because it uses homes closest to state borders, this sample is least susceptible to unobservable differences between geographic areas, although we argue below that even 5-mile bands are inadequate to obtain clean identification. In this sample, classifying Massachusetts—correctly—as nonjudicial eliminates 70 percent of the comparisons.3

One response to this criticism would be to reclassify the states correctly and then reestimate both sets of regressions. The problem for the border regressions is that Massachusetts's and Wisconsin's borders with judicial and nonjudicial states respectively are sparsely populated and do not meet the authors' criteria for inclusion in the border sample. For example, farms and weekend homes comprise most of the properties in border ZIP codes between western Massachusetts and southern Vermont.

Misclassification proves detrimental to the identification strategy
As the authors have written the paper, they claim to find big differences in ZIP-code-level outcomes based on the judicial/nonjudicial classification. However, they use regressions with the wrong classification for most of the comparisons. If the identification strategy worked as the authors had hoped, their regressions would have implied that there are no important differences on either side of most judicial/nonjudicial borders because these borders in fact separated states with similar laws. However, because the regressions instead reported significant differences, some other important sources of heterogeneity across the state lines must exist—and if the authors can't control for heterogeneity across, say, the Massachusetts–Rhode Island border, the reader can't be expected to have confidence in their ability to control for unobserved differences between Massachusetts and Nevada.

Another way of putting this is that the authors have inadvertently performed and failed a falsification, or placebo, test on their data. They estimated their regressions on a sample of borders that are, for the most part, not characterized by differences in foreclosure laws, at least in terms of the judicial/nonjudicial classification, and found large effects where they should have found none. In our opinion, this is very strong evidence against their claim that judicial/nonjudicial foreclosure laws are a valid instrument for foreclosure rates. Even if the authors correctly reclassify the states and reestimate the IV regressions for the border sample, this failed falsification test still sheds doubt on the entire empirical strategy.

In addition to this primary critique, we also found some other important drawbacks in the analysis. For readers that are interested in learning more about these issues, here is a detailed discussion.

We remain unconvinced by the authors' claim that exogenous increases in foreclosures substantially reduce housing prices. This issue, of the link between foreclosure and house prices, is of first-order importance to policymakers, who struggle not only with the foreclosure problem itself but also with the potential effects of foreclosures on the economic recovery. However, the authors' research strategy is unlikely to be helpful in addressing these problems given the deep conceptual issues it did not deal with and the poor data on which it is based.

Photo of Kris GerardiKris Gerardi
Research economist and assistant policy adviser at the Federal Reserve Bank of Atlanta


Photo of Paul WillenPaul Willen
Research economist and policy adviser at the Boston Fed

1 Moreover, one of the main stylized demographic facts about the United States in the last 50 years has been the spread of population south and west across the country. Indeed, for the past 25 years, population has consistently and steadily grown twice as fast in the states the authors identify as nonjudicial compared to the states they identify as judicial.

2 Arguably, the authors misclassify as many as six states: the two listed plus Maryland, Nebraska, New Mexico, and Iowa. However, as we explain below, it's the misclassification of Massachusetts and Wisconsin that dramatically affects their results.

3 The authors are aware that there are alternative classifications but view the discrepancies as minimal, relegating the following comment to a footnote: "The only states that differ across these three classifications are Massachusetts, Nebraska, Oklahoma, Rhode Island, and Wisconsin." It is unclear whether they were aware that two of those states accounted for most of their border sample and that their border sample specification was not robust to the alternatives.