coronavirus - Building the Skyline

February 25, 2021 by Jason Barr Leave a Comment

The Pandemic Tsunami: How COVID-19 Swept Across America

Jason M. Barr and Troy Tassier February 25, 2021

The Great Equalizer?

In March 2020, during the salad days of the COVID-19 pandemic in the U.S., many people, from Madonna to New York Governor Andrew Cuomo, believed the virus was a “great equalizer,” hitting rich as likely as poor, and White as likely as Black or Hispanic.

In short order, however, it became clear that these pronouncements were wrong; people of color and the poorer members of society were hit at rates over their proportion in the population, particularly in the large cities of the Northeast as well as New Orleans and Detroit. As the great equalizer myth faded, another belief rose in its place—that population density would determine the pandemic’s course. In March, few COVID-19 cases could be found in the sparsely populated rural interior, while New York City was flooded with new cases and deaths.

The Bronx is Burning

By mid-April, the State of New York alone had more cases than any country outside the U.S., with the majority in the New York City metropolitan region. When this happened, curiosity into what was unique about New York soon followed. The frequently repeated response: its density. New York is, by far, the densest city in the United States, with 28,000 residents per square mile (10,811 per km²); San Francisco is a distant second at 17,000 (6,564 per km²).

Further singling out New York was its extensive public transportation network. Its subway alone carries five million riders per day. As a comparison, Los Angeles’s subway and rail system takes two weeks to hit this threshold. New York was accused, as it were, of being New York—unique and apart from the heartland—which gave the rest of the country a false sense of security.

But soon that would change—cases spiked in the heartland and south during the summer, followed by a cascade of infections washing over urban, rural, and suburban places alike. As a new spring approaches with cases, hospitalizations, and deaths dropping rapidly, along with the limited arrival of vaccines, it seems as if the great waves of the pandemic are receding with the hope of calmer seas and possible herd immunity by summer.

Figure 1: The Spread of Coronavirus from January 22 to April 25, 2020. Note that cases are given per 100,000 residents. GIF created by Eon Kim, based on data from USAFacts.org.

The Early Role of Population Density

In the early weeks of the pandemic, few items were more discussed than the role of density. We took part in the discussion, as well, by offering an early critique of its importance. We used the metaphor of lightening striking cities before rural areas because large cities are hubs of tourism and international travel. Our argument was that while density was important, it was not the only determinant of the epidemic. New York and other cities were first because dense places are also central places in the global economy. In this case, people were confusing correlation for causation. The high transmissibility of coronavirus meant it was going spread widely and soon; no region was safe.

Nonetheless, density should play a role in the spread of an epidemic. Density unchecked is the antithesis of social distancing. More people in tighter spaces lead to more contacts and more opportunities for infectious diseases to spread. It is not a question of whether density is important or not, but rather of its relative magnitude and how its impact may change over time. Others have also investigated its role.

Researchers at John’s Hopkins University performed the best-known study. Using data on 900 counties in U.S. metropolitan areas, they found that a city’s overall population size was correlated with more cases of coronavirus, but a city’s density (people per land area) was not. In other words, if a city has a large population, there will be more cases, on average, regardless of whether the people are sparsely or tightly packed together within the city. Some studies noted the role of density as a primary predictor of the epidemic in the U.S., while still others were more nuanced, claiming, as we did, that density matters but was not the sole factor in the epidemic’s spread. Studies from countries around the world have found mixed results as well (see here and here.)

Month to Month

One common feature of the studies, however, is that they all use data from the early part of the pandemic, with analysis ending by the early summer of 2020 at the latest. But how did the virus spread since then, and what role did density play month after month over the past year? Were people of color and the poor continuing to suffer the brunt of the pandemic throughout the year, or did the brunt of the impact change over time?

To understand how and why the virus spread, we have performed an analysis over the past 12 months. When we do this, we see the pandemic’s changing impact, with waves crashing on different groups at various times over the year. The resulting impact was that all groups got hit but at unequal levels and during different times.

We expand on the studies discussed above by performing a statistical (regression) analysis that looks at the role of density, and other factors, up to February 1, 2021 (data sources and results here). We use county-level reported coronavirus cases from USA Facts. The goal is to see how the monthly increase in cases can be “explained” by density, race, and poverty. Our method allows us to perform a kind of “hot-spot analysis” to see which key drivers statistically explain coronavirus growth rates each month and how these drivers’ impacts evolved.

Our results give the percentage change in coronavirus cases associated with a 1% change in a variable. Economists call this the elasticity. Larger values mean that the variable of interest has a larger percentage impact on new cases. If a particular variable has an elasticity of 0.5, it means a 1% increase in that variable is associated with a 0.5% increase in coronavirus cases, on average. If the elasticity is one, then a 1% increase in that variable is associated with a 1% increase in coronavirus cases, on average.

The Density Effect

Figure 2 plots the monthly elasticity for density on the number of cases for the pandemic’s first twelve months. If we look at the effect of density throughout the epidemic, we see that it was most important in the initial months. The most significant impact of density appears during the second month because of where the epidemic was first located, in northeastern cities and other metropolitan centers like New Orleans and Detroit. As the epidemic continued, the role of density becomes less important.

Figur2 2: The Density Waves. Each point is an elasticity, it gives the percent increase in coronavirus cases from the month prior with a 1% change in each variable. For example, on July 1, 2020, the elasticity for density was 0.2. This suggests that, on average, across U.S. counties, a county with a 10% higher density has a 2% higher coronavirus case increase. For household size, here the points show the percent increase in coronavirus cases with an increase of one extra household member, on average. The graphs shows that denser counties and those with larger average household sizes, by and large, experienced the worst coronavirus cases loads from April to June, 2020. Sources: See here.

Household Size

A second, more localized aspect of density is the average household size. Early on in the pandemic, it was recognized as an key influence in New York City in academic research and the general press. Other studies noted the link more broadly across the U.S. and the world. Household size is more nuanced than the broader concept of population density and has its own avenues for increasing infections. The effect of household size relates to both density within a home once an infection is present, the number of opportunities for an infection to enter the home, and the breadth of access to diverse places outside the home where infections are possible. In addition, household size, on average, is positively correlated with more density surrounding the home and with poverty.

The studies mentioned above, again, use data from early in the pandemic. If we look at the effect over time, we see that the impact of household size, or localized density, peaks in April when it becomes the largest factor in the U.S. epidemic. Like the density, it also falls off after that point in time.

Race and Ethnicity

If we return to our elasticity measures, in Figure 3, we see a similar story for race and ethnicity. In the initial months, counties with larger proportions of Asian, Black, and Hispanic people were hit hardest, with this initial wave cresting by May. After that, the measure of elasticity for counties with larger minority populations began to decrease. By July, the elasticity measurement for percentage of Asian and Black residents within a county was approaching zero.

The elasticity of percentage of Hispanic people within a county, however, decreased at a slower rate, only being equal to zero by September. Counties with higher percentages of Native Americans had a slightly different path during the epidemic. After an initial peak in elasticity similar to other groups discussed above, counties with larger native American populations had a second elasticity peak in August when the pandemic ravaged many areas in the southwest and the Dakotas, among other more rural regions.

Figure 3: Race and Ethnicity Waves. Each point is an elasticity, it gives the percent increase coronavirus cases from the month prior with a 1% change in each variable. For example, on June 1, 2020, the elasticity for Hispanics was 0.24. This suggests that, on average, across U.S. counties, a county with a 10% higher fraction of Hispanic residents, had a 2.4% higher coronavirus case increase. The graphs show that high Asian counties were hit hard first, then high Black and Hispanic counties after that. High Native American counties had two waves, one in the spring and one in the summer. Sources: See here.

The Poverty Wave

The second peak of cases in the summer is marked by an increase in the elasticity of county-level poverty rates. Initially, in the U.S. epidemic, impoverished counties had lower elasticities. But the effect of poverty within a county peaked in the month of July. Because there are several variables in our regressions that each pick up different aspects of the epidemic, it can be hard to sort out exactly what is happening. Because average household size within a county is strongly correlated with poverty in the county, there is also a poverty factor in the initial wave. But the “poverty wave” shown in Figure 4 is the movement of the pandemic in the early summer to areas with high rural poverty, particularly in the southern states.

The Cresting Wave

As the poverty wave receded, we entered a new wave in the fall. This third wave was bad for everyone and with lower elasticities from race, ethnicity, density, and poverty. This doesn’t mean that we have returned to the pandemic as “great equalizer,” as we make clear below. But we now find that coronavirus infection rates are now more similar across the socioeconomic spectrum than in the early waves of the epidemic. Mortality rates, however, remains unequal.

Excess Mortality

To understand mortality rates, we used data created by the CDC. They calculate a statistic called excess mortality. It measures the number of seasonally adjusted deaths that occur above normal. Below we plot the excess mortality for each of the population groups over the course of 2020 as percentages above normal (epidemiologists call these the “p-scores“).^[1]

For example, in April, the percentage of deaths above typical for Asian, Black, and Hispanic Americans peaked slightly above 100%. Combining this figure with our elasticities paints a clearer picture of the waves. Early on, the initial waves hit dense cities, particularly areas with large numbers of ethnic and racial minority groups. As the initial wave receded, a second wave arrived in more rural areas of the country, mainly rural areas with larger percentages of Hispanic and Native American residents and regions that were poorer on average.

For the Hispanic population, the second wave was almost as severe as the initial wave, if measured by excess deaths. While our new-cases regressions paint a picture of the cresting wave, excess mortality continues to be higher for minority groups (excluding the Black population at least through November), particularly for the Hispanic and Native American populations (see Figure 5).

Figure 5: Excess Mortality. This graph shows for each group the percent above typical mortality rates for each month. For example, in April, the excess mortality rate due to COVID-19 for Hispanics was nearly 1.2, or 120% normal. For nearly all months and minority groups, their excess mortality rates were higher than whites.

Waves within Waves

The pandemic that struck the United States and the world over the past year has not washed over us equally, nor has it arrived in one constant wave. It has been a pattern of episodes, smaller waves within a big storm, that each have unique characteristics. Overall, the pandemic’s waves and idiosyncrasies have inflicted most harm on the more vulnerable in our society, but even there, the effects have changed over time.

Each wave has held its own unique characteristics and impacts on different groups and different regions. This seemingly ever-changing set of features is not surprising to epidemiologists, who are accustomed to dealing with such idiosyncrasies over time. As epidemiologist Adam Kucharski writes in his book, Rules of Contagion, “if you’ve seen one pandemic, you’ve seen… one pandemic.” We wait hopefully for this one to recede into the history books.

Read more posts on the COVID-19 pandemic and related topics here.

—

^[1] These p-scores are highly correlated with our elasticity measures, which can be seen here (in data appendix pdf)

June 16, 2020 by Jason Barr 1 Comment

Disease and Unease in New York City (Part I): Mortality Rates since 1800

Jason M. Barr June 16, 2020

Pandemic 2020

On March 11, 2020, the World Health Organization officially declared the COVID-19 outbreak a pandemic. The first case appeared in Wuhan, China, on December 31, 2019. It arrived in the United States toward the end of January. On March 1, the New York metropolitan region saw its initial infection, and soon after, it became one of the world’s epicenters.

The general response by governors across the country was to close businesses, schools, and other public institutions, and send everyone home to “shelter in place.” While the epidemic in New York is far from over, isolation policies have worked to the degree that the region can begin to reopen. Arguably, the reason that the coronavirus requires a swift and forceful policy of social distancing is that not only can the infection cause terrible illnesses, but is also frequently fatal, either from the virus itself or a secondary disease, such as pneumonia. A recent study estimated that as many as 13 people out 1,000 infected are killed by it (compared to 1 in 1,000 for influenza), though that estimate is likely on the high end.

But this raises the question of how bad has the COVID-19 epidemic been in terms of mortality rates for dense cities like New York? In many respects, the history of cities is the history of the ebb and flow of shocks to our health and well-being. Here, we review over two centuries of mortality rates in New York City to not only place COVID-19 in context but also to see how the nature of mortality has evolved. To do this, we separate mortality rates into two parts. First is to look at the baseline or trend rates, and second is to review the so-called excess death rates–those years in which deaths were unusually high relative to the norm. How does the increase in deaths from COVID-19 compare to other big epidemics of the past like the 1918 influenza outbreak or the frequent smallpox epidemics of the 19^th century?

Our Urban History

Taking this long-run view allows us to draw some conclusions. First, there is a complex relationship between density and mortality. Just because a place is dense does not mean it automatically has a higher death rate. Manhattan’s death rate peaked in 1849, yet its population continued to rise until 1910. New York City’s death rate has steadily fallen over the last 170 years (data sources are here).

Second, the city is far healthier today than it has ever been in its 400-year history. Before the coronavirus came along, one could argue that it was experiencing a kind of golden age of low mortality. And while COVID-19 is likely to increase the number of deaths in New York by over 40% as compared to last year, ironically, the death rate for the city will be as it was in 1995.

Third, despite dire predictions of the COVID-19 epidemic destroying the city, a long-run view suggests this conclusion is premature. New York, over the centuries, has weathered many, many epidemics. Though the suffering they cause is devastating, the city has not only survived but has also become stronger as a result. The seeds of tragedy plant the flowers for a better, healthier place. Horrible events are met by collective action that improves the human condition (though frequently more slowly than it should). While this may be small comfort to those harmed, it should help us, as a society, feel more confident about our cities.

Lastly, just as land values can be an indicator of the health of a city’s real estate and economy, mortality rates can be an indicator of the health and well-being of its residents. Constant advances in technology, scientific knowledge, and medical treatments generate improvements in the quality of life. On the other hand, because of random shocks (outbreaks of disease or terrorism) or ongoing urban problems (such as crime, poverty, or failures in governance), these forces drag down health and increase mortality. As a result, the history of mortality is like a horse race between the “good” and the “bad.” When mortality rates are rising, or even flat, it means urban problems are occurring at a rate that, on average, offsets the improvements. When mortality is falling, it means that the advances are winning. In the very long run, the improvements have (and will) carry the day.

Mortality Rates for New York City, 1804-2020. Mortality rates are total deaths per 1,000 residents. Note for 2020, I assumed total deaths from 2019 plus deaths from COVID-19, as of June 16, 2020 (this means the actual death rate for 2020 will likely be higher). Sources: See here.

Mortality since 1800: A Helicopter View

Before we proceed, however, one caveat is in order. The focus here is on gross mortality rates: the number of deaths per 1,000 residents. It is admittedly an overly broad indicator. Different groups, of course, have different mortality rates, such poor versus rich, and infants and the elderly compared to young adults, and so on.[1] A more comprehensive history would need to detail how the less-advantaged groups have fared over time. The quality of a society can be measured by the health of those with the fewest resources. Nonetheless, gross mortality rates can help tell the story of a city’s health and history.

Looking at the broad sweep of mortality in New York City shows, roughly speaking, five major periods. Period I is from about 1800 to 1860. This was a time of rapid urban growth alongside frequent and devastating epidemics (as will be discussed in more detail below), with death rates peaking before the Civil War. Period II is from about 1860 to 1890. This time can be called post-peak mortality, where death rates fell over the period returning to a level that was not seen until the early 1800s. Period III, from about 1890 to 1925, is the Mortality Transition, as mortality rates steadily declined, and converged to those of the United States as a whole.

Period IV, from 1925 to 1990, is what might be called net-zero mortality—two sets of forces were working at odds—improved health and medical care were fighting against ongoing urban problems, such as air pollution, crime, and unemployment. While death rates for the city stayed in a limited band, they began to diverge from the United States, which saw declining mortality. Period V started around 1990, after the crack cocaine epidemic passed, and crime rates began to decrease.[2] The death rate in New York City dropped to a new low level. This is where we were when COVID-19 came along. In fact, starting in 1996, New York’s mortality rate has been lower than that of the United States. Pretty good for a city that was (proverbially speaking) told by President Gerald Ford in 1976 to “drop dead.”

Mortality Rates and Population for New York City (Manhattan) from 1804-1890. Notes: Mortality rates are deaths per 1,000 residents. The trend line shows that mortality rates over the period did not change, on average. Also note that population estimates before 1865 are in 5-year intervals. Sources: See here.

The Epidemic Century

From at least 1791 to 1900, New York City experienced frequent and devastating epidemics.[3] One can even say that epidemics were a regular part of life. The “normal” baseline death rate in New York City, in a good year, in the first half of the 19^th century was between 25 and 30 deaths per 1,000 residents. But constant epidemics continually moved the city above that rate. Epidemics often doubled the death rates above the baselines. Between 1804 and 1887, there were at least 25 epidemics, for an average of one every three years. In fact, in the years 1834, 1849, 1851, 1854, 1864-1865, and 1872, two epidemics were occurring at the same time.

The Diseases

By far, up until 1875, the most frequently recurring epidemic was from smallpox. This is ironic since a vaccine was known since the early 1700s. Yet it did not entirely disappear from the city until after 1902 (though a brief resurgence came in 1947). Two diseases tied for second, in terms of frequency: typhoid/typhus fever, and cholera. Typhoid/typhus fever was transmitted through contaminated food and water, and from direct contact with the infected. Cholera was also spread by contaminated food and water. Additionally, Yellow fever was a regularly appearing epidemic, which first arrived in New York in 1791 and disappeared after 1822. It was transmitted by mosquitoes, which likely spawned in the marshes and still waters of the island.

Arguably, the worst year for the city was 1849 (just at the height of immigration from the Irish Potato Famine) when over 5,000 people died of cholera, and the death rate spiked to 60 deaths per 1,000 residents. The additional deaths from cholera, if scaled up to today’s population count, would amount to the equivalent of about 95,000 additional deaths in 2019! And, in the three years from 1847 to 1849, there were some 8,000 reported deaths from typhoid, cholera, and smallpox combined.

Panic in the City

The random emergence of the epidemics caused the same kind of panics then as we see with coronavirus today. As one scholar writes about the 1832 cholera outbreak,

On Sunday, July 2, despite a calculated official silence, the existence of the first cases of cholera in the city was an open secret. Mass exodus from the city had already begun. To those able, flight was the immediate and traditional reaction. A hyperbolic and sarcastic observer remarked later that on Sunday “fifty thousand stout-hearted … New Yorkers scampered away in steamboats, stages, carts, and wheel barrows.” Farm houses and country homes within a thirty mile radius of the city were filled.

The feeling of dread was a common element of city life, as historical demographer, Gretchen Condran, writes,

Cholera and yellow fever defined the practical and emotional meaning of epidemics during the late eighteen and early nineteenth centuries. Virtually no deaths from these two diseases were recorded in the nonepidemic years. They arrived suddenly, ran their course in a matter of months, and then returned some years later with little or no warning….That a person could be well in the morning and dead before nightfall, a stark contrast to the lingering illness associated with many epidemic diseases like tuberculosis, added to the fear and panic that accompanied these epidemic diseases.

Was Density to Blame?

While I will have more to say about the role of urban density and epidemics in a future post, suffice it to say, the relationship between population density and the severity of epidemics is not so simple. Dense tenement districts were often hot spots for the epidemics, and the poor disproportionately suffered more than the rich, many of whom fled the city during the crises.

But over time, diseases like yellow fever and cholera would largely be eliminated, even as the tenement districts became more crowded. While poor hygiene and lack of sanitation were part of the problem, so was the city-wide environment. New York City, then as today, was a great hub of economic activity, and it was mainly through the port that diseases would enter and spread.

At the time, when germ theory was still not largely understood, health officials assigned blame for these diseases on many factors that seem strange to us today. There were also moral dimensions to their assessment—the poor immigrants huddled together in their dank, dirty tenements were responsible for these diseases because of their poor characters. Or as stated in an 1865 report by the Association for Improving the Condition of the Poor in New York (likely referring to the Irish),

[L]arge masses of the population are debased by the wretched condition in which they are compelled to live. These conditions should be improved; still, it would be true of many thousands, that if left to the uncontrolled indulgence of their reckless, filthy habits, they would convert a palace into a pigsty, and create “fever-nests” and hotbeds of vice and corruption, under circumstances most favorable to health, comfort, and social elevation.

The Mortality Transition

Starting around 1890 or so, the regularly occurring epidemics became less devastating. This is not to say the city was free of epidemics, but those that did occur killed proportionally fewer people. And overall, by the late 1880s, the death rate in New York City began to fall quickly. In 1890, there were 25 deaths per 1,000 residents. By 1925, it was more than half of that at 11.4.

The reasons for the mortality transitions were many. First, scientific knowledge of germ theory helped people better understand and prevent the spread of diseases. In the second half of the 19^th century, New York City expanded its water and sewerage system, which, over time, became universal. Housing reforms cracked down on excessive crowding and unhealthy living conditions. Mass transportation investments and rising incomes allowed people to live further away from the city center and have more housing.

Mortality Rates for New York City and the United States, 1920-1990. Note that mortality rates are deaths per 1,000 residents. From 1920 to about 1940, mortality rates in NY were lower than the U.S. Then they began to diverge, with rates in New York city rising from 1955 to 1970, a difficult period for New York. Sources: See here.

Flatlining

But starting in the 1920s and up until around 1990, mortality rates in New York remained steady, between 10 and 12 deaths per 1,000 residents. What is particularly noteworthy is that mortality rates declined faster in the U.S. than in New York City. However, New York was suffering from a series of problems that were counteracting improvements overall. The effect was a kind of net-zero for gross mortality rates—improvements and problems were offsetting each other. From 1955 to 1970, mortality rates in the city were, in fact, increasing, even while they were falling throughout the rest of the country.

New York, during its industrial heyday, was a very polluted and congested place. Starting in the late 1960s, the city’s industrial base began to disappear, putting many low- and middle-class people out of work. Racial problems intensified as cities throughout the U.S. saw the rise of so-called black ghettos, marked by hyper-segregation and concentrations of poverty, crime, and drugs. In 1975, the government was on the verge of bankruptcy and was forced to reduce services that protected the public. City problems were killing people faster than medical improvements could keep them alive.

Homicides in New York City, 1804-2019. The orange line is homicides per million residents. The blue line is percent of deaths from homicide. Sources: See here.

The Role of Homicides

New York City has always had to deal with violent crime and homicide. Here we take a brief detour to look at the long-run role of homicide in influencing mortality rates. The conclusion from this analysis is that homicides have always represented a small fraction of the total number of deaths, even in the most violent years. In the 1970s and 1980s, homicides accounted for between 2-3% of total deaths (which, of course, is still too many). In 2019, murder rates were as low as they have been since the 19^th-century. Before the modern period, the Civil War era was particularly violent, especially during the Draft Riots of 1863. Though the official figure was that 119 people killed, estimates of the actual number is as high as 1,200.

Mortality Rates in New York City, 1980-2020. The orange line up to 2019 are actual mortality rates. For 2020, mortality is estimated based on last year’s death rate and number of current recorded deaths from COVID-19. The red circle for 2001 is deaths from the 9/11 terrorist attacks. The black line is the mortality rates excluding HIV/AIDs, 9/11, and COVID-19. Sources: See here.

A New Hope: Since 1990

And then, perhaps unexpectedly, death rates in New York City began to decline. In 1990, mortality rates were about 10.1 people per 1,000 residents. By 2019, they were down to around 6.3, a reduction of over 37%. Many of the urban problems that plagued the city started to dissipate. Crime rates began to plunge, and the internet and telecommunication revolution created new employment sectors. New York City experienced a renaissance, and its residents were healthier as a result. Aside from the HIV/AIDS crisis (and some severe influenza outbreaks in 1957 and 1968), there has not been an epidemic in the city since 1918. However, between 1983 and 2017, some 83,000 New Yorkers died from HIV. In the worst years of 1994 and 1995, HIV-related deaths increased the mortality rate by 11% above baseline.

As of this writing, New York City has seen 17,351 confirmed COVD-19 deaths and another 4,692 probable deaths. The city will likely end the year with a mortality rate of around 9.0, which was last seen 25 years ago. As a comparison, the Spanish Flu Epidemic of 1918 increased the number deaths in the city by 25% as compared to the year before. To date, the coronavirus has increased the number of deaths by 42% versus 2019. So the COVID-19 epidemic will likely be worse than the 1918 flu epidemic, but not as bad as those that occurred before the Civil War.

The Future

Nonetheless, if there is one conclusion that emerges from the long run analysis is that all epidemics that hit the city were eventually banished. Sometimes, however, they took a while to be eliminated, but they were eliminated. Today, it is hard to feel and understand what diseases like yellow fever, cholera, and typhoid fever did to the people of the city, since their impacts happened so long ago. However, likely each one generated fear and panic and instilled a sense of impending doom. But the idea of New York City, as a home for strivers, as a vibrant center of culture and life will persist. It is this idea that moves us to solve our problems and will oneday return New York City back to us.

Thanks to Troy Tassier for helpful comments.

—

[1] There is also another important issue to consider, namely that of measurement. The mortality rate is defined as the number of deaths per 1,000 residents. This is possibly problematic on two fronts. First, we don’t know the actual number of deaths in a given year, since reported deaths almost surely undercount the true numbers. Second, population counts also need to be approximately accurate. There’s not much that can be done on this front, but likely, today, measurements for both figures are more accurate and so, if anything, the under counts of deaths was much higher in the 19^th century, which would suggest even larger improvements in health and well-being than indicated by the official data. However, the problem with measuring COVID-19 deaths brings to light the problem of measuring death counts during epidemics.

[2] The HIV/AIDs crises began in 1983. Deaths peaked in 1995 and steeply dropped until 2000. Since then, deaths from HIV have steadily declined.

[3] Until 1874, New York City was Manhattan. After that, the city annexed the western parts of the Bronx. In 1898, the city then annexed the rest of the Bronx, Queens, Brooklyn, and Staten Island. Here death rates until 1900 are given for Manhattan; from 1901 to the present, death rates are for the entire city.

May 19, 2020 by Jason Barr 1 Comment

Border Crossings: The Spread of COVID-19 across U.S. Counties

Jason M. Barr and Troy Tassier May 19, 2020

Pandemic 2020

The COVID-19 pandemic rages on with no end in sight. How long it will take to return to normal, at this point, is anybody’s guess. But since the Federal Government has ceded coordination on further mitigation efforts to individual states means there is going to be a hodgepodge of different policies. Each governor has been left to decide on a suite of strategies for its residents.

Some states are on a path to almost full reopening, while others are taking a more cautious approach. This week, most states will loosen at least some restrictions—even New York, the state most heavily hit by the pandemic—will begin allowing construction and curbside retail in portions of the state less impacted by the epidemic. But many areas along the east and west coasts will remain closed for longer.

While opening too soon is risky for those within each state, one could argue that if a state’s residents want to put their health at risk, that’s their choice. But a key problem with this approach is that the virus knows no borders. What happens in Vegas doesn’t stay in Vegas. If you live in a state that remains closed, but the surrounding states are opening—Illinois, for example—you should be worried.

The Spread of Coronavirus from January 22 to April 25, 2020. Note that cases are given per 100,000 residents. GIF created by Eon Kim, based on data from USAFacts.org.

A Brief History of the Coronavirus in the United States

The first case of coronavirus was reported in the United States in Washington State on January 20, 2020. From there, cases started appearing in various places throughout the country. Stage I was “the spark” in late January when the first few cases appeared. Stage II was the initial spread, but only in a few areas that saw infections early on; but it was still relatively contained.

Then around March 1^st, everything changed—the virus spread to the rest of the nation. Community transmission (where a source of the infection cannot be identified) became common. About one month later, we see the rate of increase begin to slow, suggesting that state-level social distancing measures had a positive effect. One could argue that if a strong federal policy were imposed before the 40^th day since the first infection, the rate of increase would have been much slower.

Total Number of Reported COVID-19 Cases Since January 22, 2020. Note: Day 1 here is January 22, 2020. Data is reported on natural logarithmic scale (which is better to show the growth rates of infections). Source: USAFacts.org.

The Birds and The Bees: The Reproduction Number

To understand the impact of the spread of the virus, we need to review basic epidemiology. It all begins with the reproduction number (R)—the average number of infections one person passes onto the next. For example, if each individual spreads the virus to two people, then the reproduction number is two. In this case, Greg gives it to Marsha and Jan. Marsha gives it to Bobby and Peter. Jan gives it to Cindy and Alice. And so on. Two give it to four, four to eight, and on and on. In other words, with a reproduction number of two, the number of cases double in each generation of infections.

In the U.S., earlier this year, the average reproduction number across all states was a little over two but varied widely across regions. Some estimates place New York City a little under four, while Arkansas and South Dakota may have been below one early in the epidemic. New York was hit so badly in part because of the strength of its social ties.

Four Things

R, however, is determined by four things: the fraction of remaining susceptible people in the population (how many people are left that could be infected), the duration of infection (how long an infected person carries the virus and is able to infect others), the transmission rate (how likely is a susceptible person to be infected if in contact with the virus), and the contact rate (how many other people does an infected person interact with each day when infected).

One fights an epidemic by making R smaller. If it goes below one, then instead of increasing, the epidemic will die out. For instance, if R=1/2, then 100 people become 50 in the next generation, then 25, and so on until the epidemic disappears. Each of the four elements of R can be an avenue for governmental policy.

R-Reducing Policies

We can decrease the fraction of susceptible individuals through a vaccine, but this takes time. We can limit the duration of infection through extensive testing and contact tracing, followed by isolation of the newly infected. The transmission rate can be lowered by prophylactic measures such as wearing gloves, masks and eye wear, and extensive cleaning of surfaces. But the primary method to reduce R has been by lowering the contact rate via social distancing.

After social distancing restrictions were imposed, most states lowered their reproduction number to less than one. Estimates suggest that even New York sits now at about 0.85. But, these values vary across states. Places like Iowa, Illinois, and Arizona are likely sitting just above one. Some like Alaska, Idaho, Montana, and Vermont, are well below one. But as states reopen, their R is likely to increase, even with slight easing of social distancing policies. Additionally, there appears to be no systematic effort for wide-scale testing and contact tracing.

Do No Fences Make Bad Neighbors?

One of the things that students learn early in an economics course is that no individual is an island. The buying and selling decisions of the masses determine the price you pay for any common consumer good you care to name. Laws are passed to limit the pollution of automobiles so that we don’t have too large of an effect on the clean air of others as we commute to work. We ban smoking in many public places, so non-smokers don’t breathe second-hand smoke.

These spillover effects are called externalities. Externalities don’t have to be bad. Planting flowers in your yard has a positive benefit to your neighbors across the street. The vaccines that you get to prevent you from falling sick from the flu also prevent you from infecting others; thus, your decision to get a flu shot makes everyone around you safer.

Many states are making decisions to reopen based on the caseloads, hospital capacities, and economic circumstances within their own borders. But they seem to be considering less, if at all, the potential impacts on surrounding states. A virus does not respect state borders. There is no visible or invisible fence that keeps coronavirus from passing from Georgia to Florida, or from Iowa, Wisconsin, and Indiana (each a high profile state that is reopening) to Illinois, which is still struggling to keep its caseload under control.

Regional Pacts

This is the reason that some states have formed regional pacts to coordinate their reopenings and reduce the negative externalities. For example, New York, New Jersey, Connecticut, Pennsylvania, Delaware, Rhode Island, and Massachusetts have formed a multi-state agreement to coordinate their COVID-19 responses. It is a form of centralization that will help, especially those states in the center of the region like New York and New Jersey, whose state borders are all within the pact’s boundaries. Pennsylvania, on the other hand, shares a border with Ohio, West Virginia, and Maryland. How that affects the Keystone State depends on how well their neighbors “behave.”

Corona Caseloads

To better understand the basic spread of COVID-19 we have undertaken a range of statistical analyses of county-level coronavirus cases. (The result can be found here.) We use data on confirmed cases from USAFacts.org and estimate what drives its spread by looking at variables within each county. In general, we find that the initial seeds of the epidemic in a local region tended to be driven simply by bad luck and the presence of airports.

But once it arrives in a location there are several things that impact its spread. Specifically, over time, denser counties, and those with more use of public transportation have faster growth rates—even with mandated stay-in-place measures.

Spilling Over

But to better understand the virus’ spread, we looked at how the neighboring counties affected each other. We do this in two different ways. First, we measure the average number of cases on March 20 in all surrounding counties that share a border with each county. We then statistically estimate the impacts of neighboring county cases on the original county, as of May 9^th. Second, we measure the number of cases in all counties on March 20 but discount the effect of counties that are farther away. So, this gives an average number of cases in surrounding counties, with more weight given to those closer by.

No matter which measure we choose, we find significant and large externalities across county borders. That is to say, the number of cases in surrounding counties has a significant impact on the number of cases in each county, on average.

Border Effects

We discuss the adjacent neighbors first. Let’s take two counties—call them Oak and Maple, respectively—that are the same in all respects but one. They each have about the same population density, social-economic, and racial profile, etc. The only difference is that Maple County’s neighbors have a 10% higher number of cases, on average, than Oak County. That is, the only difference between the two counties is what’s happening outside of them.

In this case, our data estimations indicate that Maple County will have 2.7% more cases than Oak. In other words, about ¼ of a one-percent increase of the cases of your neighbors are passed on to you within 60 days. If we do this same procedure but include all counties across the country but discount counties that are farther away, we get even larger effects. A 10% increase in the cases of other counties results in a 7% increase in cases in your own county 60 days later. The result from this method is larger because it considers how cases multiply across space. Your neighbors are affected by their neighbors, who are affected by their neighbors, and so on. The impact from the 60-day window illustrates how these effects are long lasting.

Magnitudes

As an example of this magnitude, suppose that in isolation, you would expect to have 1,000 cases based on your profile of population, density, public transportation use, etc. Then suppose that your neighbors (in decreasing weight by distance) have twice as many cases as another similar county’s neighbors 60 days ago. This doubling of your neighbor’s cases would result in you having 1,700 cases instead of 1,000. So, you have an extra 700 cases as a direct result of your neighbors’ influence on the pattern of the epidemic. This is the negative spatial externality associated with this epidemic. It is the reason why states and regional areas are or should be coordinating their efforts.

The main takeaway is that surrounding states can initiate a rise in cases and deaths of their neighbors. If a state, such as Illinois, is struggling to keep its R value under one, and therefore remains closed, but it starts getting “spillover” cases from a neighboring state it could trigger Illinois’ value of R to go above one, thus making it difficult for them to keep their epidemic under control, let alone reopen their economies.

Border Effects. This map gives the estimated number of COVID-19 cases on May 9, 2020 that originated directly outside the county at least 50 days before they were recorded. That is, the map gives state-level estimates of cases originating in a bordering county 60 days priore. Source: here.

Corona 2.0

And, just as there can be negative externalities in cases from a neighbor reopening, there is a positive benefit from a neighbor remaining closed. If state A opens first, it benefits economically from its actions. But, if its neighbor, B, stays closed, A pays a lower cost in terms of the epidemic because it receives fewer spillover cases. Thus, state A wants to open before state B. But, if state B remains closed while A is open, B will experience the negative externality. It loses economically because it is closed, and receives more spillover cases from A. The epidemic in the closed state will be worse as a result (and may even see its R rise to above one).

State B may then be forced to lengthen its stay-in-place policies—and the economic harm they cause—because of its neighbor’s actions. State A gets to “free ride” on its neighbor for a while. Of course, free riding will backfire. If state B has a spike of cases, it will come back to its neighbor A in the future.

The Summer of our Discontent?

This lack of coordination can have even deeper ramifications for the nation as a whole. By triggering new outbreaks across state borders, it makes the U.S. radioactive. Other countries are likely to join together to allow trade and travel among themselves, leaving the U.S. behind while the suffering and death toll drags on because some states are going it alone.

In the meantime, let’s hope an effective treatment or a vaccine gets here soon…

April 20, 2020 by Jason Barr 4 Comments

Escape from New York?: Density and the Coronavirus Trajectory

Jason M. Barr and Troy Tassier written on April 3, 2020, posted on April 20.

Note: A shorter version of this post was written for Scientific American, which was posted on April 17^th.

The Density Paradox

Anyone who follows the journalist Matt Yglesias on Twitter knows his style. He’s sardonic, facetious, and relies on barbs to make his point. His recent tweet on March 23, 2020, at 8:10 pm EST is a perfect example:

The moral of corona virus is that we should adopt the kind of low-density living patterns associated with Asian countries like South Korea, Japan, Taiwan, and Singapore that have successfully controlled its spread.

Though the tweet oozes sarcasm, it highlights how southeast Asian cities—known for their hyper-density—not only were invaded by the coronavirus pandemic but also figured out ways to slow its spread without destroying the very essence of what makes cities so successful. But his tweet contrasts sharply with the perception that New York City—the epicenter of the epidemic in the United States—is somehow to blame for causing so much damage. For example, the New York Times on March 23^rd ran an article entitled, “Density Is New York City’s Big ‘Enemy’ in the coronavirus Fight.” Journalist, Brian Rosenthal, writes:

New York has tried to slow the spread of the coronavirus by closing its schools, shutting down its nonessential businesses and urging its residents to stay home almost around the clock. But it faces a distinct obstacle in trying to stem new cases: its cheek-by-jowl density.

This kind of sentiment begs the question: what is the role of population density in spreading the virus? Are big cities worse than smaller ones? Does New York deserve such harsh criticism?

Scenes from Coronavirus, NYC. Sources: Top left, top right, bottom left, bottom right.

How Contagions Conjugate

Before we answer these questions, let’s take a step back and focus on the process of infectious diseases in general. For this, we need to assume that no preventive measures are taken, and everyone goes about business as usual, without social distancing or self-isolation. To understand how widely and quickly an infectious disease will spread, epidemiologists refer to something called the reproduction number or simply, R. It measures the average number of people that an infected individual will infect over the course of their infection.

Among extant diseases, measles has the highest reproduction number, at about 15. That is to say, assuming that the measles vaccine didn’t exist, one person with measles, if left untreated or unquarantined, can expect to pass it along to 15 others (hence the 2019 crises). For influenza, that number is only about 1.5. The current evidence puts the coronavirus at around 2.65—not as bad as the measles, but worse than the typical flu strain.

The Reproduction Number

The number of new cases on any given day is directly tied to the reproduction number. Say, for example, that R is two. This means that the average infected person passes it on to two people, who then collectively pass it on to four people, who then collectively pass it on to eight people and so on. Then, assuming the virus began with a single person, the number of new cases on a given day is directly tied to the reproduction number and the number of days since “patient zero” was infected.

But to know why and how fast the disease spreads, we need to know more about the reproduction number and what determines if it is large or small. It turns out that the reproduction number is determined by four elements. Thus, to see the role that cities play in spreading the virus, we need to look more deeply at those components.

Contact Rate

First is the contact rate. It is a measure of the average number of contacts people have with other people on a given day. Extroverts will have more contacts than their introverted counterparts. People who take public transportation and interact with many co-workers or customers each day will have more contacts than those that commute by car and sit at a solitary desk. Children in a crowded school typically have more contacts than their grandparents. Those sitting at a computer screen in Area 51 monitoring for extraterrestrial life will have fewer contacts than D.J.s. (but maybe more with Martians or Klingons).

If we take all the people in a society and calculate how frequently they mix on average, then we get the contact rate. One thing to consider, however, is that contacts are not the same for all infectious diseases. Influenza and HIV spread through very different mechanisms, and thus the count of contacts differs greatly as well. (More generally, the structure of social contacts, who interacts with whom and who they interact with as well, can also play a role, but we ignore that for the time being.)

Transmission Rate

Second is the transmission rate, which is the likelihood that any one individual will pass the disease onto someone else. Not all diseases are equally transmissible. Some infectious diseases, such as HIV, are relatively difficult to transmit. The CDC estimates that only 63 out of 10,000 exposures to shared needles will result in the transmission of HIV. Estimates for the common cold are much higher, 2%-40% depending on age. Coronavirus appears to be in the lower end of this range, perhaps around 1%-5%, although the data is far from complete. One of the key problems with the coronavirus is that even if an infected person chats with only a few other people on a given day, there is a significant chance that she passes on the infection.

Duration

Third is the time it takes for the disease to work its way through the human body from infection to illness to recovery. In the case of the Corona Virus, the average time is about a month—two weeks between infection and first symptom and then about two weeks until recovery (or death).

Susceptibility

Last is the fraction of people at a given time who are susceptible to the disease. In the case of a novel virus that is newly introduced into a population, such as Corona Virus, susceptibility is typically near 100%. One may recall the stories of Hernan Cortes carrying smallpox to the Aztecs, who were decimated from its spread. As the disease spreads and people recover, that fraction of susceptible people in a population will fall because of developed immunity.

The Fire

In summary, the reproduction number—the average number of infections a person will bestow to their fellow humans —is determined by the product of the contact rate, the transmission rate, duration of infection, and fraction of susceptible people in the population.

To offer an analogy, we can think of an infectious disease as operating like a fire. A spark ignites the wood, which burns brightly and hot, until finally, all the fuel is spent, and the fire goes out. The reproduction number determines how hot and bright the fire burns. If R is bigger than one, more and more people are infected, and the fire grows larger. But, as the epidemic grows, more people are infected and then recover to become immune; this means that the fraction of susceptible people is smaller.

In turn, as this fraction gets smaller, the reproduction number gets smaller and smaller, eventually dropping below one. It is at this point that the fire has reached the peak of its heat and begins to die out and eventually disappears because there are no longer enough susceptible people to keep the fire going. It may be that people have recovered to become immune or that people simply die. This is the story of the Black Death, which killed somewhere between 75 and 200 million people in the mid-14^th century. Once started, it burned until the hosts were buried in the ground.

The Role of Density

But what role does density play in these fires? The intuition that population density increases the propensity of an epidemic to spread in cities is correct in the sense that increased density likely leads to an increase in the contact rate of individuals, which makes the reproduction number larger, leading to more infections in dense areas. If you live in an apartment building, commute in a subway, work in a skyscraper and take your lunches in crowded diners, you are more likely to interact with someone who passes the infection along to you.

And, you’re likely to spread the infection more widely than others living in less dense areas. Thus, the reproduction number should be bigger in large dense cities and lead to bigger fires. But, how much bigger and how much of an effect will it have? We aim to answer this question below for the current stage of the epidemic in the US.

The Missing Role of Time

If we think about an epidemic spreading across a spectrum of cities, the reproduction number misses a portion of the story. An infectious disease doesn’t enter all cities at the same time. In the U.S., we heard notable stories of outbreaks on the west and east costs early in the epidemic. Only more recently are we are hearing about New Orleans, Detroit, and other urban areas in North America. The reproduction number tells us little about how the epidemic first appears and spreads through these cities and across the globe.

To think through this spread, imagine each city or town as a small forest of trees, once a spark begins a fire anywhere in the city, the rest of the forest will burn until the fuel is spent. Because dense cities have higher contact rates, they will burn a little faster and a little longer, but what is currently happening in New York City and New Orleans will be replicated in cities across the country as soon as a spark arrives. As the epidemic fire burns, it will spread to nearby areas: New Rochelle, NY lead to New York City and nearby places in New Jersey and Connecticut. But, as people travel between more distant areas of the country, this creates new sparks in new locations, and new fires start.

Thus, a traveler from elsewhere arrives in New Orleans for Mardi Gras or an otherwise anonymous spring-breaker arrives in Miami, and a new fire begins. Because we are early in an epidemic, we haven’t seen how the sparks distribute across the landscape. But, the growing fires will be bigger in areas where the fire started earlier.

Contact Rate versus Sparks: Time is of the Essence

As discussed above, the reproduction rate is given by four variables: the contact rate, the transmission rate, the disease duration, and the fraction of the population that remains susceptible. But for analyzing the coronavirus epidemic in the U.S. during the last month, the transmission rate and duration rate are likely constant across the U.S. As for susceptibility, since we are analyzing things relatively early in the epidemic (at least for the U.S.), it’s safe to assume for now that the susceptibility rate across the U.S. is also constant and likely close to 100%. Even in a city with a large outbreak like New York, with more than 30,000 cases, it is still far below 1% of the population.

But the contact rate across the country is likely affected by factors, such as population density or economic density (like regional gross domestic product). At the same time, we can measure the number of sparks which could start a fire by, for example, by the presence of a major international airport. Finally, we can investigate timing by looking at when each fire started, measured by the number of cases in each county in early March, a little over a month after the first confirmed case in the U.S.

Left: Number of COVID-19 Cases on March 27, 2020 in U.S. Counties that Had at Least Once Case vs. County Population Density (both axes in logs). Right: Number of COVID-19 Cases on March 27, 2020 in U.S. Counties that Had at Least Once Case vs. Number of Says since First Case in County. Sources:

The Density Effect

To this end, we have performed a statistical (regression) analysis of daily county-level COVID-19 cases as of March 27, 2020 (day 66 of the epidemic in the U.S.), along with several potential explanatory variables, in an effort to see how density, airports, and timing help to determine the current spread of cases across the United States. (Technical details are here.) The results show that population density does matter but is not as large as the popular media would have you believe. In fact, on average, an increase in county-level population density by 20% increases the number of cases by about 11-12%, on average (conditional on having at least one case of the virus). In other words, more populous counties are likely to have fewer cases on a per capita basis than their sparser counterparts.

We find, however, that what matters more than population density is the presence of a very large airport. A county with an airport that has at least one million passengers is likely to have nearly double the cases as compared to counties with no or smaller airports.

Further, counties that had early cases of COVID-19 have much larger case counts today. Our findings suggest that if a county had a least one case on March 1 (day 40), then by March 27, they are likely to have nearly double the cases of counties without an early case. The timing of early case arrivals is much more important than that of population density. This suggests that, at this early stage of the epidemic, the current distribution of cases in the U.S. is determined more by the arrival of early cases into a county and less by the density of the county itself. There are no low-density, low-risk counties, just counties that haven’t been found by the pandemic yet.

Around the World

It’s also worth stressing that when we look at the epicenters around the world, we see that they frequently are not located in the largest cities in each of the respective countries. In Italy, the epicenter was in northern Italy around Milan, not in Rome, which has twice as many people. In China, the epicenter was in Wuhan, China’s 10^th largest city. If population density were really the reason for the epidemic, the COVID-19 epicenters would have been in Rome and Shanghai, China’s largest city. Timing and bad luck is simply a more important driver.

The Future Course of the Virus

The analysis of density discussed here is based on the premise that the virus is free to run its course without any interventions. This is a reasonable assumption in the early stages in the U.S. before governments were able to mobilize resources and enact isolation measures. Now that states around the country have begun to fight back, there is evidence these measures are slowing its spread. This suggests the question of which policies are best to lower the reproduction rate? Do self-isolation, social distancing, or testing provide better means for fighting back? Are cities better places to be if you get infected? These are questions we’ll return to in the future when the virus has finally been vanquished.

The Pandemic Tsunami: How COVID-19 Swept Across America

The Great Equalizer?

The Bronx is Burning

The Early Role of Population Density

Month to Month

The Density Effect

Household Size

Race and Ethnicity

The Poverty Wave

The Cresting Wave

Excess Mortality

Waves within Waves

Disease and Unease in New York City (Part I): Mortality Rates since 1800

Pandemic 2020

Our Urban History

Mortality since 1800: A Helicopter View

The Epidemic Century

The Diseases

Panic in the City

Was Density to Blame?

The Mortality Transition

Flatlining

The Role of Homicides

A New Hope: Since 1990

The Future

Buy the Book

Credits