Category Archives: Coronavirus

The Covid-Statistic Wars are Heating Up

After a general consensus or willingness to accept social distancing measures imposed by politicians (often referred to as lock-downs), societies are polarizing. Some citizens want to open stores, bars, and restaurants (and get a hair cut). Others want to keep social distancing measures in place. Some people on both sides are not interested in scientific arguments for or against their position. Others like to find scientific evidence that seemingly supports their viewpoint. This abuse of science is becoming more common in a polarized world. As a scientist, I am concerned about the weaponizing of science because it undermines the ability of science to inform decisions and to correct false beliefs. Psychological research has shown how easily we assimilate information that matches our beliefs and treat disconfirming evidence like a virus. These motivated biases in human reasoning are very powerful and even scientists themselves are not immune to these biases.

Some economists appear to be afflicted by a bias to focus on the economic consequences of lock-downs and to downplay the effects of the virus itself on human lives and the economy. The idea is that lock-downs were not helpful to save lives at immense costs to the economy. I am not denying the severe consequences of unemployment (I actually co-authored an article on unemployment and well-being), but I am shocked by claims in a tweet that social distancing laws are ineffective that have been retweeted 3,500 times or blog posts that make similar claims accompanied by scatterplots that give the claims the appearance of scientific credibility.

There is nothing wrong with these graphs. I have examined the relationship between policies and Covid-19 deaths across US states and across countries, and I have also not found a significant correlation. The question is what this finding means. Does it imply that lock-down measures were unnecessary and have produced huge economic costs without any benefits? As some responses on twitter indicated, interpreting correlational data is not easy because many confounding factors influence the correlation between two variables.

Social distancing is unnecessary if nobody is infected

Let’s go back in time and impose social distancing policies across the world in May 2019 randomly in some countries and not in others. We observe that nobody is dying of Covid-19 in countries with and without ‘lock-down’. In addition, countries with lock-down suffer high rates of unemployment. Clearly, locking countries down without a deadly virus spreading is not a good idea. Even in 2020 some countries were able to contain relatively small outbreaks and are now mostly Covid-free. This is more or less true of countries like Taiwan, Australia, and New Zealand. However, these countries impose severe restrictions on travel to ensure that no new infections are brought into the country. When I tried to book a flight from Toronto to Sydney, I was not able to do so. So, the entire country is pretty much in lock-down to ensure that people in Australia cannot be infected by visitors from countries that have the virus. Would economists argue that these country-wide lock-downs are unnecessary and only hurt the tourist industry?

This image has an empty alt attribute; its file name is image-16.png

The fact that Covid-19 spread unevenly across countries also creates a problem for the correlation between social-distancing policies and Covid-19 deaths across countries. The more countries are actively trying to stem the spread of the virus, the more severe social-distancing measures will be, while countries without the virus are able to relax social distancing measures. Not surprisingly, some of the most sever restrictions were imposed at the peak of the epidemics in Italy and Spain. This produces a positive correlation between severity of lock-downs and spread of Covid-19, which could be falsely interpreted as evidence that lock-downs even increase the spread of Covid-19. A simple correlation between lock-down measures and Covid-19 deaths across countries is simply unable to tell us something about the effects of lock-down measures on deaths within countries.

Social Distancing Effects are Invisible if there is no Variation in Social Distancing Across Countries

To examine the effectiveness of social-distancing measures, we need to consider timing. First, social distancing measures may be introduced in response to a pandemic. Later on, we might see that countries or US states that imposed more severe restrictions were able to slow down the spread of the virus more. However, now we encounter a new problem. Most countries and states responded to the declaration of Covid-19 as a pandemic by the WHO on March 11 with very similar policies (school closures). This makes it difficult to see the effects of social distancing measures because we have little variation in the predictor variable. We simply do not have a large group of countries with a Covid-19 epidemic that did nothing. This means, we lack a proper control group to see whether spread in these countries would be bigger than in countries with severe lock-downs. Even countries like the UK closed schools and bars in mid March.

Sweden is often used as the example of a country that did not impose severe restrictions on citizens and kept schools open. It is difficult to evaluate the outcome of this political decision. Proportional to the population, Sweden ranks number 6 in the world in terms of Covid-19 deaths, but what is a proper comparison standard? Italy and Spain had more severe restrictions and more deaths, but their epidemics started earlier than in Sweden. Other Nordic countries like Norway, Denmark, and Finland have much lower fatalities than Sweden. This suggests that social distancing is effective in reducing the spread, but we do not have enough data for rigorous statistical analysis.

Social Distancing Policies Explain Trajectories of Covid-19 spread in hot-spots.

One advantage of epidemics it is possible to foresee the future because exponential growth produces a very notable trajectory over time that is hard to miss in statistical analyses. If every individual infects two or three other people, the number of cases will grow exponentially until a fairly large number of the population is infected. This is not what happened in Covid-19 hot spots. Let’s examine New York as an example. In mid March, the number of detected cases and deaths increased exponentially, with numbers doubling every three days.

The number of new cases peaked in the beginning of April and has been decreasing until now. One possible explanation for this pattern is that social-distancing policies that were mandated in mid-March were effective in slowing down the spread of the virus. Anybody who claims that lock-downs are ineffective needs to provide an alternative explanation for the trajectory of Covid-19 cases and deaths over time.

Once more it is difficult to show empirically what would have happened without “lock-downs”. The reasons is that even in countries that did not impose strict rules people changed their behaviors. Once more we can use Sweden as a country without ‘lock-down’ laws. As in New York, we see that rapid exponential growth was slowed down. This did not happen while people were living their lives as they did in January 2020. It happened because many Swedes changed their behaviors.

The main conclusion is that the time period from March to May makes it very difficult to examine scientifically what measures were effective in preventing the spread of the virus and what measures were unnecessary. How much does wearing masks help? How many lives are saved by school closures? The best answer to these important questions is that we do not have clear answers to these questions because there was insufficient variation in the response to the pandemic across nations or across US states. Most of the variation in Covid-19 deaths is explained by the connectedness of countries or states to the world.

Easing Restrictions and Covid-19 Cases

The coming months provide a much better opportunity to examine the influence of social distancing policies on the pandemic. Unlike New Zealand and a few other countries, most countries do have community transmission of Covid-19. The United States provide a naturalistic experiment because (a) the country has a large population and therewith many new cases each day and (b) social distancing policies are made at the level of the 50 states.

Currently, there are still 20,000 new confirmed (!) positive cases in the United States. There are also still over 1,000 deaths per day.

There is also some variation across states in the speed and extend to which states ease restrictions on public life (NYT.05.20). Importantly, there is no state where residents are just going back to live as it was in January of 2020. Even states like Georgia that have been criticized for opening early are by no means back to business as usual.

So, the question remains whether there is sufficient variance in opening measures to see potential effects in case-numbers across states.

Another problem is that it is tricky to measure changes in case-numbers or deaths when states have different starting levels. For example, in the past week New York still recorded 41 deaths per 1 Million inhabitants, while Nebraska recorded only 13 deaths per 1 Million inhabitants. However, in terms of percentages, cumulative deaths in New York increased by only 3%, whereas the increase in Nebraska was 23%. While a strong ‘first wave’ accounts for the high absolute number in New York, it also accounts for the low percentage value. A better outcome measure may be whether weekly numbers are increasing or decreasing.

Figure 1 shows the increase in Covid-19 deaths in the past 7-days (May 14 – May 20) compared to the 7 days after some states officially eased restrictions (May 2 – May 8).

It is clearly visible that states that are still seeing high numbers of deaths are not easing restrictions (CT, NJ, MA, RI, PA, NY, DE, IL, MD, LA). It is more interesting to compare states that did not see a big first wave that vary in their social distancing policies. For this analysis, I limited the analysis to the remaining states.

States below the regression line are showing faster decreases than other states, whereas states above the regression line show slower decreases or increased. When the opening policies on May 1 (NYT) are used as predictors of deaths in the recent week with deaths two weeks before as covariate, a positive relationship emerges, but it is not statistically significant. It is a statistical fallacy to infer from this finding that policies have no influence on the pandemic.

More important is the effect size, which is likely to be somewhere between -2 and + 4 deaths per million. This may seems a small difference, but we have to keep in mind that there is little variation in the predictor variable. Remember, even in Georgia where restaurants are open, the number of diners is only 15% of the normal number. The hypothetical question is how much bigger the number of Covid-19 cases would be if restaurants were filled at capacity and all other activities were back to normal. It is unlikely that citizens of open states are willing to participate in this experiment. Thus, data alone simply cannot answer this question.


Empirical science rely on data and data analysis. However, data are only necessary and not sufficient to turn a graph into science. Science also requires proper interpretation of the results and honest discussion of their limitations. It is true that New York has more Covid-19 deaths than South Dakota. It is also true that some states like South Dakota never imposed severe restrictions. This does not imply that stay-at-home orders in New York caused more Covid-19 deaths. Similarly, the lack of a correlation between Covid-19 policies and Covid-19 cases or deaths across US states does not imply that these policies have no effect. Another explanation is that there are no states that had many deaths and did not impose stay-at-home orders. For this reason, experts have relied on models of epidemics to simulate scenarios what would have happened if New York City had not closed schools, bars, and night clubs. These simulations suggest that the death toll would have been even greater. The same simulations also suggest that many more lives could have been saved if New York City had been closed down just one week earlier (NPR). Models may sound less scientific than hard data, but data are useless and can be misleading when the necessary information is missing. The social-distancing measures that were imposed world-wide did reduce the death toll, but it is not clear which measures reduced it by how much. The coming months may provide some answers to this questions. S. Korea quickly closed bars after one super spreader infected 40 people in one night (businessinsider). What will happen in Oklahoma where bars and nightclubs are reopening? Personally, I think the political conflict about lock-downs is unproductive. The energy may be better spend on learning from countries that have been successful in controlling Covid-19 and who are able to ease restrictions.

Politics vs. Science: What Drives Opening Decisions in the United States?

The New York Times published a map of the United States that shows which states are opening up today on May 1.

I coded these political decisions on a 1 = shut down or restricted to 3 = partial reopening scale and examined numerous predictor variables that might drive the decision to ease restrictions.

Some predictor variables reflect scientific recommendations such as the rate of testing or the number of deaths or urbanization. Others reflect political and economic factors such as the percentage of Trump supporters in the 2016 election.

The two significant predictors were the number of deaths adjusted for population (on a log-scale) and support for Trump in the 2016 election. The amount of testing that is being carried out in different states was not a predictor.

Another model showed that states that have not been affected by Covid-19 are more likely to open. These are states where the population is more religious, White, and rural.

It was not possible to decide which of these variables are driving the effect because predictor variables are too highly correlated. This simply shows the big divide between “red”, rural, religious states and “blue,” agnostic, and urban states.

A bigger problem than differences between states are probably differences within states between urban centers and rural areas, where a single state-wide policy is unlikely to fit the needs of urban and rural populations. A big concern remains that decisions about opening are not related to testing, suggesting that some states who are opening do not have sufficient testing to detect new cases that may start a new epidemic.

Covid-19 in Quebec versus Ontario: Beware of Statistical Models

I have been tracking the Covid-19 statistics of Canadian provinces for several weeks (from March 16 to be precise). Initially, Ontario and Quebec were doing relatively well and had similar statistics. However, over time the case numbers increased, deaths, especially in care homes were increasing and the numbers were diverging. The situation in Quebec was getting worse and recently the number of deaths relative to the population was higher than in the United States. Like many others, I was surprised and concerned, when the Premier of Quebec announced plans to open businesses and schools sooner than later.

I was even more surprised when I read an article on the CTV website that reported new research that claims the situation in Quebec and Ontario is similar after taking differences in testing into account.

The researchers base this claim on a statistical model that aims to correct for testing bias and that is able to estimate the true number of infections on the basis of positive test results. To do so without a representative sample of tests seems rather dubious to most scientists. So, it would be helpful if the researchers could provide some evidence that validates their estimates. A simple validation criterion is the number of deaths. Regions that have more Covid-19 infections should also have more deaths, everything else being equal. Of course, differences in age structures or infections of care homes can create additional differences in deaths (i.e., the caes-fatality rates can differ), but there are no big differences between Quebec and Ontario in this regard as far as I know. So, is it plausible to assume that Quebec and Ontario have the same number of infections? I don’t think so.

To adjust for the difference in population size, all Covid-19 statistics are adjusted. The table shows that Ontario has 1,234 confirmed positive cases per 1 Million inhabitants while Quebec has 3,373 confirmed positive cases per 1 Million residents. This is not a trivial difference. There is also no evidence that the higher number in Quebec is due to more testing. While Ontario has increased testing lately, testing remains a problem in Quebec. Currently, Ontario has tested more (21,865 per Million tests) than Quebec (19,471 per Million tests). This also means that the positive rate (percentage of positive tests; positives/tests*100) is much higher in Quebec than in Ontario. Most important, there are 741 deaths for 10 Million residents in Ontario and 2157 deaths in Montreal. That means there are 2.91 times more deaths in Quebec than in Ontario. This matches the differences in cases where Quebec has 2.73 times more cases than Ontario. It follows that Ontario and Quebec also have similar case-fatality rates of 6.00% and 6.39%. That is, out of 100 people who test positive, about 6 die of Covid-19.

In conclusion there is absolutely no evidence for the claim that the Covid-19 pandemic has affected Ontario and Quebec to the same extent and that differences in testing produce misleading statistics. Rather, case numbers and deaths consistently show that Quebec is affected three time worse than Ontario. As the false claim is based on the Montreal authors’ statistical model, we can only conclude that their model makes unrealistic assumptions. It should not be used to make claims about the severity of Covid-19 in Ontario, Quebec, or anywhere else.

Coronavirus in Canada: How have we been doing so far?

Like many colleagues who are professional scientists, I have followed the COVID-19 pandemic closely. We have been warned against turning into “armchair epidemiologists” who are not trained to analyze or understand epidemiological data. The problem is that the real experts are very busy analyzing data and modeling data and don’t seem to have time to communicate with the public about their results or data. So, the general public is left in the dark or is told about the COVID-19 numbers by politicians or journalists, who are not trained in data analysis at all. Some of the numbers that are being shared are doing more harm than good. For example, it is not helpful to compare the number of cases who tested positive in the UK and Canada because Canada has done a lot more testing than the UK, especially when we take population size into account. As I demonstrated elsewhere, taking testing rates into account, Canada looks better than the UK (Schimmack, March 30, 2020).

Ultimately, the most important statistic is not the number of people who tested positive or how many people were infected by the virus, but how many people died of COVID-19. These numbers were small and not very informative in the beginning, but all over the world these numbers are increasing at a rapid rate. This makes it possible to compare Canada to other countries, and to compare Canadian provinces to each other, or to US states. For this purpose, I recorded the cumulative fatality rates in the UK, USA, and Canada; as well as six states in the US, and four Canadian provinces, Alberta, British Columbia, Ontario, and Montreal. Figure 1 shows the results.

Figure 1 adjusts for population size. Deaths are computed as the number of deaths for 1 million inhabitants. In New York, 150 people out of 1 million people have died so far. The graph shows that this high fatality rate is unique. In comparison, Canada has recorded only 5 deaths for every 1 million Canadians. This is 30 times fewer deaths than in New York state. The UK has recorded 54 deaths for every million residents. This is 10 times more than in Canada. The USA has recorded 21 deaths for every 1 million residents. This is still 4 times more than in Canada. Thus, in comparison, Canada is doing well, although some other countries (e.g. Australia) are doing better, we are lucky that COVID-19 has not created a major crisis here.

Luck is probably not the only reason why Canada has been doing relatively well. It is well-known that the response by Canadian governments (federal and provincial) differed from the response in the UK and in the United States. Canada was faster to conduct more tests, to close schools, and to introduce social distancing measures. Although the data do not prove that these measures caused the better outcome in Canada, it is a plausible and probable explanation. Every Canadian who has complied with social-distancing rules has contributed to the positive outcome so far.

The fairly uniform response in Canada may also explain why there is little variation across Canadian provinces that all cluster at the bottom of the graph. Although BC had an early outbreak, it was able to get on top of it, and is now seeing a flattening of the curve in BC (not visible in the graph).

New York

In contrast, US states differ dramatically from each other. The most notable exception is New York. Although the numbers are for the entire state, the death toll is mostly due to a crisis in New York City. I have been asking colleagues and read newspaper articles to understand why NY has been hit so hard. As the numbers are adjusted for population size, size is not the answer. A more plausible factor is population density, which is mentioned as a key factor. Of course, the virus is spreading more in urban areas all over the world, but I find it hard to believe that this is the only explanation. I have looked at data from Germany, where cities like Berlin have much lower death rates than New York. Like many Berliners, I did not have a car and used public transport to get around the city. If public transport is a major risk factor, the situation should be worse. The same is true of Montreal and Toronto. For some reason, Canadian big cities have been spared the fate of New York City. I included Illinois in the graph because over half of the inhabitants of Illinois live in ‘Chicagoland.’ Although Illinois is more affected than Canada or Ontario, it is nothing like New York or even Detroit, Michigan. I don’t have an answer for the unique situation in New York (and New Jersey), but the good news is that neither Canada nor other US states are on the same trajectory.


Like British Columbia, Washington was hit early. It has the highest fatalities on March 18 when the graph starts. However, Washington has been able to ‘flatten the curve’ and to keep fatalities relatively low. This shows that it is possible to flatten the curve and that we can examine what measures Washington put in place to do so. Internationally, Asian countries have also been able to flatten the curve, but it is an open question whether Europeans and North Americans can do so. Washington and British Columbia show that it is possible.

Predicting the Future

You do not need to be a rocket scientist or an epidemiologist to understand the numbers of the past. Training in epidemiology is more important when it comes to predicting the future. Of course, predicting the future is also much harder because we are making educated guesses (guestimates) about the future based on data from the past. In this way, epidemiology is more like forecasting the weather, stock markets, or climate. Many Canadians like me have been frustrated by the reluctance of scientists or politicians in Canada to share their models and the predictions of these models. A notable exception was the premier of Ontario, Dough Ford, who shared some predictions on March 3 (CBC). The headline figures were that over the full course of the pandemic, which could last up to two years, 3000 to 15,000 people in Ontario may die of the coronavirus. Of course, a prediction over a two year time-frame is extremely uncertain and depends on many unknown factors such as how quickly a vaccine will be available. Personally, I found it more important to learn that experts expect about 1,600 deaths in Ontario by the end of this month (National Post), if the current measures stay in place. This translates into 123 deaths per 1 million citizens, compared to the current number of 5 deaths per 1 million. That is 25 fold increase over the next weeks until April 30. In the graph, this level is well above the UK and not much below the current level of 150 deaths / million in NY. This is a scary and sobering thought. Our minds are not used to think in terms of exponential growth. We tend to extend graphs in a linear fashion and the trajectory in Canada seems reassuring but they are not. Michigan’s trajectory in the figure shows what can happen in a short period of time. I am not an epidemiologists and I don’t have a model or a crystal ball to predict the future. I am only sharing with Canadians how things have played out so far. We are probably not even at the end of the first period of a hockey game that may well go into several overtimes. So far, we are up 1:0, but as all Canadians know that doesn’t mean we won the game or we can relax. All we know is that we played pretty well in the first period, and that we have a fighting chance if we continue to play well.