Despite tremendous efforts by scientists to aid in the fight against Covid-19, many important questions lack a clear scientific answer. It is known that children can be infected by the virus and in some rare cases the virus can produce severe symptoms or even death. However, it is unclear whether children have a lower risk to get infected and whether they are less likely to infect others. On May 7, Nature wrote ” scientists are still trying to understand what the deal is with kids and COVID-19″.
A major problem is that most countries responded to the Covid-19 pandemic by closing schools and minimizing children’s contact. Thus, lower number of children among people testing positive for Covid-19 might simply reflect less exposure to the virus. A notable exception is Sweden, where schools were not closed. Unfortunately, Sweden did not test children, teachers, or parents to examine whether children transmitted the virus (Science, May 22).
A German study of viral load in children and adults suggested that children are no different from adults, but this study has been criticized on methodological grounds (Science Media Centre). A key problem with this claim is that it is impossible to proof the lack of a difference. It is only possible to quantify the amount of a difference and notice that the observed difference is not statistically different from zero. The key problem with this study is that the sample size is small and there is ample statistical uncertainty. Thus, no firm conclusions can be drawn from this study alone.
My colleagues (Shigehiro Oishi, Youngjae Cha, and Bansi Javiya) have been analyzing the open data about Covid-19 deaths and cases in New York City and used the US census data to predict variation in positive cases and deaths across New York City. While we are still working on this project, we would like to share some interesting results about children and Covid-19 that emerged in our analyses. Before we do so, we want to make clear that ZIP code that are not the best data to examine this question, our results are preliminary, and even if our results hold up they do not provide conclusive evidence, and the results cannot and should not be used to make policy recommendations about opening schools or not. The main purpose of this blog post is to share information with scientists who are interested in this question and to add a tiny piece of information to the big puzzle.
Occupants and Covid-19
While the role of children in the transmission of Covid-19 is still unclear, the evidence for transmission at home is much stronger. It makes logical sense that the infection rate is greater if more people share the same living space. In addition, crowded living conditions have been linked to higher rates of Covid-19. For this reason, we looked at several predictors from the US census that reflect crowding. The best predictor was the percentage of residence with more than 1 occupant per room.
We used several measures of Covid-19 prevalence. All of them showed a positive correlation with occupants, but the correlations were stronger for the positive rate (positives / tests) than for the positives per capita (positives / population * 100,000). One possible explanation for this is that testing varies as a function of other factors. It made little difference whether we used the raw numbers or residuals that controlled for differences between Boroughs, so we did use the raw scores. The correlation with the positive rate was r = .54 and the correlation with deaths was r = .20.
This simple correlation does not prove causality. There are a host of other variables that may explain this relationship. We tried to address this issue by including other variables in a causal model.
Children and Covid-19
The number of children living in an area defined by a NYC ZIP code was also a predictor of positive cases, r = .53, and deaths, r = .39. It is important to realize that this is an analysis of ZIP codes and not of individuals. It is well-known that children are at a much lower risk to die from Covid-19. Thus, there is actually a negative link for individuals. Thus, the positive correlation must reflect some other causal mechanism. One possibility (out of many) would be that children can infect older people in the same household that are at a high risk of dying when they get infected. This ‘theory’ implies transmission of the virus from young children to old people who live together.

When we fitted this model to the data, it showed indeed a causal path from children to residents to positive cases to deaths. This path implied that for every 10 percentage point increase in the proportion of children, an additional 2 people per 100,000 would die. The average is 17 death per 100,000 inhabitants. So, an increase by 2 people is a 12% increase.
The model also shows that there is still an unexplained positive relationship between children and risk of infection. This path would contribute another 4 deaths per 100,000.
These results show that ZIP codes with more children have more deaths, and that this relationship is partially explained by children adding to the number of people in a residence. However, once more these results have to be interpreted with caution because important predictor variables are missing from the model.
Income
One potential confounding factor is income. Several analyses of the NYC data have shown that Covid-19 is more prevalent and deadly in ZIP codes with lower income. Thus, we added income to the model.

ZIP code income predicted that ZIP codes had fewer children, fewer occupants, a lower positive rate, and fewer deaths. Although the direct effect on deaths was not statistically reliable, income had a clear indirect effect on deaths by lowering the risk of getting infected. These results show that the effect of children in the previous model was inflated by ignoring the confounding effect of income. In this model, the effect of children on death was 2.1 deaths for every 10 percentage point increase in the proportion of children and 0.7 was explained by the effect on residents and 1.4 was still a direct effect.
Ethnicity
Numerous articles found ethnic disparities in Covid-19 deaths. Thus, we added ethnicity as a predictor variable to the model. We used the percentage of White residents as the comparison group and the percentages of Asian, Black, and Hispanic residents as predictors. It is difficult to visualize the complex relationships of this model. Thus, we merely report the key finding about children and Covid-19 here.
Including ethnicity as a predictor further reduced the ‘effect’ of children on Covid-19-deaths to 1.8 deaths (0.3 indirect via occupants and 1.5 direct on PR).
Age
Just like there can be confounding factors that inflate relationships, some confounding variables can suppress a relationship. We found that was the case for your measure of the percentage of residents over 65. As expected, ZIP codes with a higher proportion of older residents had more Covid-19 deaths. We also found a negative relationship between the proportion of older residents and occupants, r = -.35. Thus, the fact that high-occupancy ZIP codes tend to be younger reduced the effect of occupants on deaths. In this model, children increased deaths by 2.6 deaths per 100,000. This relationship is highly statistically significant and very unlikely to be just a random fluke. However, it is still possible that other variables that are missing from the model explain this relationship. The multiple pathways are weaker, and it is difficult to say how much they contribute to the relationship.

Conclusion
The key finding of our analyses of the NYC Covid-19 data is that ZIP codes with more children have more Covid-19 as reflected in a higher positive rate (positives / tests) and deaths. This relationship remains after controlling for income, ethnicity, and the proportion of senior residents. The final model suggests that some of the effect is explained by crowded living conditions. These results suggest that children could be transmitting the virus as much as other occupants. However, many other explanations are possible.
The most important limitation of our work is that it relies on ZIP codes, while the actual causal process is person to person transmission. We think that it would be valuable to follow up on this work with studies that examine the social networks of NYC residents who contracted the virus and who did not. Anti-body tests would be particularly useful to examine the spreading of Covid-19 within households. Relevant data may already exist from contact-tracing of infected individuals. We believe that tracing infections and deaths in NYC provides useful information about children’s risk to contract and spread the virus.
Please feel free to contact us with related information or questions.
Excellent work – thank you for sharing! Question: can you break down by age of children? Some other research suggests that older children (10+ years) are more likely to spread than younger children (preschool, or 5-10 years). Potential differences have big policy implications.
Based on recollection, I believe we didn’t find differences by age group, but the analysis has low power to detect differences that may exist.