We first explore some of the exogenous factors that affect the spread of the SARS Corona Virus 2 and the spread of the Covid-19 disease. One of the factors which was speculated on early in the life of COVID was the effect of temperature on slowing or speeding up the spread of the corona virus within a country. It was suggested that cold dry weather accelerated the spread of the disease. We test this possibility rigorously. We use the mean temperature over the relevant period (tempavg) to test this hypothesis.
Since the Italian cases and deaths exploded, old age has been recognized as one of the factors for the spread of the disease. Is this mainly a factor in the fatality rate, or does it affect the domestic spread of the disease? If the answer is yes is it the population over 60 (popmge60) or higher age levels?
The spread of the Spanish Flu in England was attributed by a few studies, to slums of London. Within country data, where available, suggests that cases are concentrated in Metros and large cities, with noticeably fewer cases in rural areas or relatively rural states. Therefore, the degree of urbanization may be a factor. We test for his by using the log of urban population (lpopurban).
S curve analysis for a number of countries suggest that the number of cases rises almost linearly with time during the middle phase of the spread of the virus. As few countries have flattened out at the top of the S curve, and those that are reported to have reached their peaks are very suspicious, we test a linear time variable daycvc.
We use cross country data (only) to determine which of these variables help us explain the differences in the number of corona virus among countries as of the end of third week of May. The results of the cross-country regressions are presented in table 1. They show that number of days from start of first case and mean temperature are significant in explaining the domestic spread of the corona virus. The urban population is found to be highly significant when taken along with the first two variables (reg 2, table1), The the aged population whether the cut off is taken as 80 years, 70 years or 60 years is also significant when introduced along with the previous two variables (reg1, table 1). Because of the high correlation of 0.96 between the age variables and the urban population, the former is no longer significant when introduced along with the latter (reg 3, table 2).
Table 1: Cross-country Regression results for Corona Virus Cases
Note:(1) *** means significant at 1% confidence level.
(2) results are the same for lpopmge60, popmge70 & popmge80
(3) Stringency index is not significant in best equation.
Lcvc = log of total corona virus cases in the country (latestest)
Daycvc = days from date of first corona virus case in country
Tempmean = mean temperature in the country during the period.
lpopmge60 = log of percent of population above 60 years of age.
Lpopurb = log of urban share of population
We therefore conclude that along with days variable, mean temperature, urban population are the most important factors explaining the differences in cases across countries. However, these are the results when most countries are in the middle of the pandemic. The results could change when most countries have reached the top of the S curve.
The actual and predicted number of cases, for the preferred regression (reg 2) is plotted in Figure 1 against the urban share variable variables. Vietnam stands out as an outlier with the number of reported cases much less than predicted (in reg 2). Other Asian countries in this category are Cambodia, Nepal and Mongolia. There are also three African countries with similar errors. Under prediction errors are the highest for Qatar, Peru and Panama.
This blog originates in joint research with Dr Surjit Bhalla, Executive Director, IMF.