Saturday, June 6, 2020

Domestic Transmission of Covid19



Spread of Virus

There are a few primary, and commonly assumed exogenous determinants, of SARS Corona Virus 2 (hereafter Covid19). Now that we have more than 150 days of data, and the presence of at least one case of the virus in 202 countries (and territories) there is a strong data foundation for establishing what are the exogenous factors which account for the observed variation in Covid19 across time and space.
Weather/Temperature: One of the factors which was speculated early in the life and times of Covid19 was the effect of temperature on its diffusion. It was suggested that cold dry weather accelerated the spread of the disease. An alternative view drew on the experience of seasonality of influenza during winter months. Both explanations point to temperature being an important factor. In the first six countries with the virus (including the origin country China), four had temperatures much below 13 degrees C (mean population weighted world temperature in January), one country Taiwan was close to this temperature (15 C)  and only one, Thailand, was well above 13 at  24 C.  
This “coincidence” of early diffusion is meant to be suggestive. As we will soon observe, temperature is one of the very few strong, and consistent, determinants of diffusion. Please see our draft paper “Arrival and Departure – Part I.” (https://egrowfoundation.org/research/covid-19-arrival-and-departure/ ).
 Old Age, and Men, and Virus Diffusion:  Italian Covid19 cases and deaths exploded onto the world stage in March, and since then old age men have been believed to be a strong factor behind the spread of the disease. However, it is important to answer whether age affects virus diffusion or the incidence of death (with Covid19 as the cause), or both. To test this hypothesis, and using country level population data published by the UN, we have extracted the male population in each country over the age of 50, over the age of 60, 70 and 80. We test and generally obtain the result that age >=60 offers the highest explanatory power in regressions involving Covid cases.
Urbanization or population density: The spread of the Spanish Flu in England, a century ago, was attributed by a few studies to the slums in London. Casual inference based on within country data suggests that cases are largely concentrated in Metros and large cities, with noticeably fewer rural cases.  Thus, the degree of urbanization in a country may be a factor. A parallel explanation is offered using a population density variable (population size divided by inhabitable area). However, what constitutes an inhabitable area is debatable. In any case, the degree of urbanization dominates population density in regressions and the latter is therefore not used in our analysis.
Diffusion and Cases: Covid19 has made “flattening of the curve” a household term. As we all know, the curve is a graphical representation of the spread of the virus, with the Y-axis representing the log of number of cases and the X axis the number of Covid19 cases in that country (or region, or state, etc.) The X axis is not the date as in most time-series economic analysis. For example, day 1 in any diffusion analysis can either be the date e.g. January 22nd, 2020, or it can be the first day when the virus was observed in each country. Which specification one uses can make a large difference to the interpretation of the results.
The reason this differentiation is important is because the number of observed cases on any date is affected by the number of days since the first case was observed (in that country). This is because historical evidence of diffusion of communicable diseases shows an S shaped pattern: A slow start followed by an acceleration, followed by a deceleration and finally a flattening out in the end. Both the initial speed of rise in cases, and the speed of flattening, are parameters which vary between countries. Most investigators have used the logistics curve to represent this pattern. We find that for COVID19, the Gompertz curve provides a better approximation of the elongated S than other logistic curves.
Capturing Cross-Sections: One indirect and admittedly imperfect way to approximate the determinants of a cross-section sample of countries (a snapshot) is to estimate separate regressions for countries classified according to number of days since the first observed case, hereafter daycvc. Note that this affects the sample of country observations at any point of calendar time.  For example, for a sample of 167 countries (excluding small economies or territories or islands with a population less than 500,000), there were only 29 countries with a positive case on February 15, 2020. Counting from Jan. 22nd, this would be day 24 of the pandemic. Thus, if the models were estimated by date (say February 15th) there would only be 29 countries with diffusion data. But if the data are ordered by days since the first case was observed for each individual country, there are 167 countries (as of today) with at least 24 days of virus diffusion.
We therefore adopt a novel approach for estimating diffusion models. We estimate such models for daycvc equal to different days; Table 1 presents the results for daycvc equal to 40, 60, 80 and a 100 days. For day 100, there are 37 observations i.e. only 37 countries have had a virus case for a minimum of 100 days.
Old Age Males do not explain diffusion of COVID19 ?
One important result that emerges is that the percentage of males in the population over the age of 60 is not significant in explaining the number of COVID cases, and this is irrespective of the definition (male population above 50, male population above 60, etc) and irrespective of the time elapsed (daycvc 40, 60, 80 or 100). These counter intuitive cases could be due to the possibility that greater care was being exercised by every country in protecting their aged population from catching the virus, after learning from the world-wide experience of fatality rates among the aged population, and especially of aged male population. As explained in our next blog, aged males are more likely to die from the virus; as was confirmed by the “vision” of the Italian experience.
 
Table 1:     Determinants of the Spread of COVID19 within a country
 

Days since the first day COVID19 was observed

40
60
80
100





%male population>60
0.0315
0.0411
-0.0591
-0.0220
(UN)
(0.58)
(0.74)
(-0.91)
(-0.24)





% urban
0.0292***
0.0319***
0.0390***
0.0231
(World Bank WDI)
(3.90)
(4.23)
(4.17)
(1.30)





Mean temperature
-0.0491*
-0.0628**
-0.0676**
-0.106*
(Average for Days of Virus
(-2.36)
(-3.16)
(-2.83)
(-2.53)
in each country)




Constant
4.589***
5.821***
7.162***
9.703***

(6.30)
(7.72)
(7.59)
(5.03)





Number of observations
170
168
129
36
adj. R-sq
0.21
0.27
0.21
0.20





t statistics in parentheses




="* p<0.05
 ** p<0.01
 *** p<0.001"


 Data Sources: COVID19 data, JHU, World Bank, UN

 Temperature: Among the three major explanatory variables, temperature is consistently the most significant and has the expected negative sign i.e. higher the temperature, less the diffusion. The coefficient is also remarkably stable in value – around -.08. Evaluated at the mean on day 90, each 1 degree increase in temperature led to 8243 fewer cases. (coefficient -.106, mean number of cases 77767). There is also some evidence that the importance of this variable has increased and strengthened over time. This could be due to the arrival of summer in the northern hemisphere and of winter in the southern hemisphere, slowing the spread of COVID cases in the former and accelerating it in the latter. One implication of this result is that countries which are traditionally affected strongly by the seasonal flu, may see a similar pattern for the COVID. The virus pattern has continually confounded experts – hence the emphasis on may
The third important result is that urbanization is important in explaining the spread of the virus, but only after 40 days have passed. The significance of urbanization also increases with days of presence of the virus and the magnitude increases to 0.04 for daycvc equal to 80, from 0.032 for daycvc equal to 40. This result is supportive of the social distancing hypothesis.  
 There is no precedent for this rather unique pandemic. It has befuddled policy makers, economists, and epidemiologists in knots. Everybody who could be proved wrong, has been proved wrong, and we are all humbled by the uncertain uncertainty of COVID. 
 
The chart represents the combined effect of all the three factors on prediction of cases. For example, for the USA, the error in prediction seems to be the highest (it is the largest positive distance away from the red line of equality between predicted and actual. Sri Lanka and Canada are opposite ends of the temperature and COVID spectrum, yet both are on the line. The countries below the line are the good performers – those farther away are the better performers e.g. Nepal. 
 
Chart 1: Actual and Predicted COVID cases on Day 80 of virus in each country
 

 

No comments: