with Surjit Bhalla (also at, https://medium.com/@time_will_tell/covid19-day-15-0-what-we-know-and-what-we-dont-know-aac6298fcfdb )
Spread of Virus
There
are a few primary, and commonly assumed exogenous determinants, of SARS Corona
Virus 2 (hereafter Covid19). Now that we have more than 150 days of data, and
the presence of at least one case of the virus in 202 countries (and
territories) there is a strong data foundation for establishing what are the
exogenous factors which account for the observed variation in Covid19 across
time and space.
Weather/Temperature: One
of the factors which was speculated early in the life and times of Covid19 was the
effect of temperature on its diffusion. It was suggested that cold dry weather
accelerated the spread of the disease. An alternative view drew on the
experience of seasonality of influenza during winter months. Both explanations
point to temperature being an important factor. In the first six countries with
the virus (including the origin country China), four had temperatures much below
13 degrees C (mean population weighted world temperature in January), one country
Taiwan was close to this temperature (15 C) and only one, Thailand, was well above 13 at 24 C.
This
“coincidence” of early diffusion is meant to be suggestive. As we will soon
observe, temperature is one of the very few strong, and consistent,
determinants of diffusion. Please see our draft paper “Arrival and Departure –
Part I.” (https://egrowfoundation.org/research/covid-19-arrival-and-departure/
).
Old
Age, and Men, and Virus Diffusion: Italian Covid19 cases and deaths
exploded onto the world stage in March, and since then old age men have been
believed to be a strong factor behind the spread of the disease. However, it is
important to answer whether age affects virus diffusion or the incidence of
death (with Covid19 as the cause), or both. To test this hypothesis, and using
country level population data published by the UN, we have extracted the male
population in each country over the age of 50, over the age of 60, 70 and 80. We
test and generally obtain the result that age >=60 offers the highest
explanatory power in regressions involving Covid cases.
Urbanization
or population density: The spread of the Spanish Flu in England,
a century ago, was attributed by a few studies to the slums in London. Casual
inference based on within country data suggests that cases are largely concentrated
in Metros and large cities, with noticeably fewer rural cases. Thus, the degree of urbanization in a country
may be a factor. A parallel explanation is offered using a population density
variable (population size divided by inhabitable area). However, what constitutes
an inhabitable area is debatable. In any case, the degree of urbanization
dominates population density in regressions and the latter is therefore not
used in our analysis.
Diffusion
and Cases: Covid19 has made “flattening of the curve” a
household term. As we all know, the curve is a graphical representation of the
spread of the virus, with the Y-axis representing the log of number of cases
and the X axis the number of Covid19 cases in that country (or region, or
state, etc.) The X axis is not the date as in most time-series economic
analysis. For example, day 1 in any diffusion analysis can either be the date
e.g. January 22nd, 2020, or it can be the first day when the virus
was observed in each country. Which specification one uses can make a large
difference to the interpretation of the results.
The
reason this differentiation is important is because the number of observed
cases on any date is affected by the number of days since the first case
was observed (in that country). This is because historical evidence of
diffusion of communicable diseases shows an S shaped pattern: A slow start
followed by an acceleration, followed by a deceleration and finally a
flattening out in the end. Both the initial speed of rise in cases, and the speed
of flattening, are parameters which vary between countries. Most investigators
have used the logistics curve to represent this pattern. We find that for
COVID19, the Gompertz curve provides a better approximation of the elongated S
than other logistic curves.
Capturing
Cross-Sections: One indirect and admittedly imperfect way
to approximate the determinants of a cross-section sample of countries (a
snapshot) is to estimate separate regressions for countries classified
according to number of days since the first observed case, hereafter daycvc.
Note that this affects the sample of country observations at any point of calendar
time. For example, for a sample of 167
countries (excluding small economies or territories or islands with a
population less than 500,000), there were only 29 countries with a positive
case on February 15, 2020. Counting from Jan. 22nd, this would be
day 24 of the pandemic. Thus, if the models were estimated by date (say
February 15th) there would only be 29 countries with diffusion data.
But if the data are ordered by days since the first case was observed for each
individual country, there are 167 countries (as of today) with at least 24 days
of virus diffusion.
We
therefore adopt a novel approach for estimating diffusion models. We estimate such
models for daycvc equal to different days; Table 1 presents the results
for daycvc equal to 40, 60, 80 and a 100 days. For day 100, there are 37
observations i.e. only 37 countries have had a virus case for a minimum of 100
days.
Old Age Males do
not explain diffusion of COVID19 ?
One important
result that emerges is that the percentage of males in the population over the
age of 60 is not significant in explaining the number of COVID cases, and this
is irrespective of the definition (male population above 50, male population
above 60, etc) and irrespective of the time elapsed (daycvc 40, 60, 80
or 100). These counter intuitive cases could be due to the possibility that greater
care was being exercised by every country in protecting their aged population
from catching the virus, after learning from the world-wide experience of fatality
rates among the aged population, and especially of aged male population. As
explained in our next blog, aged males are more likely to die from the virus;
as was confirmed by the “vision” of the Italian experience.
Table 1:
Determinants of the Spread of COVID19 within a country
|
|
Days since the
first day COVID19 was observed
|
|||
|
40
|
60
|
80
|
100
|
|
|
|||
%male population>60
|
0.0315
|
0.0411
|
-0.0591
|
-0.0220
|
(UN)
|
(0.58)
|
(0.74)
|
(-0.91)
|
(-0.24)
|
|
|
|||
% urban
|
0.0292***
|
0.0319***
|
0.0390***
|
0.0231
|
(World Bank WDI)
|
(3.90)
|
(4.23)
|
(4.17)
|
(1.30)
|
|
|
|||
Mean temperature
|
-0.0491*
|
-0.0628**
|
-0.0676**
|
-0.106*
|
(Average for Days of Virus
|
(-2.36)
|
(-3.16)
|
(-2.83)
|
(-2.53)
|
in each country)
|
|
|||
Constant
|
4.589***
|
5.821***
|
7.162***
|
9.703***
|
|
(6.30)
|
(7.72)
|
(7.59)
|
(5.03)
|
|
|
|||
Number of observations
|
170
|
168
|
129
|
36
|
adj. R-sq
|
0.21
|
0.27
|
0.21
|
0.20
|
|
|
|
|
|
t statistics in parentheses
|
|
|||
="* p<0.05
|
** p<0.01
|
*** p<0.001"
|
|
|
Data Sources: COVID19 data, JHU, World Bank, UN
Temperature:
Among the three major explanatory variables, temperature is consistently the
most significant and has the expected negative sign i.e. higher the
temperature, less the diffusion. The coefficient is also remarkably stable in
value – around -.08. Evaluated at the mean on day 90, each 1 degree increase in
temperature led to 8243 fewer cases. (coefficient -.106, mean number of cases 77767).
There is also some evidence that the importance of this variable has
increased and strengthened over time. This could be due to the arrival of
summer in the northern hemisphere and of winter in the southern hemisphere,
slowing the spread of COVID cases in the former and accelerating it in the
latter. One implication of this result is that countries which are
traditionally affected strongly by the seasonal flu, may see a similar
pattern for the COVID. The virus pattern has continually confounded experts
– hence the emphasis on may.
The third important result is that urbanization is important in explaining
the spread of the virus, but only after 40 days have passed. The significance
of urbanization also increases with days of presence of the virus and the
magnitude increases to 0.04 for daycvc equal to 80, from 0.032 for daycvc
equal to 40. This result is supportive of the social distancing hypothesis.
There is no precedent for this rather unique pandemic. It has befuddled
policy makers, economists, and epidemiologists in knots. Everybody who could be
proved wrong, has been proved wrong, and we are all humbled by the uncertain
uncertainty of COVID.
The chart represents the combined effect of all the
three factors on prediction of cases. For example, for the USA, the error in
prediction seems to be the highest (it is the largest positive distance away
from the red line of equality between predicted and actual. Sri Lanka and
Canada are opposite ends of the temperature and COVID spectrum, yet both are on
the line. The countries below the line are the good performers – those farther
away are the better performers e.g. Nepal.
Chart 1: Actual and Predicted COVID cases
on Day 80 of virus in each country
No comments:
Post a Comment