-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathozcoviz.Rmd
1110 lines (786 loc) · 64.7 KB
/
ozcoviz.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "OzCoViz"
date: "`r paste('last updated', format(lubridate::now(), '%H:%M, %d %B %Y'))`"
params:
local: true
output:
flexdashboard::flex_dashboard:
orientation: columns
vertical_layout: fill
theme: lumen
social: ["menu"]
self_contained: false
---
```{r setup, include=FALSE}
library(flexdashboard)
library(covidrecon)
library(tidyverse)
library(lubridate)
library(tsibble)
library(ggeasy)
library(googlesheets4)
library(EpiEstim)
library(patchwork)
library(gganimate)
# devtools::install_github('emitanaka/datalegreyar')
library(datalegreyar)
library(memoise)
library(scales)
# remotes::install_github("tylermorganwall/rayshader")
# library(rayshader)
knitr::opts_chunk$set(cache = FALSE)
local_build <- params$local
not_local_build <- !params$local
```
```{r center-covid-from-100}
selected_countries <- c(
"Australia",
"China",
"Italy",
"Singapore",
"United_Kingdom",
"United_States_of_America",
"South_Korea",
"New_Zealand",
"Canada",
"Germany",
"Japan",
"Sweden",
"Spain",
# "Indonesia",
"Brazil",
# "Norway",
"Russia",
"Taiwan")
# "Denmark"
```
```{r pull-data}
yesterday <- today() - 1L
covid <- covid_latest() %>%
# filter(date < yesterday) %>%
filter(country_region %in% selected_countries) %>%
mutate(country_region = case_when(
country_region == "United_States_of_America" ~ "USA",
country_region == "United_Kingdom" ~ "UK",
TRUE ~ country_region
)
)
```
```{r covid-changepoint}
covid_changepoint <- covid %>%
add_covid_change_point() %>%
filter(date >= change_point_date)
```
```{r load-nsw-funcs-and-prep-charts, cache=FALSE, include = FALSE}
if (local_build) {
source("get_nsw_data.R")
source("get_nishiura_si_sample.R")
source("plot_nsw_eff_r.R")
source("nsw_eff_R_data_and_plot_prep.R")
} else {
source("../get_nsw_data.R")
source("../get_nishiura_si_sample.R")
source("../plot_nsw_eff_r.R")
source("../nsw_eff_R_data_and_plot_prep.R")
}
```
Overview
=======================================================================
```{r datalegreyar, fig.align='center'}
# this just takes the last 21 days of Australian data, Ideally it should show the
# whole history of Australian data since, say 1st March 2020
vals <- covid %>%
filter(country_region == "Australia") %>%
arrange(date) %>%
top_n(26, wt = row_number()) %>%
pull(cases)
fig(datafy(values = vals,
text = "oz covid 19 visualisations",
ignore_space = TRUE),
size=60)
```
```{r old-code, eval=FALSE, echo=FALSE}
maxpos <- which(vals == max(vals))
if (all(vals[maxpos] > vals[18:26])) {
fig(datafy(values = vals,
text = "oz covid 19 visualisations",
ignore_space = TRUE),
size=40,
symbol = setNames(c(maxpos, 18:25),
c("max", rep("down",8))))
} else {
fig(datafy(values = vals,
text = "oz covid 19 visualisations",
ignore_space = TRUE),
size = 60,
symbol = setNames(c(maxpos, 18:54),
c("max", rep("up",8))))
}
```
Column
-----------------------------------------------------------------------
### Introduction
This web site is a joint effort of researchers at the [South Western Sydney Clinical School](https://swscs.med.unsw.edu.au) and the [Centre for Big Data Research in Health](https://cbdrh.med.unsw.edu.au) at the [UNSW Faculty of Medicine](https://med.unsw.edu.au), the [Econometrics and Business Statistics Research Group](https://research.monash.edu/en/organisations/econometrics-business-statistics) of [Monash University](https://www.monash.edu/), and at the [Ingham Institute for Applied Medical Research](https://inghaminstitute.org.au) in Liverpool, Sydney.
```{r, out.width="20%", eval=local_build, fig.show='hold', out.extra='style="padding:25px"'}
knitr::include_graphics(path="docs/assets/cbdrh_logo.png")
knitr::include_graphics(path="docs/assets/unsw_logo.png")
knitr::include_graphics(path="docs/assets/monash-logo-mono.png")
knitr::include_graphics(path="docs/assets/Ingham_Institute_Logo_Horizontal.jpg")
```
```{r, out.width="20%", eval=not_local_build, fig.show='hold', out.extra='style="padding:25px"'}
knitr::include_graphics(path="assets/cbdrh_logo.png")
knitr::include_graphics(path="assets/unsw_logo.png")
knitr::include_graphics(path="assets/monash-logo-mono.png")
knitr::include_graphics(path="assets/Ingham_Institute_Logo_Horizontal.jpg")
```
The intent is to offer a range of principled epidemiological and statistical analyses and visualisations of current COVID-19 data which go beyond the now ubiquitous [world maps and cumulative incidence charts](https://coronavirus.jhu.edu/map.html).
The broad themes for the analyses and visualisations currently available are listed in the menu at the top of this page -- more will be added in due course. For each theme there is an introductory page explaining the motivating ideas and methodology employed for each of the visualisations or analyses for that theme, which are available in the subsequent frames (the series of rectangles at the top of each page). Additional notes or commentary appear on the right of some pages.
The time-series line in the heading above is derived from current incidence cases of COVID-19 in Australia.
#### An Australian focus with an international perspective
This site has been created by researchers at Australian universities, and hence the focus is on the situation in Australia, within the broader international context -- we are, after all, all in this together. However, we hope that some of the analyses and visualisations on this site might be useful elsewhere, and to that end, all the [`R`](https://www.r-project.org) source code used to create this site if freely available -- please see the _Technical details_ tab above for details on software used, and where to find the source code.
Incidence {.storyboard}
=======================================================================
### Explanation
Incidence is the epidemiological term for the number of cases of a disease meeting some case definition in a specified time period. Here we present the daily incidence -- that is, the number of new cases each day -- of COVID-19 for a range of countries, including Australia, of course. Each country may be using a slightly different case definition, although most of the countries presented here have been using case definitions aligned with those [recommended by the WHO](https://www.who.int/publications-detail/global-surveillance-for-human-infection-with-novel-coronavirus-(2019-ncov)). Three types of case definition are used, simultaneously:
* **Suspected cases**
* **Probable cases**
* **Confirmed (by laboratory test) cases**
All of the data reported here are for **confirmed cases**, with a few exceptions -- China included cases diagnosed via lung CT scan for a few days in February, 2020, but we have adjusted the data for those days as far as possible to remove that anomaly.
The data are drawn from the [European CDC](https://www.ecdc.europa.eu/en/publications-data/download-todays-data-geographic-distribution-covid-19-cases-worldwide), which has collected data from various governing agencies around the world. There are some small discrepancies between these data and those given by the Australian government, and indeed other national governments, due to timing issues relating to when cases are tabulated each day, and so on. However, we believe the the European CDC to be the best source of automatically downloadable, machine-readable data there is right now.
Note, however, that because the **confirmed** case definition depends on laboratory confirmation, it is influenced by the number of lab tests (RT-PCR, reverse transcriptase polymerase chain reaction tests on nasopharyngeal swabs or sputum) done by each reporting country or jurisdiction. Obviously the number of reported cases is bounded by the number of such labs tests which are done, but the degree of under-ascertainment is also affected by the policies in place which determine which potential COVID-19 cases are tested. These policies are completely country-specific and have changed over time.
Another issue with these data is that it is vastly preferable to analyse incidence by (presumed or definite) date of onset of symptoms, rather than by date of notification or date of reporting. There may be variable delays in the processing of laboratory tests and the reporting of cases to central authorities. Tabulation of incidence counts by date of onset overcomes this problem. Almost all national and jurisdictional health authorities will be collecting data on presumed date of onset for each case (although, inexplicably, it is **not** one of the data items on the [WHO-recommended data collection form](https://apps.who.int/iris/bitstream/handle/10665/331234/WHO-2019-nCoV-SurveillanceCRF-2020.2-eng.pdf)). One reason for not using date of onset is that it may be incomplete, but this ignores the fact that statistical imputation can be used to validly fill in those missing dates of onset. There are also a body of methods, collectively known as _nowcasting_, that use multivariate time-series models to convert data tabulated by date of notification/reporting to (estimated) date of onset. If national or jurisdictional health authorities do not have the technical capacity to undertake such value-adding to their own data, they should make the required data available to trusted partners in the academic sector who can undertake such statistical manipulation for them (or help authorities implement such processing internally).
#### Cumulative Incidence
Cumulative incidence is, as the name suggests, just the cumulative sum of the daily incidence - that is, a running total of the number of cases. Reporting of cumulative COVID-19 seems to dominate the mainstream media, but it has many disadvantages. In particular, a cumulative sum of case counts is always monotonically increasing -- it can only ever go up, or at best, remain flat if there are no new incident cases. This tends to obscure the rate of change in incidence over time -- subtle, or even large changes in the slope of the cumulative incidence curve are difficult to see.
#### Semi-log cumulative incidence chart
The first chart presented here is a semi-log cumulative incidence chart. This chart seems to have been popularised by [John Burn-Murdoch](https://twitter.com/jburnmurdoch) in the [Financial Times](https://www.ft.com/coronavirus-latest), but it appears it was [first used by Matt Cowgill](https://blog.grattan.edu.au/2020/03/australian-governments-can-choose-to-slow-the-spread-of-coronavirus-but-they-must-act-immediately/) from Australia's very own [Grattan Institute](https://blog.grattan.edu.au). It has since been widely copied and reproduced. Please see a [blog post by Prof Rob Hyndman](https://robjhyndman.com/hyndsight/logratios-covid19/) for further discussion of this chart, and some alternative analyses.
There are several variations on the _Grattan_ chart presented, please see the notes for each one.
#### _Epicurves_
The _epicurve_ is perhaps the most-used chart in field epidemiology and outbreak control. It is simply a chart of daily (or weekly, for slower-moving diseases) incidence (new cases), traditionally shown as a bar chart. It gives an immediate sense of whether an outbreak or epidemic is in a growth phase, with increasing incident counts each successive day, or in a decay phase, with decreasing counts each successive day. Note that the cumulative count will still increase, day-on-day, even when an outbreak or epidemic is in a decay phase. Only when the epidemic has been completely extinguished will the cumulative incidence stop increasing. That's one of the reasons why cumulative incidence charts are rarely used by epidemiologists.
Three variations of _epicurves_ are provided here:
* the usual _epicurve_, on a linear y-axis, and with the ranges specific for each country to maximise the amount of detail discernible.
* the usual _epicurve_, on a linear y-axis, but with the same y-axis scale across all countries, which clearly shows the relative incidence in each country. This chart is quite startling!
* the same chart, but with a logarithmic y-axis, which allows the periods with lower counts to be inspected in greater detail.
### Semi-log **cumulative incidence** for selected countries -- the _Grattan Institute_ chart
```{r since-100}
covid_since_100 <- covid %>%
add_days_since_limit(limit = 100) %>%
filter(days_since_limit >= 0)
```
```{r gg-cumulative-incidence, fig.height = 8, fig.width = 12}
gg_covid_cumulative_exceed_limit(covid_since_100,
limit = 100) +
theme(panel.grid.minor = element_blank()) +
scale_colour_hue()
```
***
This chart, developed and poularised by the [Grattan Institute](https://blog.grattan.edu.au) and the [Financial Times](https://www.ft.com/coronavirus-latest), shows the cumulative cases of COVID-19 for selected countries on a logarithmic y-axis scale, with the dates on the x-axis normalised to the number of days since each country shown exceeded 100 cumulative cases, on a linear x-axis scale (hence the name _semi-log_, since only one of the two axes is logarithmic).
Note that countries "peel off" the diagonal trajectory as their rate of new (incident) cases reduces. If the line for a country is almost horizontal, it means there are almost no new cases occurring there.
The curve for Australia is clearly flattening, and we are keeping good company with China, South Korea, Taiwan and New Zealand as the other countries with nearly horizontal trajectories. Note that after considerable initial success in containing COVID-19 spread, both Japan and Singapore are now on a upwards trajectory, but the slopes of those trajectories are much shallower than the other countries shown, indicating much slower spread.
### Semi-log **cumulative incidence** for selected countries -- aligned to start of epidemic in each country
```{r since-cp}
covid_since_changepoint <- covid %>%
add_covid_change_point() %>%
rename(days_since_limit = days_since_changepoint) %>%
filter(days_since_limit >= 0)
```
```{r gg-cumulative-incidence-since-cp, fig.height = 8, fig.width = 12}
gg_covid_cumulative_exceed_limit(covid_since_changepoint,
limit = NULL) +
theme(panel.grid.minor = element_blank()) +
scale_colour_hue()
```
***
This chart is a variation on the previous _Grattan Institute_ chart. It isn't really an improvement on the _Grattan_ chart, but is shown here to illustrate the sensitivity of the _Grattan_ chart to the method used to align the dates on the x-axis. In the chart shown at left, the dates on the x-axis are aligned to the approximate start of the COVID-19 epidemic in each country. The start dates are chosen automatically using a non-parametric changepoint detection algorithm, (see the [`changepoint.np` package](https://CRAN.R-project.org/package=changepoint.np ) for `R`). The changepoints for each country shown are as follows:
```{r cp-table}
covid_since_changepoint %>%
distinct(country_region, change_point_date) %>%
arrange(change_point_date) %>%
mutate(country_region = stringr::str_replace_all(country_region,
pattern = "_",
replacement = " ")) %>%
knitr::kable(col.names=c("Country", "Detected start of epidemic"))
```
### Semi-log daily **incidence** for selected countries -- aligned to start of epidemic in each country
```{r gg-incidence-since-cp, fig.height = 8, fig.width = 12}
gg_covid_incidence_exceed_limit(covid_since_changepoint,
limit = NULL) +
theme(panel.grid.minor = element_blank()) +
scale_colour_hue() +
scale_y_log10(labels = scales::comma_format(accuracy=1))
```
***
This is another variation on the _Grattan Institute_ chart, but this time howing daily incidence on the y-axis, rather than daily **cumulative** incidence. It is basically the same information as shown in the _epicurve_ charts in the subsequent frames (accessed via the blue rectangles at the top of the page), but all of the selectsed countries are shown on one plot, and the dates are aligned by the detected start of the epidemic in each country.
One problem is that the daily incidence curves are rather noisy (with missing values for some countries). We can address that by smoothing them, as shown in the next frame.
### Semi-log **smoothed** daily **incidence** for selected countries -- aligned to start of epidemic in each country
```{r gg-incidence-since-cp-smoothed, fig.height = 8, fig.width = 12}
gg_covid_incidence_exceed_limit(covid_since_changepoint,
limit = NULL,
smooth=TRUE,
span=0.2) +
theme(panel.grid.minor = element_blank()) +
scale_colour_hue() +
scale_y_log10(labels = scales::comma_format(accuracy=1))
```
***
In this chart (see discussion in previous frame), we can now discern three distinct groups of countries:
* an upper group of the USA, Russia and Brazil, where the epidemic is still growing rapidly.
* a middle group comprising the UK and European countries
* a lower group, which includes Australia , New Zealand, South Korea, Taiwan, mainland China, Japan and Singapore, which all have their local COVID-19 epidemics under control (but definitely not eliminated). Note however that Japan and Singapore are now exiting that lower group and heading upwards.
```{r, eval=FALSE}
### Semi-log **cumulative incidence and cumulative deaths** for selected countries -- the _Grattan Institute_ chart with deaths added
---{r gg-cumulative-incidence-deaths, fig.height = 8, fig.width = 12}
p <- gg_covid_cumulative_cases_deaths_exceed_limit(covid_since_100,
limit = 100) +
theme(panel.grid.minor = element_blank()) +
scale_colour_viridis_c("Cumulative deaths")
plot_gg(p, multicore=TRUE,height=5,width=6,scale=500)
---
***
```
```{r, eval=FALSE}
### The Hyndman logratio plot
---{r hyndman-logratio, fig.height = 8, fig.width = 12}
covid_since_changepoint <- covid_since_changepoint %>%
mutate(cases_logratio = tsibble::difference(log(cases)))
covid_data_last <- covid_since_changepoint %>%
group_by(country_region) %>%
top_n(1, days_since_limit) %>%
ungroup()
covid_since_changepoint %>%
# filter(date >= as.Date("2020-03-01")) %>%
ggplot(aes(x = days_since_limit, y = cases_logratio, col = country_region)) +
geom_smooth(method = "loess", se = FALSE, span=0.15) +
scale_y_continuous(
"Daily increase in cumulative cases",
breaks = log(1+seq(-60,60,by=10)/100),
labels = paste0(seq(-60,60,by=10),"%"),
minor_breaks=NULL # ,
#sec.axis = sec_axis(~ log(2)/(.),
# breaks = c(2:7,14,21),
# name = "Doubling time (days)")
) +
facet_wrap(country_region~.) +
# ggrepel::geom_label_repel(data = covid_data_last,
# aes(x = days_since_limit,
# y = cases_logratio,
# col = country_region,
# label = country_region),
# size = 4,
# nudge_x = 4 ,
# # direction = "x",
# segment.alpha = 0.3,
# # segment.size = 0.2,
# # arrow = arrow(length = unit(0.02, "npc"))
# )
#
theme_minimal()
---
***
Hey!
```
### Log-log incidence versus cumulative incidence animation
```{r cum-incidence-incidence-prep}
model_covid_changepoint <- covid_changepoint %>%
# filter(country_region %in% c("USA", "Australia", "China")) %>%
rename(incidence_date=date) %>%
mutate(cases = ifelse(cases == 0, NA, cases),
cumulative_cases = ifelse(cumulative_cases == 0, NA, cumulative_cases)) %>%
mutate(daynum = as.integer(difftime(incidence_date,
min(incidence_date),
units="days")))
i <- 0
for (c in unique(model_covid_changepoint$country_region)) {
i <- i + 1
d <- model_covid_changepoint %>%
filter(country_region == c) %>%
arrange(incidence_date)
# print(c)
# print(nrow(d))
loess_mod <- loess(data=d, formula=cases ~ cumulative_cases, span = .5)
smoothed_cases <- predict(loess_mod)
if (length(smoothed_cases) < nrow(d)) {
smoothed_cases <- c(rep(NA, nrow(d) - length(smoothed_cases) ), smoothed_cases)
}
d$smoothed_cases <- smoothed_cases
if (i == 1) {
smooth_covid_changepoint <- d
} else {
smooth_covid_changepoint <- smooth_covid_changepoint %>%
bind_rows(d)
}
}
min_date <- min(smooth_covid_changepoint$incidence_date)
nframes <- as.integer(difftime(max(smooth_covid_changepoint$incidence_date),
min_date,
units="days"))
nframes <- max(smooth_covid_changepoint$daynum)
```
```{r cum-incidence-incidence, fig.height = 8, fig.width = 12, include=FALSE, warning=FALSE, message=FALSE}
p <- smooth_covid_changepoint %>%
mutate(alfa=ifelse(country_region == "Australia", 1, 0.5)) %>%
ggplot(aes(x = cumulative_cases,
y = smoothed_cases,
group=country_region,
colour=country_region,
label=country_region,
alpha=alfa)) +
geom_line() +
geom_point(size=2) +
ggrepel::geom_label_repel(size = 4,
force=2,
nudge_x = 0.2 ,
direction = "x",
segment.alpha = 0.3) +
scale_alpha(range=c(0.65, 1)) +
transition_reveal(daynum) +
scale_x_log10(limits=c(10, NA), labels=scales::comma_format(accuracy=1)) +
scale_y_log10(labels=scales::comma_format(accuracy=1)) +
# facet_wrap(country_region ~ .) +
labs(title='COVID-19 trajectories for selected countries as at {format(min_date + lubridate::days(frame_along), "%d %b %Y")}',
x = 'Cumulative cases (log scale)',
y = 'Incident cases (log scale)',
caption = glue::glue("Tim Churches (UNSW) & Nick Tierney (Monash)
Data source: European CDC up to {format(max(smooth_covid_changepoint$incidence_date),'%d %b %Y')}")) +
theme_minimal() +
theme(legend.position="none")
suppressMessages({
anim_save("docs/assets/log-log-incidence.gif", animation=p, duration=60, height=500, width=800, nframes=nframes*2, detail=5, start_pause=5, end_pause=10, rewind=TRUE)
})
```
```{r cum-incidence-incidence-fig, fig.height = 8, fig.width = 12}
if (params$local) {
knitr::include_graphics("docs/assets/log-log-incidence.gif")
} else {
knitr::include_graphics("assets/log-log-incidence.gif")
}
```
***
This plot replicates [a chart](https://aatishb.com/covidtrends/) developed by [Aatish Bhatia](https://aatishb.com) in collaboration with [Minute Physics](https://www.youtube.com/user/minutephysics). The plot shows, for each country, the number of incident (new) cases on the y-axis, on a logarithmic scale, versus the cumulative total of cases on the x-axis, also on a logarithmic scale. When plotted in this way, uncontrolled epidemics with exponential growth take a linear (straight line) trajectory at some angle. As an epidemic is brought under control, the trajectory drops below the straight line, eventually falling vertically if the epidemic has been completely extinguished.
This is just an alternative visualisation of the data shown in earlier frames, but it looks pretty and in this case the animation is actually helpful, because it shows time, which is not otherwise one of the explicit dimensions of the chart.
### _Epicurves_ for selected countries
```{r global-cases, fig.height = 8, fig.width = 12}
covid_since_changepoint_w_total <- covid_changepoint %>% # was: covid_since_100
group_by(country_region) %>%
mutate(total_cases = sum(cases)) %>%
ungroup() %>%
mutate(country_region = stringr::str_replace_all(country_region,
pattern = "_",
replacement = " "))
covid_since_changepoint_w_total_labels <- slice(covid_since_changepoint_w_total, 1)
gg_cases <- ggplot(covid_since_changepoint_w_total,
aes(x = date,
y = cases,
fill = country_region)) +
geom_col() +
theme_minimal() +
theme(legend.position = "none") +
labs(title = "COVID-19 epicurves for confirmed incident cases for selected countries",
subtitle = glue::glue("Up to {format(max(covid_changepoint$date), '%d %b %Y')}"),
x = "Date",
y = "Incident cases",
caption = glue::glue("Tim Churches (UNSW) & Nick Tierney (Monash)
Data source: European CDC up to {format(max(covid_since_changepoint_w_total$date),'%d %b %Y')}")) +
scale_x_datetime(date_labels = "%d %b") +
scale_y_continuous(label = comma)
gg_cases +
facet_wrap(~reorder(country_region, -total_cases),
scales = "free") +
labs(subtitle = "Arranged from highest to smallest cumulative incidence.\nNote: both x-axis and y-axis scales are different for each country")
```
***
The charts shown here are _epicurves_ -- that is, daily count of incident (new) cases, using data collated by the European CDC. It is easy to see when (or whether) the epidemic in each country entered a decay phase and the daily incidence started to decline (although the cumulative incidence will, by definition, continue to increase during the decay phase until the epidemic is completely extinguished).
Note that there are some days where the number of cases appears to spike upwards, followed by a decrease the following day. This indicates that there may be some data discrepancies in how the European CDC is capturing data from WHO Situation Reports. It underlines the importance of nations providing reliable machine-readable access to their own COVID-19 data. By "machine-readable" we mean CSV or JSON data files which are automatically downloadable, or an API which can be queried automatically to yield such data. Neither of those are difficult to establish, yet nearly all national governments have failed to provide such data, leaving it to third-party agencies and citizen-science efforts to piece together the required data in a manner that permits ongoing analysis. There is, for example, no official machine-readable source of national COVID-19 data provided by the Australian government. As far as we are aware, NSW is the only State or Territory government that has made any effort in that direction by providing some machine-readable data, which we leverage in the $R_{t}$ _for NSW_ theme (see menu above).
### _Epicurves_ for selected countries (common y-axis scale)
```{r world-incidence-free-scales, fig.height = 8, fig.width = 12}
gg_cases +
facet_wrap(~reorder(country_region, -total_cases),
scales = "free_x") +
labs(subtitle = "Arranged from highest to smallest cumulative incidence\nwith y-axis scale fixed to facilitate comparisons between countries")
```
***
Forcing the y-axis scale to be the same for all the plots in this chart means that, compared to the previous chart where the y-axis could change for each country, the country with the largest number cases, in this case, the USA, appears the same, and the rest of the plots appear smaller.
This provides important context: the number of cases in the USA currently dominates relative to other countries. The bottom row of countries are barely visible, by comparison.
### _Epicurves_ for selected countries (logarithmic y-axis scale)
```{r world-incidence-free-scales-log, fig.height = 8, fig.width = 12}
gg_cases +
facet_wrap(~reorder(country_region, -total_cases),
scales="free") +
scale_y_log10(label = comma) +
labs(subtitle = "Arranged from highest to smallest cumulative incidence\nwith logarithmic y-axis scales, different for each country")
```
***
The final variation the _epicurve_ chart, show here, uses a logarithmic y-axis scale. This emphasises lower counts, which is useful for inspecting the beginnings and ends of the epicurves, or the middle parts where there is more than one "wave", as in China and Singapore.
National-level $R_{t}$ {.storyboard}
=======================================================================
### The _reproduction number_
#### The _basic reproduction number_ $R_{0}$
A key statistic in field epidemiology is the _basic reproduction number_, often referred to as "R-zero" or "R-nought", and written $R_{0}$. This is the number of people we expect each new case of a disease such as COVID-10 to infect -- in other words, how many people each case passes the disease on to -- at the **beginning** of a disease outbreak or epidemic, **before** any public health controls or interventions have been established or put in place.
$R_{0}$ was popularised in the 2011 film _Contagion_. Here is Kate Winslet, who plays a US CDC epidemiologist, explaining $R_{0}$ (with a few errors -- it's _reproduction number_, not "reproductive rate" as she say) to government officials:
<iframe width="560" height="315" align="center" src="https://www.youtube.com/embed/VrATMF_FB9M" frameborder="0" allowfullscreen></iframe>
<br>
As explained in the film clip above, $R_{0}$ is determined by a number of factors, but is a very useful metric of how fast a communicable disease will spread in a population, if **nothing is done** to stop or slow it. Note that $R_{0}$ is **not** solely a characteristic of a pathogen (such as a virus) _per se_, but rather depends on the nature and biological behaviour of the pathogen, but also how it is transmitted, how often members of the population engage in behaviours that may transmit it, population density, whether any members of the population are immune or resistant to the disease, and so on. In other words, $R_{0}$ is highly situation-specific.
As such, $R_{0}$ is useful in the early stages of an epidemic, but as soon as (effective) public health interventions have been put in place to slow the spread of the disease, the reproduction number will change. The statistice is then referred to as the _effective reproduction number_, denoted $R_{t}$, where $t$ stands for time. In other words, $R_{t}$ is the reproduction number at a particular point in time, $t$.
#### What are the implications of the _effective reproduction number_?
**If $R_{t}$ equals 1.0**
- Each infected person spreads the disease, on average, to one other person during the course of their illness (which ends when they either recover or die). Therefore, in the long-run, the number of infected people neither grows nor shrinks. From day-to-day there may be some variation, but overall the number of new infections stays the same.
**If $R_{t}$ is greater than 1.0**
- Each infected person spreads the disease, on average, to more than one other person during the course of their illness. Each of those spread the disease to more than one additional people, and so on. It is easy to see that this results in a growing number of new infections each day, and that growth rate itself also grows -- the so-called exponential growth of an epidemic (at least in its early stages). If $R_{0}$ initially, or $R_{t}$, later, is only just greater than 1.0, then the disease spreads only slowly, but if the _reproduction number_ is well above 1.0, say, 2.0 or more, then rapid spread results.
**If $R_{t}$ is less than 1.0**
- Each infected person spreads the disease, on average, to fewer than one other person during the course of their illness (obviously each person spreads it to a whole number of other people, e.g., zero, one, two, three, but _on average_, taken across a large number of infected individuals, the number infected by each person with the disease is less than one. Again, it is easy to see that this results in a falling number of new infections each day, and that rate of decay in the number of new infections also decays -- in other words, epidemics decay exponentially too. As an aside, just a epidemics tend to grow much more quickly than most people expect -- humans tend to reason using linear heuristics -- so do epidemics decay much more **slowly** that people expect, for the same reason -- humans assume linear behaviour, which does not (necessarily) apply to communicable disease dynamics.
#### Estimating the time-varying _effective reproduction number_ $R_{t}$
We estimate the _effective reproduction number_ $R_{t}$ using a statistical model which was developed in 2013 by [Anne Cori and colleagues](https://academic.oup.com/aje/article/178/9/1505/89262) at Imperial College London (ICL). The method was later extended by [Thompson and colleagues](https://doi.org/10.1016/j.epidem.2019.100356) (including Anne Cori), also at ICL. We won't go into the details of the model here -- they are described in details in the cited papers -- but we will remark on a few key points that need to be born in mind when interpreting these statistics.
1. The methods we use here estimate the **instantaneous** effective reproduction number, which is is the average number of secondary cases that each infected individual would infect if the conditions remained as they were at time _t_. In that respect, it is analogous to _current life expectancy_ (which is the one usually calculated), which is the length of time someone is expected to live if the age- and sex-specific death rates as they currently are persist, unchnaged, into the future.
1. The _reproduction number_ is an estimate of the spread of an infection in a specific, local population, and thus we should really only use counts of incident (new) infections which are the result of **local** spread -- that is, we should not include cases which were acquired elsewhere, such as overseas or interstate, when estimating it. Doing so will bias our estimates of $R_{t}$ upwards while (in rough terms) the incidence of such cases is increasing, and downwards when they slow or stop. In fact, the estimation method used here can make proper use of **both** incident counts of imported cases and separate counts of locally-acquired cases. Unfortunately, very few jurisdictions have released data which enable that distinction, between imported and locally-acquired cases, to be made, and so we just use total incidence for the estimates shown in this section. However, incident counts of cases, split into imported and locally-acquired counts, **is** available from NSW Health, and we report on those, much better data in the next section (see menu at top of page).
1. We use a seven-day trailing sliding window of incidence case counts to calculate the current $R_{t}$ estimate. Thus, for each date, the estimate is based on the count of incident cases on that date and for the six days prior to that. This is done to smooth the estimates and provide greater precision for each estimate.
1. The estimation procedure uses Bayesian methods, and we display the median estimate of the posterior distribution of the estimated $R_{t}$, as well as the 95% credible interval in some of the charts.
1. Estimation of the effective reproduction number should really use incidence data tabulated (aggregated) by date of symptom onset, not date of reporting or date of notification (to the relevant health authority). Please see the explanatory notes in the previous section on incidence for further discussion of this data gap, and how health authorities could readily address it.
1. Estimates of the _reproduction number_ depend critically on using the correct distribution for the _serial interval_. The _serial interval_ is the interval between the onset of symptoms in a case and the onset of symptoms in those infected by that case. In other words, the differences between the time (or date) of onset of an _infector_ and the times or dates of onset of its _infectees_. Not every pair of _infector-infectee_ cases will have the same serial interval, even for the same _infector_ case. Thus, the _serial interval_ is not a single number, or even a range, but rather a statistical distribution. It is often expressed or summarised as the mean and standard deviation of an idealised distribution, typically a discrete $\beta$ distribution, a _discrete Weibull_ distribution or a _log-normal_ distribution. In the charts in this section, we have used samples from the Bayesian posterior distribution for the _serial interval_ derived from the observed _serial interval_ pairs for COVID-19 published by [Nishiura _et al_.](https://www.ijidonline.com/article/S1201-9712(20)30119-3/pdf). Note that the resulting set of samples have a mean which is shorter than most estimates of the incubation period of COVID-19, which is indicative of asymptomatic transmission -- that is, some cases of COVID-19 infection transmit the infection to others before they themselves start to experience symptoms of illness. That also explains the real difficulties in controlling the spread of the SARS-CoV-2 virus which causes COVID-19.
Further discussion of _serial interval_ estimation and calculation of the _effective reproduction number_ can be found in [this technical blog post](https://timchurches.github.io/blog/posts/2020-02-18-analysing-covid-19-2019-ncov-outbreak-data-with-r-part-1/#estimating-changes-in-the-effective-reproduction-number).
### $R_{t}$ for selected countries on a single plot
```{r covid-instant-repro}
covid_instant_r <- covid_changepoint %>%
add_instant_reproduction(estimate_method="si_from_sample",
si_sample=nishi_si_sample,
config = si_from_sample_nishiura_config) %>%
group_by(country_region) %>%
mutate(total_cases = sum(cases)) %>%
ungroup() %>%
mutate(country_region = stringr::str_replace_all(country_region,
pattern = "_",
replacement = " "))
```
```{r gg-effective-repro-all, cache = FALSE, fig.height = 8, fig.width = 12, cache = FALSE}
covid_instant_r %>%
# only look at dates from March
filter(date > ymd("2020-03-01")) %>%
gg_effective_repro_all() +
theme(panel.grid.minor = element_blank())
# ggthemes::scale_colour_colorblind() +
# annotation_logticks()
```
***
The $R_{t}$ estimates for each country indicate that most countries are bringing the spread of the virus under control or have brought it under control, with the exception Brazil.
### $R_{t}$ for selected countries as facetted plots
```{r gg-effective-r-facet, fig.height = 8, fig.width = 12}
covid_instant_r %>%
# only look at dates from March
filter(date > ymd("2020-03-01")) %>%
gg_effective_repro_facet() +
facet_wrap(~reorder(country_region, -total_cases)) +
theme(panel.grid.minor = element_blank()) +
easy_rotate_x_labels(angle=45, side="right")
```
***
This is the same information presented as in the previous graphic, but with each country split into its own graph. This allows us to see the trajectories of effective R for each country more easily.
Most countries are decreasing, but we notice that Japan and Singapore are rising, due to recent recrudescent outbreaks.
### $R_{t}$ for selected countries as facetted plots with 95% credible intervals shown
```{r gg-effective-r-facet-ribbon, fig.height = 8, fig.width = 12}
covid_instant_r %>%
# only look at dates from March
filter(date > ymd("2020-03-01")) %>%
gg_effective_repro_facet() +
facet_wrap(~reorder(country_region, -total_cases)) +
geom_ribbon(aes(ymin = quantile_0_025_r,
ymax = quantile_0_975_r),
alpha = 0.2) +
easy_rotate_x_labels(angle=45, side="right")
```
***
This shows the uncertainty around the median estimates of the effective reproduction number. The grey bands either side of the median estimate indicate the area in which we are 95% certain the true estimate lies (note: these are Bayesian, not frequentist estimates).
### Australia: $R_{t}$ and incidence
```{r this-is-australia}
gg_effective_repro_incidence_patchwork(covid_instant_r,
covid,
"Australia")
```
***
The $R_{t}$ for the COVID-19 epidemic in Australia appears to have been in decay since about 23rd March. Note that Australia effectively closed its borders to all non-residents on 19th March, and implemented nation-wide social distancing recommendations on 21st March, and state governments began to close non-essential services on 22nd March.
The estimates here use samples from a posterior distribution of _serial intervals_ estimated from the data given by [Nishiura _et al_.](https://www.ijidonline.com/article/S1201-9712(20)30119-3/pdf) using the method of [Thompson _et al_](https://doi.org/10.1016/j.epidem.2019.100356).
### Japan: $R_{t}$ and incidence
```{r this-is-japan}
gg_effective_repro_incidence_patchwork(covid_instant_r,
covid,
"Japan")
```
***
By contrast, Japan appears to have managed the early stages of the epidemic well, but failed, due to lack of enabling legislation, to implement strict social distancing nationwide. This may have contributed to the "escape" of the epidemic around 23rd March. Since then, Japan has struggled to regain control, although emergencies have now been declared in many Japanese prefectures, which have resulted in better containment of spread.
The estimates here use samples from a posterior distribution of _serial intervals_ estimated from the data given by [Nishiura _et al_.](https://www.ijidonline.com/article/S1201-9712(20)30119-3/pdf) using the method of [Thompson _et al_](https://doi.org/10.1016/j.epidem.2019.100356).
### South Korea: $R_{t}$ and incidence
```{r this-is-south-krea}
gg_effective_repro_incidence_patchwork(covid_instant_r,
covid,
"South Korea")
```
***
South Korea is a model of how to fight COVID-19. After initial, very fast growing outbreaks centred in Daegu, the South Korean authorities moved swiftly to implement very efficient and thorough case and contact tracing with rigorous isolation of cases and quarantining of contacts, plus extensive social distancing measures in affected areas. Within a month they had brought the epidemic under control, and have been able to keep it in decay since 9th March, with the exception of a short-lived outbreak in nightclubs in Seoul as soon as social disatncing was relaxed, now contained.
The estimates here use samples from a posterior distribution of _serial intervals_ estimated from the data given by [Nishiura _et al_.](https://www.ijidonline.com/article/S1201-9712(20)30119-3/pdf) using the method of [Thompson _et al_](https://doi.org/10.1016/j.epidem.2019.100356).
### China: $R_{t}$ and incidence
```{r this-is-china}
gg_effective_repro_incidence_patchwork(covid_instant_r,
covid,
"China")
```
***
China stunned the world, or at least it stunned epidemiologists and public health practitioners everywhere, when it closed the city of Wuhan, and then the entire province of Hubei, on 24th January. The result was that within a few weeks the epidemic was in decay, and that decay has accelerated due to door-to-door case-and-contract tracing efforts. Small outbreaks around teh country, including in Wuhan, have all been rapidly contained.
As COVID-19 epidemics began to become apparent in other countries in March, resulting in the repatriation of Chines nationals, there was an increase in the $R_{t}$ estimated here, because it is not possible to separate those _imported_ cases from ongoing local transmission. Thus, at least some of the "bump" in the estimated $R_{t}$ around 23rd March is an artefact of inadequate data.
The estimates here use samples from a posterior distribution of _serial intervals_ estimated from the data given by [Nishiura _et al_.](https://www.ijidonline.com/article/S1201-9712(20)30119-3/pdf) using the method of [Thompson _et al_](https://doi.org/10.1016/j.epidem.2019.100356).
### Canada: $R_{t}$ and incidence
```{r this-is-canada}
gg_effective_repro_incidence_patchwork(covid_instant_r,
covid,
"Canada")
```
***
Canada clearly suffers from sharing a long border with the US, but nonetheless appears to almost have its COVID-19 epidemic under control.
The estimates here use samples from a posterior distribution of _serial intervals_ estimated from the data given by [Nishiura _et al_.](https://www.ijidonline.com/article/S1201-9712(20)30119-3/pdf) using the method of [Thompson _et al_](https://doi.org/10.1016/j.epidem.2019.100356).
### Germany: $R_{t}$ and incidence
```{r this-is-germany}
gg_effective_repro_incidence_patchwork(covid_instant_r,
covid,
"Germany")
```
***
Italy, Spain, France and Germany have all suffered large epidemics of COVID-19, but all appear to have now contained spread to the point that their epidemics are now clearly in decay.
The estimates here use samples from a posterior distribution of _serial intervals_ estimated from the data given by [Nishiura _et al_.](https://www.ijidonline.com/article/S1201-9712(20)30119-3/pdf) using the method of [Thompson _et al_](https://doi.org/10.1016/j.epidem.2019.100356).
### Spain: $R_{t}$ and incidence
```{r this-is-spain}
gg_effective_repro_incidence_patchwork(covid_instant_r,
covid,
"Spain")
```
***
Italy, Spain, France and Germany have all suffered large epidemics of COVID-19, but all appear to have now contained spread to the point that their epidemics are now clearly in decay.
The estimates here use samples from a posterior distribution of _serial intervals_ estimated from the data given by [Nishiura _et al_.](https://www.ijidonline.com/article/S1201-9712(20)30119-3/pdf) using the method of [Thompson _et al_](https://doi.org/10.1016/j.epidem.2019.100356).
### Brazil: $R_{t}$ and incidence
```{r this-is-brazil}
gg_effective_repro_incidence_patchwork(covid_instant_r,
covid,
"Brazil")
```
***
Incidence in Brazil lagged that in most northen hemisphere countries, but now appears to be in the exponential growth phase, with growth slowing only slowly.
The estimates here use samples from a posterior distribution of _serial intervals_ estimated from the data given by [Nishiura _et al_.](https://www.ijidonline.com/article/S1201-9712(20)30119-3/pdf) using the method of [Thompson _et al_](https://doi.org/10.1016/j.epidem.2019.100356).
### Italy: $R_{t}$ and incidence
```{r this-is-italy}
gg_effective_repro_incidence_patchwork(covid_instant_r,
covid,
"Italy")
```
***
Italy, Spain, France and Germany have all suffered large epidemics of COVID-19, but all appear to have now contained spread to the point that their epidemics are now clearly in decay.
The estimates here use samples from a posterior distribution of _serial intervals_ estimated from the data given by [Nishiura _et al_.](https://www.ijidonline.com/article/S1201-9712(20)30119-3/pdf) using the method of [Thompson _et al_](https://doi.org/10.1016/j.epidem.2019.100356).
### Russia: $R_{t}$ and incidence
```{r this-is-russia}
gg_effective_repro_incidence_patchwork(covid_instant_r,
covid,
"Russia")
```
***
Incidence in Russia was initially very low, possibly due to limited testing, but has since accelerted dramatically, particularly in Moscow, although social distancing and rigorous lock-down is now having an effect and incidence now appears to be declining.
The estimates here use samples from a posterior distribution of _serial intervals_ estimated from the data given by [Nishiura _et al_.](https://www.ijidonline.com/article/S1201-9712(20)30119-3/pdf) using the method of [Thompson _et al_](https://doi.org/10.1016/j.epidem.2019.100356).
### Singapore: $R_{t}$ and incidence
```{r this-is-singapore}
gg_effective_repro_incidence_patchwork(covid_instant_r,
covid,
"Singapore")
```
***
Singapore was initially a model of how to contain the virus, and has used advanced smartphone technologies to ensure rigorous case isolation and contact quarantining. Nonetheless, the eppidemic escaped control, and they have been struggling to bring it back under control since. The reasons for this late failure of control are important to study, as they hold valuable lessons for other countries (such study is beyond the scope of this website, for now).
The estimates here use samples from a posterior distribution of _serial intervals_ estimated from the data given by [Nishiura _et al_.](https://www.ijidonline.com/article/S1201-9712(20)30119-3/pdf) using the method of [Thompson _et al_](https://doi.org/10.1016/j.epidem.2019.100356).
### Sweden: $R_{t}$ and incidence
```{r this-is-sweden}
gg_effective_repro_incidence_patchwork(covid_instant_r,
covid,
"Sweden")
```
***
The estimates here use samples from a posterior distribution of _serial intervals_ estimated from the data given by [Nishiura _et al_.](https://www.ijidonline.com/article/S1201-9712(20)30119-3/pdf) using the method of [Thompson _et al_](https://doi.org/10.1016/j.epidem.2019.100356).
### Taiwan: $R_{t}$ and incidence
```{r this-is-taiwan}
gg_effective_repro_incidence_patchwork(covid_instant_r,
covid,
"Taiwan")
```
***
Please see this [SMH article](https://www.smh.com.au/world/asia/population-the-same-as-australia-s-but-a-fraction-of-the-coronavirus-cases-20200412-p54j67.html) for background on Taiwan's comprehensive and model approach, one which other island states (including Australia, New Zealand and, effectively, South Korea) are seeking to emulate, or should be.
because Taiwan has so few cases, the estimates of $R_{t}$ are quite unstable, but it is has been a major success story.
The estimates here use samples from a posterior distribution of _serial intervals_ estimated from the data given by [Nishiura _et al_.](https://www.ijidonline.com/article/S1201-9712(20)30119-3/pdf) using the method of [Thompson _et al_](https://doi.org/10.1016/j.epidem.2019.100356).
### United Kingdom: $R_{t}$ and incidence
```{r this-is-uk}
gg_effective_repro_incidence_patchwork(covid_instant_r,
covid,
"UK")
```
***
The UK was initially following a "herd immunity" approach, like Sweden, but changed strategy in response to a [modelling report](https://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/report-9-impact-of-npis-on-covid-19/) released on 16 March by Imperial College London, perhaps just in time to avoid a complete catastrophe. However, Britain is still struggling to bring the epidemic under control, and it appears to have only recently left the growth phase.
The estimates here use samples from a posterior distribution of _serial intervals_ estimated from the data given by [Nishiura _et al_.](https://www.ijidonline.com/article/S1201-9712(20)30119-3/pdf) using the method of [Thompson _et al_](https://doi.org/10.1016/j.epidem.2019.100356).
### USA: $R_{t}$ and incidence
```{r this-is-usa}
gg_effective_repro_incidence_patchwork(covid_instant_r,
covid,
"USA")
```
***
The world is aghast at the US reponse to COVID-19 and the results thereof. But they do appear to be slowly reducing transmission, mainly due to extremely rigorous social distancing measures in New York city and state, which is a centre of the epidemic. The recent opening of the economy in the US may yet have detrimental effects on control of COVID-19.
The estimates here use samples from a posterior distribution of _serial intervals_ estimated from the data given by [Nishiura _et al_.](https://www.ijidonline.com/article/S1201-9712(20)30119-3/pdf) using the method of [Thompson _et al_](https://doi.org/10.1016/j.epidem.2019.100356).
$R_{t}$ for NSW {.storyboard}
=======================================================================
### Explanation
The charts of the time-varying effective reproduction number $R_{t}$ for New South Wales COVID-19 incidence data are similar to the analyses presented in the previous section, with the important distinction that incidence cases have been able to be divided into locally-acquired cases and cases where the infection was acquired elsewhere (overseas or interstate). This permits much more accurate estimation of the true $R_{t}$.
This has been enabled by the publication of more detailed data by NSW Health, available on the [NSW government open data web site](https://data.nsw.gov.au/nsw-covid-19-data), and updated daily, in conjunction with data derived by counting pixels in NSW Health charts to obtain earlier data (luckily the charts were high-resolution and thus the data extracted this way is very accurate).
In each of the frames in this section, estimates of the NSW effective reproduction number are shown, based on either a parametric _serial interval_ distribution, or a _serial interval_ distribution derived from data collated by [Nishiura _et al._](https://www.ijidonline.com/article/S1201-9712(20)30119-3/pdf) using the method of [Thompson _et al_](https://doi.org/10.1016/j.epidem.2019.100356).
Frames in which imported and locally-acquired cases are **not** distinguished in the estimates are also provided for comparison. These are much less likely to be correct than the estimates using split imported/locally-acquired counts.
In addition, frames in which the counts of cases in NSW have been inflated by a factor of 10 for local cases, and by 50% for imported cases, are also presented. This is a _sensitivity analysis_ which mimics the situation in which there is considerable under-ascertainment of cases -- that is, only 1 in 10 cases in the community are actually detected.
#### Technical details
Please see the paper by [Thompson _et al_.](https://www.sciencedirect.com/science/article/pii/S1755436519300350) for details of the methods used here.
### **NSW -- locally-acquired & overseas/interstate-acquired cases treated separately**<br>(cases under investigation excluded)<br>parametric serial interval distribution
```{r nsw-local-os-sep, fig.width=10, fig.height=10}
A1a / A2a / A3a / A4a + plot_layout(heights=c(2,1,1,1))
```
***
Commentary
### **NSW -- locally-acquired & overseas/interstate-acquired cases treated separately**<br>(cases under investigation excluded)<br>serial interval distribution estimated from data
```{r nsw-local-os-sep-est-si-dist, fig.width=10, fig.height=10}
A1b / A2b / A3b / A4b + plot_layout(heights=c(2,1,1,1))
```
***
Commentary
---
### **NSW -- adjusting for potential under-ascertainment, locally-acquired & overseas/interstate-acquired cases treated separately**, (cases under investigation excluded), parametric serial interval distribution
```{r nsw-adj-local-acquire-separately, fig.width=10, fig.height=10}
UA1a / UA2a / UA3a / UA4a + plot_layout(heights=c(2,1,1,1))
```
***
In these plots, the counts of incident cases with presumed local sources of infection have been inflated by a factor of 10, and the counts of cases with presumed overseas or interstate sources of infection inflated by a factor of 1.5. This mimics ten-fold under-ascertainment of locally-transmitted cases, and 50% under-ascertainment of inbound cases.
---
### **NSW -- adjusting for potential under-ascertainment, locally-acquired & overseas/interstate-acquired cases treated separately**, (cases under investigation excluded), serial interval distribution estimated from data
```{r nsw-adj-under-os-separately, fig.width=10, fig.height=10}
UA1b / UA2b / UA3b / UA4b + plot_layout(heights=c(2,1,1,1))
```
***
In these plots, the counts of incident cases with presumed local sources of infection have been inflated by a factor of 10, and the counts of cases with presumed overseas or interstate sources of infection inflated by a factor of 1.5. This mimics ten-fold under-ascertainment of locally-transmitted cases, and 50% under-ascertainment of inbound cases.
In fact, so long as the rate of under-ascertainment does not change much over time, it has little effect on the effective reproduction number, which is driven by the number of cases each case infects, not the total number of cases or infectious individuals.
---
### **NSW -- locally-acquired & overseas/interstate-acquired cases treated separately**<br>(cases under investigation included)<br>parametric serial interval distribution
```{r nsw-local-os-separately-paramtric, fig.width=10, fig.height=10}
B1a / B2a / B3a / B4a + plot_layout(heights=c(2,1,1,1))
```
***
Commentary
---
### **NSW -- locally-acquired & overseas/interstate-acquired cases treated separately**<br>(cases under investigation included)<br>serial interval distribution estimated from data
```{r nsw-local-os-sep-si-dist-data, fig.width=10, fig.height=10}
B1b / B2b / B3b / B4b + plot_layout(heights=c(2,1,1,1))
```
***
Commentary
---
### **NSW -- all cases treated as locally-acquired**<br>parametric serial interval distribution
```{r nsw-all-cases-locally, fig.width=10, fig.height=10,}
C1a / C2a / C3a + plot_layout(heights=c(2,1.5,1.5))
```
***
Commentary
---
### **NSW -- all cases treated as locally-acquired**<br>serial interval distribution estimated from data
```{r nsw-all-cases-locally-si-est-data, fig.width=10, fig.height=10,}
C1b / C2b / C3b + plot_layout(heights=c(2,1.5,1.5))
```
***
Commentary
---
GIFs {.storyboard}
=======================================================================
### Please stay at home these Easter holidays
```{r stay-at-home}
if (params$local) {
knitr::include_graphics("docs/assets/stay_at_home.gif")
} else {
knitr::include_graphics("assets/stay_at_home.gif")
}
```
***
This is a graphic created by Pavel Krivitsky (UNSW Maths and Stats) and Tim Churches (UNSW Medicine) to illustrate the importance of reduced social mixing, particularly over the Easter break in 2020. Further explanation and technical details can be found [here](https://cbdrh.github.io/StayAtHome/).
### Flatten this!