Forecasting Covid-19: An annotated chat transcript of Adrian Raftery's recent Berkeley Demography brown bag

The brown bag seminar in the Berkeley Demography Department has been all Covid-19 all the time the last few weeks. This week we had a terrific panel made up of Adrian Raftery from the University of Washington together with Berkeley’s own Nicholas Jewell and Joseph Lewnard.

Our format is that while the speakers are talking the audience enters questions in the chat room. We try to ask these questions during the session. This week, when I mailed out the transcript to our speakers, I was delighted to get a reply from Adrian with his own written answers.

Enjoy the annotated transcript. And maybe some of you will want to join in live when we have our next session. If you would like to be on the brownbag announcement list please email me.

Transcript from “Forecasting Covid-19”

(Comments by Adrian in bold)

12:13:59 Joshua Goldstein: Question for everyone: is there any forecasting problem comparable to Covid-19, where real time forecasts being used for policy?

Adrian: Oh yes. Weather forecasting! Big decisions made often (e.g. close schools for snow).

12:17:08 From Joshua Goldstein: a question for Adrian, was time-varying r0(t) modeling innovated for the HIV epidemic or was this already standard in epi?

I don’t think so, actually – I think there were a lot of advances in infectious disease epi from the HIV-AIDS epidemic. But I’m sure Nick would know better than me.

12:20:36 From Joshua Goldstein: another question for the group: does any one know of sero-tests that are being planned for population surveys (NHIS, …?)

12:21:06 From Magali & Robert: I am curious to know what Adrian thinks of the IHME models.

I did make a few comments in the talk.

12:21:41 From Nick Jewell: I will say something about IHME but would be curious about Adrian’s views also.

12:21:41 From Joshua Goldstein: what does “be Bayesian” mean right now for COVID-19?

Well, several of the models are already Bayesian. Gabriel Leung’s work at U of Hong Kong (Lancet, January 2020) is Bayesian, and a great use of the framework. The IHME model is also at least partly Bayesian. More generally, I think that if one is using epi models (e.g. SEIR) and trying to combine data sources, a Bayesian approach can be helpful.

12:25:17 From John Sibert: Is it possible to have complete references to the literature cited by the speaker?

My own relevant work on this is here. The other papers I cited were Schwartlander et al (1999, AIDS), and Donnat and Holmes (April 11, 2020, arXiv).

12:25:24 From Wendy Baldwin: There are some gender differences, it seems, with a male disadvantage. Do you need to account for gender in your models?

Answered orally. [Note by Josh: Adrian’s answer to the more general question of what to include and what to leave out was that the choice to include a variable in the model needs to be driven both by the theoretical importance of a variable and data availability. Over-parameterized models risk being inaccurate for forecasts. So even if you’re convinced a variable is important, be careful about including it if you don’t have enough data to make good parameter estimates. (This was a general comment, not focused on “gender”.)]

12:25:53 From Christopher Paciorek: Question for Adrian: given data shortcomings, how much hope is there to estimate variation in R0 / effects of social distancing?

Well, I think simple models could be estimated (e.g. social distancing reduced exposure by alpha %). It would be important not to get too fancy, though.

12:26:13 From luisimac21: Handling heterogeneous mixing in SIR models is a problem, right? Solutions? -Luis

Oh yes. It makes the model much more complicated. My general approach is to fit a simpler model and validate the forecasts against observations. If it’s OK, you don’t need to get more complicated.

12:26:39 From Lauren Goldstein: A new study has begun recruiting at the National Institutes of Health in Bethesda, Maryland, to determine how many adults in the United States without a confirmed history of infection with SARS-CoV-2, the virus that causes coronavirus disease 2019 (COVID-19), have antibodies to the virus. The presence of antibodies in the blood indicates a prior infection. In this “serosurvey,” researchers will collect and analyze blood samples from as many as 10,000 volunteers to provide critical data for epidemiological models. The results will help illuminate the extent to which the novel coronavirus has spread undetected in the United States and provide insights into which communities and populations are most affected.

12:29:56 From Joshua Goldstein: Thanks, Lauren. Question for group on this: isn’t this a terrible idea to have a convenience sample rather than a probability sample?

Yes, I agree. Are they really going to do a sample of volunteers? It seems like a big waste.

Thanks again to Adrian for writing these responses and giving me permission to post them. Thanks to all of our speakers. And a special thank you to everyone who asked a question in the chat room.

See you all next Wednesday at noon (“Berkeley time”).