How many people are infected with COVID-19? Sewage suggests that number is much higher than officially confirmed

April 8, 2020

TL;DR: In an area with 446 reported cases, our sewage-based method estimates up to 115,000.

Estimating the true number of COVID-19 cases is extremely challenging. Counting confirmed clinical cases provides an important view into the scope of this pandemic, but case counts are a dramatic underestimate due to limited access to clinical testing. Moreover, asymptomatic patients or those with mild symptoms may never seek out testing in the first place, but they are potentially still contagious.

SARS-CoV-2 is shed in stool and has been detected in sewage in the U.S. by our team, and in the Netherlands by the KWR Research Institute.

Sewage suggests that a much larger number of people are infected with COVID-19.
Yesterday our team published in medRxiv the first study to estimate the number of people infected with COVID-19 based on the levels of SARS-CoV-2 quantified in sewage. We collected sewage samples from a large metropolitan area in the state of Massachusetts. On March 25, the area represented by the sample had approximately 446 confirmed cases of COVID-19. Based on our sewage analysis, we estimate that up to 115,000 people are infected and shedding the SARS-CoV-2 virus.

Big Chart
SARS-CoV-2 viral titer per mL of sewage. Estimated gene copies for each of the three CDC primers are shown for all sampling dates.

Our laboratory protocols were validated to quantify SARS-CoV-2 in sewage
Our detailed laboratory protocols have been made available open source to the scientific community on our website. The four main steps in our protocol are:

1. Sample pasteurization
We subject sewage samples to a 60C heat bath over 60 minutes to inactivate coronaviruses.

2. Virus concentration
We remove bacterial cells, and we use a PEG-based precipitation method to concentrate viruses in the sewage sample.

3. RNA extraction
We extract RNA, since SARS-CoV-2 has an RNA genome. This RNA is used to create cDNA through a reverse transcription assay.

4. qPCR
Finally, we use qPCR to quantify the level of SARS-CoV-2 virus in the sewage sample. We use the virus genome to make a standard curve. And we end up with copies / mL of sewage.

Our protocols were validated with several experiments:

  • The PCR product was sent for Sanger sequencing and we confirmed it uniquely matched the SARS-CoV-2 genome.
  • Samples collected at the same treatment facility back in January (before any confirmed cases) tested negative.
  • Results were reproducible across days, and
  • Results were reproducible in samples stored in the fridge for two weeks.
Big Dots Process
Sewage sample stability over time.

Why sewage analysis shows a higher number of cases than confirmed cases from individual testing.
At this time, our prevalence estimates are a back of the envelope exercise and there’s much work to be done to improve accuracy. Beyond our own technical limitations, there are possible explanations why sewage is giving a higher number of infected people, as compared to confirmed cases:

  • People with mild symptoms may not go to the hospital, or get tested,
  • There is growing evidence that COVID-19 could have a large asymptomatic population,
  • Limited access to clinical testing, and
  • A lag in reporting confirmed cases.

Next steps to make our COVID-19 case estimation more accurate.
Our next step to make our COVID-19 case estimation model more accurate is to model the person-to-person variability in SARS-CoV-2 shedding in stool. Our team is already in conversation with other academic groups tackling this issue. Kudos to academic collaboration!

Sewage samples collected from geographical areas with different rates of COVID-19 infection will also make our modeling more accurate. We will be announcing our campaign shortly. For more information, sign up at

Written by Biobot Analytics

Biobot provides wastewater epidemiology data & analysis to help governments & businesses focus on public health efforts and improve lives.