New York University

What Numbers Can—and Can’t—Tell Us About the Pandemic

Data scientists identify common COVID-19 statistical pitfalls
14-Jul-2020 6:20 PM EDT, by New York University

Newswise — Currently, we are confronted, around the clock, with troubling data as reporters, public health experts, and elected officials seek to understand and describe the path and impact of COVID-19—poring over rates of infection, hospital admission, and death, to name just a few key indicators. 

With so many numbers to digest, it can be challenging to separate statistics that may mislead from those that illuminate—something that has complicated the decision-making of government officials, according to recent news accounts. And while the widespread suspicion that numbers can be manipulated to support almost any conclusion predates the pandemic, partisanship around the response to the virus has further undermined Americans’ trust in COVID-19 data, according to a recent Pew Research Center survey

But statistics are, of course, vital to understanding the current crisis, as well as other complex problems such as poverty, economic downturns, and climate change, and so researchers stress the importance of learning to distinguish what’s useful from what may be junk. 

“We suspect that statistics may be wrong, that people who use statistics may be ‘lying’—trying to manipulate us by using numbers to somehow distort the truth,” writes sociologist Joel Best in his book Damned Lies and Statistics. But, he explains, “[t]he solution to the problem of bad statistics is not to ignore all statistics, or to assume that every number is false. Some statistics are bad, but others are pretty good, and we need statistics—good statistics—to talk sensibly about social problems.” 

To help enhance our own statistical literacy as the pandemic continues, NYU News spoke with Andrew Gordon Wilson and Jonathan Niles-Weed, assistant professors at NYU’s Center for Data Science and Courant Institute of Mathematical Sciences, who outlined some principles to keep in mind when evaluating figures cited in the news. 

Their tips appear below, but both caution that training in data science alone isn’t enough to equip leaders to make perfect decisions.

“Many people—statisticians included—think that every problem can be solved by getting better data,” says Niles-Weed. “But even with perfect information, beating COVID will require politicians and public health experts to weigh very different considerations and make hard choices despite uncertainty. Data can help, but setting good policy also requires incorporating values and goals.” 

Be certain about the uncertainty in the data.

“Many of the facts and figures we see come with big unstated error bars,” warns Wilson. “Suppose the only person in a village tested for coronavirus tests positive. It could be reported that the incidence rate in that region is 100%. You might say, ‘Surely they need to test more people?’ But how many people should we test for an accurate incidence estimate? Ten people, 100 people, 10,000 people? What’s a reasonable sample size? And do we only test symptomatic people? What fraction of the population is asymptomatic? What constitutes ‘accurate’? Similarly, models predicting quantities such as incidence rate take many variables as input, such as case fatality rate. These inputs similarly have big uncertainty attached to them. We should be conscious of uncertainty in parsing numbers we see in the media—the point predictions, without reasonable estimates of the error bars, are often meaningless.” 

Separate real trends from random occurrences

“Random variation in data can easily be mistaken for a genuine trend,” says Niles-Weed. “Even if the underlying situation is static, data may change from day to day because of random noise. For example, if a state’s newly confirmed cases are particularly high during a given week and lower the next, it’s easy to interpret this as meaningful: perhaps the high caseload in one week made citizens more cautious, leading to a drop in cases the next week after behaviors changed. But it's just as likely that the first week was just a random outlier, and that nothing at all changed. By contrast, sustained day-over-day increases or decreases can indicate real trends.” 

Know what different probabilities can tell you—and what they can't. 

“It’s easy to confuse conditional probabilities, which is significant during a pandemic because it can lead to a misreading of testing data,” notes Wilson. “For example, in taking a test for coronavirus, we care about the probability that we have coronavirus given that we test positive—and not the probability that we test positive given that we have coronavirus."

We have to carefully interpret what a probability is telling us. For example, the sensitivity of a test tells us the probability that we test positive, given that we have the condition. Similarly, another measure—the specificity—is the probability of a negative result if we don't have the condition. If a test has a high sensitivity, and is thus reported as highly accurate, it does not mean testing positive means we are likely to have coronavirus, especially if the general rate of coronavirus in the population is low. Similarly, if the general rate of coronavirus is high, a negative test result may have high probability of being a false negative, even when the test has high specificity.” 

Is your sample biased?

“While a truly random sample can give precise information about the whole population, bias can arise if some people are more likely to be included than others,” explains Niles-Weed. “For example, if a research team performs antibody tests on a random set of people walking down a city street, they will invariably miss those too sick to leave their beds. Data collected in this way can fail to be representative when extended to the whole population.” 

What information is missing?          

“Many claims are factually correct but misleading due to crucial missing information,” says Wilson. “For example, it may be correct to say a majority of confirmed cases in a region are Asian, but if only a very small number of people had tested positive, that may not be a meaningful finding. Similarly, there are many correlations that can easily be explained away by missing causal factors. It was reported at one time that healthcare workers in New York have a slightly lower incidence of coronavirus than the general population. Does that mean social distancing is ineffective, since these workers will be more exposed to infected people? If we condition on the fact that healthcare workers are trained to be vigilant in mask wearing, hand-washing, distancing, and sanitization, it likely means the exact opposite!”


Register for reporter access to contact details

Damned Lies and Statistics

Filters close

Showing results

110 of 3368
Newswise: Woman recovers from potentially deadly stroke with timely treatment and determination
Released: 22-Sep-2020 5:20 PM EDT
Woman recovers from potentially deadly stroke with timely treatment and determination
University of Texas Health Science Center at Houston

During a time when many people are delaying appropriate health care due to fear of COVID-19, Patricia Miata, 58, says timely treatment is ultimately what saved her life after suffering a stroke.

17-Sep-2020 8:05 AM EDT
Kidney Damage From COVID-19 Linked to Higher Risk of In-Hospital Death
American Society of Nephrology (ASN)

In an analysis of patients hospitalized with COVID-19, kidney damage associated with the infectious disease was linked with a higher risk of dying during hospitalization.

access_time Embargo lifts in 2 days
Embargo will expire: 25-Sep-2020 12:15 AM EDT Released to reporters: 22-Sep-2020 4:00 PM EDT

A reporter's PressPass is required to access this story until the embargo expires on 25-Sep-2020 12:15 AM EDT The Newswise PressPass gives verified journalists access to embargoed stories. Please log in to complete a presspass application. If you have not yet registered, please Register. When you fill out the registration form, please identify yourself as a reporter in order to advance to the presspass application form.

Newswise: 243484_web.jpg
Released: 22-Sep-2020 3:45 PM EDT
When does a second COVID surge end? Look at the maths
University of Sydney

Mathematicians have developed a framework to determine when regions enter and exit COVID-19 infection surge periods, providing a useful tool for public health policymakers to help manage the coronavirus pandemic.

Newswise: 243527_web.jpg
Released: 22-Sep-2020 3:25 PM EDT
Web resources bring new insight into COVID-19
Baylor College of Medicine

Researchers around the world are a step closer to a better understanding of the intricacies of COVID-19 thanks to two new web resources developed by investigators at Baylor College of Medicine and the University of California San Diego.

Released: 22-Sep-2020 3:10 PM EDT
Nearly 20 percent of americans don't have enough to eat
Pennington Biomedical Research Center

More than 18 percent of U.S. adults do not know whether they will have enough to eat from day to day, and the numbers are worse for Hispanics, Blacks, people with obesity, and women, a new report shows.

Released: 22-Sep-2020 3:00 PM EDT
From pandemic to storms, virtual summit takes on issues facing small island states
University of Delaware

The Virtual Island Summit, held earlier this month and attended by 350 representatives of government, civil society, business and academics from more 60 different countries, addressed the urgency of identifying and implementing technology-based solutions to the COVID-19 pandemic.

Released: 22-Sep-2020 3:00 PM EDT
Patients With COVID-19 May Have Higher Risk of Kidney Injury
Rush University Medical Center

According to Jochen Reiser, MD, PhD, the Ralph C Brown MD professor and chairperson of Rush’s Department of Internal Medicine, patients with COVID-19 experience elevated levels of soluble urokinase receptor (suPAR), an immune-derived pathogenic protein that is strongly predictive of kidney injury.

Newswise: Johns Hopkins Researchers Offer Lessons Learned From Early Covid-19 Patients
Released: 22-Sep-2020 3:00 PM EDT
Johns Hopkins Researchers Offer Lessons Learned From Early Covid-19 Patients
Johns Hopkins Medicine

Using a combination of demographic and clinical data gathered from seven weeks of COVID-19 patient care early in the coronavirus pandemic, Johns Hopkins researchers today published a “prediction model” they say can help other hospitals care for COVID-19 patients — and make important decisions about planning and resource allocations.

Showing results

110 of 3368