George Rebane
Every social issue draws relevance ONLY through its numbers, which if you don’t understand, then you cannot participate in a reasoned discussion about the issue.
Today we are concerned with two types of C19 tests and testing policies. Tests fall into the ‘detect presence of infection’ variety, and the ‘detect presence of immunity’ variety – I’ll call them infection and immunity tests respectively. Testing on an individual basis is supposed to allow the healthcare professional to decide whether you need further attention (drawing down on limited healthcare resources). Conducting a comprehensive test on a sample drawn from a target population is done to advise response policy makers about ultimate levels of infection, mortality, and time to achieve each that will advise what kind of policy to implement, and then also monitor how well the implemented policy is working.
The infection tests seek to identify the guilty pathogen (today the C19 virus) or the presence of one of its bio-proxies, and the immunity test is used to identify the presence of antibodies or antigens (more here) which the body has produced after being infected, and which are now providing immunity to the individual. The problem with interpreting the results of such tests is that they are inherently unreliable (used in the technical sense) – their sensitivities and specificities are less than perfect. We recall that sensitivity is the probability of testing positive, given that you are infected – that is P(TP|V). And specificity is one minus the probability that you test negative, given that you are not infected – that is P(¬TP|¬V) = 1-P(TP|¬V). Here we recall that TP means test positive, and V denotes the presence of the virus or not indicated by ¬V. Good tests enjoy high values for both sensitivity and specificity. I have presented an extensive discussion of the impact of such a test administered for an individual in ‘Testing’s Tower of Babel’.
In this commentary I want to interest the reader in how to understand and evaluate unreliable testing of a target population for the purpose of determining the fraction of that population that is infected or already immune. Since testing the whole population is almost always impractical, the tool used to determine that fraction is through comprehensive testing of a random sample of people drawn from the target population. A drawn sample is ‘random’ only when each member of the target population has an equal chance (probability) of being included in the sample. (Target populations segmented according to a set of attributes may be randomized by ensuring that the known segment shares are represented in the sample according to each segment’s share of the overall target population.)
For any given population such curves that indicate the current cumulative value of recoveries and deaths as a function of time are called sigmoids. A sigmoid starts at a designated low initial value (usually zero) increases at an ever faster pace (slope), and then begins rolling over at a decreasing pace (slope) until it flattens out at its saturation level. For epidemics and as shown above in green (recovered) and black (deceased), such saturation levels are almost always less than the total number in the target population. A complete sigmoid for, say, the increasing number of deaths is highlighted by the red curve below.
The correct method of calculating such population fractions is a bit more complex and takes into account the reliability numbers connected with the test. Doing it right is not that hard to understand, and I explain it all in a technical note that you can download here - Download TN2004-2. Then you can understand what all this comprehensive testing hubbub is about, and understand not only the numbers, their accuracy, but also of whether they were calculated correctly. Now that ‘millions’ of these tests will become available, comprehensive testing will be done and their results reported. (There’s also a very good chance that reports of such testing will be utterly screwed up, and, of course, painted with a politicized agenda.)
The blue jiggly line in the above figure indicates the noisy values of the fraction as obtained through testing a random sample periodically drawn from the target population. It is only from the partial and noisy blue data that we must be able to estimate the true population fraction, and then use it to project when and at what level it will stabilize (see figure below); but in the realworld it can get even more complicated. More about that below.
Above we see a figure showing the developing sigmoids of C19 deaths experienced in several developed countries. These are the kind of realworld data from which we want to predict the progress of the disease to support public response policies that will affect both population health and the economy. Considering how this data was developed and obtained, we note that what we have is the sequential sum of possibly multiple sigmoids operating in sequence, each reflecting a change in how the target population would behave in response to changing public policies and/or their own reaction responding to such policies. In other words, the observed green process is not static, but is instead dynamic, switching from one sigmoid to another over time. We show such a dynamic process in the figure below.
Above we see the different responses (in red) to three distinct policies. For explication purposes let’s say we are attempting to minimize total deaths. We adopt the first policy, represented by the uppermost red sigmoid, about the effects of which we don’t yet know, but are beginning to gather meaningful data around week five (W5). The politicians put the technical mavens to work, and using the noisy mortality data from the target population, they estimate the topmost red sigmoid. The politicians ponder and conclude that NFW will we tolerate such a high number of total deaths – ‘we gotta do something’. And they implement a new policy at W8 that switches the ‘mortality regime’ to a more acceptable level indicated by the middle red sigmoid. The techies do their mathematical magic and predict the new lower level of deaths that will be achieved somewhere around W17. Too high, and the politicos are still not satisfied, so they decide to really turn the screws on a tough hunker down policy at W10. The noisy data starts coming in and the techies tell the politicos that a revised lower level of total deaths will come to pass at around W30. The new policy has flattened the curve and stretched it out to a later date.
What we have seen here is the transition between three policy regimes to finally arrive at a level of deaths that the political leaders consider acceptable given all the public pushback to more stringent lockdown policies. But we have to remember that to do this kind of political decision making, the techies had to work with the developing noisy green data from which to deduce the predicted smooth blue line that captures the sigmoid of the underlying process imposed by the sequence of new policies. It’s a complex and difficult task to get the parameters of the new blue sigmoid nailed down and communicated to the non-techie politicians.
The overarching problem here involves the intellectual competence of the decision makers. How well do they understand the above described concepts which so far have involved no mathematics whatsoever. Unfortunately, we live in a world where vanishingly few decision makers will be able to grasp any of this. And I fear that not many RR readers will have suffered through this simplistic ding-dong school on testing for population fractions and using these fractions to predict the effect of coronavirus response policies. But that doesn’t stop anyone from having twenty opinions on what is being done, what can be done, and what should be done when it comes to testing. Testing, testing, and more testing is the cry across the land.
I suppose that life is just one damn sigmoid after another (h/t to Toynbee). Plus, for extra special magic, you get the added inaccuracies of determining actual cause of death, virus evolution, reporting standards changing for political and financial reasons. Yee haw.
I can see the point of epidemiologists' math in an attempt to divine the future or to test policy but after poking through a few papers (praise be to sci-hub) a lot of it strikes me as a combination of fitting data to rule-of-thumb functions and what Taleb calls 'citation rings'. No doubt simulations can produce equally poor results depending on initial conditions and the vagaries of non-linear systems.
Given the politics of the thing, the temptation (at least these days) will tend to favor immediate perceived danger and require overkill in policy. If you are going to invade Normandy beaches, take along 3x what you think you'll need and pray that Das Reich gets caught on the back roads.
The upshot? Thousands of politicians and business executive will just wing it anyway. Analysis is just to keep the TV audience entertained. In the end, there'll be That Guy who correctly guessed the right inaccurate numbers who'll be hailed as a genius.
Posted by: scenes | 04 May 2020 at 08:07 AM
GR -I have read and re-read your "ding dong school on testing" and concur with scenes 08:07 that most policy makers are just winging it, politics and perception being their major inputs.
Posted by: Bob Hobert | 06 May 2020 at 08:24 AM
I just love the disagreements.
https://www.sciencetimes.com/articles/25410/20200421/austria-90-drop-coronavirus-cases-requiring-people-wear-face-masks.htm
My own take is to completely reopen the economy and to require masks outside of the house. But then I've thought so for a couple of months.
Posted by: scenes | 06 May 2020 at 09:56 AM