False Positive Paradox Calculator
Table of contents
Specificity, sensitivity, and base rateThe false positive paradoxUsing Bayes's theorem to calculate the false positive paradoxThe false positive paradox formula (positive predictive value)Misconceptions about the false positive paradoxTwo ways to overcome the false positive paradoxFAQsMedical testing is essential for diagnosing and treating diseases but is not 100% accurate, and, in fact, there's a little-known and counterintuitive phenomenon related to false positive results: the false positive paradox.
Apart from correctly detecting positive results, it's worth avoiding false positive results as much as possible, as they can cause undue anxiety and distress for the patient and lead to unnecessary further medical interventions.
The false positive paradox is the most common example of base rate fallacy, which consists of people's tendency to ignore the base rate (e.g., prevalence) in favor of information pertaining only to a specific case (e.g., sensitivity and specificity).
Keep reading this article, and discover why "accurate" can be highly inaccurate.
Specificity, sensitivity, and base rate
To understand the false positive paradox, we must be aware of the following terminology:
-
Sensitivity measures the proportion of positive cases correctly identified by a test. Mathematically, it is the ratio of true positive results to the total number of positive cases (i.e., true positives plus false negatives):
-
Specificity measures the proportion of negative cases correctly identified by the test. We calculate it as the ratio of true negative results to the total number of negative cases (i.e., true negatives plus false positives):
-
The base rate is the proportion of individuals in a population with a particular condition at a given time. In epidemiology, the base rate is the same as the prevalence, which is the proportion of a population affected by a medical condition or disease.
🙋 In the sensitivity and specificity calculator, we dig deeper into these concepts.
The false positive paradox
The false positive paradox consists of a high portion of the positive test results being, in fact, false positives, even if the test has a high specificity and sensitivity. This paradox occurs when the base rate for the condition is low, as it happens with rare diseases that affect a tiny percentage of the population.
Visit the false positive calculator to learn more about false positives.
Using Bayes's theorem to calculate the false positive paradox
Yes, many positive test results can be false positives, but how do we know the exact proportion? Let's apply the knowledge gained with our Bayes theorem calculator to this problem.
Let A be the event of having the condition (being positive) and B be the event of testing positive. Then, the probability of having a condition given a positive test result is:
where:
- — Base rate of the condition in the population;
- — Probability of testing positive; and
- — Probability of testing positive, given that you actually have the condition. In other words, the sensitivity of the test.
We usually know the base rate and sensitivity, but P(B) is not readily available and will depend on other probabilities. Using the law of total probability, we can express P(B) as:
where:
- — Probability of testing positive given the actual condition is negative, also known as the false positive rate of the test; and
- — Complement of the base rate, which is the probability of not having the condition.
Now, we can mix the two above equations and obtain a more explicit version of P(A|B):
Let's relate the above terms to the base rate, sensitivity, and specificity parameters. Before, let's remember:
- ;
- ;
- (as is the specificity); and
- .
Let's input all that in the previous equation to finally obtain the probability of having a condition given a positive test result, in terms of base rate (BR), specificity (SP), and sensitivity (SE):
The probability of having a condition given a positive test result, that is , is also called the positive predictive value.
The false positive paradox formula (positive predictive value)
If you're a TL;DR lover, don't worry. The summary from the previous section is the following formula:
where:
- — Positive predictive value;
- — Sensitivity of the test;
- — Specificity of the test; and
- — Base rate of the condition, known as "prevalence" in epidemiology.
The PPV is the formal name for that quantity we've been talking about all this article: the probability of having a condition given a positive test result.
Misconceptions about the false positive paradox
Even if we're aware of this paradox, we can also be victims of some misconceptions associated with its counterintuitive nature. Here are some of them:
- "A positive test result is more likely to be incorrect than correct." This affirmation is not necessarily true, as the false positive paradox arises given a low prior probability of the event to test. This low prior probability can increase as we get more information about the actual condition.
💡 "The probability of an event depends on our state of knowledge (information) and not on the state of the real world. Corollary: There is no such thing as the "intrinsic" probability of an event." (
)-
"We can resolve the false positive paradox by increasing the sample size." As we saw in the previous section, the false positive paradox (P(A|B) equation) doesn't depend on the sample size but on the base rate, specificity, and sensitivity. If we increase the sample size, there will be more true positives but also more false positives, and the TP/(TP + FP) proportion will remain the same. The following section explains how to modify the sample to overcome this paradox.
-
"Increasing the sensitivity will significantly attenuate the paradox effects". A higher sensitivity will increase the ability to detect positives correctly. Still, more is needed to solve the main issue: the tremendous number of false positives arising from the specificity and the enormous quantity of negative events.
Two ways to overcome the false positive paradox
In the previous section, we saw that the sample size and sensitivity have little effect on the false positive paradox, but what can we do? Let's find out.
Increase the specificity
Increasing the specificity will attenuate the FP paradox effects, as it'll decrease the number of false positives without affecting the number of true positives.
The easiest way to increase the specificity is by switching to a test with higher specificity. If that's not possible, you can dig deeper into the biological and technical causes behind that specificity error and try to modify them in your favor. The
presents some examples:- For HIV tests, technical issues include specimen mix-up, mislabeling, improper handling, and misinterpreting of a visually-read rapid test result.
- Biological causes include participation in an HIV vaccine study, autoimmune disorders, and other medical conditions that generate HIV-like antibodies.
Improve your sampling
The false positive paradox assumes the prevalence is the only prior probability we know. We can modify this by improving our sample and only testing those "suspicious" cases.
Suppose you carefully select your sample to mainly include people with a higher prior probability of having the condition. In that case, this can increase the test's positive predictive value and decrease the likelihood of obtaining false positives given a positive result.
For example, suppose you are testing for a rare disease that only affects 1% of the population and select a sample of individuals with a family history of the disease. In that case, the prior probability of having the disease in this sample may be much higher than 1%.
What are the chances of a false positive HIV test?
If you live in the US and get an HIV-positive test, the chance of a false positive is about 63%, assuming the US prevalence is the only info about your HIV risk. The probability will be higher if you belong to a high-risk group or know you've been exposed to the virus. As always, follow the guidelines of a health professional.
How to calculate sensitivity and specificity?
To calculate sensitivity and specificity:
-
Identify the number of true positive (TP), true negative (TN), false positive (FP), and false negative (FN) cases.
-
Calculate the sensitivity by dividing the number of true positives by the sum of true positives and false negative cases:
Sensitivity = TP / (TP + FN)
-
Calculate the specificity by dividing the true negative by the sum of the true negative and false positive cases:
Specificity = TN / (FP + TN)
How to calculate the positive predictive value (PPV)?
There are two ways to calculate the PPV:
-
Dividing the number of true positives by the total number of positive test results (the sum of the false and true positives):
PPV = TP / (TP + FP)
-
If you know the sensitivity (SE), specificity (SP), and base rate (BR, aka prevalence), use the formula:
PPV = SE × BR / [SE × BR + (1 - SP) × (1 - BR)]
Which is an example of base rate fallacy?
The most common example of base rate fallacy is the false positive paradox, in which, even if the test accuracy is high, the proportion of false positives will be high because of the condition's low prevalence (base rate).