Sunday, July 18, 2010

Success and Failure in Medical Testing

There is nothing quite as unsettling or at times scary as getting back a medical test result that suggests that you have a particular disease or one of your bodily parts is not functioning correctly. Your despair is undoubtedly deepened when your doctor inevitably tells you that the test that has uncovered your sad predicament is in fact very accurate. “97% accurate,” the doctor might say.

Such bad news is faced by people everyday. Yet a little understanding of probability can restore some hope to the situation. This is particularly the case for tests that are conducted for unusual diseases or for which you have no history of potential exposure. So how do you figure out what your real risk is?

The first thing to notice is that the doctor is quoting the accuracy of turning up a false positive, namely that the test indicates that you have the disease when in fact you don’t. What is not being quoted is the conditional probability of given that you have a positive test result, what is the probability that you have the disease. This might sound like the same thing, but in fact, the two probabilities are very different.

To make further headway into the issue, it is important to know what the probability of occurrence of the stated disease is. Let’s take as an example testing for cancer. The following example uses made up assumptions to demonstrate the point, and would not represent actual risks.

Say the cancer test says that you do not have cancer 97% of the time, when in fact you do not, and says that you do have cancer 99% of the time, when in fact you do have cancer. The first measure (97%) is called the “specificity” of the test. The second measure (99%), is called the sensitivity of the test.

Now assume that two in a hundred people of the test taking age do in fact have cancer. Armed with this information, we can now find the solution with the help of a magic probability box. The box looks at the intersection of having and not having cancer, with the results of the test.

We start by cross multiplying the proportion of the population having and not having cancer with the sensitivity and specificity of the test, respectively.

Have Cancer No Cancer Total
Test says “No Cancer” 2% x
98% x

Test says “Cancer” 2% x
98% x


Next we calculate the results and add up the rows and columns.

Have Cancer No Cancer Total
Test says “No Cancer” 0.02% 95.06% 95.08%
Test says “Cancer” 1.98% 2.94% 4.92%
Total 2.00% 98% 100.00%

Note that the bottom row shows the proportion of the population having and not having cancer and that the total in the two calculated columns and rows both add up to 100%. That is, the four calculated cells add up to 100% and these four cells represent the entire universe of “event” outcomes.

Finally, we are ready to answer the question: If the test says cancer, do you in fact have cancer. The probability of this is

= Prob(having cancer and the test says you have cancer) / Prob(test says you have cancer)

= 1.98% / 4.92%

= 40.24%

Now what is the probability that if the test says cancer, you do not have cancer. The probability of this is

= Prob(not having cancer and the test says you have cancer) / Prob(test says you have cancer

= 2.94% / 4.92%

= 59.75%

We are left with a remarkably counter-intuitive result: You are more likely not to have cancer, than to have cancer despite the so called “accuracy” (sensitivity and specificity) of the test appearing to be quite high, namely 99% and 97%. Why does this happen? It occurs because the low incidence of the disease in the population is in fact lower than the error rate of the test. The test is more likely to be erratic than for you to have cancer.

The implications of this simple example are profound in how we conduct our lives. There is of course the unnecessary worry and concern that results of medical tests produce. More problematic is if the conduct of the test has a cost. This is particularly the case where the cost is not just monetary but physical.

Take for instance the test for Down Syndrome with the use of an Amniocentesis. An amniocentesis collects amniotic fluid, the fluid in the womb, and tests the fetal cells for Down Syndrome. The test is done in the 16-18th week of pregnancy. The test increases the risk of miscarriage by about 0.75%. The test is often recommended for pregnant women 35 years and older. The risk of Down increases with age and is about 1/250 at the time of the test for a 35 year old and 1/20 for a 45 year old.

Let’s assume we have 250 woman of the same age and risk class taking the Amniocentesis and we want to find out the probability that the test will be successful. The question is how to define success. This might be as much a qualitative as a quantitative issue, but one way to do it might be:

  • Success = discover Down, Down exists
  • Failure = Do not have Down but have a miscarriage as a result of the test
  • Neutral = Other outcomes are deemed to be “neutral” and we take the opinion that these are neither good nor bad. This category relates to pregnant woman who do not have a Down pregnancy and do not have a miscarriage from the test.

To simplify matters, we assume that the sensitivity and specificity of the test are both 100%. Now we create a new magic probability box and look at the risks for a 35 year old, making the following assumptions:

  • probability of a miscarriage from test for all pregnancies: 0.75%
  • probability of Down in population of 35 year-old pregnant woman at time of test: 1/250

Miscarriage from Test No Miscarriage from Test Total
Down 0.75% (1/250) 99.25% (1/250) 1/250
No Down 0.75% (249/250) 99.25% (249/250) 249/250

Completing the table gives:

Miscarriage from Test No Miscarriage from Test Total
Down 0.00003 0.00397 0.004
No Down 0.00747 0.98853 0.996
Total 0.0075 0.9925 1.000

For a group of 250 women, we have

  • Successes = 250*0.004 =1
  • Failures = 250*0.00747 = 1.8675 = miscarriage for women without Down fetus
  • Total successes and failures = 1 + 1.8675 = 2.8675

We note that there are more unnecessary miscarriages than Down discoveries!

We now seek the probability of “success” of the test, given that the test produces either a success or failure, according to our prior description.

  • Prob (success, given a success or failure in the test)
    • = 1 / (1+1.8675)
    • = 34.9%
  • Prob (Failure, given a success or failure in the test)
    • = 1.8675 / (1+1.8675)
    • = 65.1%

The result is quite dramatic if you consider that failures result in an unnecessary miscarriage, and leads to the question whether the test produces more harm than good. To answer that question, probability calculations alone will not suffice.

For a 45 year old, the success probability increases to 87.5%.

The advice most doctors give their patients on deciding whether to do the test is to quote the extra risk of miscarriage of 0.75%. This seems very small. However, when you translate this small extra risk to the conditional probability of success or failure, the matter becomes far from clear. The issues become even more complex when you make a value judgment of whether the benefit of a discovery of a Down pregnancy is a better or worse outcome than the causation of miscarriage.

DISCLAIMER: The above analysis does not recommend you should not get tested for cancer (where I made up the probabilities above), or not have an amniocentesis. The above is just a simplified model of how you can think about these matters, and could be quite inappropriate for your circumstances. Individuals should discuss these matters with their doctors who might be aware of many other factors. You should not rely on the simplified models only when making life altering decisions. Speak to your doctor!!

No comments:

Post a Comment