Just as women experience a drop in estrogen during menopause, so might men also experience a drop in testosterone and suffer symptoms akin to menopause as they approach middle age. Whereas menopause for women is unequivocal and represents a biological demarcation point from fertility to infertility, the effects of a hormonal downturn for men, as men age, are most likely subtle and certainly less dramatic then what women experience.
Despite the arguable status of male menopause, a number of authors have strongly suggested that such an event can occur, can be measured, and can be successfully ameliorated. As discussed in prior posts in this series, Morley and his colleagues first labelled male menopause as ADAM or Androgen Deficiency in Aging Males. In addition to describing ADAM as a clinical phenomenon, Morley also developed a 10 item questionnaire to measure the possible presence of androgen deficiency among middle-aged men.
To test the ADAM construct, Morley et al administered their questionnaire to a sample of 316 Canadian physicians, who ranged in age from 40 to 82 years, and measured these physicians’ testosterone levels. To refresh your memory, here is the ADAM questionnaire and its scoring key:
Morley found that 25 percent of these physicians had bioavailable testosterone levels lower than 70 ng/mL. Using reference values from asymptomatic younger men, 70 ng/mL was deemed to demarcate low testosterone from normal testosterone. Morley and his colleagues found that positive scores on the ADAM questionnaire identified 88 percent of those men with low testosterone.
Let’s stop and consider this outcome. At first glance, identifying almost 90 percent of those men in this sample who had low testosterone based on a simple questionnaire seems quite impressive. The ADAM questionnaire is clearly very sensitive when it comes to identifying the presence of testosterone deficiency. At the same time, it is also very easy to confuse the sensitivity of a test with the accuracy of a test. We naturally think of these two terms as interchangeable. If you are told that a test is 90 percent sensitive to the presence of a disorder, you immediately consider that test to be very accurate. You, in fact, would be wrong but you would not be alone in your error.
For example, take a moment and type in “ADAM and testosterone” and do a Google search on these keywords. You will find innumerable websites, both medical and lay, attesting to the ability of the ADAM questionnaire to quickly and accurately detect the presence of testosterone deficiency. Now I certainly defend your right to stick whatever nostrum in your body if it helps you get from today to tomorrow. But do not delude yourself in thinking that these websites are concerned about your health or diagnostic accuracy. They are not.
They are concerned about selling you a product or a service — testosterone enhancement.
The reality is that the sensitivity of a test is only half the story. The other half is whether or not the test achieves its sensitivity through being over-liberal in its detection of the disorder of interest. For example, lets say we create a new screening test for testosterone deficiency to compete with the ADAM questionnaire and we call it the Everybody Gets A Disorder (EGAD) questionnaire. Here is the test and its scoring key:
Now, to test our questionnaire we give it to another imaginary sample of 316 middle-aged and elderly Canadian physicians and measure those physicians’ testosterone levels. Again we find that 25 percent of the physicians have low levels of testosterone. And we find that our new screening measure, the EGAD questionnaire, outperforms the ADAM questionnaire and is 100 percent sensitive. Our new questionnaire was able to detect all of the physicians who had low levels of testosterone. Amazing!
So, you get the point. Sensitivity is nice but, by itself, not very instructive. We need to consider not only those whom the test correctly identifies as having low testosterone but also those whom the test incorrectly identifies as having low testosterone. This type of error — suggesting that someone has a disorder when they do not — is called a false positive. Most diagnostic tests will report both sensitivity and specificity. Specificity is directly related to the false positive rate. To be exact, one minus the specificity rate gives you the false positive rate. When a diagnostic test has good specificity, false positives are few. When specificity is low, false positives are common.
Morley and colleagues noted that the ADAM questionnaire demonstrated a specificity of 60 percent. Not great but not completely horrible. It is fairly straightforward to determine how often the ADAM questionnaire was correct in its prediction of low testosterone based on its sensitivity, specificity, and base rate (the number people in the study who actually had low testosterone based on blood testing). The rate at which the ADAM positively predicts those with low testosterone is only 42 percent. Or put another way, only four out of ten people that the ADAM questionnaire predicted would have low testosterone actually did have low testosterone upon follow-up blood testing.
That is not very impressive.
But, to be fair, our EGAD questionnaire did not do better. It’s rate of positive prediction was 25 percent. So, if we round up, the EGAD questionnaire positively predicted only three out of ten people as testosterone deficient. Not as good as the ADAM but close.
However, as it turns out, the true specificity rate of the ADAM questionnaire may be a bit lower than the 60 percent rate reported by Morley and his colleagues. Following Morley’s original study, subsequent studies have reported considerably lower specificity rates for the ADAM questionnaire with rates ranging from 22 percent to 40 percent. In fact, the lowest specificity rate of 22 percent was demonstrated in the study with the largest sample and contained over 5000 participants. As well, in 2006, in a second study evaluating the ADAM questionnaire, Morley and his colleagues also reported a specificity rate of 30 percent.
If we recalculate the rate of positive prediction in Morley’s original study using a specificity rate of 30 percent (or false positive rate of 70 percent), then the ADAM questionnaire’s positive prediction rate drops down to three out of ten people.
Given that our EGAD questionnaire has higher sensitivity than the ADAM questionnaire and most likely an equal rate of specificity, I say EGAD is the winner.
So, pharmaceutical companies and other purveyors of testosterone porn, you know where to find me should you wish to discuss licensing fees.
- Chueh, K. S., Huang, S. P., Lee, Y. C., Wang, C. J., Yeh, H. C., Li, W. M., . . . & Liu, C. C. (2012). The comparison of the Aging Male Symptoms (AMS) scale and Androgen Deficiency in the Aging Male (ADAM) questionnaire to detect androgen deficiency in middle-aged men. Journal of Andrology, 33, 817-823.
- Everybody Gets A Disorder (EGAD) Questionnaire