8.6. INFERENCE FOR TWO SAMPLES OF BINARY DATA 421
is associated with a large multiplicative increase in the probability of outcome. When overall risk
is larger, a risk ratio may potentially obscure the magnitude of an effect. For example, suppose
overall risk is 0.40 and an intervention under study is thought to reduce risk to 0.35. In a large
population, this reduction in absolute risk of 0.05 may be clinically relevant; in a population of
1,000,000 a reduction in risk from 0.40 to 0.35 will reduce the occurrence of the condition from
400,000 to 350,000, affecting 50,000 individuals. The relative risk of 0.40/0.35 = 1.14 does not
convey the same message as the risk difference. Whichever summary statistic is used as the primary
measure of comparison, both should be provided in the interpretation of a study.
The calculation of risk ratio in the LEAP study used the peanut consumption group as the
baseline. The risk in the peanut avoidance group could have been used for the baseline, yielding a
relative risk of 0.019/0.137 = 0.139. This risk of allergy in the consumption group is approximately
0.14 times that of avoidance group. While there is no set convention for the choice of the baseline
group, risk ratios greater than 1 are easier for most people to interpret so the baseline group is
usually chosen to be the one with the smaller risk.
Prevalence ratio and prevalence difference
The calculations for prevalence ratios and differences mirror those for risk ratios and differences,
but the different terminology reflects an important difference in interpretation. The prevalence of a
disease is the proportion of a population experiencing the disease. Cross-sectional studies sample
a population during a prespecified (usually short) time interval and can be used to estimate the
prevalence of a disease and features of the population that may be associated with the disease.
Since a cross-sectional study does not measure an outcome occurring subsequent to an exposure,
it cannot estimate risk of an outcome from an exposure. Cross-sectional studies can, however,
provide important information about the association between outcome and features of a population
that might justify additional studies.
The US CDC estimates that approximately 14.9% of non-Hispanic Asian adults in the United
States have Type 2 diabetes (T2D);
24
the prevalence of T2D in this population is 0.149. For non-
Hispanic white adults, the prevalence of T2D is 0.119 (11.9%). The prevalence difference between
the groups, comparing non-Hispanic Asian to non-Hispanic white adults, is 0.149 - 0.119 = 0.03.
The prevalence ratio comparing Asian to white non-Hispanics is 0.149/0.119 = 1.252. The preva-
lence of T2D for Asian adults is 1.252 times as large as that for white adults.
Odds ratios
Odds ratios are used to estimate an association between an outcome and exposure when baseline
risk or prevalence cannot be estimated, such as in a case-control study. In a dataset, the observed
odds of an event is the number of times the event happens divided by the number of times it does
not. The odds ratio (OR) is the odds of an event occurring in one group divided by the odds of an
event occurring in the baseline group. Somewhat surprisingly, even when risk or prevalence ratio
cannot be estimated, the OR comparing the odds of an outcome between exposed and unexposed
groups can.
Figure 8.19 in Section 8.5 summarizes the results of a study examining the association of per-
sistent pulmonary hypertension of a newborn (PPHN) with exposure to maternal use of a selective
serotonin re-uptake inhibitor (SSRI) during pregnancy. For convenience, the figure is repeated here
as Figure 8.22.
Participants in the PPHN study were sampled and grouped according to whether their in-
fants did or did not suffer from PPHN; the study did not count the number of PPHN outcomes
among women using an SSRI during pregnancy. Thus, the absolute risk of PPHN given SSRI use,
24
Centers for Disease Control and Prevention. National Diabetes Statistics Report, 2020.