Standardized Rates of Disease
When comparing two or more populations with respect to a health outcome, it is temptiing to compare crude rates of disease, i.e., the number of disease events divided by the size of the population. The "crude rate" is the measure that was introduced in the module on Measures of Disease Frequency. However, comparisons of crude rates can be misleading because of confounding if the populations being compared have different distributions of other determinants of disease, such as age which has an important effect on many heatlh outcomes, such as mortality, heart disease, cancer, infectious diseases, and injury. As a result, differences in age can distort other comparisons between populations, and this distortion is called confounding. This module will focus on a technique called standardization that allows one to compute summary rates of health outcomes that are adjusted to take into account differences in confounding factors like age in order to provide a less distorted comparison.
The two closely related techniques are commonly used to compute "age-adjusted" summary rates that facilitate compartisons among population. Direct standardization applies a standard age distribution to the populations being compared in order to compute summary rates indicating how overall rates would have compared if the populations had had the same age distibution. This method is used when age-specific rates of disease are known for the populations being compared. In contrast, so-called indirect standardization applies a standard set of age-specific rates of disease to the populations being compared in order to compute the number of cases of disease that would be expected in a given population, based on its size and age-distribution.
After completing this module, the student will be able to:
- Explain what is meant by:
- Calculate standardized rates of disease or death for two populations using direct standardization and interpret the findings in words.
- Calculate standardized incidence ratio (SIR) and standardiized mortality rate (SMR) for a disease and describe its meaning.
Crude rates are quite simple and straightforward. They are calculated by dividing the total number of cases in a given time period by the total number of persons in the population. In this case Population B has a higher crude rate of disease. If we think about these two populations as the 'exposures' of interest, does this imply that it is riskier to live in Population B compared to Population A?
The problem with this comparison is that the crude rate is an overall average rate of disease, but it doesn't take into account possible confounding factors.
A confounding factor is basically another risk factor for the outcome of interest that is also unequally distributed among the populations being compared. In this case, age is clearly an independent risk factor for cancer mortality, but what we really would like to know is whether there are differences in cancer mortality between the two populations that are not due to age differences, i.e., differences in mortality that are independent of age differences. If the two populations have unequal age distribution, it will distort the comparison of interest.
For example, Population B might have a greater percentage of older people, and we know that the risk of cancer mortality increases with age regardless of one's environment. If so, then the risk of death due to cancer in Population B might only appear to be greater simply because it has a greater percentage of old people who have an inherently greater risk of dying. In other words, the crude mortality rate for population B might be higher just because it is weighted more heavily with old people. In this setting we might be interested in comparing the mortality rates without the unwanted confounding effect of age.
This problem is clearer if we take a more detailed look by examining the age-specific mortality rates within each of these populations, as shown below.
In this hypothetical example, the table below shows that the age-specific mortality rates are absolutely identical in the two populations. In other words, in any given age group, the two populations have the same risk. However, note that the risk of mortality increases with age. Note also that Population B has a greater percentage of older people. In other words, population B is more heavily weighted with older people, and age is also associated with risk of mortality, so the comparison of crude rates is unfair, because of the unequal age distributions.
Table - Population A
Number of Deaths
Number of People
Death Rate per 10,000
Table - Population B
Number of Deaths
Number of People
Death Rate per 10,000
Since the age-specific rates are identical, the risk of cancer mortality is exactly the same in these two populations. What makes the crude rates different is that older people have a higher risk of cancer mortality, and population B has a greater proportion of older people. In other words, the age-specific rates are the same, but the higher proportion of older people in population B means that the overall crude rate is more heavily weighted by the age-specific rate among older people.
Standardized Rates of Disease
This method, sometimes referred to as direct standardization, provides a useful way to compare health outcomes among populations that may have different age distributions. This is done by applying a standard age distribution to the populations being compared in order to compute hypothetical summary rates indicating how the overall rates would have compared if the populations had had the same age distibution. This method is used when age-specific rates of disease are known for the populations being compared.
Death Rates in Florida and Alaska
This table summarizes the data used to calculate crude (unadjusted) rates for Florida and Alaska. Note that the crude rate for Florida is substantially greater than Alaska's, raising the possibility that it is riskier to live in Florida. Are there social, behavioral, or environmental factors that account for the higher mortality rates? Is the risk of death really greater in Florida?
Number of deaths
Crude mortality rate (per 100,000)
Note also that the crude mortality rate ratio is 1,069/399 = 2.68. However, as you probably know, many older people move to Florida when they retire, so the population of Florida contains a higher percentage of older people, and they have an inherently greater risk of dying compared to young people. As a result, comparing the crude rates is likely to be misleading about whether the risk of death is truly greater in Florida. This is illustrated in detail in the table below.
Table - Age-specific Mortality Rates in Florida
Number of People
% of Total Pop.
Death Rate per 100,000
1,069 (crude rate)
Table - Age-specific Mortality Rates in Alaska
Number of People
% of Total Pop.
Death Rate per 100,000
399 (crude rate)
When we look at the age-specific mortality rates, we see that there is little difference within each age group, certainly nothing like the approximately 2.7 (1069/399) times higher crude death rate in Florida than in Alaska . In theory, we could simply report the age-specific rates and let people compare different states by looking at the rates within each age group separately, but that is less than ideal for two reasons. First, if we wanted to look at all of the 50 states side-by-side, it would be extremely difficult to compare by looking at all the age-specific rates in each state. More importantly, looking at the age-specific rates doesn't necessarily tell us whether one state is higher than another and certainly not the size of any difference. What we would like is a single summary rate like we have with the crude rate, but with the distortion caused by age removed. This is what standardization accomplishes.
In order to understand how this works, it is helpful to take another look at the crude rate.
Method #1: The simple, logical way to calculate the crude death rates is to divide the total events by the total population.
(total # deaths / total population) = (131,902 / 12,340,000 = 0 .01069 = 1,069 / 100,000 population
Method #2: The Long Way to Calculate the Crude Rate (Just to make a teaching point)*
If asked to compute a crude rate, the sensible thing would be to use method #1. However, it is also possible to calculate the crude rate by multiplying the age-specific rates by the fraction of the population that they represent and then summing this up. The "weight" of each age category is given by the fraction of the total population that it represents. For example, the "weight" of the youngest age group in Florida is 0.07 or 7%, while the weight of the oldest age group in Florida is 0.18, or 18%.
So, in the example above, we could calculate the crude rate for Florida as:
(.07) x (284/100,000) = 19.88/100,000
+ (.18) x (57/100,000) = 10.26/100,000
+ (.36) x (198/100,000) = 71.28/100,000
+ (.21) x (815/100,000) = 171.15/100,000
+ (.18) x (4,425/100,000) = 796.25/100,000
Total = 1,069 /100,000 population
NOTE: This is a laborious way to calculate the crude rate; it makes much more sense to just divide the total number of deaths by the total population size. However, we are doing this the long way just to illustrate that if you weight the category-specific rates according to the proportion of the population in each group and then add them, you end up with the crude rate. Because of this, even if two populations have identical category-specific rates, the crude rates will vary if the distributions of the populations are different.
Adjustment by Standardization
As noted above, age-specific rates provide a fairer comparison, but in many situations it is useful to have any overall summary rate that is adjusted for a confounding factor like age, so you can easily compare multiple populations. This can be done be calculating an "adjusted" overall rate which provides for a fairer comparison. In essence, this is accomplished by asking the question "How would the rates have compared if the two populations had had the same age distribution?" I will illustrate how to do this when comparing two populations, but keep in mind that multiple populations can be "adjusted" this way.
The Question We Would Like to Answer:
"What would the comparable death rate be in each state if both populations had identical age distributions?"
We saw above that the crude rate is a weighted average, but the comparison is distorted if the populations have different age distributions. In order to see how the two population would have compared if they had had the same distribution, we can calculate a summary rate by pretending that the distributions are the same in the populations being compared. We will use the long method of calculating the summary rate, as show at the bottom of the previous page. We will use each population's actual age-specific rates, BUT we will apply the same set of weights (fraction of people in each age group) to all of the populations being compared. In essence, this will give us a summary rate that is adjusted in a way that answers the question posed in the table above.
Basically, an age-standardized rate is also a weighted average, but the weights for the age categories are artificially set to be equal for the populations being compared by applying the weights of some standard population to each of them. We are still using the actual age-specific rates of each of the populations, but we are weighting them using a uniform standard population distribution.
What age distribution should you use? It doesn't really matter, but you usually see one of the following used for a standard age-distribution:
- The distribution of one of the populations being compared.
- An independent standard, e.g. US population in an arbitrarily chosen year.
- A distribution constructed by combining the populations, e.g. by averaging.
Example #1: Calculating standardized Rates using Florida's age distribution as the standard
If I wanted to ask the question "What would Alaska's overall mortality rate have looked like if Alaska had its actual age-specific rates but also had the same age distribution in the population as Florida?" I can do this quite simply by applying Florida's population distribution to Alaska's age-specific rates.
First, we will calculate the standardized rate for Florida by multiplying each of Florida's age-specific rates by the fraction of the Florida's population in each age group.
For the age group <5 years old: 0.07 x 284 = 19.18
For the age group 5 to 19 years: 0.18 x 57 = 10.26
For the age group 20 to 44 years: 0.36 x 198 = 71.28
For the age group 45 to 64 years: 0.21 x 815 = 154.85
For the age group greater than 64 year: 0.18 x 4,425 = 796.50
SUM = 1069 per 100,000 population
As you would expect, the standardized rate in Florida is the same as its crude rate, because we used Florida's age distribution as the standard.
Now let's use Florida's age distribution as the standard to calculate Alaska's standardized rate by multiplying each of Alaska's age-specific rates by the fraction of the Florida's population in each age group.
For the age group <5 years old: 0.07 x 274 = 19.18
For the age group 5 to 19 years: 0.18 x 65 = 11.70
For the age group 20 to 44 years: 0.36 x 188 = 67.68
For the age group 45 to 64 years: 0.21 x 629 = 132.09
For the age group greater than 64 year: 0.18 x 4,350 = 783.00
SUM = 1014 per 100,000 population
We can compare Florida's standardized rate to Alaska's standardized rate by computing a standardized rate ratio (SRR) = 1069/1014 = 1.054, much less than the crude mortality rate ratio of 2.68, suggesting that much of the crude difference was due to confounding by age.
In this example we adjusted for age differences by using Florida's age distribution as a standard set of weights and applied those weights to the age-specific rates of each state. However, we could have achieved a fair comparison by using other standards as well, as long as we applied the same standard or weights to each of the populations being compared. For example, I could have arbitrarily chosen to use the age distribution of the US population in 1988 as the standard, as demonstrated on the next page.
Standardization Using an External Distribution
Example #2: Calculating Age-adjusted Rates Using an External Age Distribution as the Standard (e.g., using the age distribution of the US population in 1988 as the standard age distribution.)
Table - Distribution of the US Population in 1988
Population (% of Total)
Now, let's use the US population distribution in 1988 as the standard distribution for both Florida and Alaska:
Here are the age-specific death rates for Florida and Alaska:
Florida Death Rates per
Alaska Death Rates
First, we will calculate the standardized rate for Florida by multiplying each of Florida's age-specific rates by the fraction of fraction of the age group in the standard population.
For the age group <5 years old: 0.07 x 284 = 19.88
For the age group 5 to 19 years: 0.22 x 57 = 12.54
For the age group 20 to 44 years: 0.40 x 198 = 79.20
For the age group 45 to 64 years: 0.19 x 815 = 154.85
For the age group greater than 64 year: 0.12 x 4,425 = 531.00
SUM = 797 per 100,000 population
Therefore, using the 1988 US population distribution as the standard, the standardized rate in Florida is 797 per 100,000 population, calculated.
Now let's use the standard population distribution to calculate Alaska's standardized rate by multiplying each of Alaskaa's age-specific rates by the fraction of fraction of the age group in the standard population.
For the age group <5 years old: 0.07 x 274 = 19.18
For the age group 5 to 19 years: 0.22 x 65 = 14.30
For the age group 20 to 44 years: 0.40 x 188 = 75.20
For the age group 45 to 64 years: 0.19 x 629 = 119.51
For the age group greater than 64 year: 0.12 x 4,350 = 522.00
SUM = 750 per 100,000 population
Using the 1988 US population distribution as the standard, the standardized rate in Alaska is 750 per 100,000 population, calculated by multiplying each of Alaska's age-specific rates by the fraction of fraction of the age group in the standard population.
Using the 1988 US population distribution as the standard gives different adjusted rates than when we used Florida as the standard, but the difference between the two states is almost identical to when we used Florida as the standard. Once again, note that the standardized rate ratio (SRR) = 797/750 = 1.06, i.e., much less than the crude mortality rate ratio of 2.68, but very close to the standardized rate ratio that was obtained when the age distribution of Florida was used as the standard.
These adjusted rates are hypothetical death rates that would have occurred in each state if each had the age distribution of the entire US population in 1988. It is important to note the the adjusted rates are artificial, because they are based on a hypothetical situation, and what one gets for the summary rates depends, to some extent, on what one selects as the standard. However, the more important observation is the impact on the comparison between the two populations. Both sets of weights provided age-standardized rates that showed there is little difference in mortality risk between the two states after adjusting for age.
Because rates can be compared only when weights are the same for each entity, basic public health data almost always use an external population to facilitate comparison with other entities. For example, to compare the mortality rates among all 50 U.S. states, it would make much more sense to use the U.S. population as a whole for the weights than weighting each state's population to Florida or any other state. This consideration carries over to the situation in which only two states are compared, or even when tracking trends over time in a single state, especially if over a time period long enough to see a change in the age distribution of the population.
Therefore, one would be much more likely to see a comparison between Florida and Alaska where the U.S. population was used as the standard (Example #2) than where the population of Florida was used (Example #1). Analogously, mortality rates among all the countries in the world typically use a world standard based population.
Currently, the age distribution of the population based on the 2000 Census is used for almost all measures in the United States, while the World Health Organization (WHO) has developed a standard population based on the average age distribution of the world's population
Standardized Rate Ratio
Summary of Standardized Rates
Standardization results in "adjusted" rates that are not real, but they have the advantage of enabling you to compare two or more populations after removing the distorting effect of other confounding factors, such as age. In many public health circumstances, it is important to compare rates of disease among two or more populations, but there may be differences in the distributions of the populations that distort the comparison. In this situation you will frequently see adjusted or standardized rates.
A comparison of crude and adjusted rates also provides a way to identify whether a factor is causing confounding. By definition, if you adjust for a factor like age and the relationship changes, then there was confounding. In the illustration below, Woburn's crude rate was 750 per 10,000 compared to Weymouth's crude rate of 250 per 10,000, a 3-fold difference. However, the age-adjusted rate for Woburn was 383 per 10,000, and the age-adjusted rate for Weymouth was 376 per 10,000. This indicates that the crude comparison was confounded by age.
One might ask "Why not just compare the age-specific rates?" The answer is that there are times when unconfounded summary rates are very useful. For example, suppose you wanted to examine trends in mortality rates for heart disease over time, and you wanted to also see how trends compared among black and white males and females. In this situation you might have age-, race-, and gender-specific rates at multiple time points in a single population. However, there would be so many category-specific rates that it would be impossible to keep track of all of the comparisons and make any sense out of what was going on, as illustrated in the following tables showing age-, gender-, and race-specific rates of mortality from heart disease over time.
Trying to make sense out of all of these category-specific rates would be extremely difficult. On the other hand, if you calculated age-adjusted summary rates for black and white males and females for each year, you could then summarize these with a graph that allowed you to quickly see what the trends were, as illustrated below.
The video below provides a 20 min overview of standardized rates.
Standardized Incidence Ratios
To calculate age-adjusted standardized rates, as above, one must first have the age-specific rates of disease for each of the populations to be compared. One then uses a standard age distribution to compute a hypothetical summary rate that indicates what the overall rate of disease would be for each population, if they had had the same age distribution as the standard. In other words, one uses each population's real age-specific rates and applies these to a single standard age distribution. In some situations, however, the age distribution of the populations being compared is know, but it is difficult, if not impossible, to obtain reliable estimates of age-specific rates, particularly if one is interested in smaller populations in which age-specific rates would be subject to random error because of relatively small numbers of observations. Consider the problem of a cluster of cancer cases that come to our attention in a specific community. The obvious question is whether the occurrence of cancer in this community is higher than that of other communities in the same state. However, the number of cases of a particular type of cancer occurring in even a relatively large community is typically small enough to produce very unstable rates due to random error. On the other hand, age-specific rates for the entire state would be much more stable, because of the larger sample size. In this situation one can approach the problem by using the age-specific rates observed for the entire state population as an estimate of the expected rates for the component communities. One can then apply these rates to the age distribution of each community to compute the expected number of specific cancer cases for a given community and then compare the expected number of cases to the observed cases. This approach is typically used by state cancer registries. Since the frequencies of different cancers oftern differ by gender, separate computations are performed for men and women.
Consider the following example adapted from the Massachusetts Department of Public Health:
A) Overall Age-specific State Rate
B) Town's Population Size
C) Expected Cases (A x B)
Observed # of Cases
SIR = (Observed Cases/Expected Cases) x 100 = (144/136.35 4) x 100 =106
Consequently, these results suggest that, after adjusting for age differences, the incidence of this particular type of cancer in this town was 6% higher than expected based on average age-specific rates for the state. The Massachusetts Department of Public Health provide the following comments regarding the limitations of thise type of data:
"... apparent increases or decreases in cancer incidence over time may reflect changes in diagnostic methods or case reporting rather than true changes in cancer incidence. Three other limitations must be considered when interpreting cancer incidence data for Massachusetts cities and towns: under-reporting in areas close to neighboring states, under-reporting of cancers that may not be diagnosed in hospitals, and cases being assigned to incorrect cities/towns."
Another important consideration is the precision of these estimates. This is best evaluated by computing a 95% confidence interval for the SIR. The Epi_Tools.XLS spreadsheet has a worksheet that will help you compute the confidence interval. For this example, the 95% confidence interval is:
95% confidence interval = 88 to123
One of the important applications of standardized incidence ratios is to monitor the frequency of cancer and other diseases. SIRs are partcularly useful because the number of any particular type of cancer cases is likely to be small in an individual town, particularly if the community is small. In this situation standardized rates are less useful since the age-specific rates for a particular cancer would be subject to a huge amount of random error due to the small number of cases. SIRs get around this problem hy using the more stable rates for the entire state in order to compute the expected number of cases of a given cancer for a community, given the community's age distribution.
In the video below Professor Richard Clapp, the first director of the Massachusetts Cancer Registry discusses the need for registries of this type.
The link below will take you to the website for the Massachusetts Cancer Registry, where you can explore the SIRs and confidence intervals for specific types of cancers throughout Massachusetts.
Standardized Mortality Ratios
When the outcome of interest is a mortality rate, a standardized incidence ratio is referred to as a standardized mortality rate.
Section 5: Measures of Association
The key to epidemiologic analysis is comparison. Occasionally you might observe an incidence rate among a population that seems high and wonder whether it is actually higher than what should be expected based on, say, the incidence rates in other communities. Or, you might observe that, among a group of case-patients in an outbreak, several report having eaten at a particular restaurant. Is the restaurant just a popular one, or have more case-patients eaten there than would be expected? The way to address that concern is by comparing the observed group with another group that represents the expected level.
A measure of association quantifies the relationship between exposure and disease among the two groups. Exposure is used loosely to mean not only exposure to foods, mosquitoes, a partner with a sexually transmissible disease, or a toxic waste dump, but also inherent characteristics of persons (for example, age, race, sex), biologic characteristics (immune status), acquired characteristics (marital status), activities (occupation, leisure activities), or conditions under which they live (socioeconomic status or access to medical care).
The measures of association described in the following section compare disease occurrence among one group with disease occurrence in another group. Examples of measures of association include risk ratio (relative risk), rate ratio, odds ratio, and proportionate mortality ratio.
Definition of risk ratio
A risk ratio (RR), also called relative risk, compares the risk of a health event (disease, injury, risk factor, or death) among one group with the risk among another group. It does so by dividing the risk (incidence proportion, attack rate) in group 1 by the risk (incidence proportion, attack rate) in group 2. The two groups are typically differentiated by such demographic factors as sex (e.g., males versus females) or by exposure to a suspected risk factor (e.g., did or did not eat potato salad). Often, the group of primary interest is labeled the exposed group, and the comparison group is labeled the unexposed group.
Method for Calculating risk ratio
The formula for risk ratio (RR) is:A risk ratio of 1.0 indicates identical risk among the two groups. A risk ratio greater than 1.0 indicates an increased risk for the group in the numerator, usually the exposed group. A risk ratio less than 1.0 indicates a decreased risk for the exposed group, indicating that perhaps exposure actually protects against disease occurrence.
EXAMPLES: Calculating Risk Ratios
Example A: In an outbreak of tuberculosis among prison inmates in South Carolina in 1999, 28 of 157 inmates residing on the East wing of the dormitory developed tuberculosis, compared with 4 of 137 inmates residing on the West wing.(11) These data are summarized in the two-by-two table so called because it has two rows for the exposure and two columns for the outcome. Here is the general format and notation.
Table 3.12A General Format and Notation for a Two-by-Two Table
|Exposed||a||b||a + b = H1|
|Unexposed||c||d||c + d = H0|
In this example, the exposure is the dormitory wing and the outcome is tuberculosis) illustrated in Table 3.12B. Calculate the risk ratio.
Table 3.12B Incidence of Mycobacterium Tuberculosis Infection Among Congregated, HIV-Infected Prison Inmates by Dormitory Wing — South Carolina, 1999
|East wing||a = 28||b = 129||H1 = 157|
|West wing||c = 4||d = 133||H0 = 137|
Data Source: McLaughlin SI, Spradling P, Drociuk D, Ridzon R, Pozsik CJ, Onorato I. Extensive transmission of Mycobacterium tuberculosis among congregated, HIV-infected prison inmates in South Carolina, United States. Int J Tuberc Lung Dis 2003;7:665–672.
To calculate the risk ratio, first calculate the risk or attack rate for each group. Here are the formulas:
Attack Rate (Risk)
Attack rate for exposed = a ⁄ a+b
Attack rate for unexposed = c ⁄ c+d
For this example:
Risk of tuberculosis among East wing residents = 28 ⁄ 157 = 0.178 = 17.8%
Risk of tuberculosis among West wing residents = 4 ⁄ 137 = 0.029 = 2.9%
The risk ratio is simply the ratio of these two risks:
Risk ratio = 17.8 ⁄ 2.9 = 6.1
Thus, inmates who resided in the East wing of the dormitory were 6.1 times as likely to develop tuberculosis as those who resided in the West wing.
EXAMPLES: Calculating Risk Ratios (Continued)
Example B: In an outbreak of varicella (chickenpox) in Oregon in 2002, varicella was diagnosed in 18 of 152 vaccinated children compared with 3 of 7 unvaccinated children. Calculate the risk ratio.
Table 3.13 Incidence of Varicella Among Schoolchildren in 9 Affected Classrooms — Oregon, 2002
|Vaccinated||a = 18||b = 134||152|
|Unvaccinated||c = 3||d = 4||7|
Data Source: Tugwell BD, Lee LE, Gillette H, Lorber EM, Hedberg K, Cieslak PR. Chickenpox outbreak in a highly vaccinated school population. Pediatrics 2004 Mar;113(3 Pt 1):455–459.
Risk of varicella among vaccinated children = 18 ⁄ 152 = 0.118 = 11.8%
Risk of varicella among unvaccinated children = 3 ⁄ 7 = 0.429 = 42.9%
Risk ratio = 0.118 ⁄ 0.429 = 0.28
The risk ratio is less than 1.0, indicating a decreased risk or protective effect for the exposed (vaccinated) children. The risk ratio of 0.28 indicates that vaccinated children were only approximately one-fourth as likely (28%, actually) to develop varicella as were unvaccinated children.
A rate ratio compares the incidence rates, person-time rates, or mortality rates of two groups. As with the risk ratio, the two groups are typically differentiated by demographic factors or by exposure to a suspected causative agent. The rate for the group of primary interest is divided by the rate for the comparison group.
Rate ratio =
EXAMPLE: Calculating Rate Ratios (Continued)
Public health officials were called to investigate a perceived increase in visits to ships' infirmaries for acute respiratory illness (ARI) by passengers of cruise ships in Alaska in 1998.(13) The officials compared passenger visits to ship infirmaries for ARI during May–August 1998 with the same period in 1997. They recorded 11.6 visits for ARI per 1,000 tourists per week in 1998, compared with 5.3 visits per 1,000 tourists per week in 1997. Calculate the rate ratio.
Rate ratio = 11.6 ⁄ 5.3 = 2.2
Passengers on cruise ships in Alaska during May–August 1998 were more than twice as likely to visit their ships' infirmaries for ARI than were passengers in 1997. (Note: Of 58 viral isolates identified from nasal cultures from passengers, most were influenza A, making this the largest summertime influenza outbreak in North America.)
Table 3.14 illustrates lung cancer mortality rates for persons who continued to smoke and for smokers who had quit at the time of follow-up in one of the classic studies of smoking and lung cancer conducted in Great Britain.
Using the data in Table 3.14, calculate the following:
- Rate ratio comparing current smokers with nonsmokers
- Rate ratio comparing ex-smokers who quit at least 20 years ago with nonsmokers
- What are the public health implications of these findings?
Table 3.14 Number and Rate (Per 1,000 Person-years) of Lung Cancer Deaths for Current Smokers and Ex-smokers by Years Since Quitting, Physician Cohort Study — Great Britain, 1951–1961
|Cigarette smoking status||Lung cancer deaths||Rate per 1000 person-years||Rate Ratio|
|For ex-smokers, years since quitting:|
|Nonsmokers||3||0.07||1.0 (reference group)|
Data Source: Doll R, Hill AB. Mortality in relation to smoking: 10 years' observation of British doctors. Brit Med J 1964; 1:1399–1410, 1460–1467.
Check your answer.
An odds ratio (OR) is another measure of association that quantifies the relationship between an exposure with two categories and health outcome. Referring to the four cells in Table 3.15, the odds ratio is calculated as
Odds ratio = (
) = ad ⁄ bc
a = number of persons exposed and with disease
b = number of persons exposed but without disease
c = number of persons unexposed but with disease
d = number of persons unexposed: and without disease
a+c = total number of persons with disease (case-patients)
b+d = total number of persons without disease (controls)
The odds ratio is sometimes called the cross-product ratio because the numerator is based on multiplying the value in cell "a" times the value in cell "d," whereas the denominator is the product of cell "b" and cell "c." A line from cell "a" to cell "d" (for the numerator) and another from cell "b" to cell "c" (for the denominator) creates an x or cross on the two-by-two table.
Table 3.15 Exposure and Disease in a Hypothetical Population of 10,000 Persons
|Exposed||a = 100||b = 1,900||2,000||5.0%|
|Not Exposed||c = 80||d = 7,920||8,000||1.0%|
EXAMPLE: Calculating Odds Ratios
Use the data in Table 3.15 to calculate the risk and odds ratios.
- Risk ratio
5.0 ⁄ 1.0 = 5.0
- Odds ratio
(100 × 7,920) ⁄ (1,900 × 80) = 5.2
Notice that the odds ratio of 5.2 is close to the risk ratio of 5.0. That is one of the attractive features of the odds ratio — when the health outcome is uncommon, the odds ratio provides a reasonable approximation of the risk ratio. Another attractive feature is that the odds ratio can be calculated with data from a case-control study, whereas neither a risk ratio nor a rate ratio can be calculated.
In a case-control study, investigators enroll a group of case-patients (distributed in cells a and c of the two-by-two table), and a group of non-cases or controls (distributed in cells b and d).
The odds ratio is the measure of choice in a case-control study (see Lesson 1). A case-control study is based on enrolling a group of persons with disease ("case-patients") and a comparable group without disease ("controls"). The number of persons in the control group is usually decided by the investigator. Often, the size of the population from which the case-patients came is not known. As a result, risks, rates, risk ratios or rate ratios cannot be calculated from the typical case-control study. However, you can calculate an odds ratio and interpret it as an approximation of the risk ratio, particularly when the disease is uncommon in the population.
Calculate the odds ratio for the tuberculosis data in Table 3.12. Would you say that your odds ratio is an accurate approximation of the risk ratio? (Hint: The more common the disease, the further the odds ratio is from the risk ratio.)
Check your answer.