How
to Visualize Public Health Data?
Part two: Direct and Indirect Standardization
Methods
.........................................................................................................................
Dr. Mohsen Rezaeian (PhD, Epidemiologist, Associate
Professor)
Social Medicine Department, Rafsanjan Medical
School, Rafsanjan, Iran.
Correspondence:
Tel: +98 391 5234003
Fax: +98 391 5225209
Email: moeygmr2@yahoo.co.uk
|
ABSTRACT
Spatial data visualisation
is the accurate description of data taking
into account the component of space. Although
plots of data such as box plot are among
the fundamental tools for data visualisation
in general, for spatial data, visualising
maps are the most important tools. One
necessary step in producing a map is to
standardise the rates of disease mortality
and morbidity. The aim of the present
article, which is the second article in
a series of two, is to discuss the pros
and cons of two most important ways of
standardisation i.e. direct and indirect
methods using a hypothetical example.
Key words: Direct
standardization, Indirect standardizations,
Map, Data visualization
|
Spatial data visualisation is the accurate
description of data taking into account the
component of space(1). One of the
most important parts of spatial data analysis
is data visualisation(2). Although
plots of data such as box plot are among the
fundamental tools for data visualisation in
general, for the spatial data visualising maps
are the most important tools(1).
Nowadays, the use of mapping in the medical
context has developed so rapidly(3)
that the presentation of maps is established
as a basic tool in the analysis of public health
data(4&5).
However, it should be noted that there are
two main classes of disease maps. They are maps
of standardised rates, and maps of statistical
significance of the difference between disease
risk in each area and the overall risk averaged
over the entire map(6).
It has been emphasised that mapping standardised
rates in small areas might create a misleading
picture. Furthermore, employing statistical
significance instead of standardised rates,
especially in areas with large populations,
might produce small values of 'P', which are
statistically significant but not scientifically
interesting(7).
Given the above facts, it is now generally
acceptable, to map standardised rates rather
than 'P values'(8). However, there
is one more important question remaining to
be answered. What kind of standardisation is
the best choice in the area of disease mapping?
The aim of the present article is to discuss
the pros and cons of two most important ways
of standardisation i.e. direct and indirect
standardisation methods using a hypothetical
example.
|
DIRECT AND INDIRECT METHODS OF STANDARDISATION |
Age and sex influence the risk of most diseases
and therefore, comparisons of risk in the form
of maps must take this important issue into
account. Otherwise, observed differences could
be confounded by these variables. As a result
the process of age and sex adjustment has an
important role to play in producing disease
mortality and morbidity maps. The aim of an
adjustment process is to produce a single summary
value, which is unaffected by differences in
age and sex distributions(9).
The two most common approaches of age adjustment
are by direct and indirect weighting of stratum-specific
rates(10). In the direct approach
a weighted average of the age-specific rates
from a study population is created based on
the age distribution of a reference population(11).
The corresponding formula is
Direct age adjustment = 
in which capital letters represent values that
come from the reference population and small
letters represent values from the study population.
For instance, Ni denotes the number of
people in stratum of the reference population.
Similarly, ni and di,
respectively represent the number of people
and the number of cases in stratum in the study
population. Finally, represents
the summation sign(11).
It is also possible to obtain an easily interpreted
ratio from the directly standardised rate. This
is achieved by dividing the expected number
of deaths in the reference population by the
observed number of deaths in the reference population
over the same period of time(12).
This ratio is termed either the comparative
mortality figure (CMF), or equivalently the
standardised incidence rate ratio (SRR)(13).

in which Di represents the
number of cases in stratum i
of the reference population. Other symbols in
this formula are the same as in the previous
one. Furthermore, the approximate standard error
(SE) of the SRR can be achieved by using the
following formula

The skewed distribution of the SRR may make
logarithmic transformation of it more preferable.
Therefore, the approximate standard error for
the transformed SRR can be obtained by using
the following formula

and the 95 per cent confidence interval can
be taken from(12)
95% CI = exp (ln(SRR)+1.96*SE
(ln(SRR)
In order to adjust the rate using the indirect
method, the crude rate in the study population
is multiplied by a ratio known as the standardised
mortality ratio (SMR)(11). The SMR
is given by dividing the observed number (O)
by the expected number (E) of cases:

and the expected number of cases is given by:

using the symbols as previously described(11).
The approximate standard error (SE) of the SMR
is given by

As with SRR it may be better to use the log
transformed SMR to take into account its skewed
distribution. Therefore, the approximate standard
error for the transformed SMR can be obtained
by

and the 95 per cent confidence interval can
be taken from(12)
95% CI = exp (ln(SMR)+1.96*SE
(ln(SMR)
Direct and indirect adjustment techniques could
apply equally to adjustment by factors other
than age or in combination with age. For instance,
one might adjust rates by sex and age to derive
sex and age-specific rates. Therefore, the comparison
can be made without concern for confounding
by these factors.
The pros and cons of direct and indirect
methods of standardisation
It has been argued that if the age distributions
of two regions differ, the comparison of their
SMRs suffers from the possible bias comparable
to statistical confounding. Take, for example,
the following hypothetical regions (Table 1).
|
Table 1. The
demographic characteristic of two hypothetical
regions. |
| |
Region one |
Region two |
| Age
Band |
Deaths |
Person
years |
Deaths |
Person
years |
| 15-29 |
2 |
2500 |
1 |
1250 |
| 30-44 |
3 |
1500 |
2 |
1000 |
| 45-59 |
6 |
1000 |
9 |
1500 |
| 60+ |
10 |
500 |
50 |
2500 |
It should be noted that the stratum-specific
incidence rate ratios are all equal to 1. Therefore,
the SMR of region one versus region two is also
equal to 1, as are all other weighted incidence
rate ratios. However, when one compares these
two regions with a large reference population,
one certainly will find two different SMRs.
For instance, take the following hypothetical
reference population (Table 2).
|
Table
2. The
demographic characteristic of the hypothetical
reference population. |
|
Age Range |
Deaths |
Person years |
| 15-29 |
50 |
500,000 |
| 30-44 |
100 |
1,000,000 |
| 45-59 |
150 |
1,000,000 |
| 60+ |
150 |
1,500,000 |
Based on this reference population SMR for
region one equals 70, while for region two it
equals 177.14. However, the directly standardised
rate ratios for two regions are identical and
equal 42.66. Therefore, when comparing to an
external reference population, the SMR yields
different rate ratios for regions with a different
demographic structure even though the incidence
rates within strata are identical(14).
It should be also noted that applying directly
adjusted rates also has its own problems. For
instance, in this approach the standard error
depends on variations in the age specific number
of cases rather than the total number of cases,
which may provide less stable estimates. As
a result the standard error is generally larger
than that of indirectly adjusted rates(12).
Nevertheless, this advantage of the SMR is easily
outweighed by its disadvantage in terms of validity(13).
Finally, it is usual to use the national population
as the standard population in summarising age-
and sex- specific rates for geographical regions
within a country(15).
Based on the above discussion it has been concluded
that morbidity and mortality maps can be misleading
when based on indirectly adjusted rates or a
function of them. Therefore, the use of the
direct method of age adjustment for mapping
purposes, accompanied by an examination of age-specific
rate patterns is recommended(16).
- Bailey TC, Gatrell AC. Interactive spatial
data analysis. Harlow: Longman, 1995.
- Rezaeian, M. Dunn, G. St. Leger, S. Appleby
L. Geographical epidemiology, spatial analysis
and geographical information systems: a multidisciplinary
glossary. J Epidemiol Community Health 2007;
61 : 98-102.
- Clif AD. Analysing geographically related
disease data. Stat Methods Med Res 1995; 4
: 93-101.
- Lawson AB, Bohning D, Biggeri A, Lesaffre
E, Viel JF. Disease mapping and its uses.
Disease mapping and risk assessment for public
health. Chichester: John Wiley and sons, 1999.
- MacMahon B, Trichopoulos D. Epidemiology
principles and methods. USA: Little Brown
and Company, 1996.
- Clayton D, Bernardinelli L. Bayesian methods
for mapping disease risk. In Elliott P, Cuzik
J, English D, Stern R. (1996) Geographical
and environmental epidemiology - methods for
small area studies, pp 181-204 .Oxford: Oxford
University Press, 1996.
- Bithell JF. Geographical analysis. In Armitage
P, Colton T. International encyclopaedia of
biostatistics, pp 1701-1716, Chichester: John
Wiley, 1998.
- Cartwright, RA, Alexander FE, McKinney
PA, Ricketts TJ. Leukaemia and lymphoma: an
atlas of distribution within areas of England
and Wales 1984-1988. Leeds: Leukaemia research
Fund, 1990.
- Selvin S. Statistical analysis of epidemiological
data. Oxford: Oxford University Press, 1996.
- Rezaeian, M. Dunn, G. St. Leger, S. Appleby
L. The production and interpretation of disease
maps: A methodological case-study. Soc Psychiatry
Psychiatr Epidemiol. 2004; 39: 947-954.
- Gerstman BB. Epidemiology kept simple.
An introduction to classic and modern epidemiology.
USA: Willey-Liss, 1998.
- Breslow N, Day N. The design and analysis
of cohort studies, volume 2. IARC Scientific
Publication No. 82, International Agency for
Research on Cancer: Lyon, 1987; 65-73.
- Julious SA, Nicholl J, George S. Why do
we continue to use standardised ratios for
small area comparisons? J Public Health Med
2001; 23 : 40-46.
- Rezaeian, M. Spatial epidemiology of suicide
in England and Wales. PhD Thesis. University
of Manchester. 2002.
- Inskip H. Standardisation methods. In Armitage
P, Colton T. International encyclopaedia of
biostatistics, pp 4237-4250. Chichester: John
Wiley, 1998.
- Pickle L, White AA. Effects of the choice
of age-adjustment method on maps of death
rates. Stat Med 1995; 14 : 615-627.
|