JOURNAL
Current Issue
Journal Archive
...........................................
December 2008 - Volume 6 Issue 10
Download print-friendly version (695kb)
...........................................
From the Editor
........................................................
Original Contributon and Clinical Investigation

Primary Care Physicians’ Knowledge, Attitude, and Practice Toward Obesity Management in Qatar
Ahmad Essa Al- Muraikhi, Mohamed Ghaith AL-Kuwari

Early Performance of Imaging Studies After First Urinary Tract Infection
Khaled M. Amro, Mohamed Alnaji, Salem Al-Zawahri, Mustafa Al-Zboon, Mohamed I. Aladwan
 
........................................................
Medicine and Society
Supporting Services and Quality of Life in People with Multiple Sclerosis
Mojtaba Azimian, Mostafa Eghlima, Ghoncheh Raheb, Mitra Zohmand, Asghar Dadkhah
HPV Vaccine Hype The Gardasil; The Approved First World Cervical Vaccine
Dr. Ebtisam
........................................................
Clinical Research and Methods
How to visualize public health data? Part one: Box plot and map
Dr. Mohsen Rezaeian
........................................................
Office Based Family Medicine
Otological Manifestations among Patients with Cleft Palate
Aser El-Hrout, Khaled Hamasha, Hussien Al-Qasim
........................................................

Chief Editor -
Abdulrazak Abyad MD, MPH, MBA, AGSF, AFCHSE

.........................................................

Publisher -
Lesley Pocock
medi+WORLD International
572 Burwood Road,
Hawthorn 3122
AUSTRALIA
Phone: +61 (3) 9819 1224
Fax: +61 (3) 9819 3269
Email
: lesleypocock@mediworld.com.au
.........................................................

Editorial Enquiries -
abyad@cyberia.net.lb
.........................................................

Advertising Enquiries -
lesleypocock@mediworld.com.au
.........................................................

While all efforts have been made to ensure the accuracy of the information in this journal, opinions expressed are those of the authors and do not necessarily reflect the views of The Publishers, Editor or the Editorial Board. The publishers, Editor and Editorial Board cannot be held responsible for errors or any consequences arising from the use of information contained in this journal; or the views and opinions expressed. Publication of any advertisements does not constitute any endorsement by the Publishers and Editors of the product advertised.

The contents of this journal are copyright. Apart from any fair dealing for purposes of private study, research, criticism or review, as permitted under the Australian Copyright Act, no part of this program may be reproduced without the permission of the publisher.

December 2008 - Volume 6, Issue 10
How to Visualize Public Health Data?
Part one: Box Plot and Map
.........................................................................................................................
Dr. Mohsen Rezaeian (PhD, Epidemiologist, Associate Professor)
Social Medicine Department, Rafsanjan Medical School, Rafsanjan, Iran.

Correspondence:
Dr. Mohsen Rezaeian
Tel: +98 391 5234003
Fax: +98 391 5225209
Email: moeygmr2@yahoo.co.uk

ABSTRACT

Health care professionals including family physicians increasingly become involved in public health data analyses. Data visualisation is the first step in data analyses, which help to disclose complex structures within data. The chief aim of the present article, which is the first article in a series of two, is to discuss the pros and cons of two ways of data visualisation i.e. box plot and map using a real public health data example.

Key words: Box plot, Map, Data visualization, Health care professionals.


INTRODUCTION

Health care professionals increasingly become involved in public health data analyses. They either have to analyse public health data by themselves or have to use the results of the analyses, which have been done by other health care professionals. Therefore, they have to be familiar with different ways of public health data analyses. Data visualisation is the first step in data analyses, which help to disclose complex structure in data(1). From this point of view, data visualisation may not only create interest and attract the attention of the viewer but also provide a way of discovering the unexpected(2). In the present article, which is the first article in a series of two, the pros and cons of two ways of data visualisation i.e. box plot and map are discussed, using a real public health data example.

 

BOX PLOT

One of the most useful methods of summarising data is to present the lowest value, the lower quartile, the median, the upper quartile and the highest value in a graph called box plot(3). In this display, the median is used to show the central value and the range of the upper and lower quartiles to show variability of the data.

To make this graph, a box is drawn with ends at the upper and lower quartiles and a crossbar at the median value. Next, a line is drawn from the lower quartile to the lowest value and from the upper quartile to the highest value. To complete this picture and by using the following formula, the position of the outliers is also indicated usually using a circle symbol (3):

Lower quartile - 1.5 inter-quartile range & upper quartile + 1.5 inter-quartile range

The application of box plot will be demonstrated using a public health database later on.

 

MAP

"From the perspective of public health practice, knowledge that a health problem is concentrated in identifiable places is essential for the efficient distribution of resources for prevention, treatment or amelioration(4)." Therefore, maps are becoming more and more important in public health data analyses.

The production of attractive and informative disease maps harmonize any formal statistical analyses of spatial variations and for their attractiveness, maps will influence the recipient of the information much more than the associated statistics(5). Maps reveal geographical relations that are not obvious from numerical and tabular data(6).

However, like any other graphical displays there are a number of principals that one has to follow in order to produce an informed map. For instance, selecting the appropriate administrative boundaries, selecting the appropriate colour scheme or hatching, plus selecting an appropriate method of data classification patterns, are among the most important issues in mapmaking, which requires cautious considerations(5,7).

In the next section and by using a real public health data example I am going to show one of these principals i.e. selecting an appropriate method of data classification and for the rest of these principals I am going to refer the readers to the other articles(4,5). It should be noted that the process of classification can be explained as systematically grouping data based on one or more characteristics. This should result in a clearer picture and should also improve insight into the data. Research has also revealed that in order to get an overview of the theme mapped at a single glance, the number of classes should not exceed more than seven(8).


PUBLIC HEALTH DATA EXAMPLE

The data used in this article comes from the results of Iranian National Demographic Health Survey (DHS) which was conducted in the year 2000(9). The piece of data that was selected for visualisation purposes is related to the percentage of people over 15 years with hypertension in the then 28 provinces of Iran (Table 1). Based on the figures, which are presented in an ascending order in Table 1 it is very difficult to summarise the data or visualise any relationship between provinces.

Table 1 The percentage of people over 15 years with hypertension within different provinces of Iran
Iranian Provinces % of people over 15 years with hypertension
Gom 7.1
Bushehr 7.5
Sistan va Baluchestan 7.9
Khuzestan 8.6
Fars 8.7
Golestan 8.7
Semnan 8.8
Chahar Mahall va Bakhtiar 9
Azarbayjan-e-gharbi 9.2
Kordestan 9.3
Lorestan 10.4
Kohgiluyeh va Buyer Ahmad 10.7
Ilam 10.8
Mazandaran 10.8
Zanjan 11.2
Khorasan 11.2
Khorasan 11.4
Hormozgan 11.6
Kermanshah 11.7
Hamadan 11.7
Kerman 12.4
Ardabil 12.5
Tehran 13.1
Azarbayjan-e-shargi 13.5
Gilan 15.1
Qazvin 15.9
Markazi 18.9
Yazd 19.3

In order to summarise the data a box plot was produced (Diagram 1). As mentioned earlier a number of important summary indices can be seen by this graph. For instance, by looking at this graph one could easily visualise the following summary indices:

Lowest value = 7.10
Lower quartile = 8.85
Median = 11
Upper quartile = 12.47
Highest value = 16.20
Inter-quartile range = 3.62

Diagram 1 Box plot depicting the percentage of people over 15 years with hypertension within different provinces of Iran

One also easily visualises that two provinces i.e. Markazi and Yazd were considered as the outliers for their high percentage of people over 15 years with hypertension i.e. 18.9 and 19.3, respectively.
Nevertheless, box plot is still unable to reveal any relationship between provinces. Therefore, one has to apply a map to reveal any such relations.

Therefore, two maps were produced from the current data selecting two acceptable methods of classification as follows: The first method is Quantile, which divides the number of observations evenly over the number of classes taken. The name of this method is based on the number of classes, for instance, when applied to four classes it is called Quartile and with five classes, Quintiles(8). The second method is Equal Interval, in which the class width is equal for all classes(8). For each map a white to black colouring scheme has been adapted. According to this scheme those provinces which have a higher percentage of people over 15 years with hypertension, have adopted a darker colour and vice versa.

Map 1 depicts a Quintiles classification of the percentage of people over 15 years with hypertension within different provinces of Iran. This map reveals all 28 provinces of Iran evenly categorized in five classes i.e. 6 provinces placed in three categories whilst five provinces are in two other categories. Based on this map there are five provinces i.e. Azarbayjan-e-shargi, Gilan, Qazvin, Markazi and Yazd, which adopt a black colour indicating that they have a high percentage of people over 15 years with hypertension.

Map 1 Map depicting Quintiles classification of the percentage of people over 15 years with hypertension within different provinces of Iran


Map 2 also depicts Equal Interval classification of the percentage of people over 15 years with hypertension within different provinces of Iran. For producing this map the highest percentage i.e. 19.3 has been detracted from the lowest percentage i.e. 7.1. Then, we get the resulting figure i.e. 12.2 divides by 5 i.e. the number of classes, which becomes equal to 2.44. This means that the interval between classes must be set at 2.44. Based on this map there are only two provinces i.e. Markazi and Yazd, which adopt a black colour indicating that they have a high percentage of people over 15 years with hypertension.

Map 2 Map depicting Equal Interval classification of the percentage of people over 15 years with hypertension within different provinces of Iran


It should be noted that both maps are correct looking at the problem from different angles. Whilst Map One divides provinces evenly, Map two is more in accordance with box plot trying to highlight outliers. Both maps also highlight that more provinces in the northern and central parts of Iran suffer from hypertension compared to southern provinces.

 

CONCLUSION

Although maps reveal the spatial relationships that might not be seen in tables(10) we should not rely on the presentation of a single map (5) because a single map is only one of the large number of maps that might be produced from the same data(11). On the one hand, it has been pointed out that the end point of data visualisation is not necessarily a single 'correct' map"(12), and, on the other hand, it has been argued that it is crucial to ensure that correct rules are applied in the mapping processes(13). Furthermore, one should also bear in mind that other graphical displays such as box plot may also help health care professionals to better summarise and visualise their data(5).


REFERENCES
  1. Cleveland WS. Visualising data. Hobart Press, Summit, NJ, 1993.
  2. Everitt BSE, Dunn G. Applied multivariate data analysis. London: Arnold, 2001.
  3. Dunn G, Everitt B. Clinical biostatistics. London: Edward Arnold, 1995.
  4. Rezaeian, M. Dunn, G. St. Leger, S. Appleby L. Geographical epidemiology, spatial analysis and geographical information systems: a multidisciplinary glossary. J Epidemiol Community Health 2007; 61 : 98-102.
  5. Rezaeian, M. Dunn, G. St. Leger, S. Appleby L. The production and interpretation of disease maps: A methodological case-study. Soc Psychiatry Psychiatr Epidemiol. 2004; 39: 947-954.
  6. Parchman, ML. Ferrer, RL. Blanchard, KS. Geography and Geographic Information Systems in Family Medicine Research. Fam Med 2002; 34:132-137.
  7. Smans M, Esteve J. Practical approach to disease mapping. In Elliott P, Cuzik J, English D, Stern R. Geographical and environmental epidemiology-methods for small area studies, pp 141-150. Oxford: Oxford University Press, 1996.
  8. Kraak M, Ormeling F. Cartography: visualisation of spatial data. Harlow: Longman, 1996.
  9. National Demographic Health Survey (DHS). Iranian Ministry of Health and Medical Education; 2001.
  10. Bell BS, Broemeling LD. A Bayesian analysis for spatial processes with application to disease mapping. Stat Med 2000; 19 : 957-974.
  11. Monmonier M. How to lie with maps. Chicago: The university of Chicago Press, 1996.
  12. Gatrell AC, Bailly TC. Interactive spatial data analysis in medical geography. Soc Sci Med 1996; 42 : 843-855.
  13. Clif AD. Analysing geographically related disease data. Stat Methods Med Res 1995; 4 : 93-101.
.................................................................................................................
 

I About MEJFM I Journal I Advertising I Author Info I Editorial Board I Resources I Contact us I Journal Archive I MEPRCN I Noticeboard I News and Updates
Disclaimer - ISSN 148-4196 - © Copyright 2007 medi+WORLD International Pty. Ltd. - All rights reserved