Professional Documents
Culture Documents
Shelley Harris, Ph.D. Associate Professor Department of Epidemiology and Community Health & Center for Environmental Studies saharris@vcu.edu
Outline
Introduction to epidemiology Study designs and measures of risk in epidemiology Statistical Analysis programs Epidemiological Analysis programs An example calculation .. Some cautions
Shelley A. Harris
Digitally signed by Shelley A. Harris DN: CN = Shelley A. Harris, C = US, O = VCU Reason: I am the author of this document Date: 2005.02.02 16:42:43 -05'00'
Introduction to Epidemiology
Epidemiology is the study of patterns of disease occurrence and other healthrelated conditions in human populations and of the factors that influence these occurrences and conditions.
ho sis L o HI iver f V In fe ct io n
Su ici
Ci rr
Ho m
ici de
de
Estimated deaths
2% Melanoma of skin 2% Oral 33% Lung 4% Pancreas 3% Stomach 10% Colon & Rectum 13% Prostate 5% Urinary 8% Leukemia & Lymphomas 20% All others
Melanoma of skin Oral Breast Lung Pancreas Colon & Rectum Ovary Uterus Urinary Leukemia & Lymphomas All others
Melanoma of skin Oral Breast Lung Pancreas Colon & Rectum Ovary Uterus Urinary Leukemia & Lymphomas All others
A report from the National Cancer Institute (NCI) estimates that about 1 in 8 women in the United States (approximately 12.8 percent) will develop breast cancer during her lifetime.
Breast Cancer
International incidence rates (per 100,000
women)
129.5 108.8 108.6 106.8 94.3 84.3 76.4 60.1 37.0
Sweden
US
Italy
Netherlands
United Kingdom
France
Germany
Spain
Japan
10
15
20
25
30
35
Rate/10000
Holland 18 40
England 17 35
France 15 30
Canada 10 26
United States 7 24
Holland 15 40
England 17 35
France 18 30
Canada 20 26
United States 25 24
Popular Hypotheses:
Exposure to synthetic environmental estrogens is related to: 1) increases of breast and prostate cancer over time 2) differences in rates of breast and prostate cancer between counties 3) decreases in sperm quality/quantity observed over the last 50 years Exposure to natural environmental estrogens accounts for: 1) differences in rates of breast and prostate cancer between counties and differences in heart disease
10
lifestyle factors
Diet (high fat) ? high alcohol consumption?? Smoking, passive smoke???
11
12
Analytic Studies
To test hypotheses it is necessary to conduct analytic epidemiological studies. Analytic studies can be divided into two main types: 1) Experimental studies -> Clinical trials 2) Observational Studies
13
Controlled Assignment
Uncontrolled Assignment
Experimental Studies
Observational Studies
Randomized Assignment
Sampling with Regard To Exposure, Characteristic, Or Cause Prospective studies (cohort, case-cohort)
Community Trials
Cross-sectional Studies
Case-Control Studies
selected into a study based on their disease status. sometimes called retrospective studies or case-referent most common type of epidemiologic study
14
Case-Control Studies
Retrospective Prospective
1960
2001
2030
Odds of being a case if you are exposed = a/b Odds of being a case if you are not exposed = c/d
15
Breast Cancer Yes 499 19 518 No 462 56 518 Totals 961 75 1036
Cohort Studies
considered a natural experiment called follow-up studies, incidence studies, or longitudinal studies 1) Prospective cohort 2) Retrospective cohort
16
Cohort Studies
Retrospective Prospective In 2005 select exposed and non-exposed groups
1960
2005
2030
Measures of Risk
Disease Exposed Not exposed Totals a c N1 No Disease b d N2 Totals M1 M2 N
Rate of disease in exposed Rate of disease in non-exposed Relative risk = (a/M1) (c/M2)
17
Breast cancer Yes 20 5 25 No 9980 9995 19975 Totals 10000 10000 20000
18
SAS is a large, general-purpose package descended generalfrom an original program that was designed to run on mainframe computers in a "batch" mode, ie. by the user ie. submitting a batch of commands and then getting a pile of results in a separate output file (or window, now that Windows and Mac versions are available). Along with a slightly complicated approach to data management, this makes the program harder to learn and compared with SPSS there is less capability to learn by experiment using menus. On the other hand, the data processing capabilities are extremely powerful and the range of statistical procedures wide.
University of Melbourne
19
SPSS is a well-known package particularly popular in the social sciences and psychology. It is a very large and somewhat cumbersome program but also very powerful and capable of performing almost all the standard methods of analysis. Recent Windows versions have a convenient user interface, but it can still be hard to keep track of exactly what you've done. The menu-based interface makes it relatively easy to learn, at least for simple applications
University of Melbourne
S-PLUS is a program for specialist statisticians only. It is an interactive, object-oriented system, with both a wide range of built-in functions and complete programming capabilities for extending these. Probably its most useful feature for us is an extremely powerful and relatively easy-to-use capacity for graphics.
University of Melbourne
20
Epi Info
Latest Version: Epi Info Version 3.3 With Epi Info and a personal computer, epidemiologists and other public health and medical professionals can rapidly develop a questionnaire or form, customize the data entry process, and enter enter and analyze data. Epidemiologic statistics, tables, graphs, and maps are produced with simple commands such as READ, FREQ, LIST, TABLES, GRAPH, and MAP. Epi Map displays geographic maps with data from Epi Info. A new version, Epi Info for Windows retains many features of the familiar Epi Info for DOS, while offering Windows ease of use strengths such as point-and-click commands, graphics, fonts, and point- andprinting. http://www.cdc.gov/epiinfo/ http://www.cdc.gov/epiinfo/
21
Egret
Egret software is a statistical package that specializes in offering modeling and offering graphics capabilities to investigators conducting epidemiological and biomedical epidemiological studies. Egret is a user-friendly statistical package for epidemiologists. userComprehensive Set of Models: Many Not Available Elsewhere
Contingency Tables Logistic Regression Conditional Logistic Regression* Logistic Regression with Random Effects Beta-Binomial Regression BetaPoisson Regression Weibull Regression Exponential Regression Cox Proportional Hazards Regression Cox Regression with Time-Dependent Covariates TimeKaplan-Meier Analysis and Plots KaplanExtensive Post-Fit Analysis with Plots, Including Delta-Betas, and Hazard Functions PostDeltaPlus a new spreadsheet-based data editor and a statistical scratchpad spreadsheetUnlike other epidemiology software, Egret permits the case/control ratio to vary over strata case/control without using an approximation for the conditional likelihood function. function.
22
23
True Difference Present Conclusion of Statistical Test Different Correct (true positive) (1-=Power) Incorrect: Type II ( ) error (false negative) Absent Incorrect: Type I () error (false positive)
Not Different
Power n=100
Ho n=1000
alpha beta
Ho
24
Sample Size
1 _ _ 1 + p q ( Z / 2 + Z ) 2 k n= ( p1 p2 ) 2
CRAP Detector #1
Beware the large sample size.
Effects can be statistically significant and biologically inconsequential
25
CRAP Detector #2
Beware the small sample size
It is hard to find significant differences and no difference means nothing.
Some Thoughts
garbage in garbage out. consult the biostatistician and the epidemiologist we charge by the hour
26