Professional Documents
Culture Documents
Associate Professor of Epidemiology & Biostatistics at Drexel University Dornsife School of Public Health,
and Adjunct Associate Professor of Medicine at Drexel University College of Medicine, Philadelphia, PA.
Co-Chair, APHA Epidemiology Sections Cardiovascular Disease and Diabetes Research Interest Group
FOREWORD
PREFACE
ACKNOWLEDGMENTS
1
CHAPTER 1 INTRODUCTION (15 pages)
Cardiology (Clinical)
Heart Failure Clinic
Research Methods
Study Design
Databases
Primary data collection
Electronic health records
Big Data
Data Analysis
Data Interpretation
Cardiology (from Greek kardi, "heart" and - -logia, "study") is a branch of medicine
dealing with disorders of the heart as well as parts of the circulatory system. The field includes medical
diagnosis and treatment of congenital heart defects, coronary artery disease, heart failure, valvular
heart disease and electrophysiology.1 Today, in clinical aspects of cardiology, it covers several standard
and new fields, including, but not limited to, acute coronary syndromes, anticoagulation management,
arrhythmias, cardiac surgery, cardio-oncology, congenital heart disease and pediatric cardiology,
diabetes and cardio-metabolic Disease, dyslipidemia geriatric cardiology, heart failure and
cardiomyopathies. In the context of the epidemiology of cardiology, it covers almost all these aspects,
but focuses on etiological studies, prevention, quality of life, quality of care, and health promotion at
2
population and community levels. Of these disease conditions, heart failure has become one of the new
Preventive cardiology addresses clinical practice and preventive medicine of cardiology in patients at
individual and population levels. The goal of preventive cardiology is to prevent cardiovascular disease
and to reduce the burden of cardiovascular disease in populations, and improve quality of life and life
expectancy in individuals with cardiovascular disease. In a clinical context if mainly focuses on (1)
patients with established atherosclerotic disease, (2) people with subclinical status of cardiovascular
disease, and (3) people at high multifactorial risk of developing cardiovascular disease. Most work is
done by clinical physicians in prevention and clinical medicine, nurses, social workers and other
healthcare professionals. Today, a number of hospitals have preventive cardiology clinics and programs
in the U.S. and other countries. Most countries have preventive cardiology professional societies. For
example. the American Society of Preventive Cardiology, its mission is to promote the prevention of
cardiovascular disease, advocate for the preservation of cardiovascular health, and disseminate high-
quality, evidence-based information through the education of healthcare clinicians and their patients.8
The European Society of Preventive Cardiology, its mission is to reduce the burden of cardiovascular
disease.9
Epidemiology, as a medicine discipline, has been in existence for over 150 years. Most agree that
modern epidemiology started in 1850s, U.K, from the control of cholera outbreak in the center of
London using a systematic and data-driven approach to examine the pattern of the diseases
distribution, the possible cause of the disease, control and prevent from the disease at individual
3
<<Insert John Snow Picture>>
Since recent 60-70 decades, the patterns of leading cause of deaths has changed to non-communicable Commented [ss1]:
disease in populations, non-communicable (or chronic) disease epidemiology has become one of the
important studies and practices in healthcare and medical education.11 The principles of epidemiology,
including study designs, data-driven hypothesis test, causal and association analyses, and disease
outcomes studies plays a fundamental role in medicine and public health. By definition endorsed by the
International Association of Epidemiology, epidemiology is The study of the occurrence and distribution
of health-related events, states, and processes in specified populations, including the study of the
determinants influencing such processes, and the application of this knowledge to control relevant
health problems.12 What is noteworthy about this definition is that it includes both a description of the
content of the discipline and the purpose or application for which epidemiological investigations are
carried out.13 The distribution of health-related states or events in specified populations can be outlined
by three W, Who, When, and Where. We used data from the U.S. Center for Disease Control and
Prevention (CDC) Wide-ranging Online Data for Epidemiologic Research (CDC WONDER) to demonstrate
Example: Figure 1.1 Age-specific mortality rates (per 100,000) from heart failure in people aged 35 and
older (Figure 1.1A) and in those aged 65 and older (Figure 1.1B) by sex in U.S. 2013-2015. It indicates
4
that males have a higher risk of mortality from heart failure than females in each age group of those
aged 35-44, 55-64, 65-74, 75-84, and 85 years old (Figure 1.1A), and the mortality increases from those
aged 65 and older, with a specifically sharp increase in people aged 85 and older in both males and
A B
800 800 65-74 75-84 85+
700 Male 700
Female
Rate (per 100,000)
600 600
500 500
400 Ages 65 400
300 300
200 200
100 100
0 0
35-44 45-54 55-64 65-74 75-84 85+ Male Female
Age (yrs)
Figure 1.1 Age-adjusted mortality (per 100,000) from heart failure by age and sex in U.S. 2013-2015
2. When did the disease change its pattern or when did the disease occur?
Example: Figure 1.2 depicts the trend of age-adjusted mortality (per 100,000) from heart failure in White
and African Americans aged 35 and older from 1999 to 2015 in U.S. It suggests that there was a
decrease trend from 1999 to 2011. However, this trend changed from 2011, that the mortality from Commented [ss2]:
heart failure increased from 2011 to 2015 in both white and African Americans in U.S.
Example: Figure 1.3 Age-adjusted mortality rates in patients with heart failure aged 35 and older by
states in U.S. 2013-2015. It depicts that there are huge variations in mortality rates from heart failure
across the states. Of the top five, the state of Mississippi had the highest age-adjusted mortality rates
5
(85 per 100,000 population), followed by Alabama (84.7 per 100,000 population), Louisiana (71.2 per
100,000 population), Utah (68.4 per 100,000 population), and Georgia (59.3 per 100,000 population) in
2013-2015.
Figure 1.3 Age-adjusted mortality rates (per 100,000 population) from heart failure
in people aged 35 and older by states in U.S. 2013-2015
In terms of epidemiological research stages, epidemiology can be divided two stages: stage one is called Commented [ss4]:
descriptive epidemiology (i.e., the basic of three W), and stage two called analytical epidemiology. In
descriptive epidemiology, investigators describe the burden of disease and/or risk factors, and generate
research questions for further studies through analytical epidemiology that it aims to answer Why. Commented [ss5R4]: Change terms into regarding
Cardiovascular disease epidemiology is a branch of epidemiology and public health. It involves the Commented [ss6]:
understanding of the causes of these diseases, identification of means of prevention, and monitoring of
populations to assess the changing burden of these diseases and the measurable impact of interventions
Cardiovascular disease is a group of disorders of the heart and blood vessels, which include six major
forms of diseases: coronary heart disease, stroke, heart failure, peripheral arterial disease, rheumatic
heart disease and congenital heart disease. It should be noted that by the definition of cardiovascular
6
disease and the criteria of the International Classification of Disease (ICD), hypertension is one of the
forms of cardiovascular disease, although we most commonly study hypertension as a risk factor for the
The U.S. Framingham Heart Study (FHS), initiated in 1947, is one of the earliest cardiovascular
epidemiological studies that focus on the causes of cardiovascular disease, distribution and prevention Commented [ss8]:
Commented [ss9R8]:
of these disease at population level.16 Since 1970s and 1980s, a number of regional, national and Commented [ss10R8]: distribution , please add comma
Commented [ss11]:
international cardiovascular disease epidemiological studies have been carried, for example of the
World Health Organization (WHO) MONICA Study, Atherosclerosis Risk in Communities (ARIC) Study,
Cardiovascular Health Study (CHS), the WHO coordinated Cardiovascular Disease Alimentary
Comparison (WHO-CARDIAC) Study, and the recent studies of Jackson Heart Study and the Multi-Ethnic
Study of Atherosclerosis (MESA) since early 2000s.17-23 Commented [ss12R11]: The population
Commented [ss13]:
1.1.2 Significance
Cardiovascular diseases are the leading cause of morbidity and mortality in U.S. In 2010, cardiovascular
Commented [ss14R13]: This sentence is very long.
To improve readability, consider breaking this into multiple
disease was a primary diagnosis for more than 7.8 million inpatients hospital discharges, and 4.6 million sentences.
emergency department encounters. Although cardiovascular disease death rates have declined over the
Commented [ss15]:
past decades, it remains the leading of death. It accounted for 30.6% of all death, with heart disease still Commented [ss16R15]:
the first, and strokes the fourth leading causes of death in the U.S. in 2014.24,25 Moreover, of the major
forms of cardiovascular disease, heart failure is the only one that the incidence, prevalence and
7
1.2 Epidemiology of Heart Failure: New Insights Into Research and Prevention
Heart failure, one of the major forms of cardiovascular diseases is a complex clinical syndrome that can
result from any structural or functional cardiac disorder that impairs the ability of the ventricle to fill
with or eject blood. Patients with heart failure may have a thick and still heart that cannot fill properly,
and/or may have a weakened heart that cannot pump out enough blood.26
The unique aspects of the epidemiology of heart failure in research and healthcare practice include, but
The burden of the disease has increased dramatically, with special impacts on aging and
underserved populations
The differences and changes of the spectrums of higher blood pressure, hypertrophy, and
ventricular systolic and diastolic dysfunctions and outcomes vary by ages, sex, race/ethnicities,
and regions
The complex of the disease with renal dysfunction, diabetes mellitus, and lung disease
8
The complex of the disease with multiple medications and interactions.
1.2.2 Significance
Since middle 1990s, heart failure has been emerging a significant public health concern because of its
Figure 1.4 shows that of the three major forms of cardiovascular disease, the prevalence of
hospitalization from heart failure in patients aged 65 and older who had a primary diagnosis of the
HF represents a new epidemic of CVD, affecting about 5.7 million Americans ages 20 and older (NHANES
2009-2012). It is estimated about 915,000 new HF cases occur annually according to data from ARIC
2002-2012 24. In 2013, HF any-mention mortality was 300,122 (140,126 males and 159,996 females) and
HF was the underlying cause in 65,120 of those deaths 24. In contrast to other CVDs, the prevalence,
incidence, and mortality from HF are increasing, and prognosis remains poor 4,5,27. Multiple comorbidities
in patients with heart have raised an even complication issue in both etiological study and healthcare
practice. Figure 1.5 shows that the prevalence of heart failure patients with comorbidities of coronary
heart disease (CHD), chronic constructive pulmonary disease (COPD), and renal failure have increased in
9
recent decades in both sexes, except a slightly decrease in comorbidity of cerebrovascular disease
(CBVD).
CHD
60
COPD
DM
50 Renal failure
CBVD
Rate (%)
40
30
20
10
0
1980-84 1985-89 1990-94 1995-99 2000-06 1980-84 1985-89 1990-94 1995-99 2000-06
Male Female
Figure 1.5 Changes in prevalence of comorbid CHD, COPD, DM, renal failure or CBVD in patients with
primary diagnosis of heart failure aged 65 and older in males and females from 1980 to 2006, U.S.
1.3 HFSA and AHA Guidelines for Heart Failure Prevention and Treatment
In 1999, the first HF guideline developed by the Heart Failure Society of America (HFSA) had a narrow
dysfunction.28 It did not consider subsets of the clinical syndrome of HF, such as acute decompensated
HF and diastolic dysfunction, or issues such as prevention.29 In 2006 and 2009, two subsequent
comprehensive clinical practice guidelines were published to address a full range of topics including
prevention, evaluation, disease management, and pharmacologic and device therapy for patients with
HF, and add a section on the genetic evaluation of cardiomyopathy.30,31 In 2010, HFSA updated and
expanded each of these areas.29 Furthermore, since 2001 the American College of Cardiology
10
Foundation (ACCF) and the American Heart Association (AHA) proposed new classifications of HF using
A, B, C and D stages, prevention and control of HF have been paid a great attention by all clinical
cardiologists and other professional healthcare practitioners.11,26,32 For example, in 2013, the American
College of Cardiology Foundation (ACCF) and the American Heart Association (AHA) have jointly
1.3.2 Significance
Knowledge about HF is accumulating so rapidly that individual clinicians may be unable to readily and
adequately synthesize new information into effective strategies of care for patients with this syndrome.
Trial data, though valuable, often do not give direction for individual patient management. These
characteristics make HF an ideal candidate for practice guidelines. The 2010 HFSA comprehensive
practice guideline addresses the full range of evaluation, care, and management of patients with HF. In
includes 17 specific sections: (1) Development and implementation of a comprehensive HF practice Commented [ss17]:
cardiac dysfunction, and HF. (4) Evaluation of patients for ventricular dysfunction and HF. (5) Commented [ss19]:
patients with reduced ejection fraction. (8) Disease management, advance directives, and end-of-life
care in HF. (9) Electrophysiology testing and the use of devices in HF. (10) Surgical approaches to the
treatment of HF. (11) Evaluation and management of patients with HF and preserved left ventricular
ejection fraction. (12) Evaluation and management of patients with acute decompensated HF. (13)
Evaluation and therapy for HF in the setting of ischemic heart disease. (14) Managing patients with
hypertension and HF. (15) Management of HF in special populations. (16) Myocarditis: current
treatment, and (17) genetic evaluation of cardiomyopathy. This updated HFSA 2010 highlights the
11
importance of control HF at primary, secondary and tertiary preventions, because patients at risk for Commented [ss23]:
perhaps the most significant step in limiting the public health impact of HF. Emphasis on primary and
secondary prevention is particularly critical because of the difficulty of successfully treating left
ventricular dysfunction, especially when severe. Current therapeutic advances in the treatment of HF do
Research methods refer to descriptive and analytical epidemiological studies, which include applied
biostatistics.
Descriptive epidemiology aims to describe the distributions of diseases and determinants. It provides a
way of organizing and analyzing these data in order to understand variations in disease frequency Commented [ss27]: Remove inorder and put to
geographically and over time, and how disease (or health) varies among people based on a host of
personal characteristics (person, place, and time). Descriptive epidemiology can thus generate
Cross-sectional study design (see Chapter 4) is most commonly applied in descriptive epidemiological
studies. For example, the National Health and Nutrition Examination Survey is a cross-sectional study. In
the study, participants health conditions, such as coronary heart disease, stroke, diabetes, and HF are
12
In descriptive analysis, statistics measures of mean, standard deviation, percentage, proportion and
rates .It is are commonly used to describe the distributions of disease and determinants by place
(where), person (who), when (time) in order to describe what happen (e.g., health issue of concern), and
generate research hypotheses for further analytic study (why/how = risk factors and possible causalities
Analytical epidemiology aims to study the associations between exposures (i.e., risk and protective
factors of diseases) and outcomes (i.e., measured by incidence, prevalence, and mortality of disease).
Analytical epidemiology can thus test research hypotheses of potentially causality effects and/or
A Prospective study, case-control study, and longitudinal intervention study designs are most commonly
Test rates differences using Chi-square test, logistic regression, and Cox proportional hazard
regression models (see chapter 4).
Test mean differences using student t test, analysis of variance (ANOVA), linear correlation and
linear regression models (see chapter 4).
13
With advances in new computer software and biostatistics methods, a number of new methods are
Mapping and visualization techniques using data from large-scale have been recently applied.
Now most common software are able to describe the distributions of disease and determinants
using mapping and visualization techniques, such as SAS, SPSS, R, and Stata.
In recent decades, several new biostatistics techniques are applied into analytical
epidemiological studies. For example, life course epidemiology design and analysis approach has
multiple confounders.36 Mediation analysis37, and multi-level regression models are applied to
test diffident associations between exposures and outcomes by individual and small areas
levels.38,39 40 Reduced rank regression models are applied in test associations between
Life course epidemiology: the study of long-term biological, behavioral, and psychosocial processes that
link adult health and disease risk to physical or social exposures acting during gestation, childhood,
adolescence, and earlier or adult life or across generations.43 David Barbers studies on maternal
nutrition, birth weight and risk of cardiovascular disease mortality in late life in UK is one of the earlier
examples using life course epidemiological approach.35,44 Recently, we worked with data from the US
Human Health Study to demonstrate that stroke mortality in adults is significantly associated with
14
Propensity score: is the probability of treatment assignment conditional on observed baseline
characteristics. The propensity score allows one to design and analyze an observational
controlled trial.45
Mediation analysis: Mediation is a hypothesized causal chain in which one variable affects a second
variable that, in turn, affects a third variable (y). Mediation analysis is typically applied when a
researcher wants to assess the extent to which the effect of an exposure (x) is explained, or is not
explained by a given set of hypothesized mediators (M, also called intermediate variables) (Figure
1.6).46,47
X M Y
or M
X Y
For example, if a study found a total effect of metabolic syndrome (yes versus no) on risk of heart failure
equal to a relative risk of 2.5 and, after adjustment for inflammation (such as assessed by serum C-
reactive protein), the relative risk decreased to 2.0, the percent excess risk of metabolic syndrome on
the risk of heart failure explained by the inflammation status (high versus low) would be 33.3% [(2.5-
2.0)/(2.5-1)*100].
15
It should be noted that we want X to affect M. If X and M have no relationship (as indicates in Figure
1.6), M is just a third variable that may or may not be associated with Y. A mediation makes sense only if
X affects M.47
viewed as a modern way of addressing research questions concerning how outcomes at the individual
level can be seen as the result of the interplay between individual and contextual factors. It is also
known as hierarchical linear regression analysis (models), nested data models, mixed models, random
coefficient, random-effects models, random parameter models, or split-plot designs. It addresses that
statistical models of parameters of interest vary at more than one level. An example could be an analysis
of heart failure patients whom we measured several biomarkers for individual patients, as well as
measures for hospitals within which the patients are grouped. The purpose of the multilevel analysis is
to take both variations in individuals and hospitals. This analysis has become popular since more
1.4.2 Significance
Standard and new epidemiological methods have met the challenges facing in heart failure
epidemiological studies. Specifically, common PC software can conduct conventional and complex
biostatistics efficiently and effectively, and results of biostatistics are easily interpreted and understood.
In the textbook, a number of standard and new data analysis methods are presented step by step, and
16
CHAPTER 2 PATHOPHYSIOLOGY AND RISK PROFILE OF HEART FAILURE
(Estimated 30 - 40 pages)
HF happens when the heart cannot pump enough blood and oxygen to support other organs in your
body. HF is a serious condition, but it does not mean that the heart has stopped beating. Fig 2.1 depicts
HF vs. congestive heart failure: HF is a global term for the physiological state in which cardiac output is
insufficient for the bodys needs. Congestive heart failure is the most severe clinical manifestation of
heart failure. This occurs most commonly when the cardiac output is low. Because not all patients have
volume overload at the time of initial or subsequent evaluation, the term heart failure is preferred over
17
The clinical syndrome of HF may result from disorders of the pericardium, myocardium, endocardium,
heart valves, or HF may be associated with a wide spectrum of left ventricular (LV) functional
abnormalities, which may range from patients with normal LV size and preserved ejection fraction (EF)
to those with severe dilatation and/ or markedly reduced EF. In most patients, abnormalities of systolic
patients with HF because of differing patient demographics, comorbid conditions, prognosis, and
response to therapies and because most clinical trials selected patients based on EF. EF values are
dependent on the imaging technique used, method of analysis, and operator. As other techniques may
indicate abnormalities in systolic function among patients with a preserved EF, it is preferable to use the
HF with reduced ejection fraction (HFrEF) and HF with preserved ejection fraction (HFpEF): HF may be
associated with a wide spectrum of left ventricular (LV) functional abnormalities, which may range from
patients with normal LV size and preserved ejection fraction (EF, called HFpEF) to those with severe
dilatation and/ or markedly reduced EF (called HFrEF). In most patients, abnormalities of systolic and
with HF because of differing patient demographics, comorbid conditions, prognosis, and response to
therapies and because most clinical trials selected patients based on EF. EF values are dependent on the
imaging technique used, method of analysis, and operator. As other techniques may indicate
abnormalities in systolic function among patients with a preserved EF, it is preferable to use the terms
cardiac function, skeletal muscle, renal function, stimulation of the sympathetic nervous system, and a
18
complex pattern of neuro-hormonal changes that impairs the ability of either ventricle to fill with or
eject blood 51, 52. Over the past decades, three clinical pathophysiological models (hypotheses) for HF
1) The cardio-renal model, in which HF is reviewed as a problem of excessive salt and water retention
abnormalities in the pumping capacity of the heart and excessive peripheral vasoconstriction.
active molecules that are capable of exerting toxic effects on the heart and circulation 53,54.
NYHA Functional Classification of Heart Failure: The New York Heart Association (NYHA) Functional
Classification provides a simple way of classifying the extent of heart failure. It places patients in one of
four categories based on how much they are limited during physical activity; the limitations/symptoms
are in regard to normal breathing and varying degrees in shortness of breath and/or angina.55
Class I: Cardiac disease, but no symptoms and no limitation in ordinary physical activity, e.g. no
Class II: Mild symptoms (mild shortness of breath and/or angina) and slight limitation during ordinary
activity.
19
Class III: Marked limitation in activity due to symptoms, even during less-than-ordinary activity, e.g.
walking short distances (20100 m), and/or being comfortable only at rest.
Class IV: Severe limitations. Experiences symptoms even while at rest, and/or mostly bedbound patients.
ACC and AHA Staging of Heart Failure: In 2001, the American College of Cardiology (ACC) and the
American Heart Association (AHA) took a new approach to the classification of HF that emphasized both
the evolution and progression of the disease. According to the ACC and AHA new approach, HF is
identified as 4 stages.32
Stage A identifies the patient who is at high risk for developing HF but has no structural disorder of the
heart.
Stage B refers to a patient with a structural disorder of the heart but who has never developed
symptoms of HF.
Stage C denotes the patient with past or current symptoms of HF associated with underlying structural
heart disease.
Stage D designates the patient with end-stage disease who requires specialized treatment strategies
hospice care.
2.1.2 Significance
20
The importance of classifying the subtypes HFrEF and HFpEF: The ejection fraction (EF) is an important
measurement in determining how well the heart is pumping out blood and in diagnosing and tracking
heart failure. Approximately half of all patients with HF have preserved ejection fraction (HFpEF), and
they often include women and elderly. Thus, as life expectancies continue to increase in western
societies, the prevalence of HFpEF will continue to grow. Many features of the HF syndrome are similar
across the EF spectrum, including elevated left atrial pressure, abnormal left ventricular-filling dynamics,
neurohumoral activation, dyspnea, impaired exercise tolerance, frequent hospitalization, and reduced
survival. In contrast to (classical) HF with reduced ejection fraction (HFrEF), only a limited spectrum of
treatment modalities seem effective in improving morbidity and mortality rates in HFpEF.56 Therefore,
awareness and attention to the HF syndrome in the presence of normal or mildly abnormal EF is
important.
The importance of the ACC and AHA Staging of HF: This classification system is intended to complement
but not to replace the NYHA functional classification, which primarily gauges the severity of symptoms in
patients who are in stage C or D. It has been recognized for many years, however, that the NYHA
functional classification reflects a subjective assessment by a physician and changes frequently over
short periods of time and that the treatments used do not differ significantly across the classes.
Therefore, the committee believed that a staging system was needed that would reliably and objectively
identify patients in the course of their disease and would be linked to treatments that were uniquely
appropriate at each stage of their illness. According to this new approach, patients would be expected to
advance from one stage to the next unless progression of the disease was slowed or stopped by
treatment. This new classification scheme adds a useful dimension to our thinking about HF similar to
21
2.2 Risk Factors for Heart Failure
A number of risk factors have been suggested to a causal role in the development of HF, although it
remains the subject of debate. As HF is a syndrome rather than a primary diagnosis. It has many
potential etiologies, diverse clinical features, and numerous clinical subsets. Patients may have a variety
of primary cardiovascular diseases and never develop cardiac dysfunction, and those in whom cardiac
dysfunction is identified through testing may never develop clinical HF. In addition to cardiac
dysfunction, other factors, such as vascular stiffness, dyssynchrony, and renal sodium handling, play
major roles in the manifestation of the syndrome of HF.29 Figure 2 depicts several major risk factors for
HF.
CHD
MS factors*
Smoking, diet LV systolic dysfunction HFpEF
Abuse of drug LV diastolic dysfunction
Aging Ventricular-vascular coupling HFrEF
Lung disease Others
CKD
Air pollution?
*MS factors: Metabolic syndrome factors: Obesity, elevated blood pressure, serum triglycerides,
glucose, and decreased high density lipoprotein cholesterol.
CKD: Chronic kidney disease.
HFpEF: HF preserved ejection fraction. HFrEF: HF reduced ejection fraction.
It should be noticed that several approaches that have been used to classify risk factors for HF according
to their specific concerns. For example, the AHA groups risk for HF to two groups: (1) lifestyle factors
that increase risk of heart disease, stroke and diabetes (e.g., smoking, being overweight, eating foods
22
high in fat and cholesterol and physical inactivity. (2) Health conditions that either damage the heart or
make it work too hard. These conditions include: (a) Coronary heart disease. (b) Past heart attack
(myocardial infarction). (c) High blood pressure. (d) Abnormal heart valves. (e) Heart muscle disease
(g) Severe lung disease, when the lungs dont work properly, the heart has to work harder to get
available oxygen to the rest of the body. (h) Diabetes. (i) Obesity. (j) Sleep apnea, and (k) Others: low red
Schocken and Bui et al, according to the knowledge of the established and hypothesized risk factors,
proposed eight groups of HF risk factors: (1) Major clinical risk factors: Age, male sex, hypertension, LV
hypertrophy, myocardial infarction, valvular heart disease, obesity, and diabetes. (2) Minor clinical risk
anemia, increased heart rate, dietary risk factors, sedentary lifestyle, low socioeconomic status, and
Infectious: Viral, parasitic (Chagas disease), and bacterial. (5) Toxic risk precipitants: Chemotherapy
alcohol consumption. (6) Genetic risk predictors: SNP (e.g. 2CDel322-325, 1Arg389), family history,
and history of congenital heart disease. (7) Morphological risk predictors: Increased LV internal
dimension, mass, and asymptomatic LV dysfunction. (8) Biomarker risk predictors: Immune activation
[e.g. insulin-like growth factors 1 (IGF1), tumor necrosis factor (TNF), interleukin 6 (IL-6), C-reactive
protein (CRP)], natriuretic peptides [e.g. brain natriuretic peptide (BNP) and n-terminal (NT)-BNP)], and
23
In epidemiology, we commonly classify disease risk factors according to unpreventable and preventable
2.2.2 Significance
HF is a major clinical and public health issue. The lifetime risk of developing heart failure is one in five.
Although promising evidence shows that the age-adjusted incidence of HF may have plateaued,
HF still carries substantial morbidity and mortality, with 5-year mortality that rival those of many
cancers. Therefore, understanding of these risk factors for HF plays a pivotal role in the education of
healthy lifestyle habits, and in clinical practice for the purpose of control and prevention of this disease.
24
CHAPTER 3 RESEARCH AND DESIGN
(Estimated 60 pages)
3.1 INTRUCTION
This chapter covers the basic concepts of clinical epidemiology and translational epidemiology, the
Clinical epidemiology is the application of the science of epidemiology in a clinical setting. Emphasis is
on a medically defined population, as opposed to statistically formulated disease trends derived from
outcomes and healthcare service research. For example, we conducted an in intervention study (pilot)
to improve HF patients adherence to medication and healthy behaviors. In the study, we recruited 152
African American (AA) adults who were diagnosed with HF. Of them, 79 were randomly assigned as the
intervention group who received usual health care and received cardiovascular health-focused
25
education and discussion through health-related survey questionnaires, and 73 participants that
received usual healthcare and received health-related questionnaire surveys only. All patients received a
baseline survey after their initial recruitment, and then a first follow-up after 3 months, and a second
follow-up after an additional 3 months since their first follow-up. The results show that of 152 patients
(M: 68, F: 84), the proportions of those with HF stage A, B, C, and D were 26.5%, 19.1%, 22.1% and
10.3% in men, and 52.4%, 21.4%, 21.4% and 4.8% in women, respectively. Of them, 121 completed the
first (3 M), and 39 had both the first and second (6 month) follow-up. Significant improved scores of
knowledge to HF etiology and self-efficacy were observed at the first follow up, and scores of self-care
skills at the second follow-up in both intervention and control groups. As compared to control group,
the improvement of knowledge to HF etiology and self-efficacy scores in the intervention group were
significantly higher in the first follow-up (p<0.01), and self-care skill scores were in both the first and
second follow-ups (p<0.01). Significant improvement in adherence to medication and healthy behaviors
were observed in intervention group (p<0.01). There was a tendency of improvement in quality of life at
the first and second follow-up in the intervention group although they did not reach statistical
significance.60 The concepts and study designs used in clinical epidemiology has been applied in clinical
trials as well. For example, in The Antihypertensive and Lipid-Lowering Treatment to Prevent Heart
Attack Trial (ALLHAT), sponsored by the National Heart, Lung, and Blood Institute (NHLBI).61
In recent decades, advances in our understanding of human biology and the emergence of powerful
new technologies, such as genomics and bioinformatics provide us new insights into modern
epidemiology. However, the transformation of scientific discoveries and advances into effective health
interventions remains limited. Recent emphasis on translational research (TR) is highlighting the role of
epidemiology in translating scientific discoveries into population health impact. For example, Khoury
and his colleagues proposed the applications of epidemiology in TR through 4 phases (designed T1-T4),
26
illustrated by examples from human genomics.62 In T1, epidemiology explores the role of a basic
scientific discovery in applying in practice (such as a new biomarker used in risk prediction and
prevention), In T2, epidemiology can help to evaluate the efficacy of a candidate application by using
observational studies and randomized controlled trials. In T3, epidemiology can help to assess
facilitators and barriers for uptake and implementation of candidate applications in practice. In T4,
epidemiology can help to assess the impact of using candidate applications on population health
outcomes. In the Ogilvie and colleagues study, they addressed two research gaps in population
translational research, the first between basic research and early clinical trials, and the second between
Translational epidemiology addresses the effective transfer of new knowledge from epidemiologic
studies intro the planning of population-wide and individual-level disease control programs and
policies.62,64,65 For example, the Heart Failure Association of the European Society of Cardiology recently
organized an expert workshop to address the issue of inflammation in heart failure from a basic science,
translational and clinical perspective, and to assess whether specific inflammatory pathways may yet
serve as novel therapeutic targets for this condition. In the workshop, several translational studies were
reviewed and were consistent to address that chronic inflammation interplaying with increased
oxidative stress, cytokine production, proteolytic matrix degradation, and autoimmunity, is implicated in
HF pathophysiology by increasing cardiac injury, fibrosis, and dysfunction. One of the important main
findings is that the idea of a common inflammatory pathway that characterizes all different forms of HF
appears unrealistic. It will probably be important to design specific anti-inflammatory approaches for
different types and stages of HF. In particular, anti-inflammatory approaches may need to differ
27
3.2.2. Significance
inclusive aspect of basic, clinical, preventive and predicting study. These basic concepts are the
The credibility criteria is involved in establishing that the results of a study are believable. It depends on
the richness and accuracy of information gathered, rather than the amount of data collection. Two most
common error factors affect the credibility of a study. There are (1) systematic error (bias), and (2)
Bias is a systematic distortion of the association between a determinant (i.e., an exposure) and an
outcome due to a deficiency in the study design. It may arise (1) from the study sample that is not
representative of the target population of interest (selection bias), (2) from misusing instruments and/or
from poor measurement (information bias), and /or (3) from the differential effects of other
determinants on the association between exposures and outcomes of the interest (confounding).
Select Bias
Selection bias can occur from the study design and during the courses of the study implementation. This
bias may occur when the sample does not represent the target population of interest. For example,
when some eligible participants refuse to participate in a study, the nonparticipation may introduce a
28
selection bias. In particular, if the proportion of participants depends on both exposure and disease
Information Bias
Information bias refers to bias arising from measurement error. Information bias is also referred
to as observational bias and misclassification. For example, often times, a variable can be
measured by many different methods, and each method would have its own accuracy. The most
accurate method may be the most expensive, most time-consuming, or most invasive. This
method, however may not be applied to all participants of a study population. In this case,
another less expensive, less time-consuming, or less invasive method would be used to collect
the information about the variable for all the participants. Certainly, an internal validation study is
needed for collecting information about the variable in a subsample of the study population from
both the less accurate method and the more accurate method. The information about the
variable would then be compared in the subsample. This comparison would yield values for the
bias parameter (e.g., sensitivity and specificity) used to correct for classifying and/or correcting
Confounding
Confounding is one of the most intriguing biases that occur in epidemiological study. It arises whenever
an outcome has multiple determinants (e.g., exposures) which are themselves associated, and one or
more of them is/are omitted from the consideration in data collection and analysis. In epidemiological
studies, age and sex are most commonly treated as confounders because they are associated with
exposures and are risk (or protective) factors for the outcomes of interest in most cases. For example,
vitamin D intake) is a risk factor for the development of HF and all-cause mortality.68-72 Fig 3.1 depicts
29
that when we test whether decreased serum 25-hydrvitamin D concentration is a significant risk factor
of heart failure, we should take into consideration of their association with age. Because a decreased
serum 25-hydroxyvimatin level is frequently observed in the elderly, and aging is a significant risk factor
Age
Serum 25-hydrovitamin D
concentration Heart failure
Fig 3.1 depicts a simple confounding case. To test whether there is a association between decreased
serum 25-hydrovitamin D concentration (a marker for vitamin D intake) and risk of heart failure, we need
to consider whether this vitamin D and heart failure association, if any, is confounded by age, serum 25-
hydrovitamin D concentration is lower in the elderly, and aging is a risk factor for heart failure. In the
case, if out research interest is to examine the association between serum 25-hydrovitamin D and heart
failure, age is called a confounding factor and should be adjusted in data analysis.
It is also well-recognized that serum 25-hydroxyvitamin D levels are different between males and
females (e.g., females have higher serum 25-hydroxyvitamin D than males), and sex is a significant
determinant of disease and mortality. Therefore, if our research of interest is to examine whether an
exposure (instead of age and sex) is a predictor of disease, we need to control the effect of age and sex
Definition of confounder:
A confounder is a third variable that can make it appear (sometimes incorrectly) that an observed
associated with the exposure of interest and is a potential cause of the outcome of interest.
30
Must not be an intermediate step in the causal path
According to the steps of a study implementation, there are various ways that could control or decrease
the potential confounding at the study design, data collection and analysis stages. (1) At study design
and data collection stages: A matching approach can be applied in a case-control study, and a cohort
study (see Section 3.5 for more detail). (2) At data analysis stages: Stratification analysis, multivariate
adjusted analysis (i.e., modeling) and sensitivity analysis can be applied to control potential confounding
One of the principal aims of an epidemiological study is to assess the cause of disease whether the
disease occurs due to one or more risk factors (i.e., exposures). However, since most epidemiologic
studies are by nature observational rather than experimental, several possible explanations for an
and a disease) need to be considered before we can infer a cause-effect relationship exists. That is, the
observed association may in fact be due to the effect of errors in study design and analysis stages.
Therefore, the observed statistical association between a risk factor and a disease does not necessarily
lead us to infer a causal relationship. Conversely, the absence of an association does not necessarily
imply the absence of a causal relationship.73,74 To evaluate a statistical association, we need to examine
31
1) Chance (random error)
3) Confounding
A small sample size of a study may lead to low statistical power and be attributable to a non-significant
statistical association between the study factors and the disease of interest. Selection bias in
participants recruitment and information bias in data collection may cause systematic errors. Non- or
inappropriate adjustment analysis may cause serious confounding effects on the association between
Even the above three errors have been controlled appropriately, the judgement as to whether an
observed statistical association (commonly defined as a p-value <0.05, see chapters 4 and 5) represents
a cause-effect relationship between exposure and outcomes requires inferences far beyond the data
from a single study and involves consideration of criteria that include the magnitude of the association,
The Bradford-Hill criteria are widely used in epidemiology as providing a framework against which to
(1). Strength of the association. According to Hill, the stronger the association between a risk factor and
(2). Consistency of findings. Have the same findings must be observed among different populations, in
32
(3). Specificity of the association. There must be a one to one relationship between cause and outcome.
(5). Biological gradient. Change in disease rates should follow from corresponding changes in exposure
(dose-response).
(7). Coherence. Does the relationship agree with the current knowledge of the natural history/biology of
the disease?
(8). Experiment. Does the removal of the exposure alter the frequency of the outcome?
Epidemiological study design involves descriptive and analytical studies. As discussed in Chapter xx,
descriptive epidemiological studies are concerned with describing the distribution of disease and risk
factors by frequency in terms of person, place and time. Analytical epidemiological studies are
concerned with the causality of disease based upon comparison of study populations in related to their
disease or exposure to risk factors status. In general, to study the causes of disease, descriptive studies
raise questions and provide the opportunity to generate hypotheses of the presence of an association
between disease and exposures; and analytical studies test the hypotheses. In general, there are two
major descriptive epidemiological study designs (i.e., ecological study and cross-sectional study), and
33
two major analytical epidemiological study design (i.e., observational case-control and cohort studies,
and experimental studies). Figure 3.2 depicts the two major types of epidemiological study designs.
Ecological study is also called as Correlation Study. It uses data from entire populations (not specific
individuals) to describe disease frequencies between different groups during the same period of time (or
in the same population at different point in time) in relation to a presumed risk factor.
The advantage of the study is that it can be conducted relatively quick and at low cost, because the
requested data is relatively easy to obtain. The study is useful for the formulation of hypotheses for
further studies.
However, the study cannot be used to test the hypotheses because of limitations inherent in this design.
First, the lack of individual-level information leads to a limitation of ecological studies known as the
ecological fallacy or ecological bias. The ecological fallacy means that any association observed
34
between variables on an aggregated level cannot be applied to the individual. An additional limitation of
ecological study is that investigators are unable to detect subtle or complicated relationships between a
disease and exposures. This is because of the crude nature of the data and the lack of information on
the individuals characteristics that might affect the association. Figure 3.3 shows the ecological
relationship between zip-code level poverty rate (%) in 2000 and age-adjusted prevalence of
zip-codes) with higher poverty rates had higher prevalence of hypertension across the neighborhoods in
A cross-sectional study describes the pattern of health-related events/factors, and examines the
relationship between disease and other variables of interests as they exist in a defined population at one
point in time.
Cross-sectional studies are carried out for public health planning and etiological research. Most
governmental surveys conducted by the U.S. National Center for Health Statistics are cross-sectional
studies. For example, the National Hospital Discharge Surveys, Nationwide Inpatients Sample, Behavior
Example: A cross-sectional study (n=3000) aimed to describe the frequencies of diabetes and heart
failure, and to examine if there was an association between diabetes and heart failure in a population
35
aged 45 and older (Table 3.1a). Among subjects (n=411) with diabetes mellitus (DM) the prevalence of
heart failure was 16.83 % (70/416), and among those without DM (n=2584) the prevalence of heart
failure was 5.53 % (143/2584). The results suggest that patients with DM had higher prevalence of heart
failure than those without DM. In statistics, we call this type of table a 2 x 2 table (Table 3.1b), as
showing a, b, c, and d (it is a basic table in Chi-square test, see Chapter 4).
Table 3.1a
HF
YES NO TOTAL
Prevalence of HF in subjects with DM:
YES 70 346 416
= a/(a + b) = 70 / 416 = 16.83 %
DM (a) (b)
NO 143 2441 2584 Prevalence of HF in subjects without DM:
(c ) (d) = c/(c + d) = 143 / 2584 = 5.53 %
213 2787 3000
* HF: Heart failure. DM: Diabetes mellitus.
Successive cross-sectional studies can be used to determine if there are trends in disease and/or
exposures over time (e.g., the prevalence rate of the disease or the exposure changes or the strength of
the association between the disease and exposures. For example, using data from the National Hospital
Discharge Surveys, we examined the trend in the prevalence of hypertensive disease (HTD) and
subtypes of HTD attributable to hospitalization among U.S. adults aged 35 years and older from
1980 to 2007. Data (n=4,598,488,000 hospitalized cases) from the 1980-2007 National Hospital
36
Discharge Surveys were used to examine the trends of hospitalized patients with 1 st (the
principal cause) and any 2nd to 7th (as a comorbid condition) diagnosis of HTD (ICD-9-CM: 401-
405) by gender and geographic regions. Age-adjusted rates of disease were calculated using
U.S. 2000 standard population. The results show that Age-adjusted hospitalization rates due to
1st diagnosis of HTD increased from 1.74% in 1980-1981 to 2.06% in 2006-2007 in men (p<0.01),
and from 2.0% to 2.09% in women (p=0.06). Age-adjusted rates due to any 2nd to 7th diagnosis
of HTD significantly increased from 7.06% to 35.09% in men (p<0.001), and from 7.88% to
31.98% (p<0.001) in women. Patients with 2 nd to 7th diagnosis of essential hypertension and
hypertensive chronic kidney disease had the highest annual percent increases.80
Unlike an ecological study, cross-sectional studies assess the status of an individual with respect to the
presence or absence of both the disease and exposure at the same point in time. Note that in this type
of study, the cases of disease we identify are prevalent cases of the disease. We know that the cases
existed at the time of the study but do not know their duration. For this reason, this design is also called
a prevalence study. It should be noted that if we have multi-years successive cross-sectional data, we
may examine a certain cohort effect of different time periods (such as to hypothesize that changes in
37
May provide evidence of birth cohort effect on changes in prevalence and mortality in multiple
years
Cannot establish cause-effect relationships. Because a disease determinant can change with
time, a cross-sectional study can give misleading results. For example, a patient who had
smoking for years may quit smoking after he/she had a diagnosis of heart disease. If a cross-
sectional study was conducted after the patients quitting smoking status, he/she would be
grouped as those who did not currently smoked. A non- or weak association between current
smokers and risk of a disease of interest may be observed because in the comparison group of
those who did not smoked included those who smoked before.
Survival bias (i.e., people who die soon after diagnosis or who recover quickly are less likely to
be identified as diseased, and survivors, those who live with the disease or who get better
quickly may be very different from the other individuals who die quickly from the disease).
study. To resolve this issue, one approach would set up a cohort study that compares the incidence of a
disease (i.e. a newly diagnosis of the disease) between participants who smoked and those who did not.
In the situation, we hypothesized that the truly smoking status had not been changed before the disease
was diagnosed.
38
A cohort study, also called a prospective study, is a longitudinal study, in which participants are selected
before they have experienced the outcomes of interest, and their exposures to possible determinants of
interest are then examined and recorded, together with their subsequent disease outcomes. In other
words, a cohort study is designed to compare disease incidence (and/or mortality) in populations
classified by whether the participants who are exposure to and those who do not exposure to the
For example, how a cohort study could be designed to investigate decreased serum vitamin D
concentrations as a risk factor for HF. First, select a group participants who have serum25-
hydroxyvitamin D concentration less than 20 ng/ml (let us define it as a decreased serum vitamin level),
and a group participants who have serum 25-hydroxyvitamin D concentration 20 ng/ml, all free of HF,
and record the numbers in each group who develop HF during a specified follow-up period (say, 10
years). Then compare the incidence of HF between the two groups. Will this method identify a casual
association, if one exists, between serum vitamin D and HF? It is possible if the two groups are similar in
every respect except for their exposure to the determinant of interest (vitamin D concentration in the
case). For then any difference in outcomes between the two groups must be due to the risk factor alone.
39
However, in a cohort study there is no guarantee that the groups are comparable. In case, unknown to
you, most of those who have higher serum vitamin D concertation are non-smokers, and most of those
who have lower serum vitamin D concentration are smokers. Given that smoking is a well-established
risk factor for HF, the association between subjects with lower serum vitamin D concentrations and
those with an increased incidence of HF could be due entirely to the smoking factor.
outcomes may be tested because the determinants of interest occur before the development of
From a cohort study, we are able to obtain incidence data, which indicate the risk of a disease
As compare to a case-control study (see next section), a cohort study minimizes selection bias
and information bias because the participants are free of the study outcomes of interest at the
start point of the study. Information on the determinants of interest is measured over the
follow-up period, instead of recalling. the study variables reduces Estimates time sequence for
causality
In general, a cohort study is very expensive because it may take years for the participants who
40
A cohort study is time-consuming because the development of a disease or an outcome of
Only few possible determinants of the study outcomes of interest could be examined due to the
Loss to follow-up. Some participants leave the geographic area of the study and cannot be
traced; some die with unknown cause or lose interest in participating; some are inevitably lost
to follow-up for the effect, despite intensive efforts to track them. This introduces selection bias.
A major problem with a cohort design just described is that the study population often must be followed
up for a long period to determine whether the outcome of interest has developed. For example,
concentration (a marker of vitamin D intake) and risk of heart failure. The study starts by selecting two
groups, one have a decreased 25-hydroxyvitamin D concentration (such as with an insufficient level, say
<20 ng/l, defined by most studies), and the other with normal serum 25-hydroxyvitamin D
concentration. Both groups will be followed for several years, to see which group have higher incidence
of heart failure. This type of study design is called a prospective cohort study (or a longitudinal study). It
is time consuming and costly. Given these issues, the prospective cohort study is often unattractive to
investigators.
However, if the exposure of interest was well recorded a long time in the past (such as serum 25-
hydrxyvitamin D was measured several years ago), a retrospective cohort study can be designed. In a
41
retrospective cohort study, both the exposures and outcomes have already occurred when the study
begins. The study involves assembling the existing information on the persons exposure status from the
historical information and combing these data with the persons outcome status.
3) Assessments of outcome
4) Data analyses
Examples:
To examine associations of unprocessed and processed red meat consumption with heart failure
incidence and mortality in men, a population-based prospective Cohort of Swedish Men (COSM) study
was conducted in 37,035 men aged 45-79 who had no history of heart failure, ischemic heart disease, or
cancer at baseline. During a mean follow-up of 11.8 years, 2891 incidences and 266 deaths from heart
failure. In the study, unprocessed and processed red meat consumption were categorized by four
groups (<25, 25-49.9, 50-74.9, and 75 g/day). The results show that men who consumed 75 g/day
processed red meat compared to those who consumed <25 g/day had a 1.28 (95%CI: 1.10-1.48, p-trend
test = 0.01) higher risk of heart failure incidence and 2.43 (1.52-3.88, p-trend test <0.001) higher risk of
heart failure mortality. This long-term study provides new insight into the prevention of heart failure
through diet change, although this finding came from a study with more than 10 years of follow-up.83
42
A retrospective cohort study using data from the U.S. Medicare beneficiaries was conducted and reports
by Dr. Lindenauer and colleagues. Patients aged 65 years and older, and hospitalized in 2006-2008 with
a principal diagnosis of acute myocardial infarction, heart failure, or pneumonia were examined the
association between exposure to income inequality and a patients risk of death and or readmission
within 30 days of admission to an acute care hospital. ICD9 was applied to identify the study disease of
interest and mortality. The results show that 555,962 admission in 4348 hospitals for acute myocardial
infarction that met criteria for the 30-day mortality analysis, 1092,285 in 4484 hospitals for heart failure,
and 1,146,414 in 4520 hospitals for pneumonia. The readmission analysis included 553,073 for
hospitalizations in 4262 hospitals for acute myocardial infarction, 1,345,909 hospitalizations in 4494
hospitals for heart failure, and 1,345,909 hospitalizations in 4524 hospitals for pneumonia. Multilevel
models showed no significant association between income inequality and mortality within 30 days of
admission for patients with acute myocardial infarction, heart failure, or pneumonia. By contrast,
income inequality was associated with re-hospitalization (acute myocardial infarction, risk ratio 1.09
(95% confidence interval 1.03 to 1.15), heart failure 1.07 (1.01 to 1.12), pneumonia 1.09 (1.03 to
1.15)).84
A case-control study may overcome some of these limitations and difficulties, because it differs from a
cohort study on the selection of participants. In a case-control study, patients who already have a
disease (case group) and people who do not have the disease (control group) are selected for
investigation, and the proportions of each group with the exposures of interest are compared.
43
1) Selection of cases
Incident or prevalent cases from a hospital (or several hospitals), physicians office, or patient
registry
2) Selection of controls
Community-population
Neighborhood
Family member
4) Data analyses
Recall histories
Patients (e.g. HF)
Compare to a cohort study, a case control study is relatively quick and inexpensive.
It is a best study design for investigating risk factors for rare disease
44
A case-control study can evaluate multiple risk factors for one disease because the study begins
with individuals who already have the disease of interest, and then collect data through recall
Selection bias is one of the main issues in a case-control study, because the cases are selected
after the disease have occurred, and the control are selected on the basis of a certain matching
and availability at the study point that is not necessary to be a representative sample for the
participants recall and/or their historically documented measures. Potential information bias
Matching: For each case, find a control that looks just like him/her in all other possible ways except for
Individual matching (one-to-one matching): one case is matched to one control (note: This type
of matching has implications for data analyses). For example, in a case-control study if we
wanted to test whether drug A is a risk factor for heart failure, and we think (or refer to
literature review) age is a probable confounding variable, we should match cases and controls
by age, such as each 65-year-old patient with heart failure (case) will be matched with a healthy
45
65-year-old control person. In a cohort study, a degree of matching is also possible and it is
often by only admitting certain age groups or a certain sex into the study population, creating a
Frequency matching: rather than matching control to one patient, frequency matching requires
that the frequency distributions of the matched variables be similar in case and control groups.
For example, if the case group consist of 100 young men, 50 old men, 25 young women and 25
old women then the control group will be made the same. Frequency matching does not have to
Multiple controls (one case matched to many controls): For each case, find one or more controls. This is
recommended by many epidemiologists when feasible (but there is little gain from having more than
Blinding: If it is possible, investigators who assess exposures should be blinded to whether the study
In a case-control study, subjects are samples from two populations (cases and controls), and cases
In a cross-sectional study, subjects are sampled from one population, and cases are prevalent only.
Example:
46
In Drs. Dunlay and colleagues report, a Population-based case-control study and analysis were
conducted to examine risk factors for heart failure. Residents living in Olmsted County, Minnesota who
had the first diagnosis of heart failure, identified by International Classification of Diseases, Ninth
Revision (ICD9) code 428, between 1979 and 2002 were selected, and age- ( three years older) and sex-
matched to population-based controls using Rochester Epidemiology Project resource. The occurrence
of each risk factor of interest (coronary heart disease, hypertension, diabetes, obesity, smoking) was
collected from age 18 (or date of emigration to Olmsted County after that) until the date of incident
heart failure or index date for controls (i.e., the date of each control recruited). The results suggest that
hypertension was the most common (population attributable risk, PAR=66%), followed by smoking
(51%). The risk of heart failure was particularly high for coronary heart disease and diabetes with odd
ratios (95% confidence intervals) of 3.03 (2.36-3.95) and 2.65 (1.98-3.54), respectively.85
(5).
A study design that has been used increasingly in recent years is the nested case-control study a hybrid
design in which a case-control study is nested in a cohort study. This cohort, at its inception or during
the course of follow-up, has had exposure information and/or biospecimens collected of interest to the
investigator. The investigator identifies cases of disease that occurred in the cohort during the follow-up
period. The investigator also identifies disease-free individuals within the cohort to serve as controls.
Using previously collected data and obtaining additional measurements of exposures from available
biospecimens, the investigator compares the exposure frequencies in cases and controls as in a non-
47
Study start (time) Direction of inquiry
Follow-up
All participants free of the
disease of interest Some
At the end of follow-
developed the
Exposure group up (the original
disease: As
(e.g. participants with serum study design)
Cases Nested case-
25-hydroxyvitamin D control study
insufficiency )
As compared to a cohort study, a nested case-control study saves time and inexpensive
No recall bias
Example
Drs. Hsiao and Hsieh et al examined the association between use of dopamine agonists and risk of heart
failure using a nested case-control study using data from Taiwans National Health Insurance research
database (NHIRD). They identified a population-based cohort comprising 27,135 patients who were
prescribed anti-parkinsonian drugs between 2001 and 2010. Of the patients, a nested case-control study
was carried out in which 1,707 cases of newly diagnosed heart failure were matched to 3,414 controls
(1:2 matched according to age, gender and cohort entry year) within this cohort. Their results showed
that an increased risk of heart failure was observed with current use of ergot-derived dopamine agonists
(adjusted odds ratio [OR] 1.46, 95 % confidence interval [CI] 0.9972.12) but not with current use of
48
non-ergotderived dopamine agonists (adjusted OR 1.24, 95 % CI 0.841.82). Among non-ergot-derived
dopamine agonists, both pramipexole (adjusted OR 1.40, 95 % CI 0.752.61) and ropinirole (adjusted OR
1.22, 95 % CI 0.761.95) showed a non-significantly increased heart failure risk. Although this study may
be limited by lack of statistical power, a clear pattern of an increased during of pramipexole use and an
increased risk of heart failure was observed.87 Figure 3.7 depicts the adjusted odds ratios (ORs) and 95%
confidence intervals (CI) for the associations between heart failure and current use of different
Figure 3.6 Adjusted odds ratios (95%CI) for the association heart failure and current use of
different dopamine agonists relative non-dopamine agnist treatment
Use of dopamine agonists Adjusted OR (95 % CI)
Non-dopamine agonist treatment 1 (Reference)
Current use of any dopamine agonista 1.22 (0.89 - 1.67)
Non-ergot-derived dopamine agonists 1.24 (0.84 - 1.82)
Pramipexole 1.40 (0.75 - 2.61)
Ropinirole 1.22 (0.76 - 1.95)
Ergot-derived dopamine agonists 1.46 (1.00 - 2.12)
Cabergoline* 2.39 (0.41 - 14.1)
Pergolide 1.39 (0.77 - 2.48)
Bromocriptine 1.54 (0.93 - 2.55)
0 1 2 3 4 5
*The upper confi dence wa s 14.1, but we depi ct i t i n a s horter l i ne for the purpos e of fi tti ng i n the tabl e.
As distinct from an observational study (i.e., a cross-sectional, cohort, and case-control study), an
experimental study is that the investigators have some control over a determinant of interest. On the
basis of the study design, experimental studies are necessarily prospective (cohort) studies. A case-
control study, however, cannot be experimental because the participants are selected after they have
49
Experimental studies are most commonly with investigating treatment therapies and/or intervention a
disease of interest among patients and/or subjects with high risk of disease. Experimental studies can be
classified in several ways, depending on their design and purpose. Two major types of experimental
studies (i.e., Randomized Trails) may be identified: (1). Randomized Clinical Trials (i.e., where treatments
and/or interventions are allocated to individuals. (2). Community Trials (i.e., where treatments and/or
A randomized clinical trial is called if the participants are hospital or doctors patients, or classified as
hospital-based clinical trials. It is also commonly applied to a new drug or medical product development.
Before a clinical or community trial begins, investigators review prior information about the intervention
(such as a drug) to develop specific research questions and objectives. Although different studies have
their specific research questions and objectives, investigators should keep in mind of two important
aspects when generating their research questions and objective: (1) Significance, and (2)
Innovation.[Ref]
Significance: What are the significances of the study? Does the study address an important problem or a
critical barrier to progress in the field? Is there a strong scientific premise for the study? If the objectives
of the study are achieved, how will scientific knowledge, technical capability, and/or clinical practice be
improved? How will successful completion of the objectives change the concepts, methods,
50
Innovation: Does the study challenge and seek to shift current research or clinical practice paradigms by
Are the concepts, approaches or methodologies, instrumentation, or interventions novel to one field of
When specific research questions have been developed, then the investigators need to decide:
Whether there will be a control group and other ways to limit research bias?
What assessments will be conducted, when, and what data will be collected?
Figure 3.7 shows the basic design of a randomized trial. We begin with a defined population that is
randomized to receive either new treatment or current treatment. We follow the subjects in each group
to see how many are improved in the new treatment group compared with the current treatment
group.
51
Improved
New
treatment
Not
Improved
Defined
population
Improved
Current
treatment
Not
Improved
In a clinical trial, randomized controlled trial and double blinding are important approaches to control
Randomized controlled trial: A method where the study population is divided randomly to mitigate the
chances of self-selection by participants or bias by the study designers. Before an experiment begins, the
investigators will assign the members of the participant pool to their groups (control, intervention,
parallel).
Using a Random number table, which is created by statisticians and can be found in most
Biostatistics textbooks
Computer assignment
Blinding: If it is possible, investigators who assess exposures should be blinded to whether the study
52
Case-control study Vs. Cross-sectional study:
In a case-control study, subjects are samples from two populations (cases and controls), and cases
In a cross-sectional study, subjects are sampled from one population, and cases are prevalent only.
Example:
In Drs. Dunlay and colleagues report, a Population-based case-control study and analysis were
conducted to examine risk factors for heart failure. Residents living in Olmsted County, Minnesota who
had the first diagnosis of heart failure, identified by International Classification of Diseases, Ninth
Revision (ICD9) code 428, between 1979 and 2002 were selected, and age- ( three years older) and sex-
matched to population-based controls using Rochester Epidemiology Project resource. The occurrence
of each risk factor of interest (coronary heart disease, hypertension, diabetes, obesity, smoking) was
collected from age 18 (or date of emigration to Olmsted County after that) until the date of incident
heart failure or index date for controls (i.e., the date of each control recruited). The results suggest that
hypertension was the most common (population attributable risk, PAR=66%), followed by smoking
(51%). The risk of heart failure was particularly high for coronary heart disease and diabetes with odd
ratios (95% confidence intervals) of 3.03 (2.36-3.95) and 2.65 (1.98-3.54), respectively.85
53
A study design that has been used increasingly in recent years is the nested case-control study a hybrid
design in which a case-control study is nested in a cohort study. This cohort, at its inception or during
follow-up, has had exposure information and/or biospecimens collected of interest to the investigator.
The investigator identifies cases of disease that occurred in the cohort during the follow-up period. The
investigator also identifies disease-free individuals within the cohort to serve as controls. Using
previously collected data and obtaining additional measurements of exposures from available
biospecimens, the investigator compares the exposure frequencies in cases and controls as in a non-
As compared to a cohort study, a nested case-control study saves time and inexpensive
No recall bias
Example
54
Drs. Hsiao and Hsieh et al examined the association between use of dopamine agonists and risk of heart
failure using a nested case-control study using data from Taiwans National Health Insurance research
database (NHIRD). They identified a population-based cohort comprising 27,135 patients who were
prescribed anti-parkinsonian drugs between 2001 and 2010. Of the patients, a nested case-control study
was carried out in which 1,707 cases of newly diagnosed heart failure were matched to 3,414 controls
(1:2 matched according to age, gender and cohort entry year) within this cohort. Their results showed
that an increased risk of heart failure was observed with current use of ergot-derived dopamine agonists
(adjusted odds ratio [OR] 1.46, 95 % confidence interval [CI] 0.9972.12) but not with current use of on-
dopamine agonists, both pramipexole (adjusted OR 1.40, 95 % CI 0.752.61) and ropinirole (adjusted OR
1.22, 95 % CI 0.761.95) showed a non-significantly increased heart failure risk. Although this study may
be limited by lack of statistical power, a clear pattern of an increased during of pramipexole use and an
increased risk of heart failure was observed.87 Figure 3.7 depicts the adjusted odds ratios (ORs) and
95% confidence intervals (CI) for the associations between heart failure and current use of different
55
As distinct from an observational study (i.e., a cross-sectional, cohort, and case-control study), an
experimental study is that the investigators have some control over a determinant of interest. On the
basis of the study design, experimental studies are necessarily prospective (cohort) studies. A case-
control study, however, cannot be experimental because the participants are selected after they have
Experimental studies are most commonly with investigating treatment therapies and/or intervention a
disease of interest among patients and/or subjects with high risk of disease. Experimental studies can be
classified in several ways, depending on their design and purpose. Two major types of experimental
studies (i.e., Randomized Trails) may be identified: (1). Randomized Clinical Trials (i.e., where treatments
and/or interventions are allocated to individuals. (2). Community Trials (i.e., where treatments and/or
A randomized clinical trial is called if the participants are hospital or doctors patients, or classified as
hospital-based clinical trials. It is also commonly applied in a new drug or medical product development.
Before a clinical or community trial begins, investigators review prior information about the intervention
(such as a drug) to develop specific research questions and objectives. Although different studies have
their specific research questions and objectives, investigators should keep in mind of two important
56
aspects when generating their research questions and objective: (1) Significance, and (2)
Innovation.[Ref]
Significance: What are the significances of the study? Does the study address an important problem or a
critical barrier to progress in the field? Is there a strong scientific premise for the study? If the objectives
of the study are achieved, how will scientific knowledge, technical capability, and/or clinical practice be
improved? How will successful completion of the objectives change the concepts, methods,
Innovation: Does the study challenge and seek to shift current research or clinical practice paradigms by
Are the concepts, approaches or methodologies, instrumentation, or interventions novel to one field of
When specific research questions have been developed, then the investigators need to decide:
57
Whether there will be a control group and other ways to limit research bias?
What assessments will be conducted, when, and what data will be collected?
Figure 3.7 shows the basic design of a randomized trial. We begin with a defined population that is
randomized to receive either new treatment or current treatment. We follow the subjects in each group
to see how many are improved in the new treatment group compared with the current treatment
group.
In a clinical trial, randomized controlled trial and double blinding are important approaches to control
Randomized controlled trial: A method where the study population is divided randomly to mitigate the
chances of self-selection by participants or bias by the study designers. Before an experiment begins, the
investigators will assign the members of the participant pool to their groups (control, intervention,
parallel).
58
Two common methods are applied in a randomization process
Using a Random number table, which is created by statisticians and can be found in most
Biostatistics textbooks
Computer assignment
Blinding: subjects do not know which group they are signed to in a study (single-blind). It avoids
certain psychological responses that may affect the study if the study subject knows that he or she is
Double blinding: Not only subjects do not know which group they are signed to, but also the data
collectors (often including the physician) and data analysts are blinded.
Placebo: an inert substance that looks, tastes, and smells like the active agent (i.e. the new drug).
For example, a randomized control trial was conducted to evaluate the efficacy and safety of treatment
with sildenafil (ClinicalTrials.gov number NCT00309790) for 12 weeks in patients with systolic heart
59
failure receiving standard heart failure therapy.89 In the trial, placebo-controlled, double-blind and
parallel assignment methods were used to compared two groups of patients. One group received
sildenafil and the other group received a placebo (a pill which looks like sildenafil, but contains no
medication). Patients had a heart catheterization, echocardiogram and exercise stress test. Patients
then took study medication for 12 weeks. A repeat heart catheterization, echocardiogram and exercise
stress test were then performed. Patients aged 18 years and older who had left ventricular systolic
dysfunction (LVSD, LV ejection fraction <40%) and had New York Heart Association class II to IV chronic
heart failure despite standard heart failure therapies, and had secondary pulmonary hypertension (PH,
defined by a mean pulmonary arterial pressure >25 mm Hg) were included. Patients were taking the
following medications: nitroglycerine pill/patch/paste, Isordil, Imdur, antifungal agents and certain
antidepressants, and patients with a history of optic neuropathy or unexplained visual impairment,
and/or with anemia were excluded. The primary outcome measures included (1) patients had the
following performed at baseline and again after taking sildenafil for 12 weeks: exercise capacity
measured by exercise stress test, heart pressure measured by a heart catheterization and (2) quality of
life measured by the Minnesota Living With Heart Failure questionnaire at baseline and at 12 weeks. A
total of 37 patients who met the recruitment criteria were randomized to treatment and placebo groups
for a 12-week trial. The results show that sildenafil improves exercise capacity and quality of life in
patients with systolic heart failure and secondary PH. The change in peak VO2 from baseline, the
primary end point, was greater in the sildenafil group (1.80.7 mL. kg-1.min-1) than in the placebo group
It should be noted that the Food and Drug Administration (FDA) describes a clinical trial of a drug with
five phases.91
60
Phase 0: Exploratory study involving very limited human exposure to the drug, with no therapeutic or
Phase 1: Studies that are usually conducted with healthy volunteers and that emphasize safety. The
goal is to find out what the drug's most frequent and serious adverse events are and, often, how the
Phase 2: Studies that gather preliminary data on effectiveness (whether the drug works in people
who have a certain disease or condition). For example, participants receiving the drug may be compared
to similar participants receiving a different treatment, usually, an inactive substance called a placebo, or
a different drug. Safety continues to be evaluated, and short-term adverse events are studied.
Phase 3: Studies that gather more information about safety and effectiveness by studying different
populations and different dosages and by using the drug in combination with other drugs.
Phase 4: Studies were occurring after FDA has approved a drug for marketing. These including
postmarket requirement and commitment studies that are required of or agreed to by the study
sponsor. These studies gather additional information about a drug's safety, efficacy, or optimal use.
It should be noted that current clinical trials follow a typical series from early, small-scale, Phase 1
Phase 1: Patients: 20-100 healthy volunteers or people with the study disease or condition of interest.
The purpose of Phase I is to test safety and dosage of a new treatment. It commonly takes several
61
Phase 2: Patients: Up to several hundred patients with the disease or condition of interest. The purpose
of Phase 2 is to test efficacy and side effects. It commonly takes several months to 2 years to complete
this Phase.
Phase 3: Patients: 300 to 3000 volunteers who have the disease or condition of interest. The purpose of
Phase 3 is to further test the efficacy and monitoring of adverse reactions in a larger study participants.
The above three Phases are mandatory for a development of new medical products. Also, a Phase 4 is
Phase 4 trials are carried out once the drug or device has been approved by FDA during the Post-Market
Safety Monitoring. The purpose of Phase 4 is to further test the safety and efficacy using data in the
real-world.
It is clear most clinical trials are conducted in patients with a disease or condition of interest, and they
are hospital-based studies. However, we all know the number of patients who receive healthcare in
hospitals is much smaller than the number of populations who live in communities and they may even,
have been not clinically diagnosed of having a disease or condition of interest, but they are at high risk
62
of the development of the disease or condition (called ice-berg phenomena).13 Therefore, a high level
Strategies and processes of data collection, and quality of control and evaluation of the data processes
There is a common mistake that if a researcher thought that the more the data they collect, the better
they have. Because if we collect the data that are unnecessary for the studys purpose it not only
increases unnecessary cost, but also consumes unnecessary time. On the other hand, if we do not
collect the data which we need, it will lead to an unsuccessful study as well. What data we should collect
in a study, it is case by case. The following principles offer an overall strategy of data collection in terms
(1). Outcome(s): A clear definition and measurable outcome should be classified. For example, if the
prognoses of patients with HF are the study outcomes. The definitions of what the prognoses should be
articulated, such as whether it is a 30-day readmission in patients with HF, or 30-day in-hospital
mortality, etc.
63
(2). Predictors: what are the main factors/predictors of interest should be very clear and measurable.
For example, to test an education-based intervention program for the purpose of the reduction of
readmission in patients with HF, changes in patients adherence to medications and adherence to
healthy lifestyle may be the important measures, because these factors predict readmission rates.
Several other factors, such as the severity of disease, age, sex, and comorbidity status may be also
(3). Covariates: Once the outcome(s) and predictors are classified, potential and important covariates
(or confounders) should be classified and collected as well. Because they may confound (bias) the
Last, but not the least, it is important to develop a SMART approach for data collection.93
Specific
Measurable
Attainable/Achievable
Relevant
Time bound
64
SpecificWhat exactly are we going to collect the data for the purpose of testing our specific research
questions?
Measurable implies the ability to count or otherwise quantify an activity or its results. It also means that
the source of and mechanism for collecting measurement data are identified, and that collection of
If a cohort study is designed, a baseline measurement is required to document change (e.g., to measure
percentage increase or decrease). Another important consideration is whether change can be measured
in a meaningful and interpretable way given the accuracy of the measurement tool and method. For
example, to estimate population awareness of the signs and symptoms of heart attack, we estimate
awareness using a sample of the state population. Since this is an estimate, there is a chance of error
associated with itusually expressed by a confidence interval (the point estimate, plus or minus an
estimate of variability). Projecting a very small change in population awareness, although measurable,
might not be meaningful because the change projected falls within expected variability or within the
Attainable/AchievableCan we get it done in the proposed time frame with the resources and support
we have available?
65
The objective and data collection approaches must be feasible with the available resources,
appropriately limited in scope, and within the programs control and influence.
RelevantWill the data being collected are relevant for the study?
Relevant relates to the relationship between the specific research questions and the overall goals of the
study. Evidence of relevancy can come from a literature review, best practices, or new theory and
A specified and reasonable time frame should be incorporated into the study design and specific
research questions in data collection. If an intervention study (either clinical trial or community trial) will
be designed, it should take into consideration the environment in which the change must be achieved,
the scope of the change expected, and how it fits into the overall work plan.
One of the most frequent questions asked by physicians conducing trials of new agents: How many
subjects do we have to study? To answer this question, we need to understand the following relevant
concepts:
Effect size
66
Type I error and Type II error
Power of study
Effect size
Assume that in a study, we know the current cure rate for patients with heart failure is 60%, and a new
therapy may have a cure rate of 62%. Does the 2% difference mean the new therapy is better than the
current one? Most people may consider this difference is too small to have any clinical important, as it
may occur by chance, or even if it does not arise by chance. Suppose the difference between the current
and new therapies is 5%. Would we say the new therapy is better? That would depend on how large a
difference we thought was clinically important. The size of the difference we want to detect is called the
effect size.
Example: A trial in which groups receiving one of two therapies, therapy A and therapy B. Before
beginning the study, we can list the four possible study results:
(a). The treatments are not different, and we correctly conclude that they are not different.
(b). The treatments are different, but we conclude that they are not different. (Type II error).
(c). The treatments are not different, but we conclude that they are different. (Type I error).
(d). The treatments are different, and we correctly conclude that they are different.
67
Figure 3.8 shows the four potential results.
Reality
Treatment are not Treatment are
We conclude different different
= Probability of concluding the treatments are different, but in reality they are not different
In a significant test, if we set up alpha () at p0.05, this means we are willing to take a risk of a type I
error at 5%.
= Probability of concluding the treatments are not different, but in reality they are different
Power of Study
Statistical power means the probability of finding a real effect (of the size that you think is clinically
important).
68
Power = 1 -
An estimate of the difference (i.e., the effect size) between the current rate and an expected
rate to be detected
Example:
To calculate sample size (n) for the difference between two proportions:
(11)+(22)
n= f (alpha, power)
(21)2
Where p1 = rate in group 1, q1 = 1 p1; p2 = rate in group 2, q2 = 1 - p2; alpha = significant level.
The value of alpha, power) for a two-tailed test can be obtained from the table below:
69
In a study, we know the current cure rate for patients with heart failure is 60%, and we expect a new
cure rate of 75% with a new therapy. How many subjects do we have to study, with of 0.05 and a
(.60.40)+(.75.25)
n= 7.9
(.75.60)2
= 150
The investigators would need 150 participants in each group to be 80% sure that they can detect a
To calculate sample size (n) for the difference between two means:
The formula to estimate sample size for a test of the difference between two mans, assuming there is to
2
n = [k * 2 ] / (MD)2
2
where is the error variance, MD is the minimum difference an investigators wishes to detect, and k depends on
the significance level and power desired. Selected values of k are shown on the table below.
70
Table. Alpha, power and k values
Significance
level Power k
0.99 18.372
0.95 12.995
0.05
0.90 10.507
0.80 7.849
0.99 24.031
0.95 17.814
0.01
0.90 14.879
0.80 11.679
Example:
To detect a difference in mean systolic blood pressure of 5 mmHg between two groups of people, where
the variance = 162 = 256, at a significance level of 0.05 and with power of 0.80, then the estimated
sample size:
N = [7.849 * 2 *( 256) ] / 52
= 161
That, we would need 161 participants in each group, this means we are 80% likely to detect a difference
To estimate a sample size, we may calculate using these formulas above, or using computer software
calculates it. Now most computer software can estimate a sample size easily as long as we have the
required information.
71
Whenever we carry out a trial, the ultimate objective is to generalize the results beyond the study
population itself. Two basic concepts are related to the generalizability of results: Internal validity and
Internal validity: whether the study was well done, and the findings are valid for the study sample.
External validity: whether the findings are validity for the large reference population.
Reference
population
External validity
(Generalizability)
Study
population
Internal validity
(quality of the
Randomized study)
New Current
Treatment Treatment
Investigators reply on a number of data sources for planning their research to ask specific research
questions and test specific research hypotheses. Knowing the home of a data source can make your
research efforts substantially more efficient, representative and meaningful. According to the nature of
data collection conducted by the investigators, there are two types of data: primary data, and secondary
data.
When the data collected directly by the researchers for the first time, it is called primary data. Primary
data collection is necessary when a researcher cannot find the data needed. Meanwhile, the advantage
72
of using primary data is that researchers are collecting information for the specific purposes of their
study. Researchers collect the data themselves, using interviews, surveys and direct observations. Data
collection through chart reviews and/or electronic health record system (EHRs) can be also considered
as primary data collection, specifically if these data collection through the physicians themselves.
Secondary data collection is to obtain data from another party. There are several types of secondary
data. They can include (1) population based health surveys coordinated by the World Health
Organization, such as the World Health Survey,94-96 (2) population based health surveys conducted by
national health agencies, such as the U.S. National Health and Nutrition Examination Surveys
(NHANES)33,97, and Behavior Risk Factor Surveillance System (BRFSS)98, (3) state-wide, regional, city and
community based health surveys, such as the Southeastern Pennsylvania Household Health Surveys.38,48
One type of secondary data thats used increasingly is administrative data, such as NHANES, BRFSS,
Compared to primary data, secondary data tends to be readily available and inexpensive to obtain. In
addition, administrative data tends to have large samples, because the data collection is comprehensive
and routine. Whats more, administrative data (and many types of secondary data) are collected over a
An investigator may obtain a secondary dataset from the U.S. National Institute for Health directly. For
73
NHBLI Biorepository:
The National Heart, Lung, and Blood Institute (NHLBI) is one of 27 Institutes and Centers at the National
Institutes of Health. The Institute supports basic, translational and clinical research in heart, lung and
blood diseases. It has supported data collection from participants in epidemiology studies and clinical
trials for over six decades. These data have often been sent to the NHLBI at the conclusion of the study
and placed in a Data Repository. The Data Repository is managed by NHLBI staff in the DCVS
Epidemiology Branch and includes individual level data on more than 580,000 participants from over
110 Institute supported clinical trials and observational studies.99 Figure 3.10 depicts an overview of the
For example, an investigator may request data from the NHLBI repository on heart failure included
studies:
Heart Failure Network (HFN) CARdiorenal REScure Study in Acute Decompensated Heart Failure
(CARRESS)
Heat Failure Network (HFN) Diuretic Optimization Strategies Evaluation in Acute Heart Failure (DOSE AHF)
The National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) is to conduct and support
medical research and research training and to disseminate science-based information on diabetes and
74
other endocrine and metabolic diseases; digestive diseases, nutritional disorders, and obesity; and
kidney, urologic, and hematologic diseases, to improve peoples health and quality of life. A number of
existing datasets are available upon an application to conduct a secondary data analysis and/or set up
an ancillary study. Figure 3.11 depicts the NIDDK Central Repository website.100
Several study datasets which are included heart failure as an outcome and/or comorbidity are available
Assessment, Serial Evaluation, and Subsequent Sequelae of Acute Kidney Injury (ASSESS-AKI) Study
3.8 Statistical analysis strategies for a study with a specific research design
Studies with different research designs and characteristics of the measures of the outcomes and
exposures of interest in general request different approaches of data analysis. Table 3.2 shows the
75
Table 3.2 Statistical analysis strategies
Analysis strategies Refs
Research design Univariate Multivariate Chapter
Descriptive study
Ecological mean, frequency Partial correlation
Simple correlation, regression Multiple regression
Test for trend (rates)
Cross-sectional mean, frequency Linear regression
Simple correlation, regression Logistic regression
Test for trend (rates)
Analytical study
Case-control mean, frequency Logistic regression
Odds ratios (OR)
Cohort mean, frequency Cox's regression
Relative risk (RR)
PAR*
Kaplan-Meier
Clinical trial Kaplan-Meier Cox's regression
Community trial Kaplan-Meier Cox's regression
*PAR: Population attributable risk.
Significance
Research design is the fundamental in an epidemiological study. There are several ways that are used to
clarify research designs. It can be descriptive epidemiological studies and analytical epidemiology
studies, or observational studies and experimental studies. No matter whatever a classification is used, a
study focusing on the description of the determinants and outcomes by person, place and time, and
comparison of the determinants and outcomes between exposure and non-exposure groups are
important to provide meaningful and scientific evidence for etiological studies of disease and healthcare
practice.
Cross-sectional studies are designed to determine what is happening? right now. Subjects are selected
76
Case-control studies begin with the absence (i.e., controls) and presence (i.e., cases) of an outcome and
then look backward in time to try to detect possible causes or risk factors that may have been suggested
in descriptive studies (such as ecological study, cross-sectional study or clinical case-report). It asks
what happened?
Cohort studies are forward-looking from a risk factor to an outcome. It asks what will happen?
Clinical trials, including community trials, and most with randomized control approaches, are forward-
looking from a new treatment and/or intervention to outcomes. It asks whether it works (of the
77