Interpreting Oral Health-Related Quality of Life Data

Community Dent Oral Epidemiol 2012; 40: 193200 All rights reserved
2011 John Wiley & Sons A/S
Commentary
Interpreting oral health-related quality of life data

Tsakos G, Allen PF, Steele JG, Locker D. Interpreting oral health-related quality of life data. Community Dent Oral Epidemiol 2011. 2011 John Wiley & Sons AS Abstract The most common way of presenting data from studies using quality of life or patient-based outcome (PBO) measures is in terms of mean scores along with testing the statistical signicance of differences in means. We argue that this is insufcient in and of itself and call for a more comprehensive and thoughtful approach to the reporting and interpretation of data. PBO scores (and their means for that matter) are intrinsically meaningless, and differences in means between groups mask important and potentially different patterns in response within groups. More importantly, they are difcult to interpret because of the absence of a meaningful benchmark. The minimally important difference (MID) provides that benchmark to assist interpretability. This commentary discusses different approaches (distribution-based and anchorbased) and specic methods for assessing the MID in both longitudinal and cross-sectional studies, and suggests minimum standards for reporting and interpreting PBO measures in an oral health context.
Georgios Tsakos1, P. Finbarr Allen2, Jimmy G. Steele3 and David Locker4

Department of Epidemiology and Public Health, UCL, London, UK, 2Department of Restorative Dentistry, Cork University Dental School & Hospital, Cork, Ireland, 3 Department of Restorative Dentistry, School of Dental Sciences, University of Newcastle upon Tyne, Newcastle upon Tyne, UK, 4 Community Dental Health Services Research Unit, Faculty of Dentistry, University of Toronto, Toronto, Canada
1
Key words: epidemiology; oral health; outcomes assessment; quality of life; data interpretation Dr. Georgios Tsakos, Department of Epidemiology and Public Health, UCL, 1-19 Torrington Place, London WC1E 6BT, UK e-mail: g.tsakos@ucl.ac.uk Submitted 2 December 2010; accepted 1 October 2011
Assessing the subjective dimensions of oral health has become a major focus of enquiry in dentistry, and there is now a substantial body of research documenting the self-perceived oral health of patients and populations. Early contributions to this eld (16) were concerned with changing concepts of health and models of disease and its consequences. Together these provided a conceptual and theoretical rationale for the development of indices and scales to measure the constructs dened by those models. Indeed, a number of indices have been developed (7, 8) and continue to evolve. At the same time, there has been some debate about what these indices actually measure and what they should be called (8, 9). Although most have the same format, that is they assess the frequency and or severity of functional and psychosocial impacts associated with oral disorders, they have been variously labelled as sociodental indicators, subjective oral health status measures, oral health outcome measures, oral health-related quality of life measures or quality of life measures. A similar debate has taken place in medicine. As it does not appear to be easily resolved, Fitzpatrick et al. (10)
doi: 10.1111/j.1600-0528.2011.00651.x
have suggested the umbrella term patient-based outcome measures (PBOs) on the grounds that all are dependent upon what patients have to say about their health. For consistency, this is the term used in this paper, though we acknowledge its inaccuracy particularly for epidemiological studies where participants are not patients; participant-based outcomes is more appropriate.
Context and applications

The main purpose of PBOs is to complement the conventional or normative clinical measures that have been central to oral health research and practice for most of its history. In recent years, PBO measures have been widely used in oral health research, in the following contexts: Epidemiological surveys demonstrating the impact of oral conditions on peoples quality of life. Most PBOs currently used in oral health were initially used in epidemiology. Recently, they have been increasingly incorporated in national surveys.
193
Tsakos et al.
Studies exploring their potential use, in combination with clinical measures, in assessing needs for dental care. Clinical trials measuring the effectiveness of interventions, where PBO measures are used as either primary or secondary outcomes, in addition to clinical assessments. The evaluation of PBOs should be based on both theoretical and technical requirements. Theoretical requirements, discussed in a previous commentary (8), refer primarily to the theoretical models employed and precise denitions of the concepts measured and provide important background information that affects the meaning and interpretation of scores. This commentary focuses on the technical requirements of PBOs in oral health.
statistical signicance of differences are insufcient and suggest instead a more comprehensive and thoughtful approach to the reporting and interpretation of data.
Dealing with meaningless scores

For scores derived from clinical measures, interpretation is sometimes facilitated by the availability of clear risk-related cut-off points. For example, blood pressure cut-off points for hypertension can be related to empirically derived increases or decreases in the probability of a cardio or cerebro-vascular event (17), giving clinical scores some clinical meaning. Such consensus-based or empirically derived cut-offs are not available for the PBO measures used in medicine or dentistry. All PBO measures used in dentistry give rise to some form of arithmetic score. These are most often calculated by the summation of numerical codes attached to frequency response options of the items comprising the measure. In some cases, such as the Oral Impacts on Daily Performances (OIDP), calculation of scores involves combination of frequency and severity of oral impacts. These aggregate scores are intrinsically meaningless and difcult to interpret, because there are no criteria for determining whether an individual with a specic PBO score is mildly, moderately or severely compromised by his her oral disorders. A related problem is that a given score can be derived from different sets of responses with different items affected to a varying degree, therefore making it impossible to provide one prole for a specic score. For example, considering that OHIP-14 items are scored on a scale ranging from 0 to 4 (Never = 0, Hardly ever = 1, Sometimes = 2, Fairly often = 3, Very often = 4), a score of 14 would be given to a subject who responded Hardly ever to all 14 items and to a subject who responded Very often to three items, Sometimes to one and Never to the remaining ten items. While these subjects are treated as being the same for analytic purposes, they have very different response proles. The latter might be quite severely compromised in terms of performing daily life activities, while the former may not be compromised to any signicant degree, if at all. The same applies equally to other widely used measures, such as the OIDP and GOHAI. In an effort to address these issues, different scoring formats (estimates of prevalence, extent
Technical requirements
Depending on the context upon which they are used and the study design employed (cross-sectional or longitudinal), the main technical requirements of a PBO measure are reliability, validity and sensitivity to change (1113). Some are suitable for measuring between group differences in crosssectional population or clinic-based studies, whereas others can be more suited to measuring change in clinical trials and intervention studies. It must not be assumed that a PBO measure can perform all these tasks equally well. In essence, the measure chosen must be suited to the purpose for which it is being used and have measurement properties to match. However, in all studies using PBO measures, the fundamental aim is to detect differences between groups, either at one point in time (e.g. differences between socioeconomic groups) or over time (e.g. pre post-treatment differences). In the literature to date, the most common way of presenting data from these studies is in terms of aggregate scores along with an appropriate (or sometimes inappropriate) test of the statistical signicance of these differences. The use of single aggregate scores is not without limitations, as shown for generic PBOs (14, 15) and also in relation to clinical periodontal indicators (16), and they should be interpreted with caution. More importantly, there is limited guidance on what constitutes clinical relevance for PBO measurements and this has also practical implications, for example little help to inform power calculations for clinical trials. We argue that reporting only aggregate scores and assessing the
194
Interpreting quality of life data
and severity) have been calculated for the OHIP-14 (18, 19) and the OIDP (19, 20). For the OHIP-14, prevalence refers to the proportion of subjects with one or more items experienced Fairly often or Very often; though, this cut-off is recognized as arbitrary. Extent is the number of items experienced Fairly often or Very often, while severity is a simple summation of the response codes to all 14 items. Prevalence, extent and intensity have also been suggested for the Child-OIDP score calculation (21). Prevalence refers to the proportion of subjects that reported one or more daily life performances affected by their oral conditions, extent indicates the number of performances affected and intensity is used to classify subjects into groups according to their highest score in any performance. Children with the same overall score may well vary in their extent and intensity scores; higher extent indicates more daily life performances affected, whereas higher intensity indicates more severe effect in at least one performance. Such different scoring formats of PBOs provide complementary information and a more sophisticated approach to scoring. Reporting ndings for different scoring formats for PBO measures should be encouraged as a rst step towards improving their interpretability. There may also be subjects inconsistently classied by the different scoring formats. Whether such inconsistencies can be resolved by altering case denitions is something that might be explored empirically. Conceptually, it may be argued that as the different scoring formats have different focus, the variable denition of cases may not necessarily be a limitation.
Dealing with meaningless means

As noted above, data derived from PBO measures are most commonly reported by mean scores. In cross-sectional studies, PBO scores of groups are compared using an independent samples t-test or its nonparametric equivalent. In longitudinal studies, one group pre post-treatment outcome studies and randomized or nonrandomized controlled trials, mean pre postchange scores are calculated with signicance tests applied within and between groups. While this may seem a reasonable approach for a clinical outcome study, focussing solely on mean change scores is quite limiting. For a start, the longitudinal evaluation of PBOs through the mea-
surement of change is complex and controversial and the use of change scores is not free from statistical limitations (22). Furthermore, change can occur in both directions, a pattern masked by aggregate change scores (23). In a trial, some individuals in the intervention or the control group may have positive change scores indicating improvement, while others may have negative change scores indicating deterioration and some may not change. Furthermore, the same mean change score can be due to relatively smaller changes in the same direction for the whole group or rather larger changes in one direction for some subjects, while others may change in the opposite direction. This applies to all outcome measures, not just PBOs. We suggest that these distinct patterns of change should be reported, rather than simply providing a mean change score that fails to recognize them. More importantly, signicant differences in scores do not provide information about the key research question, which is to understand whether the difference between groups is meaningful either from a clinical or from the patients perspective. In line with the previous critique, if the PBO scores are meaningless, differences or changes in scores are differences between meaningless estimates. They give the direction of the difference, but without any notion of scale or (more importantly) intrinsic meaning. Statistical signicance is only relevant in refuting the null hypothesis of no difference in means. A very large sample, whether in a trial or a population study, can reveal statistical signicance that has little relevance or intrinsic meaning. Turned around, when thinking about power calculations for trials or epidemiology, the critical point should be to determine what is clinically meaningful, with the sample calculated to t. In this context, interpretability is a key issue.
Interpretability
An initial distinction that needs to be made is the one between responsiveness and interpretability. Responsiveness represents a measures ability to detect change when change has or might reasonably be expected to have occurred. Responsiveness is important, but a technical issue can be dened relatively easily using statistics. On the other hand, interpretability refers to whether these changes are clinically signicant or meaningful to a person
195
Tsakos et al.
experiencing that change. It has been dened as the degree to which one can assign qualitative meaning that is, clinical or commonly understood connotations to quantitative scores (13). Assessment of responsiveness implies repeated measurements, such as in a clinical trial or clinical outcome study, while interpretability is a generic concept applying to both longitudinal and cross-sectional studies. Consequently, interpretability refers to both single scores as well as change scores of PBOs, while responsiveness is relevant only for the latter. For presentation purposes, we will initially focus on interpretability of change scores in longitudinal studies and then refer separately to interpretability in cross-sectional studies. A key concept in determining interpretability is the minimally important difference (MID).
Establishing the MID in longitudinal studies

Jaeschke et al. (24) dened the MID as: the smallest difference in score in the domain of interest which patients perceive as benecial and which would mandate, in the absence of troublesome side-effects and excessive cost, a change in the patients management. In essence, the MID, alternatively also termed as minimally important clinical difference or meaningful difference, provides a good indication of whether the observed change (or difference in a cross-sectional study) is meaningful (2529). Consequently, an intervention may be justied if it can be shown to result in a change that exceeds the MID. Different methods have been presented for calculating the MID, and they can be categorized under two broader groups: distribution-based and anchor-based. Distribution-based, or internally referenced, methods compare the change in PBO scores to some measure of variability (2529). The more commonly used distribution-based methods are the (i) effect size (ES), (ii) standardized response mean (SRM) and (iii) standard error of measurement (SEM). The ES and SRM are calculated as ratios of the mean change score with the standard deviation of the baseline score and the change score, respectively. The SEM is calculated by multiplying the standard deviation of the baseline score with the square root of 1 minus the reliability of the PBO measure (30, 31). Both ES and SRM are expressed in standard deviation units and can be interpreted through
conventional benchmarks (32), as small (0.2), moderate (0.30.7) or large (0.8) effect. These benchmarks are useful but do not provide an actual value for the MID. In addition, as both depend on the distribution and variability of PBO scores, these measures assume a normal distribution of change scores. This needs to be demonstrated rather than simply assumed. More importantly, they are sample dependent and can be considerably affected by the dispersion of observations. So, the ES (or SRM) can be large even if the mean difference in the change scores is modest, providing that there is not much dispersion in the baseline or change scores. The SEM is expressed in the same (original) units as the PBO measure and is more of a xed characteristic of a measure (31), hence not sample dependent. The value of SEM indicates what is likely to be measurement error. Therefore, any change smaller than the SEM cannot be disassociated from measurement error (26), while larger values indicate the existence of real changes. However, this does not provide concrete evidence for the MID, because differences larger than the measurement error should not de facto be considered as important or meaningful. Wyrwich et al. (30, 31) have provided empirical evidence that the SEM is almost equal to the MID in patients with cancer, and the same was the case in a study on periodontal patients (33); however, whether this is the case for other conditions or groups of patients remains to be proven. Norman et al. (34) suggested another approach based on the PBO scores distribution. By reviewing a large number of studies, they concluded that most MIDs were approximately half the standard deviation of baseline PBO scores. Obviously, this is totally empirical without any conceptual justication, but it is worth checking whether it continues to provide reasonable estimates as evidence on the MID accumulates from different studies. While distribution-based methods are internally referenced and derived solely from PBO scores, anchor-based (or externally referenced) methods use additional information to determine an external criterion of change to compare PBO scores against. In this respect, known clinical groups, population norms or subjective global transition scales can act as the reference (anchor) point. In the latter case, the MID reects the mean PBO change score for subjects reporting transition ratings indicative of minimal important change. Transition ratings are easy to use, mirror the kinds of
196
questions clinicians ask patients and offer a patient-based approach to calculating a MID. However, their use is somewhat controversial largely because their psychometric properties have been questioned (26, 35, 36), although the progressively larger PBO change scores among groups with better global transition ratings (37) provide some evidence for their construct validity. Examples of this approach to calculating MID include the work of Juniper et al. (37) using the Asthma Quality of Life measure, and Allen et al. (38) on the OHIP-20. Using an anchor-based method is helpful for longitudinal studies as it facilitates clinical interpretability while still retaining the richness of the PBO measure as an outcome.
Establishing the sectional studies
MID
in
cross-
The concept of interpretability also applies to crosssectional studies where scores of two or more groups are compared. While the differences between groups may be statistically signicant, the problem remains as to whether or not they are of sufcient magnitude to be regarded as meaningful either clinically or to the individual. One might suggest that establishing the MID is less critical in cross-sectional studies as the PBO would rarely be set up as a single primary outcome measure. Nevertheless, we would argue that a marker of interpretability in the form of a MID can still give important clinical context, provided that it is, in turn, interpreted appropriately. The technical issues in cross-sectional studies are slightly more challenging. Two of the three distribution-based methods for the MID, the ES and the SEM, can be used with cross-sectional data. For example, in a national population survey of Canadian adults, the mean OHIP-14 severity scores were 20.1 for those with only secondary education and 18.3 for those with post-secondary education (P < 0.001). In terms of interpretability, the ES for this comparison was 0.24, i.e. small, and the SEM was 2.7. This difference does not exceed what is likely to be error, hence cannot be considered as clinically meaningful. In contrast, the respective difference in mean OHIP-14 severity score between the lowest and highest income groups was 5.8 (P < 0.001), and the related ES was 0.78; this difference exceeded the SEM and should be considered meaningful. Obviously, the use of ES and SEM in cross-sectional
studies is subject to the same limitations as for when assessing changes over time. In contrast to the distribution-based methods, no guidelines have been published for using externally referenced criteria (anchor-based methods) for calculating the MID in cross-sectional studies. Applying the same principle as for longitudinal studies, it is possible to determine anchors for the MID based on differences in mean scores between known clinical groups or oral health ratings. For example, in the same Canadian data, the mean OHIP-14 severity scores of dentate and edentate, two clinical groups whose PBO assessments could be assumed to differ to a meaningful extent, were 18.6 and 21.8, respectively (P < 0.001). This difference exceeds the SEM and gives an ES of 0.42. While the distinction between dentate and edentate is important, it may be too broad. The dentate is a diverse group, in terms of number of teeth and levels of oral diseases. Differences between more rened clinical groups would be preferable, but there is no real consensus as to which groups should be used for that purpose. Considering also the indirect theoretical relationship and relatively weak associations between clinical and PBO measures, there is currently no concrete evidence that PBOs in oral health can be linked to anything other than large clinical differences. Global ratings of either oral health or its impact on quality of life can also be employed as anchors. These are conventionally scored on ordinal scales and differences in mean PBO scores between adjacent categories can be used to estimate the MID. However, the choice of categories to use as anchors can have a marked effect on the estimated MID as differences will probably vary accordingly (in the previous study, they ranged from 5.74 at the bottom of the scale to 1.18 at the top). A potential solution is to use the mean of the mean differences. Again, much more empirical evidence is needed to indicate the potential usefulness of this approach. While the challenges for measuring the MID and interpreting PBO scores in cross-sectional studies are acknowledged, there is no reason why conclusions from such studies should not be subject to the same scrutiny as clinical trials. In the case of crosssectional studies, this paper offers a diagnosis of the problem (of PBO scores interpretability) and also some suggestions. We hope that further debate will lead to more robust solutions. And we acknowledge that further progress in terms of interpretability, for both longitudinal and cross-sectional studies, will
197
Tsakos et al.
not necessarily address the limitations of aggregate scores.
Estimating the MID in the context of oral health

There is a dearth of studies on the interpretability of PBO scores in oral health research, with the exception of very few longitudinal studies. Locker et al. (39) undertook a study on elderly patients attending a community clinic and estimated the MID for the OHIP-14 to be 5 points, through employing a transition scale and using the mean score of those reporting being a little better. Another relevant study refers to treating children under general anaesthesia (40). The MID (calculated through the anchor-based approach; mean change scores for those reporting a little improvement in overall quality of life after treatment) was 8 points for the Parental Child Perceptions Questionnaire and 3 points for the Family Impact Scale. A study on partially dentate patients that underwent treatment including the provision of removable partial dentures used different global transition ratings (on appearance, chewing ability, oral comfort, and speech) as anchors and estimated that the MID for OHIP-20 ranged between 7 and 10 points (38). Using patients global transition rating about their treatment, the MID for OHIP-49 was 6 points on adult patients receiving prosthodontic treatment (41). Recently, a study on patients treated for periodontitis employed both distribution- and anchor-based approaches and showed that the MID in the OIDP was approximately 5 points, irrespective of method used (33). These studies have used different measures, in different populations, and with a considerable variability of contextual factors. In turn, the MIDs
described also vary. It seems logical that interpretability of PBO scores should be context and condition (disease) specic. For instance, it cannot be assumed that the MID for periodontal patients completing a specic PBO measure will also be relevant to patients with TMJ for the same measure. Such information should be compiled from different cross-sectional and longitudinal studies using PBOs, in the same fashion as for the re-establishment of the psychometric properties of a measure. As a result, a useful volume of knowledge on the MID for different PBO measures and populations will be gradually built up and could be stored in a database to facilitate planning future studies.
Reporting requirements for studies using PBO measures

This discussion about interpreting PBO scores has important implications for determining minimum standards for reporting (Table 1). As a starting point, reporting should refer not just to the mean and or median scores, but extend to other relevant descriptions, such as alternative scoring formats, where available. Mean and or median PBO scores are equally applicable to both cross-sectional and longitudinal studies; the score from the single administration is used for the former and the change score between the two administrations for the latter. Nevertheless, alternative scoring formats have been used in cross-sectional but not equally so in longitudinal studies. Furthermore, in longitudinal studies, it is informative to also present the change scores distribution and the proportions of participants that improved, stayed the same and deteriorated. This provides a much clearer picture, particularly when two or more grades of improvement and deterioration are
Table 1. Minimum reporting standards for studies using patient-based outcome measures Cross-sectional Description Mean Median Alternative scoring formats Change scores distribution (improvement; no change; deterioration) Interpretation Statistical signicance Effect size Standardized response mean Standard error of measurement Global ratings (oral health quality of life) Well-established clinical groups benchmarks X X (X) X X X X Longitudinal X (X) X (X) X X X X X
198
used. On the contrary, means are disproportionately affected by outliers and their reporting may mask the real situation and mislead the discussion. Once the MID is established, it is also important to know what proportion of the sample reported improvement equal or higher than the MID. Differences between groups are conventionally assessed through hypothesis tests of statistical signicance. In trials, the standard practice is to compare post-treatment scores between groups using ancova to account for the effect of baseline scores. Tests of statistical signicance are widely used but are insufcient for decision making, as they provide no information about the magnitude of the difference and whether it has any clinical or public health importance. By addressing this issue, the MID can give meaning to otherwise meaningless PBO scores and guide interpretation of these differences. The MID should be preferably calculated through different methods. For distribution-based methods, this implies calculating ES, SRM and SEM for longitudinal and ES and SEM for cross-sectional studies. For anchor-based methods, global ratings of oral health (and or quality of life) should be included in studies using PBOs; current ratings in crosssectional studies and ratings of change in longitudinal studies. In addition, well-established clinical groups or clinical benchmarks can also be used as anchors for the MID. However, clinical benchmarks require consensus about what constitutes minimum meaningful change in clinical status. And this is still an unresolved issue in most elds, including oral health. Furthermore, clinical benchmarks for calculating the MID in PBOs need to be relevant to and correspond with oral health perceptions, for example through clinical manifestations that are recognized by the person. This is not straightforward, particularly in conditions that are silent for part of their progress. After using different methods, the value of MID should be determined by triangulating on a single value or small range of values (28, 29). This is easier when the different MID estimates are close to each other but becomes more difcult when there is larger variation between them. While not all these recommendations are without limitations and applicable to all cases, it is worth expanding on the current very narrowly focussed practice and applying them for the reporting of PBOs. After all, this is a way into interpreting what may otherwise be meaningless PBO scores. Future research should also focus on psychometric properties of global transition ratings and the establishment of consensus clinical benchmarks.
Acknowledgements
David Locker initiated this commentary, had primary responsibility in writing and revised different versions of the text. The paper was nalized after his sudden death and therefore he did not see the nal version. His fellow authors wish to acknowledge the substantial contribution he made to this paper.
References
1. Cohen LK, Jago JD. Toward the formulation of sociodental indicators. Int J Health Serv 1976;6:681 98. 2. Gift HC, Atchison KA. Oral health, health, and health-related quality of life. Med Care 1995;33(11 Suppl):NS5777. 3. Locker D. Measuring oral health: a conceptual framework. Community Dent Health 1988;5:318. 4. Reisine ST. The impact of dental conditions on social functioning and the quality of life. Annu Rev Public Health 1988;9:119. 5. Reisine ST, Locker D. Social, psychological and economic impacts of oral conditions and treatments. In: Cohen LK, Gift HC editors. Disease prevention and oral health promotion. Socio-dental sciences in action. Copenhagen: Munksgaard; 1995; 3371. 6. Sheiham A, Croog SH. The psychosocial impact of dental diseases on individuals and communities. J Behav Med 1981;4:25772. 7. Slade GD (editor). Measuring oral health and quality of life. Chapel-Hill: Department of Dental Ecology, School of Dentistry, University of North Carolina; 1997. 8. Locker D, Allen F. What do measures of oral healthrelated quality of life measure? Community Dent Oral Epidemiol 2007;35:40111. 9. McGrath C, Bedi R. A national study of the importance of oral health to life quality to inform scales of oral health related quality of life. Qual Life Res 2004;13:8138. 10. Fitzpatrick R, Davey C, Buxton MJ, Jones DR. Evaluating patient-based outcome measures for use in clinical trials. Health Technol Assess 1998;2:iiv, 174. 11. Scientic Advisory Committee of the Medical Outcomes Trust. Assessing health status and quality-oflife instruments: attributes and review criteria. Qual Life Res 2002;11:193205. 12. Guyatt GH, Kirshner B, Jaeschke R. Measuring health status: what are the necessary measurement properties? J Clin Epidemiol 1992;45:13415. 13. Lohr KN, Aaronson NK, Alonso J, Burnam MA, Patrick DL, Perrin EB et al. Evaluating quality-of-life and health status instruments: development of scientic review criteria. Clin Ther 1996;18:97992. 14. Osoba D. Measuring the effect of cancer on healthrelated quality of life. Pharmacoeconomics 1995;7: 30819. 15. Simon GE, Revicki DA, Grothaus L, Vonkorff M. SF36 summary scores: are physical and mental health truly distinct? Med Care 1998;36:56772. 16. Imrey PB. Considerations in the statistical analysis of clinical trials in periodontitis. J Clin Periodontol 1986;13:51732.
199
Tsakos et al. 17. Chobanian AV, Bakris GL, Black HR, Cushman WC, Green LA, Izzo JL Jr et al. The Seventh Report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure: the JNC 7 report. JAMA 2003;289:256072. 18. Slade GD, Nuttall N, Sanders AE, Steele JG, Allen PF, Lahti S. Impacts of oral disorders in the United Kingdom and Australia. Br Dent J 2005;8:48993. 19. Soe KK, Gelbier S, Robinson PG. Reliability and validity of two oral health related quality of life measures in Myanmar adolescents. Community Dent Health 2004;21:30611. 20. Kida IA, Astrom AN, Strand GV, Masalu JR, Tsakos G. Psychometric properties and the prevalence, intensity and causes of oral impacts on daily performance (OIDP) in a population of older Tanzanians. Health Qual Life Outcomes 2006;4:56. 21. Gherunpong S, Tsakos G, Sheiham A. The prevalence and severity of oral impacts on daily performances in Thai primary school children. Health Qual Life Outcomes 2004;2:57. 22. Locker D. Issues in measuring change in selfperceived oral health status. Community Dent Oral Epidemiol 1998;26:417. 23. Slade GD. Assessing change in quality of life using the Oral Health Impact Prole. Community Dent Oral Epidemiol 1998;26:5261. 24. Jaeschke R, Singer J, Guyatt GH. Measurement of health status. Ascertaining the minimal clinically important difference. Control Clin Trials 1989;10:40715. 25. Osoba D, King M. Meaningful differences. In: Fayers P, Hays RD, editors. Assessing quality of life in clinical trials, 2nd edn. Oxford: Oxford University Press; 2005; 24357. 26. Copay AG, Subach BR, Glassman SD, Polly DW Jr, Schuler TC. Understanding the minimum clinically important difference: a review of concepts and methods. Spine J 2007;7:5416. 27. Guyatt GH, Osoba D, Wu AW, Wyrwich KW, Norman GR. Methods to explain the clinical significance of health status measures. Mayo Clin Proc 2002;77:37183. 28. Revicki DA, Cella D, Hays RD, Sloan JA, Lenderking WR, Aaronson NK. Responsiveness and minimal important differences for patient reported outcomes. Health Qual Life Outcomes 2006;4:70. 29. Revicki DA, Hays RD, Cella D, Sloan J. Recommended methods for determining responsiveness and minimally important differences for patientreported outcomes. J Clin Epidemiol 2008;61:1029. Wyrwich KW, Tierney WM, Wolinsky FD. Further evidence supporting an SEM-based criterion for identifying meaningful intra-individual changes in health-related quality of life. J Clin Epidemiol 1999;52:86173. Wyrwich KW, Nienaber NA, Tierney WM, Wolinsky FD. Linking clinical relevance and statistical signicance in evaluating intra-individual changes in healthrelated quality of life. Med Care 1999;37:46978. Cohen J. Statistical power analysis for the behavioral sciences, 2nd edn. Hillsdale, NJ: Lawrence Erlbaum Associates; 1988. Tsakos G, Bernabe E, DAiuto F, Pikhart H, Tonetti M, Sheiham A et al. Assessing the minimally important difference in the Oral Impact on Daily Performances index in patients treated for periodontitis. J Clin Periodontol 2010;37:9039. Norman GR, Sloan JA, Wyrwich KW. Interpretation of changes in health-related quality of life: the remarkable universality of half a standard deviation. Med Care 2003;41:58292. Guyatt GH, Norman GR, Juniper EF, Grifth LE. A critical look at transition ratings. J Clin Epidemiol 2002;55:9008. Wyrwich KW, Bullinger M, Aaronson N, Hays RD, Patrick DL, Symonds T. Estimating clinically significant differences in quality of life outcomes. Qual Life Res 2005;14:28595. Juniper EF, Guyatt GH, Willan A, Grifth LE. Determining a minimal important change in a disease-specic Quality of Life Questionnaire. J Clin Epidemiol 1994;47:817. Allen PF, OSullivan M, Locker D. Determining the minimally important difference for the Oral Health Impact Prole-20. Eur J Oral Sci 2009;117:12934. Locker D, Jokovic A, Clarke M. Assessing the responsiveness of measures of oral health-related quality of life. Community Dent Oral Epidemiol 2004;32:108. Malden PE, Thomson WM, Jokovic A, Locker D. Changes in parent-assessed oral health-related quality of life among young children following dental treatment under general anaesthetic. Community Dent Oral Epidemiol 2008;36:10817. John MT, Reissmann DR, Szentpetery A, Steele J. An approach to dene clinical signicance in prosthodontics. J Prosthodont 2009;18:45560.
30.
31.
32. 33.
34.
35. 36.
37.
38. 39.
40.
41.
200
This document is a scanned copy of a printed document. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material.

Interpreting Oral Health-Related Quality of Life Data

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Interpreting Oral Health-Related Quality of Life Data

Uploaded by

Copyright:

Available Formats

Community Dent Oral Epidemiol 2012; 40: 193200 All rights reserved

2011 John Wiley & Sons A/S

Interpreting oral health-related quality of life data

Georgios Tsakos1, P. Finbarr Allen2, Jimmy G. Steele3 and David Locker4

Context and applications

Dealing with meaningless scores

Interpreting quality of life data

Dealing with meaningless means

Establishing the MID in longitudinal studies

Interpreting quality of life data

Establishing the sectional studies

not necessarily address the limitations of aggregate scores.

Estimating the MID in the context of oral health

Reporting requirements for studies using PBO measures

Interpreting quality of life data

You might also like