The Ostomy-Q: Development and Psychometric Validation of an Instrument to Evaluate Outcomes Associated with Ostomy Appliances

Ostomy Wound Management 2017;63(1):12–22
Beenish Nafees, MSc; Mikkel Rasmussen, MSc; and Andrew Lloyd, DPhil


Using an ostomy appliance can affect many aspects of a person’s health-related quality of life (HRQL). A 2-part, descriptive study was designed to develop and validate an instrument to assess quality-of-life outcomes related to ostomy appliance use. Study inclusion/exclusion criteria stipulated participants should be 18 to 85 years of age, have an ileostomy or colostomy, used an appliance for a minimum of 3 months without assistance, and able to complete an online survey. All participants provided sociodemographic and clinical information.

In phase 1, a literature search was conducted and existing instruments used to measure HRQL in persons with an ostomy were assessed. Subsequently, the Ostomy-Q, a 23-item, Likert-response type questionnaire, divided into 4 domains (Discreetness, Comfort, Confidence, and Social Life), was developed based on published evidence and existing ostomy-related HRQL tools. Seven (7) participants recruited from a manufacturer user panel took part in exploratory/cognitive qualitative interviews to refine the new quality-of-life questionnaire. In phase 2, the instrument was tested to assess item variability and conceptual structure, item-total correlation, internal consistency, test-retest reliability, sensitivity, and minimal important difference (MID) in an online validation study among 200 participants from the manufacturer’s user panel (equally divided by gender, 125 [62.5%] >50 years old, 128 [64%] with an ileostomy). This exercise also included completion of the Stoma Quality of Life Questionnaire and 2 domains from the Ostomy Adjustment Inventory-23 to assess convergent validity. Eighty-two (82) participants recompleted these study instruments 2 weeks later to assess test-retest reliability. Sociodemographic and clinical data were assessed using descriptive statistics; Cronbach’s alpha was used for internal consistency (minimum 0.70), principle component analysis for item variability/conceptual structure, and item-total correlation; intraclass correlation coefficient was used for test-retest reliability; and standard error of measurement was applied to MID. All domains demonstrated good internal consistency (between 0.69 and 0.78). All scales showed stability, with a minimum intraclass correlation coefficient of 0.743 (P <.001). The Ostomy-Q showed good convergent validity with other instruments to which it was compared (P <.01). In this study, the Ostomy-Q was found to be a reliable and valid outcome measure that can enhance understanding of the impact of ostomy appliances on users. Some items for social relationships and discreetness may need more exploring in the future with other patient groups. 


Clinical evidence, including observational studies and clinical trials,1,2 has found individuals with an ostomy face many physical and emotional challenges. Ostomy surgery to address medical conditions such as colon/rectal cancer, Crohn’s disease, and trauma3 results in the use of an appliance, often for the rest of the patient’s life, presenting a significant challenge for many people.4  

New innovations in ostomy care have user advantages. Per a review of literature,5 most ostomy pouching systems are designed to be lightweight, odor-proof, and relatively low maintenance in order to provide an acceptable wear time and prevent skin irritation.6 

Pittman et al4 explored how ostomy complications such as skin irritation, leakage, and difficulty adjusting to an ostomy affect a person’s quality of life (QoL); in their cross-sectional study, United States’ veterans (N = 239) completed 2 versions of the City of Hope Quality of Life: Ostomy Instrument, a patient-administered questionnaire designed to assess QoL. The research demonstrated overall QoL was predicted by the severity of skin irritation, leakage, and difficulty adjusting to the appliance.  

Developments in the technology of ostomy care and appliances infer people may feel differently about dealing with an ostomy than in years past. Consequently, older instruments such as the Ostomy Adjustment Scale7 are less valid because some of the issues are less relevant. 

The current study was conducted to 1) review existing user-completed instruments or questionnaires and literature focused on the impact of ostomy devices on users’ QoL and utilize this information to develop a new instrument to assess aspects of QoL among ostomates; and 2) test the new instrument using qualitative interviews with people who have ostomies in the United Kingdom and then, in a larger psychometric validation study, evaluate test-retest reliability, validity, and responsiveness in the UK.  

Methods and Procedures

Study design. The study was conducted in 2 phases. Phase I involved developing a new measurement tool and evaluating its content validity; phase II involved psychometric validation of Ostomy-Q. A schematic overview of the study design is shown in Figure 1. owm_0117_nafees_figure1

Participant sample. Recruitment of all participants for both phases took place in the UK over 6 months. Participants were recruited from the manufacturer end-user panel, an online forum maintained by Coloplast A/S, Denmark (“the manufacturer”) for contact with users (N = 488). Potential participants were screened online to ensure they met the eligibility criteria, which stipulated they should be a resident of the UK, 18 to 85 years old, have an ileostomy or colostomy, have used ostomy appliances for a minimum of 3 months, handled ostomy appliances themselves (ie, without help from others), and able to complete the online survey or interview. Participants were excluded if they had an acute illness or cognitive impairment that in the opinion of the investigator would interfere with the study requirements. All participants gave online informed consent before participating. 

Protocol. A protocol was developed for both phases of the study that was granted ethical approval by the Salus Institutional Review Board (United States).  

Phase I.

Literature review. A targeted literature review was undertaken in order to understand the evidence regarding the impact of ostomies and appliances on people’s health-related quality of life (HRQL). EMBASE and PubMed were searched using the terms ostomy, colostomy, ileostomy, ostomy and quality of life, and health-related quality of life. This search identified 12 relevant studies that detailed physical (eg, skin irritation1), emotional, and relationship issues. 

Dabirian et al1 conducted a qualitative study in which 14 patients with ostomies were interviewed about their QoL; 9 themes emerged, including physical problems (such as rash, lack of sleep), psychological problems (low mood), relationships (family life), costs, nutrition, physical activity, travel, sexual relationships, and religion. Using a questionnaire, Nugent et al2 had participants (N = 391) assess postoperative care, QoL issues, and equipment issues; major problems noted included rashes (51%), leakage (36%), and ballooning, and the latter 2 were found to cause embarrassment, distress, and sleep disturbance. The fear of unpleasant gases and general use of ostomy appliances can have a negative effect on social relationships and inhibit participation in leisure activities.1,2 Additional research3,8,9 including observational studies has highlighted concerns regarding leakage, ballooning, and inability to conceal the pouch.

Additional psychological problems reported by people with an ostomy include reduced confidence, anxiety, depression, and stigma often related to risk of appliance leakage.3,10-12 To explore practical methods that can be taken to lessen a patient’s fear of embarrassment and ridicule after the surgery, Noone10 described the case of a woman who underwent stoma surgery. Danielsen et al12 conducted focus groups with 15 people with permanent ostomies to understand the affect of ostomies on daily living; participants reported they wanted control and more education from health care professionals regarding their new lives with an ostomy. Participants also mentioned isolating themselves to avoid disclosing their stoma to people. 

Existing assessment instruments. The literature also was reviewed to examine existing scales, and several scales were identified that were designed to assess the affect of ostomy care. These included: Stoma-QoL13 Ostomy Adjustment Scale,7 City of Hope Quality of Life: Ostomy Instrument,14 and the Ostomy Adjustment Inventory-23 (OAI-23).15 Most of these instruments evaluated different aspects of the impact of having an ostomy, such as quality of life and physical activity. The Stoma-QoL was of key interest because it was designed to specifically assess HRQL associated with using an ostomy appliance. Therefore, it was reviewed and used as a base for the current research.  

The Stoma-QoL was developed to assess QoL for individuals with a colostomy or ileostomy based upon detailed qualitative research (N = 169). This 20-item, unidimensional HRQL instrument is based on Maslow’s Hierarchy of Needs theory,16 a Rasch-based model. The Stoma-QoL captures relevant QoL-related concepts for people with an ostomy, including social interaction, anxiety, and body image, in a single score. Although the Stoma-QoL is built upon a sound foundation of qualitative research, the current authors believe the use of the unidimensional structure derived from the Rasch model is overly restrictive in terms of its assumptions. Therefore, the authors sought to develop a multidimensional scale that relaxed the model assumptions. The content of the Stoma-QoL was used as a starting point for the development of a multidimensional scale to assess the affect of an ostomy appliance on some specific aspects of QoL.  

Instrument development and testing. The literature review was used to guide the development of items for the Ostomy-Q. The items in the Stoma-QoL measure were reviewed by the study team (the study authors and 2 speciality nurses from the manufacturer panel). The review was done by assessing items against the team members’ clinical experience and knowledge to see if anything relevant was missing. New items were drafted based on the findings from the literature review and the reviewers regarding what had been reported as important. Five (5) ostomy appliance users from the manufacturer user panel reviewed the first draft of the survey to determine whether the proposed content included important issues. The interviews also were designed to gather general feedback from participants regarding how well they understood each item.  

Content validity of the draft Ostomy-Q was evaluated by conducting cognitive debriefing/exploratory telephone interviews with the same 5 individuals with an ileostomy or colostomy. The participants were asked to evaluate item comprehension and interpretation, completeness of item coverage, relevance, clarity, and readability of Ostomy-Q. The instruction recall period of 1 week and respondent burden (in terms of length and complexity of the items) also were assessed. The interviews involved a think-aloud and retrospective approach that enabled participants to speak freely about how well they understood an item and how it could be improved. The interviewer then recorded the answers. The interview provided flexibility to allow the interviewer to adapt the questions to suit each individual participant; this enabled the interviewer to understand the responses specific to each person.17

The interviews were analyzed using an interview grid developed for the study that included each person’s response to each question to evaluate the information gathered; minor revisions to the new instrument were made based on the interviews. Two (2) additional interviews were conducted with new users from the panel to assess user understanding and interpretability of the revised instrument during which participants were asked to explain what they interpreted to be the meaning of each item and its response. The tested version of the Ostomy-Q measure included a total of 23 items. 

Participants completed a sociodemographic form, clinical background form, and the questionnaires. Following completion of the interviews, participants were offered remuneration in the form of points that could be used in an online shop for ostomy users.  

The tested version of the Ostomy-Q included 23-items rated on a 5-point Likert scale ranging from 1 = strongly agree to 5 = strongly disagree. The items were divided into 4 domains — Discreetness, Comfort, Confidence, and Social life — as identified from the literature review as important to ostomy users. These 4 issues or concepts reflected specific aspects of QoL, but they were not designed to describe all domains of QoL.

Phase II. In phase II, the Ostomy-Q underwent psychometric validation. A total of 200 participants with an ileostomy or colostomy participated in an online survey. In order to evaluate convergent validity, participants were asked to complete the Stoma-QoL and the anxious preoccupation and social engagement domains of the Ostomy Adjustment Inventory-23 (OAI-23), which measured the validity of the new instrument against existing instruments. A subset of participants (N = 82) was invited to complete the survey again approximately 2 weeks later to assess test-retest reliability. Furthermore, a series of tests were applied to assess the psychometric performance of Ostomy-Q (see Figure 2). owm_0117_nafees_figure2

Measures. Participants completed sociodemographic and clinical background forms that included demographic and clinical data such as age, date of diagnosis, and treatment (see Table 1). They also completed the new Ostomy-Q instrument (Version 1), including transition items (items relating to each of the new instrument domains discreetness, comfort, confidence, and social life) to assess test-retest reliability, the revised Ostomy-Q instrument, Stoma-QoL, and the 2 domains from OAI-23 (anxious preoccupation and social engagement). Participants took 45 minutes to 1 hour to complete the exercise. 


Analysis. All data were collected and stored electronically in locked files. Descriptive statistics were used to summarize demographic/clinical data. All instruments were scored and summarized for each participant. A series of different aspects of the psychometric performance of the instrument was explored. 

Item variability and conceptual structure. The total number of responses and the percentage of the total responses for each item were calculated. Any floor and ceiling effects for each item were evaluated and defined to occur when 50% of the responses were in the lowest or highest response category for any item.18,19 If floor and ceiling effects occur, it is difficult to measure the influence of the appliance on the given item.  

A principal component analysis (PCA) was undertaken to explore the extent to which the selected items naturally grouped into the 4 domains (ie, the extent to which the items load or share variance with the hypothesized domain). An exploratory approach was used because the conceptual framework was only hypothesized at this stage.  This analysis included an oblique rotation, which allows the emergent domains to naturally correlate with each other.  

Internal consistency. Internal consistency helps determine the homogeneity of the items within each of the conceptual domains. Cronbach’s alpha20 was used to assess the internal consistency reliability of each conceptual domain, and minimum alpha values of 0.70 (acceptable consistency21) or 0.80 (good consistency22) were provided as guidelines to determine a domain or total score as internally consistent.  

Item-total correlation. The item-total correlation (ie, the correlation between an individual item and the overall domain score) was used to assess the homogeneity of the Ostomy-Q.  Usually, items should have a significant correlation ≥0.2023 (although a higher requirement of 0.30 has been proposed24); this was used to interpret the performance of the Ostomy-Q.  Factor analysis and PCA were conducted in order to understand the relationship between the items and overall structure of the instrument. PCA looks at a set of observations from a large dataset and converts them into a smaller set of values (principal components). The proposed items of each domain should cluster together to show consistency with the domain.  

Test-retest reliability. The intraclass correlation coefficient (ICC) was used to test for test-retest reliability25 and was assessed by correlating domain scores between baseline and 2 weeks after baseline in participants reporting no change in their use of ostomy appliances and those reporting no change on the relevant global change items (N = 82). 

The ICC ranges from 0 to +1.0 and can be interpreted as the proportion of within-user variability. Although no wide agreement exists regarding benchmarks to help interpret the ICC, scale-level ICCs of ≥0.80 have been proposed.22 For the purposes of this study, the following thresholds were used: scale-level ICC <0.6 = poor test-retest reliability; 0.6 to 0.69 = moderate; 0.7 to 0.79 = good; 0.8 to 1.0 = very good. Pearson’s correlation coefficients and t-tests also were conducted to assess the stability of the measure over time.  

Convergent validity. Convergent validity refers to the extent a measure relates to other measures or variables based on theoretical content or the expected relationship with the variable chosen. Convergent validity of the Ostomy-Q was evaluated using 2 validated instruments: the original Stoma-QoL and the anxious preoccupation and social engagement domains of the OAI-23. Because the current measure was somewhat based upon the Stoma-QoL, these 2 measures were compared for consistency. Associations between these measures were explored using parametric and/or nonparametric correlations as appropriate. Convergent validity was considered supported if correlation coefficients between related scales were >0.40.  

To explore the responsiveness of the Ostomy-Q over time, data from a clinical trial26 involving 129 people with an ostomy were used. This trial employed a cross-over design and explored the performance of a newly developed ostomy appliance compared with the participant’s current ostomy appliance. To explore these issues, the data from both trial arms from the clinical trial and current study were merged. Responsiveness was assessed in terms of sensitivity and minimal important difference (MID).  

Sensitivity and MID. Sensitivity assesses the extent to which the measure’s subscale scores reflect changes in users’ experience of the underlying constructs. Sensitivity was estimated in terms of effect size and standardized response mean (SRM). Mean scores for people using their own ostomy appliance (period 1) were compared to mean scores where they tried the test product (period 2) for 4 weeks. The results from the sensitivity analysis also are interpreted in qualitative terms (eg, moderate or high) using established criteria.27  

An instrument’s MID can be analyzed in different ways. Statistical methods based on variance and dispersion have been shown to provide useful indicators of MID in the literature, including observational studies and clinical trials.28 An anchor-based approach also can be used to determine the minimal change over time, but this relies on having access to a suitable anchor or marker of change. In a previous trial conducted by the manufacturer (trial CP232), no explicit anchor was included. Some analyses have been explored in which a 25% and 50% reduction in leakage onto clothes is used as a benchmark. In the second phase of the trial, the current study investigated whether the Ostomy-Q is sensitive to the presence or absence of any output leaking onto clothes.  

Distribution-based methods offer an alternative and rely on expressing an effect in terms of the underlying distribution of the results.27 The standard error of measurement (SEM) and half a standard deviation are both widely used and accepted methods for estimating MID.29 In the current study, both estimates were considered in order to settle on a single value by averaging the 2 estimates.  

Three (3) different anchors were tested in the study — 2 within-group MIDs were estimated in terms of a 25% and a 50% reduction in leaks onto clothes.  In addition, a between-group estimate of MID was defined in terms of the presence of absence of any leak onto clothes.  


Psychometric validation study.

Sample. The sociodemographic profile of all 200 participants is presented in Table 1. The majority of the sample was >50 years of age and reported a wide range of household income and employment status. The majority of participants had an ileostomy (64%) and had used an ostomy appliance for a mean duration of 8.74 (range 0–48) years.  

Item variability. Participants’ scores in each domain and total score were analyzed (data not shown), and some individual items of the Ostomy-Q (items 16, 21–23, which described how the appliance may affect intimate relationships) showed evidence of ceiling effects (see Figure 2). None of the items produced high rates of “not applicable” responses.  

The PCA analysis showed good evidence of most items in the hypothesized Confidence and Discreetness domains loading on the same factor. The Comfort dimension was primarily represented by factor 6; however, some items were loaded on other factors that might reflect different elements of this concept. Confidence items loaded consistently on factor 1. Discreetness was best represented by factor 2. Social and relationship difficulties loaded primarily onto factors 3 and 4.  Factor 3 may be more about the impact on relations with a partner and 4 may be a more general impact. Table 2 shows the hypothesized domain structure followed by the results from the PCA performance of items within each domain. owm_0117_nafees_table2

Some items identified did not appear to function well (ie, they were not good items to assess the relevant meaning. This included items that loaded on more than 1 factor (12), and items that didn’t load on any factors (17) (see Figure 2). Some items did not load on the same factors as other items in that hypothesized domain (13, 10). Items 20 and 23 from the Social/relationships domain also did not perform well. Items 3 and 6 from the Discreetness domain also did not load well and did not correlate with other items from the same domain.  

Internal consistency reliability. The results of the internal consistency reliability for each domain of the total sample are presented in Table 3. A minimum Cronbach’s alpha value of 0.70 was used to define a priori whether a scale or score could be considered internally consistent. The results showed all domains were near this criterion, with a minimum coefficient of 0.69 (Discreetness and Comfort) to 0.78 (Social life and relationships). The Ostomy-Q total score had a Cronbach’s alpha value of 0.89. owm_0117_nafees_table3 

Interdomain correlations. Table 4 presents the Pearson correlation analyses of the association between domain and total scores on the Ostomy-Q. No a priori predictions regarding the nature of these relationships were made. The domain scores were shown to be moderately to highly correlated to one another, ranging from 0.50 to 0.85. owm_0117_nafees_table4

Test-retest reliability. The majority of the sample (N = 82) reported no change on the global concept item (see Table 5); these participants were included for the analysis of test-retest reliability. All scales were considered relatively stable with the lowest ICC of 0.743 (P <.001) for the Social domain to the highest ICC of 0.830 (P <.01) for the Confidence domain.  


Convergent validity. Subscale scores of the Ostomy-Q were compared with the original Stoma-QoL and the anxious preoccupation and social engagement domains of the OAI-23 (see Table 6).  All domains of the Ostomy-Q had a positive association with the 2 domains of OAI-23 and the total score of the Ostomy-Q (P <.01) in the sample. All domain and total scores on the Ostomy-Q exceeded the correlation benchmark of 0.40 against the OAI-23 (and subscales). The total score of the Ostomy-Q and the OAI-23 had the highest correlation >0.75 (P <.01), and all of the domains of the Ostomy-Q had significant correlations with the original Stoma-QoL and exceeded the criterion (P <.01).  owm_0117_nafees_table6

Sensitivity and MID estimates. The effect sizes varied from 0.35 to 0.70, suggesting moderate to high effect sizes for each domain as determined using the criteria established by Luiz and Almeida.26 The estimates of standardized response means were in alignment (data not shown). 

Table 7 shows the MID estimates using the different methods. Different methods gave different results, but the range of scores here can be used to indicate the range of possible values of MID. These data could be used to inform a sensitivity analysis. owm_0117_nafees_table7.jpg


This study presents the development and psychometric validation of the Ostomy-Q, designed to measure aspects of QoL experienced by ostomy appliance users. The Ostomy-Q was developed based on a number of sources, including a literature review and interviews with users. The scale then underwent a psychometric assessment in people with an ostomy, followed by further testing using clinical trial data. The Ostomy-Q could be a useful resource for clinicians concerned with the affect of ostomy appliances on aspects of user’s QoL and also as a potential endpoint in clinical trials. A valid and reliable scale allows clinicians to assess the effect of using an appliance and the needs of a user.  

The Ostomy-Q demonstrated evidence of internal consistency and test-retest reliability in the sample. Internal consistency was found to be modest (>0.69) for all domain scores and total scores. Internal consistency increased markedly when original items 5 and 6 (“The stoma appliance was discreet” and “The stoma appliance did not look like a medical appliance”) were deleted, which suggests these are not being interpreted consistently with other items and may need re-examining. The factor analysis also identified some items that did not perform well or measured more than 1 item at the same time; however, they were not removed from the questionnaire. These items may measure useful concepts but for different reasons are not interpreted consistently by the participants, potentially adding to measurement error.  

Evidence of convergent validity was noted between the Ostomy-Q and other instruments. Data from 2 domains of the OAI-23 (social engagement and anxious preoccupation) were compared against the Ostomy-Q data. The subscale and total scores showed moderate correlations with the domains of the Ostomy-Q, indicating the newer tool is measuring aspects well. The social engagement domain correlated highly with the Social relationships domain of the Ostomy-Q, confirming an anticipated relationship between the domains. The highest correlations were with the total scores of all instruments.  

The new ostomy questionnaire had a higher correlation with the Stoma-QoL total score (r = 0.598–r = 0.800). This is to be expected, given the conceptual overlap between the measures; therefore, it is important to note the Ostomy-Q performs similarly to the Stoma-QoL.  

The analyses of the trial data provided good evidence to show the responsiveness of the Ostomy-Q. The sensitivity analyses reported moderate to high effect sizes for each domain and total score. Multiple estimates of MID were produced, and a good degree of convergence was noted between these estimates.  

The psychometric validation has shown the new instrument could be used to assess aspects of QoL related to ostomy appliances. The measure showed good internal consistency, but some items may need further evaluation (items 3, 6, 10, 12, 13, 17, 20 and 23; see Figure 2). In addition, the psychometric analyses support the validity of the instrument.  


This study had some limitations. The sample was recruited from the end-user panel of a specific manufacturer (ie, indicating most users were using this brand of products only). The validation findings may have been more representative if a broader group of participants was included. It also would be useful to extend the validation and debriefing work to other countries. Some of the psychometric analysis (especially that related to the PCA and the internal consistency analysis) suggests some of the items did not perform as well as others. If the study team had decided to remove items, it is possible some of the psychometric performance of the instrument would have improved. However, it is also worth considering that the coverage of the measure in terms of the items included would have been more restricted.  


The Ostomy-Q is a new tool for measuring outcomes in ostomy users. The items and domain structure are designed to measure some specific aspects of QoL the literature suggested were important to users. The psychometric analysis highlighted some limitations in the measure of which users should be aware. Some items for social relationships and discreetness may need more exploring in the future. These measures should be evaluated in observational studies and clinical trials to demonstrate their applicability in varying settings.  n


This study was supported by research funding from Coloplast A/S, Denmark to ICON plc, UK. However, no restrictions were placed on the design of the study, the choice of included data sources, or the presentation of results. The authors specifically thank Martin Nottmeier who contributed to data interpretation and critical scientific review of the manuscript. 



Potential Conflicts of Interest: This study was supported by research funding from Coloplast A/S, Denmark to ICON plc, UK, which placed no restrictions on the study design, the choice of included data sources, or the presentation of results. Ms. Nafees is a consultant for ICON plc, UK. 


Ms. Nafees is a Health Outcomes Consultant, ICON plc, UK. Mr. Rasmussen is a Health Economist, Coloplast A/S, Denmark. Dr. Lloyd is a Health Outcomes Consultant, ICON plc, UK.