The cost-effectiveness of screening in the community to reduce osteoporotic fractures in older women in the UK: economic evaluation of the SCOOP study

The SCOOP study was a two-arm randomized controlled trial conducted in the UK in 12,483 eligible women aged 70 to 85 years. It compared a screening program using the FRAX® risk assessment tool in addition to bone mineral density (BMD) measures versus usual management. The SCOOP study found a reduction in the incidence of hip fractures in the screening arm, but there was no evidence of a reduction in the incidence of all osteoporosis-related fractures. To make decisions about whether to implement any screening program, we should also consider whether the program is likely to be a good use of health care resources, ie, is it cost-effective? The cost per gained quality adjusted life year of screening for fracture risk has not previously been demonstrated in an economic evaluation alongside a clinical trial. We conducted a “within trial” economic analysis alongside the SCOOP study from the perspective of a national health payer, the UK National Health Service (NHS). The main outcome measure in the economic analysis was the cost per quality adjusted life year (QALY) gained over a 5-year time period. We also estimated cost per osteoporosis-related fracture prevented and the cost per hip fracture prevented. The screening arm had an average incremental QALY gain of 0.0237 (95% conﬁdence interval –0.0034 to 0.0508) for the 5-year follow-up. The incremental cost per QALY gained was £ 2772 compared with the control arm. Cost-effectiveness acceptability curves indicated a 93% probability of the intervention being cost-effective at values of a QALY greater than £ 20,000. The intervention arm prevented fractures at a cost of £ 4478 and £ 7694 per fracture for osteoporosis-related and hip fractures, respectively. The current study demonstrates that a systematic, community-based screening program of fracture risk in older women in the UK represents a highly cost-effective intervention. © 2018 The Authors. Journal of Bone and Mineral Research Published by Wiley Periodicals, Inc.


Introduction
T here are approximately 9 million osteoporotic or fragility (low-trauma) fractures worldwide per year. (1) In developed nations, around 1 in 3 women and 1 in 5 men aged 50 years or older will suffer a fragility fracture during their remaining lifetime, most commonly at sites such as the hip, distal forearm, vertebrae, and humerus. In the UK, around 536,000 people suffer fragility fractures each year, including 79,000 hip fractures, with a cost in 2010 estimated at £3.5 billion expected to rise to £5.5 billion per year by 2025. (2) For the individual, a hip fracture can be devastating with loss of independence, and less than onethird of patients make a full recovery; mortality at 1 year postfracture is approximately 20%. (3) Advances in osteoporosis management over the last two decades have included the development of bone-strengthening treatments and also fracture risk assessment tools, such as FRAX, improving the ability to target treatment to those most likely to fracture. These elements provide the potential for a community-based screening program to reduce fracture rates. The aim of the 1 SCOOP (Screening for Prevention of Fractures in Older Women) trial was to assess the effectiveness and cost-effectiveness of a FRAX-based screening program for older UK women. The preceding section is adapted from the SCOOP main trial paper by Shepstone and colleagues. (4) The screening program in the SCOOP study used a baseline questionnaire to assess 10-year risk using the FRAX risk algorithm. (5) Individuals judged to have sufficiently high risk were invited to undergo dual-energy X-ray absorptiometry (DXA)-based bone mineral density (BMD) measurement, and this information was used to recalculate the 10-year hip fracture probability. This information was communicated to the participant and their family doctor. We have recently reported the effectiveness results, (4) which concluded that there was a potential to reduce hip fracture rates substantially over 5 years (hazard ratio [HR] ¼ 0.72, p ¼ 0.002), though not fractures at other sites (HR ¼ 0.94, p ¼ 0.178).
There is extensive literature evaluating the cost-effectiveness of interventions for osteoporosis. However, this has almost exclusively used economic modeling. The breadth of this literature is illustrated by two recent systematic reviews of models to estimate the cost-effectiveness of preventing osteoporotic fractures. Si and colleagues examined the evolution of health economic models aimed at strategies for preventing osteoporotic fractures and identified 104 studies relating to 74 different models, published between 1980 and 2013. (6) They found that models have evolved in terms of complexity and emphasis. Hiligsmann and colleagues looked at cost-effectiveness analyses of drugs for postmenopausal osteoporosis and identified 39 studies, published between 2008 and 2013. (7) These authors concluded that active osteoporotic drugs were generally cost-effective in postmenopausal women over age 60 years, particularly if they had other risk factors. Many of these studies have estimated outcomes in terms of cost per quality adjusted life year (QALY). However, we are aware of no published study that has estimated cost per QALY using an economic evaluation conducted alongside a randomized controlled trial of screening to prevent fractures. The aim of the current study was to use resource-use and outcome data collected as part of the SCOOP study to estimate the cost-effectiveness of the SCOOP screening intervention over a 5-year time horizon.

Materials and Methods
The clinical trial The SCOOP trial has been described elsewhere. (4,8) In brief, SCOOP is an evaluation of screening aimed at identifying older women at increased risk of fragility fractures. The study was conducted across seven UK geographical regions as a pragmatic, randomized controlled trial. A total of 12,483 women, aged 70 to 85 years, were consented into the trial by post via primary care. Women already on prescriptions for anti-osteoporosis medicines (apart from vitamin D or calcium) were excluded. Participants randomized to the screening arm had 10-year hip fracture probabilities computed from clinical risk factors using the FRAX tool. Those above an age-dependent threshold were invited to have a BMD assessment using DXA. Individuals subsequently above a second age-dependent threshold, with the inclusion of the BMD measure, were recommended for treatment via their general practitioner (GP). Participants in the control arm received standard management: This included referral for DXA scans and anti-osteoporosis treatments if deemed clinically appropriate by their GP. Data collection followed at 6 and 12 months post-randomization and then annually thereafter up to 5 years follow-up. The primary outcome measure was the proportion of individuals sustaining at least one osteoporosis-related fracture, the assessment of which is summarized below. A number of secondary outcomes measures were also collected in the trial, including hip fractures, all clinical fractures, mortality, health-related quality of life, and health care resource use data.

Measurement of outcomes
Data were requested on an annual basis from 2009 to 2014 from NHS Digital, formerly the Health and Social Care Information Centre (HSCIC). (9) This comprised admitted patient care (inpatient), outpatient, and accident and emergency (A&E) data sets. Data were interrogated to identify fractures in study participants from randomization to the end of follow-up. Primary care records were also screened for fractures based on their GP Read codes. Participants could also self-report fractures at each follow-up. In the case of self-report and A&E reported fractures, or for other sources where there was missing information on dates or anatomical site, further verification was sought. This included requests to primary care practices or searches of radiological records at local hospitals. Only verified fractures were included as outcomes. (4) The main outcome measure used in the economic evaluation was the quality adjusted life year (QALY). This was assessed using the 3-level EQ-5D (10) by means of a postal questionnaire and scored using the published tariff. (11) The EQ-5D was assessed at baseline, 6 months, 12 months, and then annually thereafter up to 5 years' follow-up. We estimated QALY by area under the curve at these time points assuming a linear relationship between each EQ-5D value. Any participant who died was assumed to have an EQ-5D score of 0. A secondary analysis was performed using cost per fracture avoided for both any probable osteoporotic-related fracture (defined as fractures of hip, wrist, and spine) and hip fractures only.

Cost of the screening intervention
Primary care practices identified eligible women from their lists who were then invited to participate. Those individuals who agreed constituted the SCOOP cohort and were randomized to the intervention or control arms. Those in the intervention arm had their fracture risk assessed, and participants and GPs were notified of fracture risk. The resources required to undertake the relevant processes (BMD measurement via DXA scans, calculation and clinical review of final fracture risk, written notification of initial and final fracture risk, and a GP consultation for identified high fracture risk individuals) were recorded as part of the SCOOP study. All costs were either costed using data collected as part of the study or were costed using appropriate unit cost data. (12,13) A breakdown of these costs is given in Table 1.
Costs associated with fracture-related health care contacts Health Resource Group codes (HRGs) were not available from NHS Digital, therefore, inpatient, outpatient, and A&E data sets were each run through the HRG 4þ grouper to derive the HRG 2 TURNER ET AL.

Journal of Bone and Mineral Research
codes. (14) Costs of resource use were drawn from HRGs linked to National Health Service (NHS) reference costs via data from HSCIC. (13) Inpatient stay costs were derived from HRG codes corresponding to each Finished Consultant Episode (FCE). Allowances were made for type of admission (elective or nonelective); length of stay; short stays; and excess bed days. A short stay was defined as less than 2 days in hospital. A long stay was costed in the same manner as elective admission costs, but to reflect non-elective NHS reference cost data. Outpatient attendances were costed according to speciality and type of attendance (for example, first or follow-up appointments). Procedure costs, where recorded, were included. The completeness of A&E data was much lower than inpatient and outpatient data, leading to the generation of missing HRG codes. A weighted average cost of £129 per A&E attendance was used in these cases. (13) Medication data were available for antiosteoporosis medicines for the full period of follow-up for all study participants; these were costed using prices from the British National Formulary No. 66. (15) All costs are for the year 2013/14 in pounds sterling. Analysis A "within trial" economic analysis was undertaken on an "intention-to-treat" basis from the perspective of a national health payer, the United Kingdom (UK) NHS. The main costeffectiveness analysis used QALYs as the outcome measure (cost-utility study). Two additional economic evaluations were performed using osteoporotic fractures and hip fractures as outcomes (cost-effectiveness studies). The study had a 5-year horizon, so discounting was used to allow for differential timing of costs and benefits. Discounting was not used for the first year of follow-up. In subsequent years, costs, QALYs, and fractures were discounted using a rate of 3.5%, as recommended by the National Institute for Health and Care Excellence (NICE). (16) EQ-5D values were completed by postal questionnaires, with some telephone questionnaire completion for nonresponders. Response rates were high in the SCOOP study, but complete data required up to 7 EQ-5D returns over 5 years, thus increasing the potential for missing data. Missing data is a common problem in economic analysis and can lead to both bias and a lack of precision. (17) To estimate QALYs, we needed EQ-5D values for all points of follow-up when the participant was alive. Each EQ-5D questionnaire has five questions, all of which need to be completed to obtain an EQ-5D score. Where there was a single missing EQ-5D question but the participant had completed the other four questions, we imputed the missing question using a "hot-decking" approach. (18) Using this method, the four completed responses were compared with individuals with complete data who had the same pattern of responses to those four items. The missing value was replaced by the modal response for that item taken from those with complete data. Individuals with complete EQ-5D data, including those imputed using hot-decking, were defined as the complete case analysis (CCA) set. Where participants had missed more than one EQ-5D question, or where the questionnaire had not been returned, that EQ-5D score was deemed missing and a QALY could not be calculated. For these cases, multiple imputation (using five imputed data sets) was used. Discounted QALY scores were imputed using the following variables: baseline EQ-5D, age at randomization, days alive, time without osteoporotic fracture, and time without hip fracture. Imputation was conducted separately for each study group. (17) This analysis was conducted in SPSS version 23 (IBM Corp., Armonk, NY, USA).
Analysis of the costs and outcomes data were undertaken using seemingly unrelated regression, which allows for correlation between costs and outcomes and is generally considered robust for skewed data. (19) This was conducted using the sureg command in STATA (StataCorp, College Station, TX, USA). Both costs and effects used baseline EQ-5D, age, and study group as explanatory variables. Means and standard errors from imputed data were calculated using Rubin's rules. (20) For the cost per QALY analysis using imputed QALY data, cost-effectiveness acceptability curves (CEACs) were estimated using 1000 samples for each of the five imputed data sets, ie, 5000 samples in total. For analyses where data were not imputed, CEACs were estimated using boot-strapping with 2000 samples. CEACs show the probability that each of the study groups is the most cost-effective option at different valuations of the outcome variable. (21) Analyses were performed in SPSS 23 and STATA 11.

Sensitivity analysis
To evaluate the effect of only using cases where we were able to estimate QALY without multiple imputation, we estimated cost per QALY for the CCA set. We felt that this would likely provide a biased estimator because cases were unlikely to be missing at random. Because we had data on cost per fracture avoided for all

Role of the funding bodies
The funders of the study played no role in the study design, data collection, data analysis, data interpretation, or writing of this report. The corresponding author had full access to all data used in this study and had final responsibility for the decision to submit for publication.

Results
There were 12,483 participants in the SCOOP study, 6250 in the control and 6233 in the intervention group. Comparisons of baseline characteristics between study groups have been published elsewhere; the two study groups were found to be very similar. (4) The number of cases for whom we were able to estimate QALY was 6881 (55%); this comprised 3404 (54%) from the control group and 3477 (56%) from the intervention group. When the hot-decking (18) method was used to impute responses when a single EQ-5D question was missing, this rose to 7975 cases (64%); this comprised the CCA set. A comparison of the CCA set with cases missing one or more EQ-5D values indicated that missing cases had statistically significantly lower baseline EQ-5D, more incident fractures, and higher fracture-related health care costs ( Table 2).
The average costs for the intervention were £104 per person ( Table 1). The main components of this cost were case finding, DXA scans, and GP consultations for identified cases. The total average discounted costs of both intervention and fracturerelated health care for the 5-year follow-up are given in Table 3; estimates are provided for both the full data set and the CCA analysis. For the whole sample, it can be found that estimated costs are £968 and £900 for the intervention and control groups, respectively. The major component of costs was inpatient stay with other secondary care costs also being important. When costs are examined for the CCA set only, it can be found that estimated total costs are lower and the difference between the intervention and control is higher at £104, reflecting a lower proportion of fractures in the CCA than the whole sample. Table 4 provides EQ-5D for all available time points for the CCA, as well as QALY estimates without adjustment for baseline EQ-5D. Also provided are baseline EQ-5D and unadjusted QALY for the imputed analysis. Estimates of the discounted QALY difference for the intervention group compared with the control are -0.005 and 0.008 for the CCA and imputed analysis, respectively, neither statistically significantly different. However, in both cases, baseline EQ-5D values are lower in the intervention group, which would tend to bias QALY estimates in favor of the control group.
The results of the economic evaluations are shown in Table 5. These results were obtained using seemingly unrelated regression and adjust for differences in baseline age and EQ-5D. The estimate of incremental QALYs was 0.0237 per person (95% confidence interval [CI] -0.003 to 0.051). The confidence interval crosses 0; thus, the difference in QALY is not statistically significantly at the 5% level. The estimate of the incremental cost-effectiveness ratio (ICER) was £2772. Also shown in Table 5 are the two analyses of cost per fracture prevented. The incremental estimate of fractures prevented was 0.0146 (95% CI 0.0002 to 0.029) and 0.0085 (95% CI 0.0026 to 0.0144) for osteoporotic-related and hip fractures, respectively. These results are for the 5 years of follow-up. The intervention group had an incremental cost per fracture prevented of £4478 per osteoporotic-related fracture and £7694 per hip fracture. The uncertainty surrounding these estimates are shown in costeffectiveness acceptability curves (CEACs) provided in Fig. 1 for all three analyses. In terms of cost per QALY, there is a 93% probability the intervention is cost-effective at the NICE threshold of £20,000 per QALY. For fractures, the CEACs are slightly lower; for example, there would be an 87% probability the intervention would be considered cost-effective if preventing a hip fracture was valued at £20,000.

Sensitivity analysis
The above analysis was repeated for the CCA set, again shown in Table 5. The incremental effect in terms of QALYs was 0.0214 (95% CI -0.011 to 0.054). The ICER for this analysis was £4646. For the CCA, the probability that the intervention would be costeffective at the NICE threshold of £20,000 per QALY was approximately 83%. When analysis of osteoporosis-related and hip fractures was restricted to only those cases that were also in the CCA data set, we found a marked difference in results (Table 5). For osteoporotic fractures, the mean estimate of incremental effect was 0.0094 (95% CI -0.0073 to 0.026). For hip fractures, the mean estimate of incremental effect was 0.0049 (95% CI -0.0018 to 0.0108). These were both considerably lower than the values for the whole sample given above. For both types of fractures, the estimates of effect were no longer statistically significant between groups and estimated ICERs were more than double those estimated from the full data sets.

Discussion
Participants in the intervention arm accrued, on average, an additional 0.0237 QALYs, though this difference was not statistically significant. The additional cost per QALY was £2772 compared with the control group in our base case analysis. Although these gains in QALY appear modest given the 5-year follow-up, it should be borne in mind that these are mean incremental values for the whole of the intervention cohort compared with the control. Because this is a screening intervention, the majority of participants in the intervention arm received no change in their health care and hence would not be expected to generate a QALY gain. The CEAC presented in Fig. 1, which allows for the uncertainty inherent in the data, indicated that at the NICE threshold value of £20,000 for a QALY, the intervention had a 93% probability of being cost-effective. The intervention also generated reductions in fractures with a cost per fracture prevented of £4478 for all osteoporotic fractures and £7694 for hip fractures. Together, these results provide strong evidence that screening in the community to reduce fractures in older women represents an efficient use of health care resources. This economic evaluation was based on the SCOOP study. An important aim of resource data collection was to minimize burden on participants and achieve high completion rates of those resources felt likely to be most important in relation to fractures, eg, fracture-related inpatient care. Data on fracturerelated outcomes and on the resource implications of fracturerelated care were obtained from routine data sources, eg, from NHS Data. Considerable effort was invested in ensuring these were as complete as possible. This also meant that completeness of these data was independent of factors that might normally be expected to affect response, such as poor health. There were also pragmatic issues related to the research burden of collecting resource-use data from a large number of practices. These considerations meant that some items of resource use were not recorded. Examples included routine primary care contacts and admissions to nursing homes. The former may understate some of the costs of providing anti-osteoporosis medicines should prescription of these drugs lead to an increase in primary care consultations. The latter might understate any potential resource savings associated with preventing fractures.
The SCOOP cohort generally had extremely high rates for the return of questionnaires: At the first follow-up (6-months), 11,967 of 12,483 participants responded (96%), and at the 60month follow-up, of the 11,408 participants still living, 10,661 (93%) responded. However, the nature of QALY estimation using a repeated series of EQ-5D questionnaires makes estimation of QALYs vulnerable to problems of missing data. Additionally, if participants would be less likely to return EQ-5D in the period immediately after a fracture, there may be fewer observations in a time period that would be expected to have the largest effect on EQ-5D scores. Furthermore, because the EQ-5D were completed at set times, it is possible that acute changes in quality of life secondary to a fracture that had occurred 6 months prior, for example, might not be captured. Because the comparison of the CCA with missing data ( Table 2) and results for cost per fracture prevented when restricted to the CCA data (sensitivity analysis, Table 5) indicated that results were likely to be biased for the CCA data, we used imputed data as our base case QALY analysis.
A number of modeling studies have evaluated the costeffectiveness of osteoporotic fracture prevention for the UK (22)(23)(24)(25) or for a number of different countries including the UK. (26)(27)(28)(29)(30) UK-based models have generally found treatment for osteoporosis to be cost-effective but have found variations in  estimates of cost-effectiveness. Factors likely to be important were reported as prevalence of osteoporosis, costs of treating fractures, and costs of treatment. One UK study also evaluated the cost-effectiveness in relation to risk of fracture. (25) Treatment in cases with higher risk of fractures was associated with increased probability of being cost-effective, suggestion strategies to identify higher-risk individuals could be beneficial. The disadvantages of a modeling approach include the requirement for data from a variety of sources and the necessity for a number of assumptions to be made. All primary data used in the current study came from the same source, ie, the SCOOP study. We are aware of no other study that has conducted an economic evaluation looking at cost per QALY alongside a randomized study for the prevention of osteoporotic fractures. The SCOOP study also differs from a number of the published models in its length of follow-up. The effect of variable follow-up time has been investigated using an economic model. (31) In this study, Kanis and colleagues investigated the effect on ICERs of a 10-year follow-up compared with lifetime follow-up for 70-yearold women. Increasing the length of follow-up led to a decrease in estimated ICERs (ie, improved cost-effectiveness). Furthermore, there is often a selection bias in randomized trials with those consenting involvement likely to be different in characteristics from decliners. The clinical study reported that mortality in SCOOP was less than 50% of that expected based on age distribution at entry and that generally participants tended to be better educated and of higher socioeconomic status than those who declined. (4) Conversely, the SCOOP study also appeared to have higher than expected numbers of fractures. (4) The fact that the SCOOP study was randomized controlled trial also may have affected the costs associated with the screening program. Costs of identification of £44 were paid to practices to reflect the fact that this was a task only carried out because of the trial. If this screening program was rolled out in practice, these costs may be lower. For these reasons, the estimates of cost-effectiveness from SCOOP may represent conservative ones.
The SCOOP clinical trial demonstrated that community screening, based upon the FRAX probability of hip fracture, leads to a significant reduction in hip fractures in older women. (4) The current study provides strong evidence that community screening, based upon the FRAX probability of hip fracture in older women, would likely be cost-effective and represent an efficient use of health care resources.