Patient-Reported Outcome Measures in Chemotherapy-Induced Peripheral Neurotoxicity: Defining Minimal and Clinically Important Changes


Background

Chemotherapy-induced peripheral neurotoxicity (CIPN) is a debilitating adverse event of neurotoxic cancer treatments and is particularly prominent with taxanes, platinum-based agents, vinca-alkaloids, proteasome inhibitors, and immunomodulatory drugs. CIPN produces numbness, tingling, and pain in a length-dependent manner, affecting the hands and feet, and may result in functional impairments, such as difficulty with fine motor tasks, problems with balance, and increased falls risk,1 that can persist long term and negatively impact on health-related quality of life.2 There are currently no interventions established to prevent CIPN development, and the only treatment moderately recommended is duloxetine.3,4 Consequently, evaluating CIPN with a valid, reliable, and responsive outcome measure is critical in identifying early signs of nerve damage and effects of intervention aimed at preventing long-lasting or severe neurotoxicity.

There are numerous approaches to assessing CIPN, with patient-reported outcome measures (PROMs) recognized as valuable tools that provide a patient-based perspective essential to accurate assessment.5 Previously demonstrated discordance between patient- and clinician-reported CIPN6 suggests that PROMs may depict a broader spectrum of CIPN severity than clinician-based assessments,7 enabling better understanding of symptom manifestation and effects on function.8

A number of PROMs have been developed to assess CIPN9; the 2 most commonly used and extensively validated are the EORTC Quality of Life – Chemotherapy-Induced Peripheral Neuropathy questionnaire (QLQ-CIPN20)10 and the Functional Assessment of Cancer Therapy/Gynecologic Oncology Group – Neurotoxicity questionnaire (FACT/GOG-NTX).11 Previous studies have demonstrated that both PROMs are responsive in identifying change in CIPN symptoms over time.7,1115 However, the utility of these PROMs remains limited due to the lack of guidelines regarding thresholds for identifying the development of clinically relevant and functionally significant CIPN.

Estimates of the smallest meaningful change in an outcome measure (termed the minimally important difference [MID]) enable clinicians and researchers to assess the clinical significance of changes in outcomes over time. CIPN PROMs are currently used in multiple research settings, including observational studies and clinical trials, without clear guidelines on how to interpret change in scores. MID estimates will aid and increase interpretability for CIPN PROMs by helping guide clinical judgement about whether and when change in CIPN symptoms has reached clinical relevance, facilitating routine use of CIPN PROMs as clinical trial outcome measures and in clinical practice. Although previous studies have attempted to estimate MIDs for the QLQ-CIPN20 and FACT/GOG-NTX,16,17 methodological constraints remain that may affect the reliability, clinical interpretability, and utility of these estimates.18 Consequently, this study aimed to estimate MIDs for the QLQ-CIPN20 and FACT/GOG-NTX to provide guidance for clinical threshold of CIPN development. Furthermore, thresholds for clinically significant CIPN were also estimated.

Methods

Patients and Study Design

Patients were recruited into a prospective longitudinal study at 2 hospitals in Sydney, Australia, from August 2015 to March 2021. Patients were eligible if they were scheduled to commence treatment with neurotoxic agents, including taxanes, platinums, vinca-alkaloids, proteasome inhibitors, or immunomodulatory drugs, and were recruited at treatment commencement. Patients were assessed at baseline (prior to the second administration of neurotoxic agent), midtreatment (halfway through treatment protocol), and end-of-treatment (upon completion of neurotoxic treatment). Demographic and treatment dosing information were obtained from medical records.

The study was approved by the Sydney Local Health District (SLHD) and South-Eastern Sydney Local Health District (SESLHD) Human Research Ethics Committees, and all patients provided informed signed consent in accordance with the Declaration of Helsinki. This study followed the STROBE reporting guidelines.19

CIPN Assessments

At each timepoint, patients completed a comprehensive battery of CIPN assessments as reported in prior studies,20,21 including key assessment tools described later that were used for the threshold estimation analysis. CIPN assessments were undertaken by trained researchers following a standardized testing protocol to ensure reproducibility. Full details of outcome measures and scoring are provided in eAppendix 1 (available with this article at JNCCN.org).

PROMs included the QLQ-CIPN20, a validated CIPN PROM consisting of 20 items addressing symptoms and functional impacts of CIPN (https://qol.eortc.org/).10 A reduced variant of the QLQ-CIPN20, consisting of 8 items (termed CIPN8) from the original scale,22 was also examined due to its recent use in translational studies.2224 Another validated PROM assessing CIPN, the FACT/GOG-NTX consisting of 13 items (https://www.facit.org/),11 and its reduced version with 4 items (NTX4),25 were also examined.

The NCI’s CTCAE (version 4) peripheral sensory neuropathy subscale26 was used to clinically grade CIPN and was adopted as the clinical anchor for MID estimations. Trained researchers graded each patient using the CTCAE scale immediately following each patient’s clinical review and comprehensive CIPN assessment to maximize the scale’s accuracy and reproducibility.27 The Total Neuropathy Score, clinical version (TNSc), which is a validated composite neurologic grading scale designed to evaluate peripheral neuropathy, was also used to evaluate CIPN severity.28,29

Statistical Analysis and Threshold Estimation Methods

Summary statistics are presented as mean and standard deviation. Mean change in PROM scores were calculated from baseline to midtreatment and to end-of-treatment timepoints, and their statistical significance from zero was assessed with unpaired t tests, with statistical significance defined at P<.05. Both anchor-based and distribution-based methods were used to estimate thresholds30 for the 4 PROMs as described later. All statistical analyses were performed using Stata, version 14 (StataCorp LP).

Anchor-Based Methods

Anchor-based methods express change in PROM scores for subgroups of patients defined by clinically relevant variables (clinical anchors). For this study, the CTCAE was chosen as the clinical anchor because it is the most commonly used method of CIPN assessment and also has clinical relevance, with dose-modifying decisions in clinical practice often made based on CTCAE neuropathy grade. To assess suitability of the clinical anchor, correlations between the CTCAE scores and PROMs were calculated at each timepoint using Spearman’s rank correlation and a correlation coefficient of r > 0.30 was required to determine plausibility of the anchor.30 PROMs and CTCAE score changes were computed between baseline to midtreatment and baseline to end-of-treatment, and correlations between PROM change scores and CTCAE grade changes were calculated using Spearman’s rank correlation to further assess anchor suitability and MID credibility.18

Three clinical change groups (CCGs) were defined: (1) CCG 0: no CIPN development (CTCAE grade 0 at all timepoints); (2) CCG 1: development of minimal CIPN (CTCAE grade 0 at baseline, developing to grade 1 at midtreatment/end-of-treatment)—this was deemed the MID, and mean changes from this group provided MID estimates for the PROMs31; and (3) CCG 2: development of clinically significant CIPN (CTCAE grade 0 at baseline, developing to grade 2 at midtreatment/end-of-treatment), sufficient to influence/inform important clinical management decisions such as dose reduction, because this was deemed the clinically important difference (CID).32 Patients who developed grade >2 neuropathy were excluded from analysis because this study aimed to estimate score changes that reflect the development of minimally important neuropathy (CCG 1) and the threshold for clinically significant neuropathy (CCG 2). TNSc scores were used as a validated neurologic score of CIPN to further examine quantitative CIPN severity between CCGs. The mean change method was used to calculate PROM score changes over time for each CCG, and statistical significance of change was assessed with paired sample t tests, expressed with a 95% confidence interval.

Distribution-Based Methods

Distribution-based methods of threshold estimation using the statistical distribution of PROM scores (such as SD or standard error of the mean [SEM]) are considered as supportive evidence to the anchor-based MID estimations.30 In this study, the distribution-based method was calculated using 0.5 SD of PROM scores at midtreatment and end-of-treatment as in previous studies.33,34 Data from patients who had completed a midtreatment or end-of-treatment timepoint were included in this analysis.

Results

A total of 478 patients were recruited to the study, with 406 patients completing a baseline and midtreatment or end-of-treatment assessment and suitable for inclusion in this analysis (Figure 1). For these 406 patients, mean [SD] age was 55.6 [12.6] years, 64.0% were female (n=260), and most were treated with taxane (38.9%; n=158), platinum (26.6%; n=108), or combination taxane/platinum-based (19.2%; n=78) neurotoxic regimens for breast (32.3%; n=131), gynecologic (19.2%; n=78), or colorectal/gastrointestinal cancers (17.7%; n=72) (Table 1).

Figure 1.

Figure 1.

Flowchart and clinical change groups across each timepoint.

Abbreviations: CCG, clinical change group; CTCAE, Common Terminology Criteria for Adverse Events.

Citation: Journal of the National Comprehensive Cancer Network 21, 2; 10.6004/jnccn.2022.7074

Table 1.

Patient Demographic and Clinical Characteristics (N=406)

Table 1.

At baseline assessment, most patients (79.3%; n=322/406) did not have neuropathy symptoms (CTCAE grade 0; Table 1). By midtreatment (8.9 ± 4.8 weeks from baseline), 62.2% (n=199/320) had CIPN symptoms (CTCAE grade ≥1), and by end-of-treatment (17.6 ± 10.1 weeks from baseline), 79.9% (n=274/343) had CIPN symptoms (Figure 2). Mean scores for all 4 PROMs similarly reflected statistically significant increased self-reported CIPN (P<.001) at each subsequent timepoint (supplemental eFigure 1).

Figure 2.

Figure 2.

Distribution of CTCAE grades for CIPN at each timepoint.

Abbreviations: CIPN, chemotherapy-induced peripheral neurotoxicity; CTCAE, Common Terminology Criteria for Adverse Events.

Citation: Journal of the National Comprehensive Cancer Network 21, 2; 10.6004/jnccn.2022.7074

Patients who received taxane- or platinum-only treatments did not have significantly different CTCAE grade at midtreatment or end-of-treatment (both P>.05). Similarly, patients with stage IV disease did not have significantly different CTCAE grade than those with stage 0–III disease at either midtreatment and end-of-treatment (both P>.05).

CTCAE grades were correlated with PROM scores at each timepoint and between timepoints (r = 0.42–0.82; supplemental eTable 1). All correlations were >0.3, indicating the CTCAE grade was an appropriate clinical anchor.28 Furthermore, all but 1 of the 20 correlations were >0.50, meeting the higher threshold set by Devji et al18 for MID credibility.

Figure 1 shows the number of patients with PROM and CTCAE data that allowed them to be categorized as CCG 0, CCG 1, or CCG 2 at midtreatment and end-of-treatment. TNSc scores significantly worsened with increasing CCG both at midtreatment and end-of-treatment timepoints (P<.001; supplemental eTable 2), further supporting the CTCAE as an appropriate clinical anchor of CIPN severity.

PROM change scores for each CCG and timepoint are shown in Table 2. As expected, the CCG 0 group had the smallest mean score changes and the CCG 2 group had the largest mean score changes at midtreatment and end-of-treatment. The end-of-treatment means were larger than the midtreatment means for all PROMs and CCGs.

Table 2.

Mean Score Changes for Each CCGa

Table 2.

Distribution-based MID estimates were based on PROM data provided by 320 patients at midtreatment and 343 patients at end-of-treatment. These supportive MID estimates were smaller than the definitive MID estimates based on anchor-based methods at both time points for all PROMs (Table 3).

Table 3.

Comparison of Definitive Anchor-Based MID Estimate (CCG 1) With Supportive Distribution-Based MID Estimatea

Table 3.

Discussion

This study established score changes associated with minimal (grade 1) and clinically significant (grade 2) neuropathy development for 2 commonly used CIPN PROMs and their abbreviated versions. CIPN PROMs are widely used in clinical research and these data will enable better understanding of the PROM score differences that reflect development of minimally emergent neuropathy as well as development of significant neuropathy that would warrant consideration of dose modification. These corresponding sets of mean score changes serve as thresholds to guide clinical interpretation of score changes on these commonly used CIPN PROMs.

The reported prevalence of CIPN in this study is within the range reported by previous studies.35 However, some studies have reported lower prevalence of CIPN, which may be due to the methodology of CIPN assessment. In the present study, neuropathy grades were evaluated by trained researchers after completing a battery of CIPN assessments. This comprehensive investigation provided researchers with in-depth CIPN information and may have resulted in increased sensitivity in capturing CIPN compared with other studies.

This study used anchor-based methods, which are preferred over distribution-based estimations because they are underpinned by clinical relevance that enriches the interpretation and utility of the MIDs.30 The CTCAE peripheral sensory neuropathy subscale was used as the clinical anchor, and accordingly these MID estimations reflect development of grade 1 sensory CIPN. MIDs were estimated for both midtreatment and end-of-treatment timepoints, providing estimations of thresholds for CIPN development during treatment and quantifying overall treatment toxicity.

Choice of Clinical Anchor: Limitations and Future Directions

Choosing an appropriate clinical anchor is essential in MID estimation. The MIDs we estimated have high credibility according to the criteria developed by Devji et al18 (supplemental eTable 3), supporting the CTCAE peripheral sensory neuropathy subscale as an appropriate and robust anchor. An appropriate anchor needs to identify the threshold of a small but meaningful change in a patient’s symptoms: grade 1 of the CTCAE meets this requirement because it represents emergence of perceptible CIPN. In addition, we used the clinical significance of the higher grades to provide PROM thresholds for treatment modification indications, given that grade ≥2 CIPN often results in treatment modification. Finally, it is familiar to clinicians and researchers, further emphasizing its utility as a clinical anchor to aid interpretation of CIPN PROM scores.

Prior MIDs estimation studies for non-CIPN PROMs used more than one clinical anchor,34,36 whereas this study used only the CTCAE sensory subscale—arguably a limitation of this study. We considered other measures as additional anchors, including the TNSc.28 However, despite being a validated CIPN outcome measure, the TNSc lacks defined clinically significant cutoff values, resulting in ambiguity when benchmarking clinical significance. Although a recent study estimated MIDs for TNSc,37 the TNSc is not commonly used in routine oncology clinical practice, and accordingly there is a lack of treatment modification indications attributed to TNSc grades.

The CTCAE also has limitations as a clinical anchor, because it can have low interobserver reliability38 and low sensitivity to change,39 with the scale’s 4 grades not being able to accurately capture the spectrum of CIPN severity. Despite these shortcomings, Cavaletti et al27 demonstrated that standardized training in CTCAE grading and interpretation increases accuracy and reproducibility of results, and this has been adopted in the present study. Furthermore, in our study, the CTCAE grading was completed by trained researchers following the comprehensive CIPN assessment and discussion with the patient, which may explain the high correlations between the change scores compared with previous literature on MIDs.33,34,36 In addition, our study found significant worsening of TNSc scores between clinical change groups, further verifying the CTCAE as an appropriate proxy of CIPN severity.

This study estimated MIDs for the development of CIPN, but was not designed to estimate MIDs for CIPN improvement posttreatment, which we acknowledge may differ. Future studies estimating improvement MIDs for the QLQ-CIPN20, FACT/GOG-NTX, and their abbreviated versions will provide important thresholds, because these PROMs are also increasingly used in CIPN treatment intervention studies to ameliorate chronic CIPN symptoms. Furthermore, use of an anchor based on patient-reported perceived change in future work would provide an MID estimate better linked to patient perceptions of importance.18

Comparison With Previous MIDs Studies

Two prior studies have investigated MIDs for both the QLQ-CIPN2016 and FACT/GOG-NTX.17 However, there were some methodological differences that may limit the comparability of findings. First, the 3 timepoints used by Yeo et al16 (baseline, at second cycle of chemotherapy, and at 12-month follow-up) may not allow accurate capturing of CIPN symptom development. Because CIPN develops cumulatively with neurotoxic treatment, these timepoints may have missed the apex of CIPN symptoms, as suggested by the low proportion of their patient cohort (6.2%–6.6%) that developed CIPN. Consequently, although Yeo et al16 aimed to estimate MIDs for QLQ-CIPN20 using anchor-based methods, final estimates were based solely on distribution-based methods because their anchor-based estimates were found to be inconsistent due to the low rate of CIPN development. Cheng et al17 also used solely distribution-based methods to estimate MIDs for FACT/GOG-NTX. As discussed earlier, distribution-based methods are limited in clinical utility, lacking a direct link to a clinically relevant anchor. Furthermore, MIDs were estimated separately for the sensory and motor subscales of the QLQ-CIPN20.16 However, due to a lack of demonstrated structural validity for these QLQ-CIPN20 subscales,9 it has been recommended that the PROM be used in its entirety, rather than individual subscales.40

Utility in Different Clinical and Research Settings

Clinically relevant thresholds for PROMs have utility in both research settings and routine clinical care.41 The QLQ-CIPN20 and FACT/GOG-NTX are currently used in a wide range of research settings, including CIPN observational and natural history studies, CIPN treatment and intervention studies, and cancer treatment clinical trials. Although not as extensively validated as the original versions, the abbreviated CIPN8 and NTX4 have been used as outcome measures in observational studies22,23 and clinical trials.42,43 However, without guidelines on clinical thresholds, interpretation of observed changes in PROM scores is limited. The application of estimated MID scores in the research setting will amplify the utility of PROMs, guiding researchers to determine whether the PROM score changes are clinically meaningful and helping to define appropriate endpoints for clinical trials when assessing CIPN.

CIPN PROMs also have an important role in routine clinical care during treatment, particularly because clinician-rated CIPN grades are known to consistently underreport severity compared with patient self-report.44 Given that development of CIPN is a major reason for dose modification, it is critical that clinicians have the most accurate representation of CIPN severity when making these decisions. CIPN PROMs provide a promising data source, but to date, the use of CIPN PROMs in clinical practice has received little attention. We have estimated thresholds for clinically significant (grade 2) CIPN, because this level of CIPN often impacts on management, including dose modification. Careful considerations need to be made when adapting PROMs developed and validated for use in research to individual patient care,45 but the clinically significant thresholds we have provided are an important first step to introducing these PROMs to routine clinical practice. Furthermore, other interventions, such as increasing patient involvement in their symptom management46 and use of decision support algorithms to promote adherence to evidence-based CIPN management,47,48 may also provide clinicians with additional information and aid in treatment modification decisions. However, further studies are needed to determine how to best implement these methods into clinical practice.

Conclusions

MIDs and other clinical thresholds estimated in this study provide guidance on the meaningful interpretation of score changes for CIPN PROMs. These results will assist clinicians and researchers in identifying minimal and clinically significant CIPN development when these PROMs are used in research settings, and potentially in clinic.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.