Main

Colorectal cancer (CRC) is the second most frequent cause of cancer-related deaths in men and third in women in developed countries (Torre et al, 2015). Randomised controlled trials have demonstrated that screening for CRC can reduce CRC incidence and/or mortality (Atkin et al, 2010; Holme et al, 2013, 2014), but the total benefit and harm of national cancer screening programmes are under debate. Saving relatively few lives requires a large number of people to be screened (Atkin et al, 2010). The vast majority of participants will never develop cancer, but may be exposed to potential psychological stress by participation. On a population level, negative psychological effects can counterbalance the benefits of reduced cancer incidence and mortality. Therefore, investigating the psychological effects is an important part of determining potential harms of a screening programme.

Cancer is one of the largest threats to peoples’ health, and participating in screening for cancer might therefore cause anxiety (Miles and Wardle, 2006). Several studies report more anxiety in participants who receive a positive faecal immunochemical test (FIT) or faecal occult blood test (FOBT) result (Parker et al, 2002; Brasso et al, 2010; Denters et al, 2013; Bobridge et al, 2014; Laing et al, 2014). However, this effect seems to diminish (Bobridge et al, 2014) or disappear (Parker et al, 2002; Brasso et al, 2010; Laing et al, 2014) with long-term follow-up. Other studies report no psychological harm of participation in CRC screening (Niv et al, 2012; Robb et al, 2013). Further, some evidence exists for positive effects of CRC screening participation, such as reduced anxiety (Thiis-Evensen et al, 1999; Wardle et al, 2003), and improved health-related quality of life (HRQOL) (Taupin et al, 2006; Pizzo et al, 2011). Possible reasons for this inconsistency may be due to the different design of the studies, for example lack of baseline measure or control groups, as well as the use of different instruments to measure psychological effects. The current study is designed to overcome some of these challenges.

In Norway, CRC incidence has nearly tripled since the 1950s (Bray et al, 2007), and the lifetime risk for the average population to develop CRC is about 6% (Bretthauer et al, 2006). Regardless, no national screening programme for CRC exists. Therefore, the randomised management study Bowel Cancer Screening in Norway—a pilot of a national programme (the BCSN pilot), was started in 2012 to compare the effect of the two different screening modalities FIT and flexible sigmoidoscopy (FS) on reduction in CRC incidence and mortality (Bretthauer and Hoff, 2012).

To evaluate the psychological effects of participation in the BCSN pilot in the Norwegian population, a sub-study was started where a random sample of the original cohort was invited to participate. Our hypothesis was that participants who receive a positive screening result, will experience more anxiety and reduced HRQOL, whereas participants who receive negative screening results will report anxiety and HRQOL scores similar to their baseline levels. The primary aim of the present sub-study was to measure short-term changes in participants’ level of anxiety, depression and HRQOL during screening participation; from before screening to receipt of the screening result. Demographical variables are known to influence both anxiety and HRQOL (Loge and Kaasa, 1998; Bjelland et al, 2008, 2009). The secondary aim was to explore if participation in CRC screening affect demographical groups differently.

Materials and Methods

In the BCSN pilot, 140 000 men and women aged 50–74 years in two defined geographical areas in South-East Norway are invited. During 2012–2018 they are randomised to receive an invitation to either a once only FS examination or biennial screening with FIT (1 : 1). Individuals randomised to FIT receive a kit for taking a stool sample together with a prepaid return envelope to send the sample for analysis. Participants with a positive screening result in both modalities are invited to a work-up colonoscopy examination. In the current sub-study, we invited a sub-sample of those invited to the main study.

Participants were invited every other week starting in September 2013 until July 2014 (FS group) and from October 2013 until December 2013 (FIT group). All participants received a psychological questionnaire with the invitation, together with a letter asking them to complete and return the questionnaire by a prepaid return envelope or complete the questionnaire online. Participants were informed that non-participation in the questionnaire study would not have consequences for their opportunity of screening participation. A reminder was sent to participants who did not complete FIT screening or attend the FS examination, but this reminder did not include the psychological questionnaires. Nonresponders of the questionnaire were not reminded to complete the questionnaire. A control group of sex- and age-matched individuals living in the neighbouring counties were randomly drawn from the Norwegian National Registry and invited to participate in the questionnaire study only. Invitations to this control group were sent once per month.

Definitions

A positive FS is defined as detection of advanced neoplasia (CRC, adenoma 10 mm, three or more adenomas, any polyp10 mm, adenoma with high-grade dysplasia or villous components) with subsequent referral to colonoscopy. As we were interested in the effect of a positive screening result, participants diagnosed with CRC at the FS examination were excluded from the analysis. A positive FIT screening result is defined as detection of human blood in the stool sample (cutoff>75 ng ml−1) with subsequent referral to colonoscopy.

Questionnaires

Two patient-reported outcome questionnaires were utilised in this study; the Hospital Anxiety and Depression scale (HADS), and the Short-Form Health Survey 12 (SF-12), a generic HRQOL questionnaire. Both questionnaires have been translated into Norwegian and validated in Norwegian background populations (Loge and Kaasa, 1998; Loge et al, 1998; Olsson et al, 2005).

HADS consists of 14 questions, 7 measuring anxiety and 7 measuring depression (Zigmond and Snaith, 1983). Each question has the same four response options, ranging from 0–3, with a maximum total score of 21 for each scale. A higher score indicates higher levels of anxiety or depression. Cutoff levels are defined as a score of 8 for possible presence, and 11 for probable presence of ‘clinically meaningful degrees of mood disorder’s. In the current study, a score of 8 was defined as cutoff for caseness in HADS-anxiety and HADS-depression, based on the results from a study performed on Norwegian patients attending primary care physicians (Olsson et al, 2005). For participants lacking one or two items on one subscale, the missing values were substituted by the mean of the completed item scores. The subscale of participants lacking more than two items was set to missing.

SF-12 consists of 12 questions; each question has response choices varying from 2 to 6 alternatives (Ware et al, 1996). The 12 questions can be transformed to 8-dimensional scores covering the physical and mental aspects of HRQOL. The dimensions are Physical Functioning, Role Physical (limitation associated with physical problems), Role Emotional (limitations associated with emotional problems), Mental Health, General Health (GH), Bodily Pain (BP), Vitality (energy and happiness, VT), and Social Functioning (SF). The dimension scores range from 0 to 100, with higher score indicating better HRQOL. Missing values were imputed following the recommendations of Ware and Kosinski (2001). If it was impossible to compute the value for one dimension, then that dimensional score was set to missing.

Participants

The number of individuals invited to the psychological questionnaire study was 7270 in the FS group, 7024 in the FIT group, and 7650 in the control group, respectively. Criteria for inclusion in the analyses were completion of the baseline and result questionnaire, as well as completion of the screening test for FS and FIT participants.

Participants were ineligible for inclusion in the analysis if participants (a) had a previous CRC diagnosis, or (b) were deceased. Unattainable participants (invitation returned to sender), and participants who had moved out of the county were excluded. This study was approved by the Regional Ethics Committee of South-East Norway (REK approval no 2011/1272). Participants gave their consent to participate in the questionnaire study by completing and returning the mailed questionnaire.

Study outline

Participants’ levels of anxiety, depression and HRQOL were measured before and after screening participation. First participants received the questionnaire with the invitation to screening (baseline measure). They received the questionnaire the second time together with the results of their primary screening test (result measure), but before potential colonoscopy, (see Figure 1). FS participants received the second questionnaire when they were informed about the result of the examination by an endoscopist, whereas FIT participants and FS participants who had a histology sample analysed received a letter with their results together with the second questionnaire. Control participants completed the questionnaire at baseline only.

Figure 1
figure 1

Flow-chart of questionnaire measures in the FS-, FIT- and control group. First and second round illustrate data collection in the current study. y, years.

The date of completion of the questionnaire was used to determine whether it had been completed before or after screening compliance. Consequently, participants with a questionnaire lacking completion date were excluded. Further, we were interested in short-term reactions of the primary screening test result, and therefore participants completing the questionnaire >60 days following the primary screening result, or participants who completed the questionnaire after work-up colonoscopy were also excluded.

Information regarding participants’ nationality, marital status, education, and occupation were obtained through the questionnaire. Education level was classified as low (primary school/high school) or high (minimum 2 years of college/university studies). Information regarding age and gender were obtained from the Norwegian National Registry.

Statistical analysis

Internal consistency for the two HADS subscales were tested by Cronbach’s alpha. To investigate demographical differences between the three groups before screening, χ2-tests were applied for categorical outcomes. Standardised residuals show where the statistical differences lie. Analysis of variance was applied to test for group differences on continuous demographical outcomes.

To compare FIT and FS participants with the control group on anxiety, depression, and HRQOL at baseline ANOVA analyses were applied. The analyses were adjusted for age, sex, education, marital status, and work. Significant difference in the ANOVA analysis was further probed by contrast analysis, to see which groups differed from each other. ANOVA was used to test for two- and three-way interaction effects between time (baseline and result), screening group (FIT and FS) and screening result (positive and negative) on the outcome variables. Age, sex, education, marital status, and work were included in the model to adjust for the effect of these variables. This analyses also allowed us to investigate interaction effects between screening result and demographic variables on the outcome measures. We compared the mean change scores from baseline to result between groups with positive and negative screening results. This analysis were completed with and without adjustment for participants’ baseline values on the outcome. McNemars test was applied to test for changes in prevalence of participants with cutoff levels of anxiety and depression from baseline to result.

To adjust for multiple comparisons we used a Benjamini–Hochberg correction with a false discovery rate of 15%. Thus, the BH rejection treshold P0.01 were considered statistically significant (Benjamini et al, 2001). Statistical analyses were performed using the statistical software IBM SPSS v19 (IBM Corporation, Armonk, NY, USA). Criteria for clinically relevant change was defined as a change of ½ s.d. (Norman et al, 2003), the smallest change perceived by individuals as an actual change in condition. Cohen classified different effect sizes, and defined a Cohens d effect size above 0.5 as a medium effect size, or a clinically relevant change (Cohen, 1969). Therefore, our second criteria for clinically relevant change was a Cohens d above 0.5.

Results

Of the 14 294 individuals randomised to screening and invited to complete the questionnaires, 7578 (53%) completed screening. Of these, 3216 (42%) completed the questionnaire both at invitation and after receiving the screening result; 1839 in the FS group and 1377 in the FIT group. Three participants in the FS group learned their diagnosis of cancer at the FS examination, and were excluded from the analysis. In the control group, 2618 individuals (35% of the invited) completed the baseline questionnaire (Figure 2).

Figure 2
figure 2

Flow-chart showing screening participation and questionnaire response rate for FS-, FIT-, and the control group.

Table 1 depicts demographical data in each randomised group. Contrast analyses revealed that the FS group had a higher mean age (M=63.7 years, s.d.=7.0) compared with the FIT group (M=62.9 years, s.d.=6.6), P<0.01 and compared with the control group (M=62.4 years, s.d.=6.8), P<0.01. The mean age of the FIT group was higher compared with the control group P<0.01. There were no statistically significant differences between the groups in proportion of participants’ gender.

Table 1 Demographic characteristics of the three groups measured at baseline

The HADS-anxiety and the HADS-depression subscales had good internal consistency with Cronbach alpha coefficients of 0.86 and 0.84, respectively. At baseline there were no differences exceeding the criteria for a clinically relevant difference in anxiety, depression or HRQOL between participants in the FS, FIT, and control groups (Table 2).

Table 2 HADS-anxiety, HADS-depression and SF-12 mean scores in the three groups prior screening (baseline)

Anxiety and depression

Results from the ANOVA analyses with time (baseline and result), screening group (FIT and FS) and screening result (positive and negative) as predictors are shown in Table 3. Participants with positive FS and FIT screening results increase slightly in anxiety from baseline to result, but the effect was not statistically significant. FS negative and FIT-negative report a statistically significant decrease in anxiety from baseline to result. The interaction effect of time, screening group, and screening result was not statistically significant (P=0.89).

Table 3 Mean HADS-anxiety and HADS-depression score at baseline and at result, and change scores from baseline to result, for participants with positive and negative screening results

Participants receiving a positive or negative FS or FIT screening result did not report a statistically significant change in depression from baseline. There was no statistically significant change in the prevalence of screening participants who scored above cutoff levels for caseness of anxiety or depression from baseline to results (Table 4).

Table 4 Prevalence of participants scoring above 8, cutoff level indicating possible presence of anxiety and depression

A marginally significant interaction effect between gender and screening result was observed for changes in anxiety, (P=0.03) and a statistically significant effect was observed for depression, (P<0.01). Participants who received negative results showed little effect of gender (women M=−0.21, men M=−0.11), whereas in the positive result group, women reported increased anxiety (M=0.49) whereas men reported a decrease (M=−0.05). Among the negatives, there was little effect of gender on depression (women M=−0.03, men M=−0.01), whereas among the positives men reported a decrease (M=−0.69) and women an increase (M=0.19).

Health-related quality of life

No interaction effects were statistically significant for HRQOL. No statistically significant changes were observed for participants who received a positive screening result. For FS negative, a statistically significant improvement in the dimensions GH, BP, and VT was observed from baseline to receipt of the result (Table 5a). Further, FIT-negative participants reported a statistically significant improvement in VT (Table 5b).

Table 5a Mean HRQOL at baseline and at result and change scores from baseline to result for FS participants with positive and negative screening results
Table 5b Mean HRQOL at baseline and at result and change scores from baseline to result for FIT participants with positive and negative screening results

A significant interaction effect was observed between gender and screening result on the RE dimension, (P<0.01). There was little effect of gender among participants with negative screening results (RE women M=−1.2, men M=−0.3). Among the positives, women report an increase in RE (M=5.4), whereas men report a large decrease in the RE dimension (M=−8.0).

To assess whether nonresponders to the result questionnaire were different to responders we repeated the analyses with baseline scores carried forward, for participants who did not reply to the result questionnaire. However, the analyses including these participants yielded similar results.

As can be observed from comparing change scores towards s.d. in Tables 3, 5a and 5b, none of the observed changes fulfilled the first criteria of a clinically relevant change, (=change of ½ s.d.). Further, comparing the screening groups at result with the control group at baseline, no comparison indicated a medium effect size, defined as a Cohens d above 0.5.

Discussion

This study indicates no clinically relevant psychological harm of receiving a positive CRC screening result. Changes in anxiety, depression and HRQOL were measured from before screening to after knowing the screening results in a large number of asymptomatic participants. A positive screening result did not increase participants’ level of anxiety or depression, or decrease participants’ level of HRQOL. Further, participants who received a negative screening result reported a statistically significant decrease in anxiety, and improvements in the HRQOL dimensions GH, BP, and VT. However, no changes observed in the current study reached the criteria of clinical relevance. Altogether, these results indicate no clinically relevant psychological harm of participation in FIT or FS screening delivered in a pilot for a national screening programme. The randomised design combined with a control group gives confidence in generalisation of the results.

The literature is inconsistent regarding psychological effects of CRC screening. The results from the current study are in congruence with some previous research showing no negative effect on anxiety (Wardle et al, 2003; Robb et al, 2013) or HRQOL (Niv et al, 2012) from baseline to receipt of FS and colonoscopy screening results. However, four other studies reported anxiety in FIT-/FOBT-positive participants (Brasso et al, 2010; Denters et al, 2013; Bobridge et al, 2014; Laing et al, 2014). Importantly, three of the studies (Brasso et al, 2010; Denters et al, 2013; Bobridge et al, 2014) did not report changes from baseline to result within groups, and thus can only document a statistically significant difference in anxiety between participants who receive different screening results, after the screening examination. However, an observed difference between recipients of different screening results may exist before screening, rather than the knowledge of the screening result causing psychological distress (Mccaffery and Barratt, 2004). As a result, this study emphasises the importance of a baseline measure to document changes within participants. Second, the last study showing anxiety as a consequence of CRC screening participation (Laing et al, 2014) documents an increase in mean anxiety score from baseline in FOBT-positive participants. However, the study shows an increase of 5% on the State-Trait Anxiety Inventory measure. Research shows that an effect of 20% change in the HADS scale (Puhan et al, 2008), or a difference of ½ s.d. (Norman et al, 2003) in other psychosocial measures, is needed for the individual to experience the change as meaningful. The effect documented in the study by Laing et al (2014), as well as all changes observed in the current study, do not fulfil this criteria for a clinically relevant change. Consequently, there seems to be little support for the concern that CRC screening causes clinically relevant anxiety.

In contrast to previous studies, the current study investigated whether anxiety depended on the type of screening test. Anxiety depended on which result one received, but was not influenced by whether participants completed FIT or FS screening. Thus, our study provides important knowledge for planning new screening programmes.

The study employed HADS, a validated measure of generic anxiety. A meta-analyses on breast cancer screening concluded that false-positive mammograms cause distress in participants, but that cancer-specific anxiety measures detect the effect, whereas generic anxiety measures do not (Salz et al, 2010). As Broderson (Brodersen et al, 2007) argues, generic scales may measure constructs unlikely to be relevant to a screening experience. Consequently, we cannot exclude the possibility that a cancer-specific measure would yield different results. However, several studies have been able to document effects of CRC screening on anxiety using generic measures (Brasso et al, 2010; Bobridge et al, 2014; Laing et al, 2014). To replicate our findings with a cancer-specific measure is an important issue for future research. To enable these studies, development of a validated CRC-specific anxiety measure is needed.

Most research on psychological reactions towards cancer screening stem from studies with female participants (Bond et al, 2013; O'connor et al, 2015). One study showed that men report lower screening-specific anxiety in CRC screening compared with women (Denters et al, 2013), whereas other studies report no gender difference (Wardle et al, 2003; Taylor et al, 2004). The marginally significant effect of gender in the present study indicates that a positive screening result might have a larger influence on anxiety in women than in men. One explanation could be that groups who report higher initial levels of anxiety are more influenced by threatening experiences such as receiving a positive screening result. Further, women report more pain than men during endoscopy, which is related to anatomical differences (Thiis-Evensen et al, 2000). Pain during screening examination is related to increased anxiety following screening (Hafslund, 2000). The trend observed in our study indicate that the reactions towards a positive screening result could differ depending on gender. This is an important finding that warrants future research.

Research on psychological effects of screening has focused on the experience of a false-positive screening result. However, the most frequent screening experience is to receive a true-negative result. The current study shows decreased anxiety and improvements in some dimensions of HRQOL as a consequence of receiving a negative result, in line with previous research (Thiis-Evensen et al, 1999; Wardle et al, 2003; Taupin et al, 2006; Pizzo et al, 2011). One possible explanation is that an invitation for CRC screening may increase initial anxiety levels measured at baseline. However, Wardle et al (1999) showed that receiving information about FS screening did not cause increased anxiety levels. Further, participants in both screening groups were similar to the control participants at baseline, and thus do not support this hypothesis. However, decreased anxiety may result from a perceived decreased risk of CRC (Robb et al, 2004), an important motivation for health behaviour (Brewer et al, 2007).

In order to improve studies of psychosocial outcomes in screening, recommendations have been made to include both a baseline measure and a control group (Mccaffery and Barratt, 2004). Only one other study of CRC screening has adhered to the recommendations (Taylor et al, 2004). However, in this study participants completed several screening programmes, making it difficult to disentangle the effect of a positive CRC screening result alone. Further, the present study has the largest participant sample of the two studies. A large sample is necessary to enable both a prospective design including a baseline measure, while ensuring a large enough number of responding participants with a positive result to enable investigation of changes within this group. Comparing screening participants with a control group determine that CRC screening participants were similar to a normal population at baseline, which enables generalisation of the results.

While the control group is a strength in the current study, unfortunately the response rate in the control group was low. Therefore, we cannot exclude the possibility that the sample of control participants who complete the questionnaire differ from those who do not, and consequently the control group might not be representative of the normal population as a whole. However, control participants were not very different from a large number of screening participants before screening on demographic variables as well as anxiety, depression and HRQOL. Before an intervention screening participants are likely to be similar to the normal population.

Owing to the design of the current study the number of participants with a positive screening result is low. The increase in mean anxiety score in participants with a positive screening result is of similar size as the decreased anxiety in screening negative participants. The latter sample was larger and therefore statistically significant results are more easily detectable. However, neither change was close to a clinically relevant change. Another limitation in the present study is the low percentage of participants complying with both the screening examination and two questionnaires. Individuals who do choose to participate in CRC screening may differ from the participants who decline participation. CRC screening attenders are more often married (Van Jaarsveld et al, 2006), have higher socioeconomic status (Pornet et al, 2010) and better mental health (Kodl et al, 2010) compared with nonattendees. These participants might respond differently towards a positive screening result, compared with participants with lower levels of social support, lower health literacy, or initial psychological distress. Further, it is possible that participants who complied with both questionnaires differ from nonrepliers, for instance by being more positive towards screening. Therefore, the results might not be generalised to the population as a whole. However, comparison with the control group indicates that the screening participants were similar on anxiety, depression and HRQOL as the general population before screening.

Conclusion

In conclusion, the current study shows no increase in anxiety or depression or decrease in HRQOL in participants who received a positive CRC screening result. Moreover, participants who received a negative result report lower levels of anxiety, and improvement on some HRQOL dimensions. However, no changes observed in the current study were considered to be of clinical relevance. Thus, receiving a positive CRC screening result, and participating in CRC screening does not seem to have clinically relevant short-term effects on anxiety, depression, or HRQOL in Norwegian participants.