INTRODUCTION

Addressing impaired sleep among trainee physicians as a means of decreasing medical errors and burnout rates continues to be a topic of considerable debate. Training programs have grappled with balancing the educational value of continuous patient care against the mental and physical effects of long shifts. The Accreditation Council on Graduate Medical Education (ACGME) has recently revised nationwide work hour restrictions to eliminate the 16-hour cap on first-year resident shifts.1 In this evolving environment, a robust understanding of the effects of residency training on sleep will be crucial in identifying both downstream effects of these changes and, importantly, mechanisms to mitigate these effects and to enhance trainee health and patient safety. Proactively implementing adaptive strategies to manage negative effects of medical training on sleep may help trainee physicians manage frequently non-circadian schedules throughout a lifelong practice of medicine.

Sleep impairment is linked not only to poor health outcomes such as obesity2,3 and cardiovascular disease4 but also to decreased task performance and concentration.5 Prior studies of medical residents identified associations between impaired sleep and both medical errors and burnout, including increased depression and anxiety.6,7,8 Other research has illustrated that work hour changes may not lead to greater sleep or to better clinical performance9,10,11,12 and that factors other than clinical work hours may contribute to fatigue.13 Many studies have also utilized brief global assessments of wellness to assess effects of work hour changes without quantifying specific sleep data,14,15 though research has shown that global perceptions of function or well-being do not correspond to results on detailed sleep assessments.16 In addition, less is understood about individual-level susceptibility of trainee physicians to either sleep impairment or its negative downstream effects.

Informed by a previous study of our system’s training programs, which demonstrated that some residents begin training with pre-existing impaired sleep,17 we sought to understand how baseline sleep characteristics may evolve during residency in the environment of restricted work hours. This updated study aimed to identify baseline individual-level factors that may aid training programs and residents themselves in anticipating susceptibilities to sleep impairment during training, to allow for future targeted interventions to reach those most at risk. Two principal hypotheses were explored: (1) sleep quality and daytime sleepiness worsen during the first year of residency, and (2) individuals entering residency with impaired sleep at baseline will continue to experience sleep impairment during training.

METHODS

To assess the effects of baseline characteristics and residency training on sleep, this naturalistic observation study utilized repeated-measures surveys of first-year (PGY-1, intern) residents in ACGME-accredited residency programs within Partners HealthCare-affiliated teaching hospitals in Boston, Massachusetts. Data collection occurred from June 2011 to April 2012, during the first academic year in which PGY-1 residents were restricted to maximum shift durations of 16 hours.18 The Partners HealthCare Institutional Review Board approved the study protocol.

Study enrollment took place during PGY-1 orientation, prior to the start of clinical duties. All incoming interns were eligible to enroll, and all received information about the research, including that participation was voluntary and that responses would not be shared with residency programs. Residents electing to participate provided written informed consent and completed self-administered surveys. Follow-up data collection via administration of identical surveys occurred at intern retreats in the spring of the first year, approximately 9 months after initial enrollment. Each participant was assigned a unique study code to allow matching of initial and follow-up data. Participants received a US$5 gift card to the hospital coffee shop upon completion of each survey.

Demographic information obtained included age, gender, presence of a partner in bed, number of children in household, and type of PGY-1 program (dichotomized as surgical versus non-surgical). Two validated survey instruments were utilized: the Pittsburgh Sleep Quality Index (PSQI) and the Epworth Sleepiness Scale (ESS). These were selected due to their ability to detect change over time in multiple domains, their inclusion of cutoff scores that indicated clinically significant levels of sleep impairment, and their widespread use in similar populations.

The PSQI is a 19-item questionnaire that queries quality of sleep over the past 1 month as an aggregate of seven components: sleep latency, sleep duration, sleep efficiency, sleep disturbance, use of sleep medication, daytime dysfunction, and subjective global rating. Individual component ratings are summed to generate a total PSQI score, with a score of 6 or above identifying poor quality sleep warranting clinical intervention, at a sensitivity of 89.6% and a specificity of 86.5%.19,20

The ESS is an eight-item questionnaire that identifies excessive daytime sleepiness by assessing an individual’s likelihood of dozing in various situations. A total ESS score of 10 or above indicates excessive daytime sleepiness. This scale is utilized in the clinical diagnosis of narcolepsy or sleep apnea; however, it does not identify the etiology of problematic levels of daytime sleepiness.21 Individuals meeting the cutoff score on this scale therefore exhibit sleepiness equivalent to a patient with untreated primary sleep disorder, regardless of the cause.

Bias between study completers and non-completers was assessed with chi-square tests for demographic variables and with Wilcoxon rank-sum tests for baseline ESS and PSQI scores. Changes in variable means over time were analyzed with paired-sample t tests after confirming normality of distributions.22 Changes in proportion of residents exceeding scale cutoff scores were analyzed with McNemar’s test. Multiple linear regression analyses were utilized to assess associations between scale scores and demographic variables. Associations based on two-tailed tests were deemed significant at α = 0.01. All analyses were performed with SPSS software (Version 22, IBM Corp., Armonk, NY).

RESULTS

At study initiation, 281 residents enrolled and completed baseline surveys. Of this initial cohort, 153 (54%) completed surveys at the follow-up time point. Participants who did not complete follow-up surveys were demographically similar to study completers in age, gender, relationship status, and number of children in household. Those lost to follow-up were more likely to be in a surgical training program than a non-surgical training program (p < 0.001). Study completers and non-completers did not differ in baseline mean scores on the ESS (p = 0.78) or the PSQI (p = 0.09). Detailed baseline characteristics of the study population are shown in Table 1. All data presented below refer to the 153 residents that completed both initial and follow-up surveys.

Table 1 Baseline Characteristics of Study Population, Stratified by Follow-up Status

Evolution of Sleep Characteristics Over Time

First-year residents exhibited statistically significant (p < 0.001) changes in the following sleep characteristics from baseline to follow-up: Average bedtime moved from 11:26 p.m. to 10:46 p.m., and average wake time moved from 7:43 a.m. to 5:38 a.m. Mean sleep duration decreased from 7.6 to 6.5 hours. Mean sleep latency did not change over time (p = 0.10). Differences in these characteristics are summarized in Table 2.

Table 2 Change in Sleep Characteristics Over Time (n = 153)

Mean scores on both survey instruments were significantly different from baseline at follow-up (p < 0.001), with total PSQI score increasing from 3.6 to 5.2 and total ESS score increasing from 7.2 to 10.4. Four of the seven PSQI component scores exhibited statistically significant (p < 0.001) increase over time: subjective sleep quality, sleep duration, sleep medication use, and daytime dysfunction. There was no significant difference over time in component scores representing sleep efficiency, sleep latency, or sleep disturbance. PSQI scores with components and ESS scores are summarized in Table 2.

Daytime dysfunction and sleep disturbance component scores were the main contributors to mean total PSQI score at follow-up. Daytime dysfunction score increased from 0.7 at baseline to 1.4 at follow-up. Sleep disturbance score remained constant at 1.0 across both time points. The most frequent causes of sleep disturbance at follow-up, by percentage of residents reporting occurrence at least once or twice per week, were middle- or end-of-night awakening (48%), waking up to use the bathroom (34%), feeling hot (25%), and having difficulty falling asleep within 30 min (18%). Responses to the full set of sleep disturbance queries are shown in Table 3.

Table 3 Reported Causes of Sleep Disturbances on PSQI at Follow-up (n = 153)

Multiple linear regression was used to model relationships between demographic characteristics and sleep scores for study completers at both time points. Greater age and fewer number of children were associated with increased PSQI score at follow-up (p < 0.001). Gender, surgical versus non-surgical specialty, and presence of a bed partner were not significant in this model. None of the queried demographic variables were significant predictors of PSQI score at baseline or of ESS score at either time point. Logistic regression models showed no significant association between demographics and positive score on either instrument.

Predictive Utility of Baseline Sleep Scores

A subset of residents exhibited clinically significant sleep impairment at start of residency based on the cutoffs on the PSQI (total score of 6 or above) and on the ESS (total score of 10 or above). Specifically, 23 matriculating residents (15%) screened positive for impaired sleep on the PSQI, and 40 (26%) screened positive for excessive daytime sleepiness on the ESS. There was minimal overlap at baseline, with 9 individuals (6%) screening positive on both instruments. The proportion of residents exceeding cutoff scores increased at follow-up: 62 (40%) screened positive on the PSQI and 90 (59%) screened positive on the ESS. Forty-two residents (27%) screened positive on both instruments at follow-up. Changes in proportions were statistically significant at p < 0.001 for both scales.

Residents who screened positive on either instrument at baseline tended to remain positive; the proportion of these individuals converting to a negative score at follow-up was 30% on the PSQI and 7% on the ESS (p < 0.001, as above). A low initial score on either scale was not protective against later sleep deterioration: Of the 94 residents with scores below cutoff on both scales at baseline, 64 (68%) were positive on at least one scale at follow-up. Figure 1 shows the distribution of PSQI and ESS scores at follow-up, with stratification by baseline scores above or below the corresponding clinical cutoff.

Fig. 1
figure 1

Distribution of PSQI and ESS scores at follow-up, stratified by baseline scores: Residents with elevated scores on the PSQI (a) or the ESS (b) tended to remain in the elevated score group at follow-up. The cutoff value corresponding to each scale’s definition of an elevated score is indicated by the dashed line. Light shading to the right of the dotted line represents individuals who changed from a normal score at baseline to an elevated score at follow-up.

DISCUSSION

This study measured baseline sleep characteristics of incoming PGY-1 residents across multiple specialties within a large, multicenter healthcare system and tracked the change in these characteristics over 9 months during the first year of the 16-hour limit on continuous duty hours. Prevalence of sleep impairment at baseline, based on the PSQI cutoff score, was nearly identical between this study (15%) and a prior analysis of residents in our training programs (17%).17 In contrast, this study identified a somewhat greater proportion of residents with baseline ESS scores of 10 or above (26%) compared to the proportion observed in the prior analysis (18%).17 The overall concordance of both analyses supports the understanding that many individuals begin residency with impaired sleep. The observed sleep duration and sleep latency at follow-up in this cohort matched results from other groups’ studies of residents without the 16-hour restriction in place,23,24,25 indicating both that limited duty hours may not exert a direct effect on overall sleep characteristics and that this study reaffirms previously established sleep characteristics during training.

The first study hypothesis, that sleep impairment would worsen during intern year, was supported by the significant increase both in average scores on the PSQI and the ESS and in proportion of residents meeting clinical cutoffs on these instruments. The second hypothesis, that individuals beginning training with impaired sleep would continue to suffer from impaired sleep during residency, was confirmed by the data. However, this occurred within a larger pattern of interns becoming more sleep impaired in general. Indeed, sleep impairment worsened to the degree that most participants exceeded the ESS clinical cutoff at follow-up. Previous studies utilizing the ESS have shown that this degree of daytime sleepiness is associated with increased incidence of medical errors that lead to adverse events in patient care.7

Accordingly, it appears that trainees face a general risk of significant deterioration in sleep independent of baseline level of impairment. Increased age was a statistically significant predictor of sleep impairment, which corresponds with existing evidence that aging decreases ability to tolerate sleep disturbance.26 The only other significant demographic predictor of worsening sleep was fewer children in the household, which may run counter to expectations that having young children would increase nighttime awakenings. One possible explanation for this finding could be that residents with children pay more attention to structuring activity outside of work and may emphasize ensuring appropriate sleep and wake times within their households.

While PGY-1 training may be a risk factor itself for worsening sleep, this study does not prove a causal link, as there may be other unmeasured personal and environmental factors driving worsening sleep impairment. For instance, the type of clinical environment or the duration of hours worked may directly influence sleep characteristics. Other environmental variables—commute time, for example—may not be present at pre-matriculation screening but nevertheless impact sleep during training. Furthermore, the finding that 30% of residents with elevated baseline PSQI scores converted to negative scores over time indicates that additional unmeasured factors may mitigate or reverse any residency-related drivers of sleep impairment. Screening for sleep impairment via self-report questionnaires prior to residency matriculation may therefore have some utility in identifying a subset of individuals at risk for poor sleep during training, in order to deliver targeted intervention strategies to reduce fatigue. It will not, however, identify all trainees at risk of clinically significant sleep impairment.

Correspondingly, it is worth further investigating risk factors that lead residents with normal baseline sleep to experience sleep impairment during training. Extensions of this study could include tracking of residents throughout multiple years of training, increasing frequency of assessments, obtaining biometric sleep data via wearable devices, identifying specific clinical settings which may contribute to periods of worsened sleep, measuring other potential sleep risk factors, and gathering qualitative data to capture resident perceptions of contributors to sleep impairment. Additionally, studying the subset of residents whose sleep characteristics improved during residency in greater detail may uncover effective individual strategies that can subsequently be taught to other residents.

Strengths of this work include the enrollment of all incoming residents across a large academic healthcare system and administration of validated survey instruments, which both robustly identify impaired sleep and excessive daytime sleepiness and allow direct comparison to other studies. The inclusion of residents from all specialties enables the results to inform discussions applicable to all residency programs. Collection of baseline data from individuals prior to beginning clinical work enables a direct analysis of the effects of intern year on sleep.

Limitations of this study must be acknowledged. Though all incoming residents enrolled, just over half completed the study. Individuals lost to follow-up were more likely to be training in surgery, rendering these findings less generalizable to surgical residencies. All data collected were self-reported, which may have influenced results if participants did not accurately recall recent sleep patterns. Sampling was performed at only two time points, limiting understanding of the full trajectory of sleep impairment during residency, specifically whether it may vary based on seasonal patterns, intensity of a given rotation, or type of night coverage schedule in place.

In conclusion, given that a large proportion of residents exhibited worsening sleep impairment independent of hypothesized baseline factors which would identify those at greatest risk, the medical education community should additionally implement practical strategies to allow all trainees to maximize rest and to mitigate the effects of sleep impairment. For example, studies have shown that proactive schedule design can increase sleep or decrease daytime impairment after overnight shifts.27,28 Practical educational offerings on sleep hygiene for trainee well-being may counteract the observed effect that shortened work hours lead residents to increase extracurricular and leisure activities rather than increasing sleep.13 Importantly, as a component of lifelong professional development, trainees must also be able to self-identify sleep impairment and access strategies to optimize their health in this regard; while work schedules may change after training, physicians continue to be at risk of fatigue and sleep impairment throughout their lives.29,30