Scolaris Content Display Scolaris Content Display

Interventions for treating proximal humeral fractures in adults

This is not the most recent version

Collapse all Expand all

Abstract

available in

Background

Fracture of the proximal humerus, often termed shoulder fracture, is a common injury in older people. The management of these fractures varies widely. This is an update of a Cochrane Review first published in 2001 and last updated in 2012.

Objectives

To assess the effects (benefits and harms) of treatment and rehabilitation interventions for proximal humeral fractures in adults.

Search methods

We searched the Cochrane Bone, Joint and Muscle Trauma Group Specialised Register, the Cochrane Central Register of Controlled Trials (CENTRAL), MEDLINE, EMBASE, and other databases, conference proceedings and bibliographies of trial reports. The full search ended in November 2014.

Selection criteria

We considered all randomised controlled trials (RCTs) and quasi‐randomised controlled trials pertinent to the management of proximal humeral fractures in adults.

Data collection and analysis

Both review authors performed independent study selection, risk of bias assessment and data extraction. Only limited meta‐analysis was performed.

Main results

We included 31 heterogeneous RCTs (1941 participants). Most of the 18 separate treatment comparisons were tested by small single‐centre trials. The main exception was the surgical versus non‐surgical treatment comparison tested by eight trials. Except for a large multicentre trial, bias in these trials could not be ruled out. The quality of the evidence was either low or very low for all comparisons except the largest comparison.

Nine trials evaluated non‐surgical treatment in mainly minimally displaced fractures. Four trials compared early (usually one week) versus delayed (three or four weeks) mobilisation after fracture but only limited pooling was possible and most of the data were from one trial (86 participants). This found some evidence that early mobilisation resulted in better recovery and less pain in people with mainly minimally displaced fractures. There was evidence of little difference between the two groups in shoulder complications (2/127 early mobilisation versus 3/132 delayed mobilisation; 4 trials) and fracture displacement and non‐union (2/52 versus 1/54; 2 trials).

One quasi‐randomised trial (28 participants) found the Gilchrist‐type sling was generally more comfortable than the Desault‐type sling (body bandage). One trial (48 participants) testing pulsed electromagnetic high‐frequency energy provided no evidence. Two trials (62 participants) provided evidence indicating little difference in outcome between instruction for home exercises versus supervised physiotherapy. One trial (48 participants) reported, without presentable data, that home exercise alone gave better early and comparable long‐term results than supervised exercise in a swimming pool plus home exercise.

Eight trials, involving 567 older participants, evaluated surgical intervention for displaced fractures. There was high quality evidence of no clinically important difference in patient‐reported shoulder and upper‐limb function at one‐ or two‐year follow‐up between surgical (primarily locking plate fixation or hemiarthroplasty) and non‐surgical treatment (sling immobilisation) for the majority of displaced proximal humeral fractures; and moderate quality evidence of no clinically important difference between the two groups in quality of life at two years (and at interim follow‐ups at six and 12 months). There was moderate quality evidence of little difference between groups in mortality in the surgery group (17/248 versus 12/248; risk ratio (RR) 1.40 favouring non‐surgical treatment, 95% confidence interval (CI) 0.69 to 2.83; P = 0.35; 6 trials); only one death was explicitly linked with the treatment. There was moderate quality evidence of a higher risk of additional surgery in the surgery group (34/262 versus 16/261; RR 2.06, 95% CI 1.18 to 3.60; P = 0.01; 7 trials). Although there was moderate evidence of a higher risk of adverse events after surgery, the 95% confidence intervals for adverse events also included the potential for a greater risk of adverse events after non‐surgical treatment.

Different methods of surgical management were tested in 12 trials. One trial (57 participants) comparing two types of locking plate versus a locking nail for treating two‐part surgical neck fractures found some evidence of slightly better function after plate fixation but also of a higher rate of surgically‐related complications. One trial (61 participants) comparing a locking plate versus minimally invasive fixation with distally inserted intramedullary K‐wires found little difference between the two implants at two years. Compared with hemiarthroplasty, one trial (32 participants) found similar results with locking plate fixation in function and re‐operation rates, whereas another trial (30 participants) reported all five re‐operations occurred in the tension‐band fixation group. One trial (62 participants) found better patient‐rated (Quick DASH) and composite shoulder function scores at a minimum of two years follow‐up and a lower incidence of re‐operation and complications after reverse shoulder arthroplasty (RSA) compared with hemiarthroplasty.

No important between‐group differences were found in one trial (120 participants) comparing the deltoid‐split approach versus deltopectoral approach for non‐contact bridging plate fixation, and two trials (180 participants) comparing 'polyaxial' and 'monaxial' screws in locking plate fixation. One trial (68 participants) produced some preliminary evidence that tended to support the use of medial support locking screws in locking plate fixation. One trial (54 participants) found fewer adverse events, including re‐operations, for the newer of two types of intramedullary nail. One trial (35 participants) found better functional results for one of two types of hemiarthroplasty. One trial (45 participants) found no important effects of tenodesis of the long head of the biceps for people undergoing hemiarthroplasty.

Very limited evidence suggested similar outcomes from early versus later mobilisation after either surgical fixation (one trial: 64 participants) or hemiarthroplasty (one trial: 49 participants).

Authors' conclusions

There is high or moderate quality evidence that, compared with non‐surgical treatment, surgery does not result in a better outcome at one and two years after injury for people with displaced proximal humeral fractures involving the humeral neck and is likely to result in a greater need for subsequent surgery. The evidence does not cover the treatment of two‐part tuberosity fractures, fractures in young people, high energy trauma, nor the less common fractures such as fracture dislocations and head splitting fractures.

There is insufficient evidence from RCTs to inform the choices between different non‐surgical, surgical, or rehabilitation interventions for these fractures.

PICOs

Population
Intervention
Comparison
Outcome

The PICO model is widely used and taught in evidence-based health care as a strategy for formulating questions and search strategies and for characterizing clinical studies or meta-analyses. PICO stands for four different potential components of a clinical question: Patient, Population or Problem; Intervention; Comparison; Outcome.

See more on using PICO in the Cochrane Handbook.

Interventions for treating shoulder fractures in adults

Background

Fracture of the top end of the upper arm bone (proximal humerus) is a common injury in older people. It is often called a shoulder fracture. The bone typically fractures (breaks) just below the shoulder, usually after a fall. Most of these fractures occur without breaking the skin lying over the fracture. The injured arm is often supported in a sling until the fracture heals sufficiently to allow shoulder movement. More severe (displaced) fractures may be treated surgically. This may involve fixing the fracture fragments together by various means. Alternatively, the top of the fractured bone may be replaced (half 'shoulder' replacement: hemiarthroplasty). More rarely, the whole joint, thus including the joint socket, is replaced (total 'shoulder' replacement). Physiotherapy is often used to help restore function.

Results of the search

We searched medical databases up to November 2014 and included 31 randomised studies with a total of 1941 participants. Most of the 18 treatment comparisons were tested by one study only. The best evidence was from eight studies, one of which was a relatively large multicentre study; these investigated whether surgery gave a better result than non‐surgical treatment for displaced fractures.

Key results

Nine trials evaluated non‐surgical treatment in usually less severe fractures. One trial found a type of arm sling was generally more comfortable than a type of body bandage. There was some evidence that early mobilisation (within one week), compared with delayed mobilisation (after three weeks), resulted in less pain and faster recovery in people with 'stable' fractures. Two studies provided weak evidence that many patients could generally achieve a satisfactory outcome when given sufficient instruction to pursue exercises on their own.

Eight studies, involving 567 participants with displaced fractures, compared surgical versus non‐surgical treatment. Pooled results from the five most recent trials showed that there were no important differences between the two approaches for patient‐reported measures of function and quality of life at 6, 12 and 24 months. There was little difference between the two groups in mortality. Twice as many surgical group patients had additional or secondary surgery. More surgical group patients had adverse events.

Twelve trials (744 participants) tested different methods of surgical treatment. There was weak evidence of some differences (e.g. in complications) between some interventions (e.g. different devices or different ways of using devices).

There was very limited evidence suggesting similar outcomes for early versus delayed mobilisation after either surgical fixation or hemiarthroplasty.

Quality of the evidence

Most of the 31 studies had weaknesses that could affect the reliability of their results. We considered that the evidence was either of high or moderate quality for the results of the surgical versus non‐surgical treatment comparison, which means that we are pretty certain these results are reliable. We considered that the evidence for other comparisons was of low or very low quality, which means we are unsure of these results.

Conclusions

Surgery does not result in a better outcome for the majority of people with displaced proximal humeral fractures and is likely to result in a greater need for subsequent surgery. Otherwise, there is not enough evidence to determine the best non‐surgical or, when selected, surgical treatment for these fractures.

Authors' conclusions

Implications for practice

There is high or moderate quality evidence that, compared with non‐surgical treatment, surgery does not result in a better outcome at one and two years after injury for people with displaced proximal humeral fractures involving the humeral neck and is likely to result in a greater need for subsequent surgery. The evidence does not cover the treatment of two‐part tuberosity fractures, fractures in young people, high energy trauma, nor the less common fractures such as fracture dislocations and head splitting fractures.

There is insufficient evidence from randomised controlled trials to inform the choices between different non‐surgical interventions, different surgical interventions, or different rehabilitation interventions for these fractures.

Implications for research

The availability of high quality evidence primarily from a sufficiently powered multicentre randomised trial (ProFHER 2015) is the key reason why this review can now inform on the use of surgery for the majority of displaced fractures. There is a need for similar trials to help address other key treatment uncertainties. Decisions on priority topics should consider the coverage of the current evidence base as well as the topics covered by the ongoing trials. Of particular note is that three ongoing trials are already comparing reverse shoulder arthoplasty versus hemiarthoplasty.

Although the identification of priority topics requires input from others, including patients, we suggest that research should be focused primarily on optimising non‐surgical treatment. Where randomised trials are warranted, these should use standard and validated outcome measures, including patient‐reported measures of functional outcome and quality of life, and also assess resource implications. They should also meet the CONSORT criteria for design and reporting of non‐pharmacological studies (Boutron 2008) and subsequent developments including the adequate reporting of interventions (Hoffmann 2014).

This Cochrane review should be maintained and updated as further randomised controlled trials become available. The authors would be pleased to receive information about any other randomised controlled trials relating to the treatment of these fractures.

Summary of findings

Open in table viewer
Summary of findings for the main comparison. Summary of findings: surgical versus non‐surgical treatment for proximal humeral fractures

Surgical versus non‐surgical treatment for proximal humeral fractures

Patient or population: [mainly older] adults with most types of displaced proximal humeral fractures1 (8 trials)

Settings: hospital (tertiary care)

Intervention: surgery, various: mainly open reduction and internal fixation (ORIF) with locking plate or hemiarthroplasty

Comparison: non‐surgical treatment, mainly sling 'immobilisation'; more rarely, closed reduction/manipulation of the fracture (2 trials)

Outcomes

Illustrative comparative risks* (95% CI)

Relative effect
(95% CI)

No of Participants
(studies)

Quality of the evidence
(GRADE)

Comments

Assumed risk

Corresponding risk

Non‐surgical treatment

Surgical treatment

Functional scores2 (higher = better outcome)

Follow‐up: 1 year

The mean difference in function (overall) in the surgery groups was

0.07 standard deviations higher

(0.12 lower to 0.26 higher)

SMD 0.07
(‐0.12 to 0.26)

419 participants

(5 studies)

⊕⊕⊕⊕
high3

This does not represent a clinically important difference:

  • 0.2 represents a small difference, 0.5 a moderate difference and 0.8 a large difference. Thus, based on this 'rule of thumb', there is little difference between the two groups. At most, the extreme range of the 95% CI includes a minimal difference in favour of surgery at one year.

  • All of the best estimates of between‐group differences for the individual outcome scores2 were much smaller than their associated MCIDs

Functional scores4 (higher = better outcome)

Follow‐up: 2 years

The mean difference in function (overall) in the surgery groups was

0.07 standard deviations higher

(0.14 lower to 0.28 higher)

(SMD 0.07, 95% CI ‐0.14 to 0.28)

351 participants

(4 studies)

⊕⊕⊕⊕
high5

This does not represent a clinically‐important difference.

  • 0.2 represents a small difference, 0.5 a moderate difference and 0.8 a large difference. Thus, based on this 'rule of thumb', there is little difference between the two groups. At most, the extreme range of the 95% CI includes a minimal difference in favour of surgery at two years.

  • All of the best estimates of between‐group differences for the individual outcome scores4 were much smaller than their associated MCIDs

Quality of life assessment: EuroQol (0: dead to 1: best health)

Follow‐up: 2 years

The mean EuroQol score ranged across control groups from
0.7 to 0.85

The mean EuroQol score in the surgery groups was 0.03 higher,
(0.01 lower to 0.08 higher)

354 participants

(4 studies)

⊕⊕⊕⊝
moderate6

The MCID of 0.12 was outside the 95% CI at this time period and at 6 months (MD 0.04, 95% CI 0.01 to 0.08) and 12 months (MD 0.02, 95% CI ‐0.02 to 0.06)

Quality of life: SF‐12 Physical Component Score (0 to 100: best)
Follow‐up: 2 years

The mean SF‐12 PCS was 44.1

The mean SF‐12 PCS in the surgery group was
1.10 higher (1.99 lower to 4.19 higher)

210 participants

(1 study)

⊕⊕⊕⊝
moderate7

A similar lack of clinically important difference8 was noted at 6 and 12 months. This measure may not be sensitive to recovery from this injury.

Mortality

Follow‐up: up to 2 years

52 per 10008

73 per 1000
(4 to 147)

RR 1.40 (0.69 to 2.83)

596 participants

(6 studies)

⊕⊕⊕⊝
moderate9

Surgery resulted in 21/1000 more deaths up to 2 years (95% CI 48 fewer to 95 more)

Where reported, none of the deaths was related to their fracture or treatment with the exception of one early death due to venous thromboembolism in the surgical group of one trial

Additional surgery (re‐operation or secondary surgery)

Follow‐up: up to 2 years

40 per 10009

83 per 1000
(47 to 144)

RR 2.06
(1.18 to 3.60)

523 participants

(7 studies)

⊕⊕⊕⊝
moderate10

Surgery resulted in 43/1000 more patients having additional surgery up to 2 years (95% CI 7 to 104 more).

One trial (250 participants) also reported on additional shoulder‐related therapy (7/1254 versus 4/125; RR 1.75 favouring non‐surgical therapy, 95% CI 0.53 to 5.83)

Adverse events / complications ‐ Number of patients with complications

Follow‐up: 2 years

184 per 10009

239 per 1000
(147 to 389)

RR 1.30
(0.80 to 2.11)

250 participants

(1 study)

⊕⊕⊕⊝
moderate11

Surgery resulted in 55/1000 more patients having adverse events up to 2 years (95% CI 37 fewer to 205 more).

All 8 trials reported on individual complications, the pattern of distribution generally reflecting the expected: e.g. infection 8/279 cases after surgery versus 0/280 cases after non‐surgical treatment

*The basis for the assumed risk (e.g. the median control group risk across studies) is provided in footnotes. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).
CI: confidence interval; MCID: minimal clinically important differences; RR: risk ratio; SMD: standardised mean difference

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1. The inclusion/exclusion criteria varied among the trials: one (30 participants) included 2‐, 3‐ or 4‐part fractures; one (60 participants) included only 3‐part fractures that included surgical neck; two (90 participants) included 3‐ or 4‐part fractures, three (137 participants) included only 4‐part fractures. The final trial (250 participants) included "displaced fracture of the proximal humerus that involved the surgical neck", resulting in a few 1‐part (but confirmed as still "displaced") as well as 2‐, 3‐ and 4‐part fractures. The majority of the fractures (146/250 = 58.4%) in the largest trial were either 2‐part (128) or 1‐part (18) fractures. Several trials included further criteria; for example, the largest trial explicitly excluded fracture dislocations (i.e. fractures with an associated dislocation of the injured shoulder joint). Consideration is also needed of other inclusion and exclusion criteria, including multiple trauma, clear indications for surgery (severe soft‐tissue compromise), and co‐morbidities precluding surgery or anaesthesia
2. Patient‐reported functional scores were the Disability of the Arm, Shoulder, and Hand questionnaire (DASH; 2 trials), the Oxford Shoulder Score (OSS; 1 trial); the American Shoulder and Elbow Surgeons (ASES; 1 trial) and Simple Shoulder Test (SST; 1 trial)
3. Although the evidence was first downgraded by one level for study limitations, reflecting a high risk of performance bias relating to lack of blinding in four single‐centre trials, the consistency in the results of these and the fifth and largest trial, where the analysis indicated that the study design limited the risk of bias relating to the inevitable lack of blinding, resulted in an upgrade
4. Patient‐reported functional scores were the Disability of the Arm, Shoulder, and Hand questionnaire (DASH; 2 trials), the Oxford Shoulder Score (OSS; 1 trial); and the American Shoulder and Elbow Surgeons (ASES; 1 trial)
5. The evidence was downgraded by one level for study limitations, reflecting a high risk of performance bias relating to lack of blinding in 3 single‐centre trials. There was, however, consistency in the results of these and the fourth and largest trial, where the analysis indicated that the study design limited the risk of bias relating to the inevitable lack of blinding, resulting in an upgrade
6. The evidence was downgraded by one level for inconsistency, reflecting the statistical heterogeneity (Chi² = 6.76, df = 3 (P = 0.08); I² = 56%), but also data from two trials (102 participants) from the same centre that found minimal clinically important differences favouring surgery
7. The evidence was downgraded one level for imprecision, reflecting that these data were from one trial alone
8. A minimal clinically important difference for the SF‐12 PCS was assumed to be 6.5. Notably, a similar finding applied for the between‐group differences in SF‐12 Mental Component Scores, but the direction of effect favoured non‐surgical treatment
9. Assumed risk is the median control group risk across studies
10. The evidence was downgraded one level for imprecision.
11. The evidence was downgraded one level for inconsistency (heterogeneity: Chi² = 8.50, df = 6 (P = 0.20); I² = 29%), which was greater for the two years follow‐up data (heterogeneity: Chi² = 7.29, df = 3 (P = 0.06); I² = 59%). At two years, three trials (160 participants) reported more additional surgery in the surgery group, but the trial (250 participants) contributing 65% of the weight of the evidence recorded equal numbers of participants (11 versus 11) undergoing additional surgery.

Open in table viewer
Summary of findings 2. Summary of findings: early versus delayed mobilisation for non‐surgically treated proximal humeral fractures

Early versus delayed mobilisation for non‐surgically treated proximal humeral fractures

Patient or population: adults with minimally displaced or displaced (2‐part or 3‐part) proximal humeral fractures (4 trials)
Settings: various, including fracture clinics and physiotherapy

Intervention: early (within or at one week) mobilisation

Comparison: delayed (usual) mobilisation or physiotherapy after three or four weeks immobilisation

Outcomes

Illustrative comparative risks* (95% CI)

Relative effect
(95% CI)

No of Participants
(studies)

Quality of the evidence
(GRADE)

Comments

Assumed risk

Corresponding risk

3 to 4 weeks immobilisation

Early mobilisation (≤ 1 week)

Shoulder disability: Croft Shoulder Disability Score ‐ Disability (1 or more problems)
Follow‐up: 1 year

725 per 10001

428 per 1000
(290 to 638)

RR 0.59 (0.40 to 0.88)

82 participants

(1 study)

⊕⊕⊝⊝
low2

Early mobilisation resulted in 297/1000 fewer people with one or more problems at 1 year (95% CI 87 fewer to 435 fewer)3

Number of treatment sessions (until independent function achieved)

Follow‐up: as described

The mean number of sessions was 14 in the usual timing group4

The mean number of sessions in the early group was
5.0 lower (1.75 to 8.25 sessions lower)

86 participants
(1 study)

⊕⊕⊝⊝
low5

This pertains to early recovery to a level that may vary with individual patients.

SF‐36 scores: pain & physical dimensions ‐ all 3 dimensions 0‐100: higher scores mean better quality of life)
Follow‐up: 16 weeks

The mean values for 3 dimensions in the delayed group4 were:

Physical functioning 69.2
Role limitation physical 39.7
Pain 59.9

The mean values in the early group were:

Physical functioning 0.70 higher (9.91 lower to 11.31 higher)
Role limitation physical 22.2 higher (3.82 to 40.58 higher)
Pain 12.10 higher (3.26 to 20.94 higher)

81 participants
(1 study)

⊕⊕⊝⊝
low6

An overall score was not available.
General physical functioning was high and comparable in the two groups. It is likely that the results for role limitation physical and pain are clinically important. This is consistent with the earlier recovery in independent function judged by treating physiotherapists (see above)

SF‐36 scores: pain & physical dimensions ‐ all 3 dimensions 0‐100: higher scores mean better quality of life)
Follow‐up: 1 year

The mean values for 3 dimensions in the delayed group4 were:

Physical functioning 68.4
Role limitation physical 54.4
Pain 65.6

The mean values in the early group were:

Physical functioning 3.00 lower (16.48 lower to 10.48 higher)
Role limitation physical 5.60 higher (13.75 lower to 24.95 higher)
Pain 3.60 higher (8.19 lower to 15.39 higher)

80 participants
(1 study)

⊕⊕⊝⊝
low6

An overall score was not available.
Results for all three dimensions are comparable in the two groups.

None of best estimates are likely to equate to clinically important differences.

Quality of life assessment: EuroQol 5D (0: dead to 1: best health)

Follow‐up: 1 year

The mean EuroQol 5D score in the early group was 0.764

The mean EuroQol 5D score in the delayed group was
0.09 lower (0.21 lower to 0.03 higher)

39 participants

(1 study)7

⊕⊝⊝⊝
very low8

Similar results of little between‐group differences of no clinical importance applied at 3 and 6 months.

Adverse events:

Shoulder complications

Follow‐up: 1 year

26 per 10009

19 per 1000
(4 to 95)

RR 0.73

(0.15 to 3.63)

259 participants
(4 studies)

⊕⊝⊝⊝
very low10

Reported shoulder complications were frozen shoulder (1 case), complex regional pain syndrome type 1 (2 cases) and treated subacromial impingement (2 cases).

Early mobilisation resulted in 7/1000 fewer people with a shoulder complication at 1 year (95% CI 22 fewer to 69 more)

Adverse events:

Fracture displacement and non‐union

Follow‐up: 1 year

23 per 10009

51 per 1000

(5 to 517)

RR 2.20 (0.22 to 22.45)

106 participants
(2 studies)

⊕⊝⊝⊝
very low10

There were no cases of non‐union. All three fracture displacements (none of which required surgery) occurred in one trial that included displaced fractures

Early mobilisation resulted in 28/1000 more people with a fracture displacement at 1 year (95% CI 18 fewer to 494 more)

*The basis for the assumed risk (e.g. the median control group risk across studies) is provided in footnotes. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).
CI: Confidence interval; RR: Risk Ratio

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1. Control risk based on study data
2. Evidence downgraded one level for one level for imprecision (single small trial) and one level for indirectness (question over outcome measure's validity; the importance of individual problems will vary)
3. Two‐year follow‐up data from the same trial (74 participants) showed that based on a control risk of 595 per 1000 in the delayed group, early mobilisation resulted in 160/1000 fewer people with one or more problems at two years (95% CI 321 fewer to 90 more); very low quality evidence (see above footnote)
4. Data from control group of study
5. Evidence downgraded one level for imprecision (single small trial data) and one level for indirectness ('independent function' and physiotherapy discharge depicts an intermediate outcome)
6. Evidence downgraded one level for study limitations (several domains at unclear risk of bias) and one level for imprecision (single small trial data)
7. Evidence from a trial comparing 1 versus 4 weeks immobilisation for predominantly displaced fractures
8. Evidence downgraded one level for study limitations (study at high risk of bias) and two levels for imprecision (wide confidence intervals; single small trial data)
9. The assumed risk is the median control group risk across studies
10. Evidence downgraded one level for study limitations and two levels for imprecision (sparse data and wide confidence intervals)

Background

Description of the condition

Proximal humeral fractures account for approximately six per cent of all adult fractures (Court‐Brown 2006). Their incidence rapidly increases with age, and women are affected between two and three times as often as men (Court‐Brown 2006; Lind 1989). Many patients who sustain a proximal humeral fracture are old and their bones are osteoporotic. Court‐Brown 2001 found that 87% of these fractures in adults resulted from falls from standing height. Palvanen 2006 found that the incidence of osteoporotic‐related fractures of the proximal humerus in Finland had tripled between 1970 and 2002 to 105 per 100,000 people aged 60 or above. An epidemiological study of upper‐limb fractures occurring in 2009 in the USA reported an incidence of 60 proximal humeral fractures per 100,000 people overall, with four‐fold increased incidence of 253 per 100,000 in those aged 65 or older (Karl 2015).

Most proximal humeral fractures are closed fractures in that the overlying skin remains intact. The most commonly used classification of shoulder fractures is that of Neer (Neer 1970). Neer considered four anatomical segments of the proximal humerus ‐ the articular part, the greater tuberosity, the lesser tuberosity and the humeral shaft. These may be affected by fracture lines but are only considered as a 'part' if displaced by more than one centimetre or 45 degrees angulation from each other. Fractures, regardless of the number of fracture lines present, which did not meet the criteria for displacement of any one segment with respect to the others were considered 'minimally displaced'; these are sometimes referred to as one‐part fractures. Neer's other categories, two‐part, three‐part and four‐part fractures all involved the displacement or angulation of some or all of the above four segments. Each of these fracture types may be potentially associated with an anterior or posterior humeral head dislocation.

At initial presentation, it may be difficult to delineate the exact pathoanatomical pattern of the fracture even with sophisticated imaging. In any event, this may not correlate with the extent to which the vascularity (blood supply) of the humeral head is compromised. The vascularity of the proximal humerus is a primary focus of another widely used classification system for these fractures, the AO classification system (Muller 1991), which was updated in conjunction with the OTA classification in 2007 (Marsh 2007). There are three main types (A, B, C), which in turn are further divided into three groups, each with a further three subgroups. Type A fractures are "extra‐articular, unifocal, with intact vascular supply"; type B fractures are "extra‐articular, bifocal, with possible vascular compromise"; and type C fractures are "articular, with a high likelihood of vascular compromise" (Robinson 2008).

Many proximal humeral fractures are only minimally displaced. Neer's estimate (Neer 1970) that approximately 85% of all proximal humeral fractures are minimally displaced, in that no bone fragment is displaced by more than one centimetre, or angulated by more than 45 degrees is often cited (Koval 1997). However, a lower figure of 49% was reported in a prospective consecutive series of over 1000 proximal humeral fractures (Court‐Brown 2001).

Description of the intervention

Non‐surgical (conservative) treatment is generally the accepted treatment option for minimally displaced fractures, and often used also for people with displaced fractures. Non‐surgical treatment usually involves a period of immobilisation, such as in an arm sling, followed by physiotherapy and exercises. Non‐surgical treatment can include closed reduction, where the displaced bone fragments are reduced using various manoeuvres while the arm is under traction. Various aspects of non‐surgical treatment, such as the arm sling and collar and cuff, are illustrated online (AO 2015). Older types of bandages, such as the Desault and Velpeau, are illustrated in Brorson 2011a.

Surgery is usually reserved for displaced and unstable fractures and those with more complicated fracture patterns. Surgical interventions include:

  • closed reduction and percutaneous stabilisation using pins or wires;

  • external fixation;

  • open reduction and plating, for example buttress plates, angle blade plates and proximal humeral locking plates;

  • open reduction and fixation using a tension‐band principle;

  • intramedullary nailing, either antegrade or retrograde insertion (intramedullary nails usually offer the option of locking screws, which are inserted into fracture fragments then transverse the nail, providing additional fracture stability);

  • hemiarthroplasty (replacement of the humeral head);

  • total shoulder replacement (replacement of the entire joint; both the 'ball' (humeral head) and 'socket' (glenoid)). There are two distinct types: anatomical and reverse shoulder arthroplasty. In reverse arthroplasty the joint polarity is reversed such that the ball is on glenoid side and the socket (fixed on a stem) on the humeral side.

Post‐operative treatment generally involves a period of immobilisation followed by physiotherapy and exercises.

How the intervention might work

Immobilisation of the injured limb provides support and pain relief during healing. However, there is a risk of the shoulder becoming stiff and painful with substantial reduction of function. Subsequent physiotherapy and exercises aim to restore function and mobility of the injured (or operated) arm. Malunion of proximal humeral fractures may result in impingement or compromised function of the 'rotator cuff' of muscles and tendons that surrounds the shoulder joint. Persistent pain and painful pseudoparalysis are common indications for late surgery.

After reduction or repositioning of the fractured parts, surgical fixation using various techniques aims to stabilise the reduced fracture and restore joint integrity. Surgical stabilisation of the fracture may also allow earlier movement of the shoulder and elbow, preventing stiffness. Surgeons have often followed Neer's premise (Neer 1975) that head avascular necrosis is virtually guaranteed in a four‐part fracture and have usually offered these patients a hemiarthroplasty, where the humeral head is replaced by an artificial part. An exception is often made for a specific type of four‐part fracture, the valgus impacted four‐part fracture, which was not mentioned initially in Neer's classification. This fracture, where the fractured parts are compressed towards each other, is less likely to lead to avascular necrosis of the humeral head, provided the lateral displacement of the head fragment is not excessive (Jakob 1991; Resch 1997). Bone quality also influences the appropriateness of any intervention and hence the long term clinical outcome. Furthermore, the patient's frailty may lead to a low rehabilitation drive and delay any recovery from both the initial trauma and any subsequent management.

Why it is important to do this review

Proximal humeral fractures are increasing in incidence, particularly in older people, and the short and long term consequences for individuals with these injuries and society are substantial (Palvanen 2006). There is considerable variation in practice, both in terms of definitive treatment such as surgical treatment for displaced fractures (Guy 2010) and rehabilitation (Hodgson 2006). Variation in practice includes that of the uptake of new implants, typically before their effectiveness has been evaluated, as illustrated for reverse shoulder arthoplasty in the USA (Schairer 2015). The costs of treating these fractures are also substantial and growing. The direct health‐care costs, adjusted to 2007 prices, in the Netherlands of upper arm fractures, the majority of which were proximal humeral fractures, were EUR 4,440 per case with an overall annual cost of approximately EUR 40M (Polinder 2013). Polinder 2013 suggested that the increase in the cost of fracture care in 'elderly women’ from a previous report of costs in the Netherlands was partly because of a higher incidence of surgery. This trend to increased surgery also applies in other countries such as the USA (Bell 2011). A very recently published report by the same team in the Netherlands estimated the medical costs per case in 2012, including hospitalisation, rehabilitation and nursing care, and, primarily in patients aged over 80 years, home care costs, were EUR 11,224 (Mahabier 2015). The estimated costs for lost productivity including time off work and other costs for those in work was EUR 20,374 per case in 2012. Schairer 2015 found the estimated mean hospital costs in 2011 in the USA were significantly higher for reverse shoulder arthroplasty than for hemiarthroplasty (USD 21,723 versus USD 18,122), a difference which was almost three times greater when it came to mean hospital charges (USD 75,849 versus USD 65,477). The often poor treatment outcome and the increasing incidence of these fractures, the increasing use of surgery and of reverse shoulder arthoplasty (Han 2015), high treatment costs and variations in practice all endorse the need for this review update.

The last two versions of this review noted the insufficiency of the evidence to inform practice, but also located ongoing trials that could potentially help to address this deficiency (Handoll 2007; Handoll 2010). This update continues the systematic review of the evidence for managing these fractures.

Objectives

To assess the effects (benefits and harms) of treatment and rehabilitation interventions for proximal humeral fractures in adults.

We defined a priori the following broad objectives:

  • to compare different methods of non‐surgical treatment (including rehabilitation);

  • to compare surgical versus non‐surgical treatment;

  • to compare different methods of surgical treatment;

  • to compare different methods of rehabilitation after surgical treatment.

We planned to study the outcomes in different age groups (initially, under versus over 65 years) and for different types of proximal humeral fractures.

Methods

Criteria for considering studies for this review

Types of studies

We included randomised or quasi‐randomised (method of allocating participants to a treatment that is not strictly random; e.g. by hospital record number) trials which compared two or more interventions in the management of fractures of the proximal humerus in adults.

Types of participants

We included adults with a fracture of the proximal humerus. Stratification was planned by fracture type (e.g. based on the Neer classification (Neer 1970) or the AO classification (Muller 1991)) and by age (under versus over 65 years) if possible. Trials including children were included provided either separate data for skeletally mature participants were available or the proportion of children was small and, preferably, balanced in intervention groups.

Types of interventions

Non‐surgical and surgical interventions, as exemplified in Description of the intervention, used in the treatment and rehabilitation of fractures of the proximal humerus. Pharmacological trials were excluded.

Types of outcome measures

The primary focus is on long term functional outcome, preferably measured at one year or more.

Primary outcomes

  • Functional outcomes: patient‐reported measures of upper‐limb function (e.g. the Disability of the Arm, Shoulder, and Hand questionnaire (DASH), the Oxford Shoulder Score (OSS; Dawson 1996; Dawson 2009), and other validated shoulder rating scales).

  • Activities of daily living and health‐related quality‐of‐life scores (e.g. EuroQol (EQ‐5D); Short‐Form 36 (SF‐36) and Short‐Form 12 (SF‐12; Ware 1996).

  • Serious adverse events (e.g. death, deep infection, avascular necrosis, complex regional pain syndrome type 1) and need for substantive treatment, such as an operation.

Secondary outcomes

  • Composite scores of subjectively and objectively rated function and overall outcome (e.g. Constant and Murley's score (Constant 1987); Neer's rating (Neer 1970)).

  • Pain.

  • Upper limb strength and range of movement.

  • Less serious complications/adverse events of limited duration and impact (e.g. superficial infection, transient paraesthesia, skin irritation).

  • Patient satisfaction with treatment, including cosmetic outcomes.

  • Anatomical outcomes (e.g. radiological deformity).

Economic outcomes: each trial report was reviewed for cost and resource data, such as length of hospital stay and number of outpatient attendances, that would enable economic evaluation.

We based our judgement of clinically important between‐group mean differences in measures of pain and function using the following minimal clinically important differences (MCID); alternative sources are listed after the main selected item in bold.

  • ASES (0 to 100: best outcome) (rotator cuff disease): 12.01 (function‐based) (Tashjian 2010).

  • Constant score (0 to 100: best outcome) (proximal humerus fracture): 11.6 (anchor‐based), 5.1 (distribution‐based) (Van de Water 2014); (upper limb proximal diagnosis): MCID 10.2 (Schmitt 2004)

  • DASH (0 to 100: worst outcome) (proximal humerus fracture): 13.0 (anchor‐based), 8.1 (distribution‐based) (Van de Water 2014); 15 recommended in DASH/QuickDASH

  • EQ‐5D (0 to 1: best outcome) (proximal humerus fracture): 0.12 (assessed in relation to a DASH MCID of 10) (Olerud 2011c)

  • OSS (0 to 48: best outcome) (proximal humerus fracture): 11.4 (anchor‐based), 5.1 (distribution‐based) (Van de Water 2014)

  • QuickDASH (0 to 100: worst outcome): 16 in DASH/QuickDASH; 8 (shoulder pain) (Mintken 2009)

  • SF‐12‐PCS (0 to 100: best outcome) (physical component score) (upper limb proximal diagnosis): MCID 6.5 (Schmitt 2004)

  • SST (0 to 12: best outcome) (rotator cuff disease): 2.05 (Tashjian 2010)

  • UCLA (2 to 35: best outcome) (proximal humerus fracture): 2.4 (anchor‐based), 2.0 (distribution‐based) (Van de Water 2014)

Search methods for identification of studies

Electronic searches

We searched the Cochrane Bone, Joint and Muscle Trauma Group Specialised Register (10 November 2014), the Cochrane Central Register of Controlled Trials (CENTRAL) (The Cochrane Library 2014, Issue 10), MEDLINE (1966 to October Week 5 2014), MEDLINE In‐Process & Other Non‐Indexed Citations (7 November 2014), EMBASE (1988 to 2014 Week 45), CINAHL (Cumulative Index to Nursing and Allied Health Literature) (10 November 2014), AMED (Allied and Complementary Medicine) (1985 to 10 November 2014), and PEDro ‐ Physiotherapy Evidence Database (10 November 2014).

In MEDLINE, we combined subject‐specific terms with the sensitivity‐maximizing version of the Cochrane Highly Sensitive Search Strategy for identifying randomised trials (Lefebvre 2011) (Appendix 1). Search strategies for CENTRAL, EMBASE, CINAHL, AMED and PEDro can also be found in Appendix 1. For this update, the search results were limited from January 2012 onwards. Details of the search strategies used for previous versions of the review are given in Handoll 2007, Handoll 2010 and Handoll 2012. We applied no language or publication restrictions.

We searched the WHO International Clinical Trials Registry Platform Search Portal, the ISRCTN registry, and ClinicalTrials.gov to identify ongoing and recently completed trials (10 November 2014) (see Appendix 1).

Searching other resources

We searched the reference lists of articles. We also included the findings from handsearches of the British Volume of the Journal of Bone and Joint Surgery supplements (1996 to 2006) and electronic searches of the The Bone and Joint Journal Orthopaedic Proceedings (10 November 2014) (see Appendix 1). We searched abstracts of the British Elbow and Shoulder Society annual meetings (2001 to 2013), the American Orthopaedic Trauma Association annual meetings (1996 to 2014), American Academy of Orthopaedic Surgeons annual meetings (2005, 2006, 2014), and the British Trauma Society annual scientific meetings (2012 and 2014). Prior to this update, we handsearched various orthopaedic proceedings and screened weekly downloads from AMEDEO (to 2007), the details of which can be found in Handoll 2012.

Data collection and analysis

Selection of studies

For this update, both review authors independently screened search results and assessed potentially eligible studies for inclusion. The initial decisions of trial eligibility were based on citations and, where available, abstracts and indexing terms. We obtained full articles and, where necessary to ascertain trial methods and status, one author (HH) sent requests for information to trial investigators. Study inclusion was by consensus. Titles of journals, names of authors or supporting institutions were not masked at any stage. Both authors performed independent study selection on the trials for which the other author was an investigator.

Data extraction and management

Both review authors independently completed a data extraction tool, which had been used in the previous version of the review, for each newly included trial. We recorded details of the study methods, participants, interventions and outcome assessment and results. Any differences that were clearly not transcription errors were discussed between review authors. Data management and entry into Review Manager (RevMan 2014) was mainly by one author (HH) with checks made by both review authors. When necessary, additional details of trial methodology or data, or both were requested from trialists.

Assessment of risk of bias in included studies

Both review authors independently assessed risk of bias for newly included trials, without masking of the source and authorship of the trial reports. HH checked between‐rater and between‐versions consistency in assessment at data entry. All inter‐rater differences were resolved by discussion. We used the tool outlined in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2008a). This tool incorporates assessment of randomisation (sequence generation and allocation concealment), blinding (of participants, treatment providers and outcome assessment), completeness of outcome data, selection of outcomes reported and other sources of bias. We considered subjective and functional outcomes (e.g. functional outcomes, pain, clinical outcomes, complications) and 'hard' outcomes (death, reoperation) separately in our assessment of blinding and completeness of outcome data. We assessed two additional sources of bias: bias resulting from major imbalances in key baseline characteristics (e.g. age, gender, type of fracture); and performance bias such as resulting from lack of comparability in the experience of care providers.

Additionally, we assessed four other aspects of trial quality and reporting that would help us judge the applicability of the trial findings. The four aspects were: definition of the study population; description of the interventions; definition of primary outcome measures; and length of follow‐up.

Measures of treatment effect

For each trial, risk ratios (RR) and 95% confidence intervals (CIs) were calculated for dichotomous outcomes, and mean differences (MD) and 95% CIs were calculated for continuous outcomes. Standardised mean differences (SMD) rather than mean differences were used when pooling data from continuous outcome measures based on different scoring schemes.

Unit of analysis issues

We remained aware of potential unit of analysis issues arising from inclusion of participants with bilateral fractures, and presentation of outcomes, such as total complications, by the number of outcomes rather than participants with these outcomes. There was just one patient with bilateral fractures (Kristiansen 1988) but there was insufficient information to quantify the small difference this would have made to study findings. We avoided the second described unit of analysis problem, mainly by reporting on incidences of individual complications.

Dealing with missing data

We contacted trialists for missing information, including for denominators and standard deviations. We performed intention‐to‐treat analyses where possible. Where there were missing standard deviations, we calculated these from other data (standard errors, 95% CIs, exact P values) where available. We did not impute missing standard deviations.

Assessment of heterogeneity

We assessed heterogeneity for pooled data from comparable trials by visual inspection of the analyses along with consideration of the chi² test for heterogeneity and the I² statistic (Higgins 2003). The main quantitative assessment of heterogeneity was based on the I² statistic where the following interpretation from the Cochrane Handbook for Systematic Reviews of Interventions was used: 0% to 40% might not be important; 30% to 60% may represent moderate heterogeneity; 50% to 90% may represent substantial heterogeneity; and 75% to 100% considerable heterogeneity (Deeks 2011).

Assessment of reporting biases

There are insufficient data thus far (a minimum of 10 trials is required) to merit the production of funnel plots to explore publication bias. The search for trials via conference proceedings and trial registration, together with the contacting of authors for information of trial status and progress has provided some insights on unpublished trials, which generally were abandoned because of poor recruitment.

Data synthesis

Where the data allowed, the results of comparable groups of trials were pooled using both fixed‐effect and random‐effects models; the selection of the model for presentation was determined by the consideration of the extent of the heterogeneity.

Subgroup analysis and investigation of heterogeneity

We set out a priori two subgroup analyses: by age (primarily, under versus over 65 years) and by types of fracture (primarily, minimally displaced versus displaced, based on the Neer classification). To test whether the subgroups are statistically significantly different from one another, we planned to inspect the overlap of confidence intervals and perform the test for subgroup differences available in Review Manager.

Sensitivity analysis

We planned sensitivity analyses based on aspects of trial and review methodology, including the effects of missing data, the inclusion of studies at high or unclear risk of bias (primarily, selection bias with reference to allocation concealment), the inclusion of studies only reported in abstracts and using fixed‐effect versus random‐effects models for pooling.

'Summary of findings' tables and quality assessment of the evidence

We produced 'Summary of findings' tables only for the two comparisons where a more substantive body of evidence had accrued. We used the GRADE approach to assess the quality of evidence related to each of the key outcomes listed in the Types of outcome measures for each comparison (see the Cochrane Handbook for Systematic Reviews of Interventions Section 12.2, Schunemann 2011).

Results

Description of studies

Results of the search

The search was updated from January 2012 to November 2014. We screened a total of 796 records from the following databases: the Cochrane Bone, Joint and Muscle Trauma Group Specialised Register (26), CENTRAL (91), MEDLINE (114), EMBASE (199), CINAHL (129), AMED (5), PEDro (55), WHO Trials Registry (61), ISRCTN registry (66) and ClinicalTrials.gov (50). We also identified four potentially eligible studies from other sources (abstracts of American Academy of Orthopaedic Surgeons annual meeting 2014 (331), the American Orthopaedic Trauma Association annual meetings (2012 to 2014) (96), The Bone and Joint Journal Orthopaedic Proceedings (13); British Elbow and Shoulder Society annual meetings (2011 to 2013) (23), British Trauma Society Annual Scientific Meeting 2014 (37); and reports for three other trials from the review authors (4)). Subsequent notification of an ongoing study was received from a trialist (Torrens). One other ongoing study was identified from a subsequent trials registry search.

Overall, 32 new studies were identified. Of these, eight were included (Boons 2012; Buecking 2014; Cai 2012; Lopiz 2014; ProFHER 2015 (5 references, including 1 trial registration and trial protocol); Sebastiá‐Forcada 2014; Soliman 2013 (2 references, including 1 trial registration); Torrens 2012 (1 reference and unpublished data)), 12 were excluded (Cigni 2012; Elidrissi 2013; Erdoğan 2014; Fan 2012; IRCT2013052313435N1; Maniscalco 2014a (2 references); Martetschlager 2012; NCT00384852; NCT01532076; NCT02122315; NTR2186; Zuckerman 2012), eight were placed in ongoing trials (NCT01524965; NCT01847508; NCT01984112; NCT02075476; NTR4019; ROTATE (2 references, including 1 trial registration); SHeRPA; Torrens) and four await classification (Liu 2011 (2 papers); NCT02052206; Wang 2013; Zhu 2014).

Further information was obtained for several studies in the previous version (Handoll 2012); this included the two‐year follow‐up report (Fjalestad 2014a) of functional outcome for Fjalestad 2010, and an additional article (Ockert 2014), which reported on 48 additional participants for Ockert 2010. A trial registration document and published protocol (Fjalestad 2014b) were found for a newly designated ongoing trial (DELPHI), previously Fjalestad (RCT proposal) in studies awaiting classification. Published protocols were also found for two ongoing trials (HOMERUS (Verbeek 2012); TPHF (Launonen 2012)). Additional information from updated trial registration documentation was added in for seven ongoing trials (HURA; NCT00438633; NCT00818987; NCT00999193; NCT01113411; NCT01557413; TPHF). Additional information was also available for Brorson 2009 which was moved from ongoing to studies awaiting classification.

Summaries of the trial populations of past and the present versions of this review as well as the changes between updates are presented in Appendix 2.

In all, 31 trials are now included, 38 trials are excluded, 21 trials are listed as ongoing and seven are in Studies awaiting classification. A flow diagram summarising the study selection process is shown in Figure 1.


8 (9 articles) new ongoing studies; additional materials (9 articles) for 7 already ongoing studies
 4 new studies (5 articles) awaiting classification
 Change in status:
 1 previously ongoing study to awaiting classification (1 extra article)
 1 study previously awaiting classification to ongoing (2 extra articles)
 Study flow diagram

8 (9 articles) new ongoing studies; additional materials (9 articles) for 7 already ongoing studies
4 new studies (5 articles) awaiting classification
Change in status:
1 previously ongoing study to awaiting classification (1 extra article)
1 study previously awaiting classification to ongoing (2 extra articles)
Study flow diagram

Included studies

Thirty included trials were published as full reports in journals, their availability ranging from 1979 (Lundberg 1979) to 2015 (ProFHER 2015). The remaining trial was published as a conference abstract only (Torrens 2012). Additional information via other publications, conference abstracts, trial registration details and communications from trial investigators were available for 15 trials (Agorastides 2007; Boons 2012; Fjalestad 2010; Hodgson 2003; Hoellen 1997; Lefevre‐Colau 2007; Ockert 2010; Olerud 2011a; Olerud 2011b; ProFHER 2015; Soliman 2013; Torrens 2012; Voigt 2011; Zhang 2011; Zyto 1997); these sometimes preceded the availability of the main report. Details of study methods, participants, interventions and outcome measurement for the individual studies are provided in the Characteristics of included studies and summarised below.

Design

Thirty trials were randomised clinical trials, although seven of these provided no details of their method of randomisation and thus the use of quasi‐randomised methods for sequence generation cannot be ruled out (Cai 2012; Hoellen 1997; Kristiansen 1988; Kristiansen 1989; Lundberg 1979; Stableforth 1984; Wirbel 1999). Rommens 1993 was a quasi‐randomised trial using alternation for treatment allocation. Livesley 1992 was double‐blinded. Of note is that the design of ProFHER 2015, a multicentre trial that compared surgical versus non‐surgical treatment, was purposefully pragmatic such as in the requirement for individual surgeons to use surgical methods and implants with which they were familiar.

Sample sizes

The 31 included trials involved a total of 1941 participants. Study size ranged from 20 participants (Bertoft 1984) to 250 participants (ProFHER 2015). One trial (Kristiansen 1989) included one person with bilateral fractures; the treatment allocation for this participant is unclear.

Setting

Thirty of the 31 included trials were single centre studies conducted in 13 different countries: Austria (1 trial); Belgium (1); China (3); Czech Republic (1); Denmark (2); Egypt (1); France (1); Germany (5); The Netherlands (1); Norway (1); Spain (3); Sweden (6) and UK (4). (Though essentially a single centre trial, the interventions in Hodgson 2003 were undertaken at two centres within an NHS Trust in the UK.) The remaining trial was a multicentre trial conducted in the UK (ProFHER 2015). Details of the timing or duration or both of trial recruitment provided for 26 included trials (see the Characteristics of included studies) show Stableforth 1984 to have the earliest start date (1970) and longest period of recruitment (11 years).

Participants

With the exception of one trial (Soliman 2013), the majority of participants in each trial were women (67% to 94% of their trial population). Most participants were aged 60 and above; two trials included a small number of children (Livesley 1992; Wirbel 1999). Seventeen trials set lower age limits. In 10 of these (Boons 2012; Cai 2012; Fialka 2008; Fjalestad 2010; Hodgson 2003; Hoellen 1997; Olerud 2011a; Olerud 2011b; Sebastiá‐Forcada 2014; Voigt 2011), the age limit restricted the population to older adults; the most extreme was Sebastiá‐Forcada 2014, where only people who were 70 years or over were included. Zyto 1997 specified that participants should be "elderly". Exceptionally, the participants of Soliman 2013 were aged between 45 to 60 years, with the majority (78%) being male.

Five trials included only minimally displaced fractures (Bertoft 1984; Hodgson 2003; Livesley 1992; Lundberg 1979; Revay 1992), whereas 22 selected only people with displaced fractures (Agorastides 2007; Boons 2012; Buecking 2014; Cai 2012; Fialka 2008; Fjalestad 2010; Hoellen 1997; Kristiansen 1988; Lopiz 2014; Ockert 2010; Olerud 2011a; Olerud 2011b; ProFHER 2015; Sebastiá‐Forcada 2014; Smejkal 2011; Soliman 2013; Stableforth 1984; Voigt 2011; Wirbel 1999; Zhang 2011; Zhu 2011; Zyto 1997). The majority of fractures were minimally displaced in Kristiansen 1989 and Rommens 1993. Lefevre‐Colau 2007 included either minimally displaced or "stable" impacted fractures, the latter included two‐part and three‐part fractures. Torrens 2012 included either minimally displaced or displaced fractures (two‐part or three‐part fractures were reported). Fractures were graded using the Neer classification system (Neer 1970) in 28 trials, together with the AO classification system (Muller 1991) in Fialka 2008, Lefevre‐Colau 2007 and Smejkal 2011. A modification of the AO classification system was described in Wirbel 1999, and no specific classification system was referred to in the remaining two trials (Bertoft 1984; Rommens 1993).

Interventions

Eleven trials evaluated non‐surgical treatment; however, this was post‐surgical treatment in two of these. Eight trials compared surgical with non‐surgical treatment and 12 compared two methods of surgery. A list of the comparisons, associated trials and numbers of trial participants, grouped according to the main objectives presented in the Objectives is given below.

Methods of non‐surgical management (including rehabilitation)

Initial treatment, including immobilisation

  • "Immediate" physiotherapy within one week of fracture versus delayed physiotherapy after three weeks of immobilisation in a collar and cuff sling: Hodgson 2003 (86 participants).

  • Immobilisation in sling and body bandage for one week versus three weeks: Kristiansen 1989 (85 participants).

  • Physiotherapy started within three days of fracture versus delayed physiotherapy after three weeks of immobilisation in a sling: Lefevre‐Colau 2007 (74 participants).

  • Immobilisation in sling for one week versus four weeks; all followed same "progressive rehabilitation" regimen: Torrens 2012 (42 participants).

  • Gilchrist arm sling versus "classic" Desault bandage: Rommens 1993 (28 participants).

Continuing management (rehabilitation) after initial sling immobilisation

  • Instructed self‐exercise versus conventional physiotherapy: Bertoft 1984 (20 participants); Lundberg 1979 (42 participants).

  • Swimming pool treatment plus self‐training versus self‐training alone: Revay 1992 (48 participants).

  • Apparatus supplying pulsed electromagnetic field versus dummy apparatus: Livesley 1992 (48 participants).

Surgical treatment versus non‐surgical treatment

The currently available trials fall into three subcategories but are all treated together in Effects of interventions.

Fracture fixation versus non‐surgical treatment

  • Percutaneous reduction and external fixation versus closed manipulation and sling: Kristiansen 1988 (30 participants).

  • Internal fixation using surgical tension band or cerclage wiring versus sling: Zyto 1997 (40 participants; three more were recorded in Tornkvist 1995, another report of Zyto 1997).

  • Surgery involving open reduction and fixation with a locking plate and metal cerclages versus non‐surgical treatment starting with immobilisation of the injured arm in a modified Velpeau bandage: Fjalestad 2010 (50 participants).

  • Surgery involving open reduction and fixation with a PHILOS plate and nonabsorbable sutures versus non‐surgical treatment starting with arm immobilisation in a sling: Olerud 2011a (60 participants).

Arthroplasty versus non‐surgical treatment

  • Hemiarthroplasty using the Neer prosthesis versus closed manipulation and sling: Stableforth 1984 (32 participants).

  • Humeral head replacement with the Global Fx prosthesis versus non‐surgical treatment starting with arm immobilisation in a sling: Olerud 2011b (55 participants).

  • Humeral head replacement with the Global Fx prosthesis versus arm immobiliser alone: Boons 2012 (50 participants)

Surgery (surgeon's choice of method according to their experience) versus non‐surgical treatment

  • Surgery involving internal fixation (primarily locking plate fixation, most commonly PHILOS plate) or hemiarthroplasty versus sling: ProFHER 2015 (250 participants)

Different methods of surgical management

Comparisons of different categories of surgical intervention

  • Open reduction with internal fixation using a locking plate (LPHP or PHILOS) versus a locking nail (PHN): Zhu 2011 (57 participants).

  • Open reduction and internal fixation using a PHILOS plate versus the Zifko method of minimally invasive fixation with intramedullary K‐wire (Kirschner wire) insertion (distally inserted): Smejkal 2011 (61 participants).

  • Hemiarthroplasty using a DuPuy prosthesis versus open reduction and PHILOS plate fixation: Cai 2012 (32 participants)

  • Hemiarthroplasty using a Global prothesis versus tension band wiring: Hoellen 1997 (30 participants); an additional nine participants were reported in another report of this trial (Holbein 1999).

  • Reverse shoulder arthroplasty using the SMR Reverse prosthesis versus hemiarthroplasty using the SMR Trauma prosthesis: Sebastiá‐Forcada 2014 (62 participants).

Comparisons of different methods of performing an intervention in the same category

  • Deltoid‐split approach versus deltopectoral approach for non‐contact bridging plate fixation: Buecking 2014 (120 participants)

  • Polyaxial versus monoaxial locking plate fixation. NCB‐PH plate versus PHILOS plate: Ockert 2010 (76 participants; 124 in a later report of this trial (Ockert 2014)); and HSP plate versus PHILOS plate: Voigt 2011 (56 participants).

  • Open reduction with internal fixation with PHILOS locking plate with or without the use of medial support locking screws: Zhang 2011 (72 participants).

  • MultiLoc proximal humeral nail (MPHN) ‐ a straight nail ‐ versus Polarus humeral nail ‐ a curved nail: Lopiz 2014 (54 participants)

  • Hemiarthroplasty using an EPOCA prosthesis versus hemiarthroplasty using a HAS prosthesis: Fialka 2008 (40 participants).

  • Hemiarthroplasty with tenodesis of the long head of the biceps (LHB) versus hemiarthroplasty with LHB tendon left intact: Soliman 2013 (45 participants)

Continuing management (including rehabilitation) after surgical intervention

  • Immobilisation in sling for one week versus three weeks after percutaneous fixation: Wirbel 1999 (77 participants).

  • Early active‐assisted mobilisation (after two weeks) versus late mobilisation (after six weeks) after cemented hemiarthroplasty: Agorastides 2007 (59 participants).

Outcomes

Many trials in previous versions of this review preceded the availability of validated patient‐reported outcome measures (e.g. DASH, Oxford Shoulder Score (Dawson 1996)) for assessing function. From the 2012 update of this review (Handoll 2012), data for these types of outcome have become available from a growing number of trials (Boons 2012; Cai 2012; Fjalestad 2010; Olerud 2011a; Olerud 2011b; ProFHER 2015; Sebastiá‐Forcada 2014; Voigt 2011). All trials except Ockert 2010 assessed functioning and pain, but often reported these as part of a combined overall assessment, such as that of Neer (Neer 1970) and Constant (Constant 1987), that included other measures. Most trials reported on adverse events or complications. Exceptionally, Fjalestad 2010 and ProFHER 2015 reported on costs. Livesley 1992 did not provide outcomes split by treatment group.

Excluded studies

Brief details and reasons for exclusion for 26 studies are given in the Characteristics of excluded studies. It is noteworthy that 11 excluded studies were trials that were registered (usually in the now archived National Research Register, UK) but either did not take place (Mechlenburg 2009) or were abandoned due to lack of or poor recruitment (Brownson 2001; Dias 2001; Flannery 2006; Hems 2000; Sinopidis 2010; Wallace 2000;Welsh 2000) or perhaps both of these (Pullen 2007); or not put forward for publication due to compromised methods or data (Bing 2002; Martin 2000). Edelson 2008 also reported an abandoned randomised trial because of lack of patient consent.

Ongoing studies

Details of the 21 ongoing trials are given in the Characteristics of ongoing studies. Two trials, both with three interventions under test, appear in two comparisons (NCT00999193; TPHF). Just two trials (aim 140 participants in total) compare different interventions, early versus late mobilisation or physiotherapy, for non‐surgically treated patients (NCT00438633; Torrens). There are four trials (aim 580 participants in total) comparing surgical versus non‐surgical treatment; three are multicentre trials (NCT00818987; ProCon; TPHF) and one is a single‐centre trial (NCT00999193). Three trials (aim 248 participants in total) are comparing nailing versus plating (NCT01557413; NCT01984112; NTR4019); three trials (aim approximately 385 participants in total) are comparing hemiarthroplasty versus plating (NCT00999193; HOMERUS; TPHF); one trial (aim 120 participants) is comparing reverse shoulder arthroplasty versus plating (DELPHI); and three trials (aim 142 participants in total) are comparing reverse shoulder arthroplasty versus hemiarthroplasty (NCT02075476; NTR3208; SHeRPA). Four trials are comparing different methods of performing an intervention in the same category of which two trials (aim 180 participants in total) are comparing minimally invasive versus usual methods of locking plate fixation (ACTRN12610000730000; HURA), one trial (aim 128 participants) is evaluating screw augmentation of locking plate fixation (NCT01847508) and one trial (aim 40 participants) is comparing two designs of reverse shoulder arthroplasty (NCT01086202). Lastly, two trials (aim 180 participants in total) are evaluating early versus standard rehabilitation after locking plate fixation (NCT01113411; NCT01524965) and one trial is comparing an external rotation brace versus a polysling with the arm in internal rotation (ROTATE).

Studies awaiting classification

Seven studies await classification (see the Characteristics of studies awaiting classification). There are insufficient data for Battistella 2011, reported in a conference abstract only. Brorson 2009, which was listed as ongoing in the 2012 version of the review was stopped after recruiting 25 participants; the use of these data is under discussion. Requests for clarification on study design have been sent to the contact authors of three trials testing bone grafts or substitutes (Liu 2011; Wang 2013; Zhu 2014). The full report of Luo 2008, which tests acupuncture and reports limited findings at one month follow‐up, is in Chinese and we will seek translation of this article for a future update. NCT02052206, which is an ongoing trial, also includes patients with osteoarthritis.

New studies found at this update

Eight trials, including a total of 601 participants, were newly included in this update. One trial compared different methods of non‐surgical treatment (Torrens 2012); two trials compared surgical with non‐surgical treatment (Boons 2012; ProFHER 2015), and the other five trials compared different methods of surgery (Buecking 2014; Cai 2012; Lopiz 2014; Sebastiá‐Forcada 2014; Soliman 2013).

Risk of bias in included studies

The risk of bias judgements on nine items for the individual trials are summarised in Figure 2 and described in the risk of bias tables in the Characteristics of included studies. A 'Yes' (+) judgement means that the authors considered there was a low risk of bias associated with the item, whereas a 'No' (‐) means that there was a high risk of bias. Frequently assessments resulted in an 'Unclear' (?) verdict; this often reflected a lack of information upon which to judge the item (see Figure 3). However, lack of information on blinding for functional outcomes was always taken to imply that there was no blinding and rated as a 'No'.


Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.


Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Allocation

Twelve trials were judged at low risk of selection bias resulting from adequate sequence generation and allocation concealment (Bertoft 1984; Boons 2012; Buecking 2014; Fjalestad 2010; Lefevre‐Colau 2007; Lopiz 2014; Olerud 2011a; Olerud 2011b; ProFHER 2015; Sebastiá‐Forcada 2014; Smejkal 2011; Voigt 2011); and a further two trials also took adequate measures to safeguard allocation concealment (Hodgson 2003; Livesley 1992). Based on its post‐randomisation application of exclusion criteria, Ockert 2010 was judged at high risk of selection bias; as was Rommens 1993, which was a quasi‐randomised trial using alternation.

Blinding

A low risk of detection bias for functional outcomes resulting from assessor and participant blinding was judged for Livesley 1992, which used sham controls, and Soliman 2013, where the intervention was very likely to have remained unknown to the blinded assessor of Constant scores. While several other trials reported blinded assessors, the lack of reporting of adequate safeguards and the lack of blinding of participants or care providers meant that the risk of bias was considered unclear. A high risk of bias reflecting no reporting or indication of blinding was likely in 21 trials. Exceptionally, ProFHER 2015, which did not blind trial participants, personnel or outcome assessment, was rated at 'unclear' risk of bias. This is because statistical tests showed a lack of a significant effect of baseline patient preferences on the primary outcome results (Oxford Shoulder Score).

Incomplete outcome data

Eight trials were considered to be at low risk of bias from the incompleteness of data on functional outcomes (Boons 2012; Hodgson 2003; Olerud 2011a; Olerud 2011b; ProFHER 2015; Sebastiá‐Forcada 2014; Torrens 2012; Zhu 2011). Thirteen trials were deemed at high risk of bias, usually reflecting large losses to follow‐up and post‐randomisation exclusions.

Selective reporting

The lack of trial registration details and protocols hindered the appraisal of the risk of bias from selective reporting. Seven trials were considered at high risk of selective reporting bias (Agorastides 2007; Hoellen 1997; Livesley 1992; Ockert 2010; Rommens 1993; Soliman 2013; Zyto 1997).

Other potential sources of bias

Baseline characteristics

No trial was considered at high risk of bias because of confounding resulting from major imbalances in baseline characteristics. However, low risk of bias judgements were given for only 11 trials (Boons 2012; Buecking 2014; Kristiansen 1988; Lefevre‐Colau 2007; Lopiz 2014; Lundberg 1979; Olerud 2011a; Olerud 2011b; ProFHER 2015; Wirbel 1999; Zyto 1997).

Care programmes

Risk of performance bias from important differences in care programmes other than the trial interventions, or differences in the experience of care providers, was judged either low (19 trials) or unclear (in the other 12 trials), usually based on inadequate information.

Effects of interventions

See: Summary of findings for the main comparison Summary of findings: surgical versus non‐surgical treatment for proximal humeral fractures; Summary of findings 2 Summary of findings: early versus delayed mobilisation for non‐surgically treated proximal humeral fractures

Where available, outcome data reported at final follow‐up for individual trials are presented in the analyses.

We based our judgement of clinically important between‐group mean differences in the various patient‐reported outcome measures (PROMS) using the following minimal clinically important differences (MCID); alternative sources and values are listed in the Primary outcomes. We decided that we would rescale MCIDs where a scoring system was rescaled but would not use these where the scoring instruments were modified, such as question removal.

  • ASES (0 to 100: best outcome): 12.01 (Tashjian 2010; rotator cuff disease)

  • Constant score (0 to 100: best outcome): 11.6 (Van de Water 2014; proximal humerus fracture)

  • DASH (0 to 100: worst outcome): 13.0 (Van de Water 2014; proximal humerus fracture)

  • EQ‐5D (0 to 1: best outcome): 0.12 (Olerud 2011c; proximal humerus fracture)

  • OSS (0 to 48: best outcome): 11.4 (Van de Water 2014; proximal humerus fracture)

  • QuickDASH (0 to 100: worst outcome): 16 (DASH/QuickDASH; general)

  • SF‐12‐PCS (physical component score) (0 to 100: best outcome): 6.5 (Schmitt 2004; upper limb proximal diagnosis)

  • SST (0 to 12: best outcome): 2.05 (Tashjian 2010; rotator cuff disease)

  • UCLA (2 to 35: best outcome): 2.4 (Van de Water 2014; proximal humerus fracture)

Methods of non‐surgical management

Initial treatment, including immobilisation

Five trials reported outcomes following initial treatment for non‐surgically managed proximal humeral fractures (Hodgson 2003; Kristiansen 1989; Lefevre‐Colau 2007; Rommens 1993; Torrens 2012). All or most fractures were described as minimally displaced in three of these trials (Hodgson 2003; Kristiansen 1989; Rommens 1993). Both Lefevre‐Colau 2007 and Torrens 2012 included displaced (two‐ or three‐part) fractures; these were described as "stable" in Lefevre‐Colau 2007 while Torrens 2012 put an upper limit to fracture displacement.

Early mobilisation versus delayed mobilisation

Although four trials compared early versus delayed mobilisation (Hodgson 2003; Kristiansen 1989; Lefevre‐Colau 2007; Torrens 2012), the timing of the start of early mobilisation varied as, where described, did the nature and intensity of the physiotherapy provided. Notable is the long (two hour) duration of individual physiotherapy sessions of Lefevre‐Colau 2007. With three exceptions, the lack of comparable outcome measurement and data precluded data pooling and so the results of the individual trials are presented separately below.

Hodgson 2003 compared commencing physiotherapy within one week of fracture versus delayed physiotherapy after three weeks of immobilisation in a collar and cuff sling in 86 people with minimally displaced fractures. The results, presented in Hodgson 2007 for self‐reported shoulder disability using the Croft Shoulder Disability Questionnaire (Croft 1994), show a tendency for less disability in the early mobilisation group at one year (e.g. disability (1 or more problems): 18/42 versus 29/40; RR 0.59, 95% CI 0.40 to 0.88; severe disability (5 or more problems): 13/42 versus 17/40; RR 0.73, 95% CI 0.41 to 1.30), continuing improvement and recovery between one and two years, and also reveal that, overall, a substantial proportion of participants continued to report some or severe disability at two years (see Analysis 1.1). Results at two years for eight of the 22 questions of the Croft questionnaire are shown in Analysis 1.2. These are presented to give an indication of the variety of problems experienced by these patients and the variation in the responses. There was some evidence supporting a quicker recovery in the early group as trial participants given early physiotherapy attended fewer treatment sessions (see Analysis 1.3: mean difference (MD) ‐5.00 sessions; 95% (CI) ‐8.25 to ‐1.75) until they and their physiotherapists agreed that independent shoulder function had been achieved. As can be seen in Analysis 1.4, participants of the early group had significantly better health‐related quality‐of‐life scores at 16 weeks in two dimensions of the SF36 (role limitation physical: MD 22.20, 95% CI 3.82 to 40.58; and pain: MD 12.10, 95% CI 3.26 to 20.94). There were no statistically significant differences between the two treatment groups in the other six dimensions (e.g. physical functioning) of the SF36 at 16 weeks, and in all eight dimensions at one year. There were no complications arising from fracture displacement. The only recorded complication in the trial was a frozen shoulder in a participant of the delayed physiotherapy group (see Analysis 1.6). Shoulder function, relative to the unaffected shoulder, measured using the Constant score (Constant 1987) was better at 8 and 16 weeks in the early group (see Analysis 1.8: mean difference in ratio affected/unaffected arm 0.16; 95% CI 0.07 to 0.25). The between‐group differences were smaller at one year and the confidence intervals crossed the line of no effect (MD 0.07, 95% CI ‐0.03 to 0.17).

Kristiansen 1989, which tested the duration of immobilisation in a sling and body bandage (one week versus three weeks) in 85 people with mainly undisplaced fractures, provided insufficient follow‐up data to allow any test for statistical significance. The authors reported that while pain, function and mobility at six months and over were similar in both groups, the patients who started early mobilisation at one week suffered less pain in the first three months than those who kept their bandaging for three weeks. One case of complex regional pain syndrome type 1 (CRPS‐1) occurred in each group (see Analysis 1.6).

Lefevre‐Colau 2007 compared commencing physiotherapy within three days of fracture with delayed physiotherapy after three weeks of immobilisation in a sling in 74 people with minimally displaced or "stable" impacted fractures. Ten trial participants withdrew from the trial because of difficulties in reaching the hospital for treatment. Participants were discharged from physiotherapy at six months. Shoulder function measured using the Constant score was statistically significantly better in the early group at six weeks and three months (see Analysis 1.9), with the differences probably including a clinically relevant effect; the differences at six months and end of treatment, though favouring the early group, were smaller and not statistically significant (MD 6.10, 95% CI ‐0.22 to 12.42). Although the early group had significantly reduced pain compared with the three weeks group by three months follow‐up, there was no difference at six months (see Analysis 1.11). Active range of motion, measured relative to the opposite arm, also did not differ significantly between the two groups at six months (see Analysis 1.12). There were no cases of fracture non‐union or displacement. One participant from each group received treatment for subacromial impingement (see Analysis 1.6). All participants attended at least 70% of the supervised physiotherapy sessions; and very few expressed dissatisfaction with their treatment (see Analysis 1.13).

Torrens 2012 compared sling immobilisation for one week versus four weeks in 42 people with minimally displaced or displaced two‐ or three‐part fractures; all participants had the same "progressive rehabilitation" regimen. Results were reported at 3, 6 and 12 months. Participants in the four‐weeks group had consistently higher quality‐of‐life scores (EuroQol 5D) at all three follow‐ups (e.g. MD ‐0.09, 95% CI ‐0.21 to 0.03; see Analysis 1.5). All three results include a clinically important difference in quality of life in favour of the four‐weeks group but also the possibility of a much smaller and clinically unimportant effect favouring the early group. Torrens 2012 reported no complications aside from noting that the three participants (two early mobilisation versus one, four‐weeks immobilisation) experiencing a "significant displacement" of their fracture did not require surgical treatment (see Analysis 1.6). One person had died in the four‐weeks group by 12 months follow‐up (see Analysis 1.7). The evidence from Torrens 2012 did not confirm differences between the two groups at any of the three follow‐ups in Constant scores (see Analysis 1.9), pain (see Analysis 1.10) or patient satisfaction (see Analysis 1.14). Of note though is that the confidence intervals of the pain score results at 12 months included a clinically important difference in favour of the four‐weeks group (MD 10.80, 95% CI ‐4.59 to 26.19); these also crossed the line of no effect.

The exceptions in terms of pooling were data for the adverse events, when pooled under 'shoulder complications' and 'fracture complications', for four and two trials respectively, and the secondary outcomes of pain and Constant scores available from two trials. Data pooled for reported shoulder complications that comprised frozen shoulder, CRPS‐1 and treated subacromial impingement showed little difference between the two groups (2/127 versus 3/132; RR 0.73, 95% CI 0.15 to 3.63; 4 trials, 259 participants). Two trials reporting on fracture complications found no cases of non‐union and only one trial (Torrens 2012) reported actual cases of fracture displacement (2/52 versus 1/54; RR 2.20, 95% CI 0.22 to 22.45; 2 trials; 106 participants). Both analyses of Constant score and pain (see Analysis 1.9; Analysis 1.10) display evidence of statistical heterogeneity, which may in part reflect clinical heterogeneity of the contributing trials (Lefevre‐Colau 2007; Torrens 2012).

Gilchrist arm sling versus the Desault bandage

Rommens 1993 compared the use of two types of immobilisation, the Gilchrist arm sling versus the Desault bandage, worn for two to three weeks in 28 patients with mainly minimally displaced fractures. Reporting up until fracture consolidation, Rommens 1993 reported, without presenting data, that they had found no differences in the end result, either in terms of functional outcome or fracture healing. More people found the initial application of a Desault bandage uncomfortable and severe skin irritation prompted premature removal of the bandage in two people in this group (see Analysis 2.1). Pain during immobilisation was also reported to be greater in the Desault group. Slight displacement of the fracture in the first week was reported in two participants of the Gilchrist group (see Analysis 2.2). At fracture consolidation, patients' rating of their assigned bandage was significantly more favourable in the Gilchrist group (see Analysis 2.3 "Poor or bad rating by patient at fracture consolidation": 2/14 versus 8/14; risk ratio (RR) 0.25, 95% CI 0.06 to 0.97).

Continuing management (rehabilitation) after initial sling immobilisation

Two small trials (Bertoft 1984; Lundberg 1979) compared self‐directed exercise following a course of instruction versus conventional physiotherapy during the 12 weeks following trauma in a total of 62 patients with minimally displaced fractures. In both trials there were no statistically significant differences between those receiving instruction for exercises at home and those undergoing supervised physiotherapy in any of the outcomes recorded (see Analysis 3.1, Analysis 3.2, Analysis 3.3, Analysis 3.4, Analysis 3.5 and Analysis 3.6). It should be noted that since Lundberg 1979 did not report whether there had been any loss to long‐term follow‐up at an average of 16 months, the results for Neer's score presented in Analysis 3.5 are for illustrative purposes only.

Revay 1992, which included 48 participants with minimally displaced fractures, reported that the addition of supervised exercises in a swimming pool to self‐treatment did not enhance long term outcome. Participants of the control group (self‐treatment only) were reported as having significantly better functional movements, joint mobility and activities of daily living at two and three month follow‐up. However, there were no significant differences at one year. Revay 1992 suggested that those using the pool may have neglected their home exercises, but the authors did not evaluate compliance.

Livesley 1992, which included 48 patients with minimally displaced fractures, reported that there was no difference in outcome between the two groups (pulsed electromagnetic high frequency energy (PHFE) versus placebo) at any stage of the trial, but provided no quantitative data. All trial participants were reported as achieving a "good" result as converse to a "poor" one.

Surgical treatment versus non‐surgical treatment

Eight heterogeneous trials, with a total of 567 participants and 568 fractures, evaluated surgical intervention for displaced fractures, of which over 73% were three‐ or four‐part fractures (Neer classification). Table 1 gives a brief summary of their characteristics. The methods of surgery varied between the trials, being restricted to hemiarthroplasty in three trials (Boons 2012; Olerud 2011b; Stableforth 1984), internal fixation in three trials (Fjalestad 2010; Olerud 2011a; Zyto 1997) and external fixation in Kristiansen 1988. Most surgery involved internal fixation in ProFHER 2015, where the surgeons used methods with which they were experienced. Non‐surgical treatment was predominantly sling immobilisation; this was preceded by closed manipulation in all participants in two trials (Kristiansen 1988; Stableforth 1984) and in eight participants in Fjalestad 2010.

Open in table viewer
Table 1. Surgical versus non‐surgical treatment trials: brief characteristics

Study

Participants
(Neer classification)

Surgery

Non‐surgical
(starting with)

Follow‐up

Boons 2012

50 participants with 4‐part fractures
(The Netherlands)

Humeral head replacement with the Global prostheses; cemented

Sling immobilisation

1 year

Fjalestad 2010

 

50 participants with 3‐ or 4‐part fractures

(Norway)

Open reduction and fixation with an interlocking plate device and metal cerclages

Immobilisation of the injured arm in a modified Velpeau bandage. Closed reduction in 8 patients.

2 years

Kristiansen 1988

 

30 participants with 31 2‐, 3‐ or 4‐part fractures. Included 7 2‐part, 19 3‐part and 5 4‐part fractures
(Denmark)

Percutaneous reduction and external fixation

 

Closed manipulation and sling immobilisation

2 years

Olerud 2011a

60 participants with 3‐part fractures (all had displaced surgical neck fracture)
(Sweden)

Open reduction and fixation with a PHILOS plate and non‐absorbable sutures

Sling immobilisation

2 years

Olerud 2011b

55 participants with 4‐part fractures
(Sweden)

Humeral head replacement with the Global Fx prosthesis

Sling immobilisation

2 years

ProFHER 2015

250 participants with "displaced fracture of the proximal humerus that involved the surgical neck". Included 18 1‐part (but confirmed as still "displaced"), 128 2‐part, 93 3‐part and 11 4‐part fractures

(UK)

Either internal fixation (majority were PHILOS plates) or joint replacement (hemiarthroplasty)

Pragmatic trial ‐ choice based on surgeon's experience with method

Sling immobilisation

2 years

Stableforth 1984

32 participants with 4‐part fracture
(UK)

Hemiarthroplasty

Closed manipulation and sling

6 months

Zyto 1997 

40 participants with 3‐ or 4‐part fractures (3 others excluded)
(Sweden)

Internal fixation using surgical tension band or cerclage wiring

Sling immobilisation

50 months

Primary outcomes

Pooled results of four different patient‐reported functional scores reported by five trials (Boons 2012; Fjalestad 2010; Olerud 2011a; Olerud 2011b; ProFHER 2015) at 12 months follow‐up showed no statistically significant difference between the two groups (standardised mean difference (SMD) 0.07 favouring surgery, 95% CI ‐0.12 to 0.26; P = 0.46; 419 participants, see Analysis 4.1; and Figure 4). The same finding, which was based on data for three scores reported by four trials, of no significant between‐group difference applied at 24 months (SMD 0.07 favouring surgery, 95% CI ‐0.14 to 0.28; P = 0.50; 351 participants; see Analysis 4.2). The Oxford Shoulder Score (OSS) results for ProFHER 2015 showed no clinically important (the MCID for the OSS was set at 5 points in this trial) or statistically significant differences between the two groups over the two‐year follow‐up (MD 0.75, 95% CI ‐1.68 to 3.18; P = 0.55; 231 participants) or at 6, 12 or 24 months (see Analysis 4.3). Pooled DASH scores from Olerud 2011a and Olerud 2011b showed no statistically significant differences between the two groups at four months, or at one or two years (see Analysis 4.4); although the scores favoured surgery, the best estimate at 24 months was still lower than the MCID (10 points) for DASH (0 to 100: worst function): MD ‐7.43, 95% CI ‐16.26 to 1.41; 99 participants). Fjalestad 2010 found no significant differences between the two groups in the American Shoulder and Elbow Surgeons (ASES: 0 to 24: best function) scores at either 6, 12 or 24 months follow‐up (see Analysis 4.5). Boons 2012 found no significant differences between the two groups in the Simple Shoulder Test scores at 3 or 12 months (see Analysis 4.6). There were no statistically significant differences in subjective assessment of function between the two groups of Zyto 1997 at either one or three years (see Analysis 4.7).


Forest plot of comparison: 4 Surgical versus non‐surgical treatment, outcome: 4.1 Functional scores at 12 months (higher = better outcome).

Forest plot of comparison: 4 Surgical versus non‐surgical treatment, outcome: 4.1 Functional scores at 12 months (higher = better outcome).

Quality of life based on the EuroQol scores from three trials (Olerud 2011a; Olerud 2011b; ProFHER 2015) and 15D (Sintonen 2001) results from Fjalestad 2010 were slightly higher in the surgery group but none of the between‐group differences were clinically important including the statistically significant finding at six months (MD 0.04, 95% CI 0.01 to 0.08; P = 0.02; 381 participants; see Analysis 4.8). A separate breakdown of the results from Fjalestad 2010, which include the number of quality of life years (QALYs), showed no differences in any quality‐of‐life outcomes for this trial (see Analysis 4.9). Based on data adjusted for covariates, ProFHER 2015 reported there were also no significant between‐group differences over two years in the mean SF‐12 physical component score (MD 1.77 favouring surgery, 95% CI −0.84 to 4.39 points; reported P = 0.18) and the mean SF‐12 mental component score (1.28 favouring non‐surgical treatment, 95% CI −3.80 to 1.23; reported P = 0.32). The SF‐12 physical component scores (0 to 100: best outcome) were slightly higher in the surgery group at all three follow‐ups (see Analysis 4.10) and, conversely, the SF‐12 mental component scores (0 to 100: best outcome) were slightly higher in the non‐surgical treatment group at all three follow‐ups (see Analysis 4.11). None of these differences was statistically significant and the confidence interval limits are less than the minimal clinically important difference.

There was no significant difference between the two groups in mortality (17/248 versus 12/248; RR 1.40 favouring non‐surgical treatment, 95% CI 0.69 to 2.83; P = 0.35; 6 trials, see Analysis 4.12). Where reported, none of the deaths was related to their fracture or treatment with the exception of one early death due to venous thromboembolism in the surgical group of ProFHER 2015. Notably, the two deaths that occurred within three months of surgery in Fjalestad 2010 were people with underlying health problems. In Zyto 1997, eight of the 11 missing participants had died at 50 months, but no information on group allocation or causes of death was provided.

Significantly more surgical group patients had additional or secondary surgery (34/262 versus 16/261; RR 2.06, 95% CI 1.18 to 3.60; 7 trials; see Analysis 4.13). In Boons 2012, one surgical group participant underwent revision surgery after one week because of head‐stem separation. A non‐surgically treated participant in Boons 2012 who had surgery at 13 months, thus outside the trial's follow‐up period, because of shoulder pain and impairment was not included in this analysis. In Fjalestad 2010, treatment failure resulting in an operation occurred in eight surgical group participants, one of whom had re‐fixation plus bone grafting at six months and seven whose implants were removed because of screw penetration into the joint space; and one non‐surgically treated patient, who had surgery because of fracture redisplacement at two weeks. In Kristiansen 1988, the three cases of treatment failure were the removal of pins due to infection in one surgical group participant and a change of method resulting from a poor initial fracture reduction in two non‐surgical group participants. The reasons for re‐operations in the surgical group of Olerud 2011a were deep infection (two cases), non‐union (one case), impingement (two cases), avascular necrosis (one case), screw penetration into joint (one case) and stiffness (two cases). One non‐surgically‐treated patient in Olerud 2011a had surgery because of impingement. Not included in this analysis is another non‐surgically‐treated patient with non‐union who abstained from surgery partly because of a late diagnosis of axillary nerve palsy. The reasons for additional surgery in Olerud 2011b were screw penetration of the joint (for one patient treated with a locking plate), stiffness and impingement and displaced greater tuberosity respectively in three surgical group patients, and for complete displacement of the humeral shaft without bony contact in one non‐surgically treated patient. Not included in this analysis is another non‐surgically‐treated patient who refused surgery for a non‐union. The reasons for further surgery in the surgical group of ProFHER 2015 were avascular necrosis (two cases), metalwork problems (seven cases) and post‐traumatic stiffness (two cases). The reasons for subsequent surgery in the non‐surgical treatment group of ProFHER 2015 were avascular necrosis (one case), malunion (two cases), non‐union (four cases), post‐traumatic stiffness (one case), rotator cuff tear (one case), severe pain (one case) and not‐reported (one case). In Stableforth 1984, one surgical group participant had their prosthesis removed because of a deep infection. Only ProFHER 2015 reported on additional shoulder‐related therapy, which occurred in slightly more participants of the surgery group (7/125 versus 4/125; RR 1.75 favouring non‐surgical treatment, 95% CI 0.53 to 5.83; P = 0.36; see Analysis 4.14).

The numbers of people in each group with one or more adverse events or complications were available only in ProFHER 2015 (30/125 versus 23/125; RR 1.30 favouring non‐surgical treatment, 95% CI 0.80 to 2.11; P = 0.28; see Analysis 4.14). Analysis 4.14 also presents the available data for individual complications. Unsurprisingly, surgery‐related complications (e.g. infection and screw penetration of the joint) were predominant in the surgery treatment group. While non‐union was more common in the non‐surgical treatment group (P = 0.05), none of the differences between the two groups in the radiologically detected outcomes of avascular necrosis and signs of osteoarthritis were statistically significant. For avascular necrosis, data favouring surgery from Boons 2012 and Olerud 2011b needs to be seen in the context that these were only likely to be detected in non‐surgically treated patients, given that surgery involved the replacement of the humeral head. Additionally, some of these outcomes were without symptoms or minor in extent. For instance, in Fjalestad 2010, both cases of non‐union in the non‐surgical treatment group were without symptoms, and 22 of the 27 participants with radiographically‐detected avascular necrosis were asymptomatic.

In Stableforth 1984, fewer participants of the prosthesis group needed some help with activities of daily living or had died by six months (see Analysis 4.15: 2/16 versus 9/16; RR 0.22, 95% CI 0.06 to 0.87).

Secondary outcomes

The differences between the two groups in the Constant scores (0 to 100: best outcome) at four different time points (3‐4, 12, 24 and 50 months) were all small and clinically not important (e.g. the most data were for 12 months: MD 2.81, 95% CI ‐2.20 to 7.82; 199 participants, 4 trials; see Analysis 4.16). The same lack of differences between the two groups applied to the Constant scores of the injured arm in Fjalestad 2010 at 6, 12 and 24 months follow‐up (see Analysis 4.17). At one year follow‐up in Kristiansen 1988, fewer participants of the surgical group had a poor or unsatisfactory rating of function assessed using the Neer score (3/11 versus 6/10; RR 0.45, 95% CI 0.15 to 1.35; see Analysis 4.18).

Boons 2012 reported similar results in the two groups for patient‐assessed disability based on a 0 to 100 VAS scale; where the maximum score equated to "no restrictions". The clinical relevance of the results, which were in favour of the surgical group, is uncertain (see Analysis 4.19).

Boons 2012 reported lower pain scores, measured using VAS (0 to 100: higher scores mean worse pain), in the hemiarthroplasty group at three months (MD ‐18.00, 95% CI ‐29.03 to ‐6.97; 49 participants; see Analysis 4.20) than in the non‐surgical group; this difference is likely to be clinically important. In contrast, there were similar results in the two groups at 12 months (median 23 in the surgery group versus 25 in the non‐surgical group; reported P = 0.725). Pooled results from two trials (Olerud 2011a; Olerud 2011b) showed slightly less pain at two year follow‐up in the surgery group (MD ‐6.38; 95% CI ‐14.18 to 1.41; 101 participants; see Analysis 4.20); the clinical importance of this result is questionable. Nearly all trial participants in Stableforth 1984 had shoulder pain but fewer in the prosthesis group reported constant pain that impaired sleep or function (see Analysis 4.22: 2/15 versus 9/15; RR 0.22, 95% CI 0.06 to 0.86). The categorisation of pain is not clear in the trial report nor whether pain was assessed for all participants. Assuming the latter is the case, the difference between the two groups is less marked when all those with more than occasional pain are included (4/15 versus 9/15; RR 0.44, 95% CI 0.17 to 1.13; analysis not shown). Zyto 1997, which provided a breakdown of the Constant score into the separate components (activities of daily living, pain, range of motion, strength), did not confirm a significant difference between the two groups in the pain component, which was in favour of the non‐surgical treatment group, at 50 months (see Analysis 4.21).

Reduced muscle strength and restricted mobility were less frequent in the prosthesis group survivors of Stableforth 1984 (see Analysis 4.23 and Analysis 4.24) than in the group receiving closed manipulation and sling. Zyto 1997 found no difference between the two groups in strength ('power') at 50 months follow‐up. The clinical relevance of the three point difference in the range of motion component of the Constant score in favour of non‐surgical treatment is questionable (see Analysis 4.21). In Boons 2012, abductor strength, reported as a percentage of the opposite shoulder, was lower in the surgery group at both three months (median values: 20% versus 30%; reported P = 0.015) and 12 months (median values: 24% versus 42%; reported P = 0.008). Boons 2012 also found that forward flexion (median 68 versus 88 degrees; reported P = 0.001) and abduction (median 61 versus 78 degrees; reported P = 0.02) were worse in the surgery group at three months. There were no between‐group differences in external rotation and internal rotation at this time, nor for all four range of motion measures at 12 months).

Fjalestad 2010 found no differences at one year between the two groups in costs (see Analysis 4.25 and Analysis 4.26). The base case economic analysis of ProFHER 2015 showed that at two years, the cost of surgical intervention was, on average, GBP 1,780.73 more per patient (95% CI GBP 1,152.71 to GBP 2,408.75).

Different methods of surgical management

Comparisons of different categories of surgical intervention

Five trials compared different methods of surgical management (Cai 2012; Hoellen 1997; Sebastiá‐Forcada 2014; Smejkal 2011; Zhu 2011).

Open reduction with internal fixation using a locking plate versus a locking nail

Zhu 2011 compared open reduction with internal fixation using a locking plate (LPHP or PHILOS) versus a locking nail (PHN) in 57 participants with two‐part surgical neck fractures. The American Shoulder and Elbow Surgeon's scores were statistically significantly better in the plate group at one year (MD 7.20; 95% CI 1.48 to 12.92) and three years (MD 4.00; 95% CI 0.01 to 7.99) (see Analysis 5.1). The clinical importance of these findings, however, is uncertain given the MCID for ASES is included only in the 95% CI at one year follow‐up. One participant of the nail group died of unrelated causes. While complications were not described in full, significantly more patients in the plate group had a complication (9/29 versus 1/28; RR 8.69, 95% CI 1.18 to 64.19; see Analysis 5.2). This included five patients in the plate group who had a re‐operation for screw penetration into the articular surface of the humeral head. Zhu 2011 found a statistically significant but probably not a clinically important difference in favour of the plate group in pain at one year but not at three years (see Analysis 5.3). There were no statistically significant differences between the two groups in the Constant scores at the two follow‐up times (see Analysis 5.4) or in range of motion measures at either one year (not shown) or three years (see Analysis 5.5 and Analysis 5.6). Although the plate group had greater muscle strength at one year, the difference between the two groups was no longer statistically significant at three years (see Analysis 5.7). Both duration of surgery (MD 24.90 minutes, 95% CI 5.97 to 43.83 minutes) and blood loss were statistically significantly greater in the plate group (see Analysis 5.8). Consistent with the finding of an increased blood loss in the plate group, more people in this group had a blood transfusion but the difference between the two groups was not statistically significant (see Analysis 5.9).

Open reduction with internal fixation using a locking plate versus minimally invasive fixation with distally inserted intramedullary K‐wires

Smejkal 2011 compared open reduction and internal fixation using a PHILOS plate versus the Zifko method of minimally invasive fixation with distally inserted intramedullary K‐wires (Kirschner wires) in 61 participants with two‐ or three‐part fractures. Smejkal 2011 did not report patient‐reported function or activities of daily living. The account of the complications seemed incomplete, with no indication of how many required a re‐operation but this was perhaps partly due to difficulties in translation from Czech to English. There was no significant difference between the two groups in the overall numbers of participants incurring a complication (11/28 versus 9/27; RR 1.18, 95% CI 0.58 to 2.38; see Analysis 6.1 The recorded nature of the complications reflected the type of implant, with four cases of screw protrusion in the plate group that resulted in impingement and migration of K‐wires, a distal humeral fracture and a nerve injury in the Zifko group. Smejkal 2011 found no difference between the two groups in Constant scores relative to the healthy limb at a mean two years follow‐up (MD ‐0.81%, 95% CI ‐7.45% to 5.83%; see Analysis 6.2). Three participants of each group had a 'poor' Constant score. Analysis 6.3 shows there were no statistically significant differences between the two groups in time to union (MD 2.10 weeks, 95% CI ‐2.25 to 6.45 weeks) or in a vaguely‐described measure of time to recover normal upper limb function (27.2 versus 21.4 weeks; MD 5.80 weeks; 95% CI ‐0.16 to 11.76 weeks). Smejkal 2011 suggested that the greater time to recover in the plate group reflected a greater impact of complications in this group. The duration of operation was significantly greater in the plate group (MD 44.74 minutes, 95% CI 32.23 to 57.25 minutes; see Analysis 6.4), but with a non‐significant tendency for less X‐ray exposure. The tendency for longer hospital stays for plate group patients did not achieve statistical significance (MD 1.20 days; 95% CI ‐0.34 to 2.74; see Analysis 6.5).

Hemiarthroplasty versus internal fixation

Two small heterogeneous trials compared humeral head replacement versus internal fixation for four‐part fractures (Cai 2012; Hoellen 1997). Only data for re‐operation were available for pooling; these favoured hemiarthroplasty (3/34 versus 8/28; RR 0.32, 95% CI 0.10 to 1.10) but were moderately heterogeneous (heterogeneity: Chi² = 1.82, degrees of freedom (df) = 1 (P = 0.18); I² = 45%; see Analysis 7.3). Given this, the results of the two trials are presented separately.

Hemiarthroplasty versus open reduction and locking plate fixation:

Cai 2012, which compared hemiarthroplasty with open reduction and PHILOS plate fixation in 32 participants with four‐part fractures, reported outcome at 4, 12 and 24 months. Although DASH scores at one and two years favoured the hemiarthroplasty group, the mean differences were smaller than the MCID of 13 for DASH (at 12 months: MD ‐7.30, 95% CI ‐16.70 to 2.10, 28 participants; at 24 months: MD ‐6.10, 95% CI ‐11.03 to ‐1.17, 27 participants; see Analysis 7.1). Although favouring the hemiarthroplasty group, the differences between the two groups in quality of life measured via the EQ‐5D were not clinically or statistically significant at any of the three follow‐up times (see Analysis 7.2). Re‐operations were reported for three participants in the hemiarthroplasty group (one dislocation, one prosthesis loosening, one infection) and three participants in the fixation group (one non‐union, two fixation failure); RR 0.68, 95% CI 0.16 to 2.88; see Analysis 7.3). One person in the hemiarthroplasty group had died by two years (see Analysis 7.4). The Constant scores were higher in the hemiarthroplasty group at all three follow‐ups; in particular, the 95% confidence interval at two years included a clinically important effect (MD 12.20, 95% CI 2.85 to 21.55; 27 participants; see Analysis 7.6). While the results at two years for pain and range of motion favoured hemiarthroplasty, Cai 2012 found no statistically significant between‐group differences in either of these outcomes (see Analysis 7.7 and Analysis 7.9). The mean time of surgery was slightly longer in the hemiarthroplasty group (93 minutes versus 86 minutes).

Hemiarthroplasty versus tension band wiring

Hoellen 1997 compared hemiarthroplasty versus reduction and stabilisation of the fracture using tension band wiring. All 30 patients reported in Hoellen 1997 had four‐part fractures. Patients with three‐part fractures were also eligible according to a later report of the trial (Holbein 1999), which reported on 39 patients. However, until we obtain further information from the trialists, we will continue to report the results from Hoellen 1997. In Hoellen 1997, results for only 18 of the 30 trial participants were available at one year. There were no serious peri‐operative or post‐operative complications such as pulmonary embolism. No participants of the replacement group required further surgery compared with five participants of the osteosynthesis group (the wires displaced in four participants and the fracture completely dislocated in one participant): RR 0.09, 95% CI 0.01 to 1.51 (see Analysis 7.3). Implants were removed in four participants of the osteosynthesis group (see Analysis 7.5). The mean Constant scores (minus the power component) for the 18 people available at one year follow‐up were similar in the two groups (48 versus 49 points out of a maximum of 75). Two participants of the hemiarthroplasty group and one in the fixation group reported pain at one year (see Analysis 7.8). Though we have not obtained clarification on the inadequately reported results presented in Holbein 1999, these did not appear to differ in a major way from those in Hoellen 1997.

Reverse shoulder arthroplasty versus hemiarthroplasty

Sebastiá‐Forcada 2014 compared reverse shoulder arthroplasty with hemiarthroplasty in 62 participants with either three‐ or four‐part fractures, some of which included dislocation. Follow‐up was between 24 and 49 months. Patient‐reported upper‐limb function pain assessed using the Quick DASH (0 to 55: worst outcome) was superior in the reverse arthroplasty group: MD ‐6.90, 95% CI ‐10.81 to ‐2.99 (see Analysis 8.1). One participant in the reverse arthroplasty group was re‐operated because of deep infection compared with six participants in the hemiarthroplasty group re‐operated because of proximal migration of implant (1/31 versus 6/31; RR 0.17 favouring reverse arthroplasty, 95% CI 0.02 to 1.30; see Analysis 8.2). All seven participants received a reverse shoulder arthroplasty. No deaths occurred in this trial. University of California‐Los Angeles scores and Constant and adjusted Constant scores all favoured the reverse arthroplasty group (see Analysis 8.4). A similar finding applied to pain, range of motion, power and activities of daily living components of the Constant score (see Analysis 8.5). Fewer participants had a complication in the reverse arthroplasty group compared with the hemiarthroplasty group (2/31 versus 10/30; RR 0.19, 95% CI 0.05 to 0.81; see Analysis 8.6 footnotes for actions taken to treat the individual complications). The findings of radiological assessment (see Analysis 8.7) did not confirm a difference between the two groups in malunion or resorption of tuberosities. The one case of scapular notching in the reverse arthroplasty group was without clinical consequence, as were the 11 cases of heterotopic ossification. Anterior forward and abduction were superior in the reverse arthroplasty group (Analysis 8.8).

Comparisons of different methods of performing an intervention in the same category

Seven trials compared different types or methods in the same intervention category (e.g. plating) (Buecking 2014; Fialka 2008; Lopiz 2014; Ockert 2010; Soliman 2013; Voigt 2011; Zhang 2011).

Deltoid‐split approach versus deltopectoral approach for non‐contact bridging plate fixation

Buecking 2014, which made this comparison in 120 people with Neer two‐, three‐ or four‐part fractures, reported results for activity of daily living at 6 and 12 months based on a score by Lawton (Lawton 1969). However, the trialists appear not to have used the scoring system correctly and reported scores that are greater than the maximum score of 8. There was no statistically significant difference between the mean scores at six months or 12 months (18 for the deltoid‐split versus 17 for the deltopectoral) but the clinical relevance of these scores is questionable. Similar numbers of participants in the two groups had a re‐operation for a complication or a fall (9/60 versus 8/60; RR 1.13, 95% CI 0.47 to 2.72; see Analysis 9.1); the same observation applies to the numbers of participants requesting plate removal (see Analysis 9.1). By one year follow‐up, one person had died in the deltoid‐split group versus three in the deltopectoral group (see Analysis 9.2). Analysis 9.3 presents the data for the complications, all present resulted in a re‐operation, reported for this trial. The Constant scores favoured the deltoid‐split group, but the mean differences were smaller than the MCID for the Constant score and the confidence intervals crossed the line of no effect (see Analysis 9.4). A similar finding applied to the pain VAS results (see Analysis 9.5). There were no significant between‐group differences in duration of operation or fluoroscopy time (see Analysis 9.6). The mean length of stay in hospital was 10 days in both groups (see Analysis 9.7).

Polyaxial versus monoaxial locking plate fixation

Two trials made this comparison (Ockert 2010; Voigt 2011). Ockert 2010, which reported on outcome for patients (66 patients in their 2010 publication; 124 patients in their later publication (Ockert 2014)) with Neer two‐, three‐ or four‐part fractures, did not report on functional outcome. Voigt 2011 found no statistically significant differences at one year (48 patients with Neer three‐ or four‐part fractures) between the two groups in their DASH scores (see Analysis 10.1: RR 2.10, 95 CI ‐6.24 to 10.44), nor at 3, 6 or 12 months in the 'simple shoulder test' (see Analysis 10.2).

Since the extended trial report of Ockert 2010 (Ockert 2014) reported only on re‐operation at 12 months, the data from the more detailed report of re‐operations and complications occurring up to six months from the 2010 publication are also presented in the following. Neither trial found statistically significant differences between the two groups in participants having a re‐operation; either at six months (data from Ockert 2010: 2/29 versus 3/37) or at one year (see Analysis 10.3: 15/83 versus 16/97; RR 1.10, 95% CI 0.58 to 2.08). In the initial six months follow‐up report for Ockert 2010, one participant of the polyaxial group had a loosened screw taken out at 10 weeks; one participant of each group had early hardware removal (at five months) because of subacromial impingement from poor plate positioning; and two monoaxial group participants had early hardware removal and a revision respectively because of intra‐articular screw protrusion. In the recruitment and follow‐up extension of Ockert 2010 (Ockert 2014), five polyaxial group versus nine monaxial group participants had revision because of secondary varus displacement with subsequent intra‐articular screw protrusion; four versus two participants had revision because of subacromial impingement; and one monoaxial group participant had revision surgery because of an infection. In Voigt 2011, one person in each group had an early "prosthetic replacement" and three participants in the polyaxial group and one in the monoaxial group had refixation. The two other re‐operated polyaxial group participants of Voigt 2011 had a corrective osteotomy and a screw removal respectively, while two other re‐operated monoaxial group participants both had early implant removals.

Similarly, neither trial found statistically significant differences between the two groups in their other reported outcome measures. The available data are shown for death (see Analysis 10.4), participants with any or individual complications (see Analysis 10.6), the Constant score relative to the uninjured arm (see Analysis 10.5), range of motion (see Analysis 10.7) or duration of operation or fluoroscopy time (see Analysis 10.8).

Locking plate: use of medial support locking screws

Zhang 2011 tested the use of the medial support locking screws in 72 people with Neer two‐, three‐ or four‐part fractures treated with open reduction with internal fixation using the PHILOS locking plate. They reported results for 68 participants. In the medial support group, locking screws were introduced through the plate so as to run up the inferior portion of the humeral neck providing support to the calcar. In the control group, these screw holes were left empty. One participant in the medial screw group had early failure of fixation due to plate breakage compared with nine participants with early fixation failure (six varus collapse; three screw penetration) in the control group; however, this difference did not reach statistical significance (see Analysis 11.1: RR 0.15, 95% CI 0.02 to 1.11). Seven of these patients, including the patient in the medial screw group, consented to have a re‐operation (RR 0.22; 95% CI 0.03 to 1.11). One patient in the medial screw group had asymptomatic osteonecrosis. The medial screw group had statistically significantly higher Constant scores (0 to 100: best score) at 31 month follow‐up (see Analysis 11.2: MD 9.00, 95% CI 2.41 to 15.59).

MultiLoc proximal humeral nail (MPHN) ‐ a straight nail ‐ versus Polarus humeral nail ‐ a curved nail

Lopiz 2014 compared these two types of intramedullary nails in 54 people with Neer two‐ or three‐part fractures, reporting results at a mean of 14 months (range 6 to 22 months). Of the two excluded participants, who were both in the MPHN group, one had died and one was lost to follow‐up. Patient‐reported outcome measures were not reported in this trial. Adverse events including re‐operations are presented in Analysis 12.1. Significantly fewer participants in the MPHN group had a re‐operation (3/26 versus 11/26; RR 0.27 favouring MPHN, 95% CI 0.09 to 0.87; P = 0.03). All re‐operations involved hardware removal of either a loose screw (one versus seven) or the whole nail (two versus four). One participant of the Polarus group had a non‐union; subsequent to nail removal, this patient had a reverse shoulder arthroplasty. Fewer participants in the MPHN group had rotator cuff symptoms (9/26 versus 19/26; RR 0.47, 95% CI 0.27 to 0.84) or shoulder impingement (2/26 versus 5/26; RR 0.40, 95% CI 0.09 to 1.88). Both the unadjusted and age‐ and sex‐adjusted Constant scores were higher in the MPHN group; e.g. adjusted Constant score: MD 10.60, 95% CI 1.71 to 19.49; Analysis 12.2). Although the MDs were a little smaller than the MCID (11.2) for the Constant score, the 95% confidence intervals included a clinically relevant difference in favour of the MPHN. There were no significant between‐group differences in range of shoulder motion (see Analysis 12.3), length of surgery or length of hospital stay (see Analysis 12.4).

Hemiarthroplasty: comparison of two types

Fialka 2008 compared two types of hemiarthroplasty, the EPOCA prosthesis versus the HAS prosthesis, which differ in a number of ways including the method of fixation of the tuberosities. Fialka 2008 reported results at one year for 35 of the 40 trial participants. The treatment allocations of three participants who had died and the two who were lost to follow‐up were not reported. Significantly better functional results, including range of motion, at one year were reported for the EPOCA prosthesis group. The relative (compared with the patient's uninjured shoulder) individual Constant score results were 70.4% (range 38% to 102%) for the EPOCA group versus 46.2% (range 15% to 80%) for the HAS group (reported P = 0.001). Reported complications were two patients with deep infection in the EPOCA group, two patients with persistent pain scheduled for a reoperation in the HAS group (see Analysis 13.1), and a periprosthetic fracture that occurred in one of the three patients who had died by one year. Radiological findings, except for heterotopic ossification where there were contradictory data, are shown in Analysis 13.2. These tended to favour the EPOCA prosthesis. Fialka 2008 noted some association between the bony resorption of the tuberosities and a decreased Constant outcome score. Results for range of motion are shown in Analysis 13.3.

Hemiarthroplasty: tenodesis of the long head of the biceps (LHB) versus LHB tendon left intact

Soliman 2013 compared tenodesis of the LHB versus leaving the LHB tendon intact in 45 people undergoing hemiarthroplasty. By deduction from the study report, four participants in each group were excluded because they had a complication within three months of follow‐up. These were reported to be tuberosity malposition (three participants); inferior subluxation of the prosthesis (two participants), loss of reduction of the greater tuberosity (two participants) and deep infection that required surgical debridement (one participant). Data for complications split by treatment group are shown in Analysis 14.1. Of these complications, only deep infection resulted in further surgery. At two years, the difference between the two groups in the Constant scores in favour of the tenodesis group was below the MCID and thus unlikely to be clinically important (MD 4.60, 95% CI 0.38 to 8.82; see Analysis 14.2). Three participants reported mild pain in the tenodesis group and six participants reported pain (four mild and two moderate pain) in the tendon intact group (3/19 versus 6/18; RR 0.47, 95% CI 0.14 to 1.62; see Analysis 14.3). Both participants with moderate pain went on to have a mini‐open biceps tenodesis at 18 and 28 months after diagnosis of an inflamed and scarred biceps tendon. There was no difference between the two groups in active shoulder elevation results at two years (see Analysis 14.4).

Continuing management (including rehabilitation) after surgical treatment

Wirbel 1999 tested the duration of immobilisation (one week versus three weeks) before starting physiotherapy after closed reduction and percutaneous fixation of displaced fractures in 77 patients. Wirbel 1999 reported that there were no statistically significant differences between the two trial groups in their functional results, assessed using the Neer score, at 3, 6 or at an average of 14.2 months. Data provided for unsatisfactory or worse outcome, as defined by the Neer score, at six months are consistent with this claim (see Analysis 15.1: 9/32 versus 10/32; RR 0.90, 95% CI 0.42 to 1.92; 64 participants). Premature removal of Kirschner wires because of loosening occurred in the five people in each group (see Analysis 15.2); these results, however, were not provided for the whole study population nor was it reported the treatment groups of five people who underwent open revision or hemiarthroplasty. Though similar numbers (three versus two) of people underwent removal of screws due to subacromial impingement after six months, the numbers of people in each group whose displaced tuberosity fractures were fixed with cannulated screws were not reported. Of the 21 participants followed up for more than two years, one developed partial necrosis of the humeral head but was symptom‐free and had a full range of motion of his affected shoulder.

Agorastides 2007 reported the findings of early active‐assisted mobilisation (after two weeks) versus late mobilisation (after six weeks) after cemented hemiarthroplasty in 49 of the 59 participants recruited in their trial. At one year follow‐up, there were no significant differences between the two groups in function as rated by the Oxford shoulder score (see Analysis 16.1; MD ‐6.0, 95% CI ‐16.53 to 4.53; scale was 0 to 100) or the overall Constant score (see Analysis 16.2). Two non‐unions occurred in the early group but none of the differences in radiologically‐assessed outcomes between the two groups was statistically significant (see Analysis 16.3). The differences between the two groups at one year in elevation and external rotation were neither statistically nor clinically significant (see Analysis 16.4).

Discussion

Summary of main results

mThis review, which covers all non‐pharmacological treatment and rehabilitation interventions for proximal humeral fractures in adults, now includes 31 trials involving a total of 1941 participants. The only multicentre trial recruited 250 participants (ProFHER 2015). With the increased availability of trials, we have been able to undertake further pooling of data compared with the last version, but this is still limited to four comparisons. We have undertaken substantive pooling in only one comparison, that of surgical versus non‐surgical treatment, including patient‐reported outcome measures of function and quality of life. This is presented first, below. The main results of the comparisons falling within the three other main treatment categories are then presented in turn. Where data allow, we have given the main results of individual comparisons in terms of the listed Primary outcomes.

Surgical treatment versus non‐surgical treatment

Eight heterogeneous trials, with a total of 567 participants and 568 predominantly displaced fractures evaluated surgical intervention for displaced fractures, of which 73% (415) were three‐ or four‐part fractures (Neer classification). Of note is that the majority of the fractures (146/250 = 58.4%) in ProFHER 2015 were either two‐part (128) or one‐part (18) fractures; the other seven two‐part fractures were included in Kristiansen 1988. Table 1 summarises the main fracture types, the interventions and length of follow‐up of the individual trials. Six trials specifically limited their trial populations to older people. Although ProFHER 2015 recruited adults of any age, the majority of trial participants were over 65 years (142/250 = 57%). Data for patient‐reported functional scores and quality‐of‐life scores were available from the five more recent trials that are thus more likely to represent current practice. The main results of this comparison are presented in summary of findings Table for the main comparison. The results apply to the majority of displaced proximal humeral fractures involving the surgical neck, but note should be taken of clear exceptions, such as where surgery is required for severe soft‐tissue compromise, as well as the exclusion of fracture‐dislocations, in ProFHER 2015. There was high quality evidence of no clinically important difference in patient‐reported shoulder and upper‐limb function at one‐ or two‐year follow‐up between surgical (primarily locking plate fixation or hemiarthroplasty) and non‐surgical treatment (sling 'immobilisation') for the majority of displaced proximal humeral fractures. There was moderate quality evidence of no clinically important difference between the two groups in quality of life at two years. While this observation applied to interim follow‐ups at six and 12 months, pooled data from four studies at six months showed a statistically significant effect. There was moderate quality evidence of little difference between the groups in mortality: although there were slightly more deaths in the surgery group, the 95% confidence interval also included the potential for a higher mortality after non‐surgical treatment. Also of note is that, where reported, only one death was explicitly linked with treatment (surgery). There was moderate quality evidence of a higher risk of additional surgery in the surgery group: based on an illustrative risk of 40 subsequent operations per 1000 non‐surgically treated patients, this amounts to an extra 43 subsequent operations per 1000 surgically treated patients (95% CI 7 to 104 more). There was also moderate quality evidence of a higher overall risk of adverse events after surgery; however, the 95% confidence intervals for adverse events also included the potential for a greater risk of adverse events after non‐surgical treatment.

Methods of non‐surgical management (including rehabilitation)

Non‐surgical management, generally involving a period of arm immobilisation followed by physiotherapy, of (mainly) minimally displaced fractures is the focus of nine trials. Exceptionally, Torrens 2012 included a higher percentage of displaced fractures (81% = 34/42 fractures). There was a general recognition of the impaired function and serious complications, such as complex regional pain syndromes, that could follow a proximal humeral fracture.

Initial treatment, including immobilisation

When considering the extent and duration of initial immobilisation after a fracture, a balance is needed between the advantages of pain relief and avoidance of fracture displacement, and the consequences of immobilisation, notably joint stiffness and muscle atrophy.

Early versus delayed mobilisation

Of the four heterogeneous trials comparing early versus delayed mobilisation for minimally displaced or displaced fractures (Hodgson 2003; Kristiansen 1989; Lefevre‐Colau 2007; Torrens 2012), only limited data, mainly for secondary outcomes, could be pooled from Lefevre‐Colau 2007 and Torrens 2012.

summary of findings Table 2 summarises the data relating to primary outcome measures for early versus delayed mobilisation in non‐surgically treated fractures. With the exception of adverse event data provided by all four trials, most of these data are from Hodgson 2003. There was low quality evidence in favour of early mobilisation in terms of fewer people with shoulder problems at one year, of the need for fewer sessions of physiotherapy to achieve independent function, and of a better quality of life at 16 weeks in terms of less pain and less limitation of physical function. There was low quality evidence of no difference between early and delayed mobilisation in physical and pain aspects of quality of life at one year. There was very low quality evidence of no clinically important between‐group differences in quality‐of‐life scores for people with mainly displaced fractures. There was very low quality evidence of little difference between the two groups in shoulder complications and fracture displacement and non‐union; the incidences of individual complications were low.

Type of bandage

The one quasi‐randomised trial (28 participants with mainly minimally displaced fractures) provided very low quality evidence on the relative effects of two types of bandages, the Gilchrist arm sling versus the Desault body bandage (Rommens 1993). There was no report of PROMS nor data to support the claims of no between‐group differences in functional outcome or fracture healing. More participants found the arm sling comfortable and acceptable compared with the body bandage.

Continuing management (rehabilitation) after initial treatment involving sling immobilisation
Instructions for home exercises versus physiotherapy

Two small trials including a total of 62 participants with minimally displaced fractures compared home exercises after receiving instructions versus supervised physiotherapy (Bertoft 1984; Lundberg 1979). Neither trial reported on PROMS for function or quality of life. There was very low quality evidence from single trials of little difference between the two groups in pain, change of therapy, adverse events, and range of motion.

Supervised exercises in a swimming pool plus home exercises versus home exercises alone

The trial making this comparison in 48 participants with minimally displaced fractures did not provide evidence that could be presented or tested in the analyses (Revay 1992). Revay 1992 claimed that the self‐treatment group had better activities of daily living and joint mobility in the first two to three months but that the two groups had similar results at one year. Revay 1992 suggested that the supervised group had neglected their home exercises, which effectively undermines the aim of this trial.

Pulsed electromagnetic high frequency energy (PHFE)

Livesley 1992 hypothesised that pain was associated with contracture of the capsule of the glenohumeral joint and that PHFE would reduce inflammation and swelling, improving the end functional result. However, the trial (48 participants with minimally displaced fractures) failed to provide any quantitative data to support or refute this hypothesis.

Different methods of surgical management

Comparisons of different categories of surgical intervention

Five trials compared different methods of surgical management (Cai 2012; Hoellen 1997; Sebastiá‐Forcada 2014; Smejkal 2011; Zhu 2011).

Open reduction with internal fixation using a locking plate versus a locking nail

There is very low quality evidence from one trial (Zhu 2011: 57 participants with two‐part surgical neck fractures) of marginally better function (higher American Shoulder and Elbow Surgeon's scores) and slightly less pain after locking plate fixation compared with locking nail fixation at one year but not at three years. There was very low quality evidence of a higher rate of complications, including re‐operation for screw penetration into the humeral head after plate fixation.

Open reduction with internal fixation using a locking plate versus minimally invasive fixation with distally inserted intramedullary K‐wires

There is very low quality evidence from one trial (Smejkal 2011: 61 participants with two‐ or three‐part fractures) of no difference between these two interventions for numbers of participants incurring a complication or in the Constant scores at two years follow‐up.

Hemiarthroplasty versus internal fixation

With minimal opportunity for pooling, data from two small heterogeneous trials testing this comparison were presented separately.

Hemiarthroplasty versus open reduction and locking plate fixation

The very low quality evidence from one trial (Cai 2012: 32 participants with four‐part fractures) of lower DASH scores (better function) and slightly higher EQ‐5D scores (better quality of life) at one and two years may not equate to clinically important differences in either of these outcomes between hemiarthroplasty and locking plate fixation. Three participants in each group had a re‐operation.

Hemiarthroplasty versus tension band wiring

There is very low quality evidence from one trial (Hoellen 1997: 30 participants with four‐part fractures) of no differences between the two groups in the Constant scores or pain (18 participants). At one‐year follow‐up, all five reoperations occurred in the fixation group.

Reverse shoulder arthroplasty versus hemiarthroplasty

There is low quality evidence from one trial (Sebastiá‐Forcada 2014: 62 participants with either three‐ or four‐part fractures) of better patient‐rated (Quick DASH) and composite shoulder function scores (UCLA and Constant scores) at a minimum of two years follow‐up in the reverse shoulder arthroplasty (RSA) group. Although a condition‐specific minimal clinically important difference is not available for QuickDASH, it is likely that the difference would have been clinically important to some extent. The clinically important differences favouring RSA in the Constant and UCLA scores will in part reflect the greater range of motion in the RSA group. Fewer people in the reverse arthroplasty group had a re‐operation (one versus six) or had a complication (two versus 10).

Comparisons of different methods of performing an intervention in the same category

Seven trials compared different types or methods in the same intervention category (e.g. plating) (Buecking 2014; Fialka 2008; Lopiz 2014; Ockert 2010; Soliman 2013; Voigt 2011; Zhang 2011).

Deltoid‐split approach versus deltopectoral approach for non‐contact bridging plate fixation

There is very low quality evidence from one trial (Buecking 2014: 120 participants with two‐, three‐ or four‐part fractures) of no differences between groups in activities in daily living, re‐operations or complications, Constant scores or pain at one year.

Polyaxial versus monoaxial locking plate fixation

Although two trials (Ockert 2010 and Voigt 2011: 180 participants with two‐, three‐ or four‐part fractures) made this comparison, most of the data were from Voigt 2011 (48 participants for function) and only data for re‐operation were pooled. There was very low quality evidence of no between‐group differences in function (DASH and simple shoulder test scores), re‐operations and complications.

Locking plate: use of medial locking screws

There is very low quality evidence from one trial (Zhang 2011: 68 participants with two‐, three‐ or four‐part fractures) of medial locking screws resulting in fewer early losses of fixation and re‐operations. However, the 95% CI results also included a higher risk of re‐operations in the medial locking screws group. Based on the control risk of 154 re‐operations per 1000 participants, medial locking screws resulted in 120 fewer re‐operations (95% CI 149 fewer to 117 more). Although the medial screw group had statistically significantly higher Constant scores at 31‐month follow‐up, only part of the 95% CI included the minimal clinically important difference.

Nails: comparison of two types

There is low quality evidence from one trial (Lopiz 2014: 54 participants with two‐ or three‐part fractures) of fewer adverse events, including re‐operations and impingement, for the MPHN nail compared with the Polarus nail. Based on the control (Polarus nail) group risk of 423 re‐operations per 1000 participants, the MPHN resulted in 308 fewer re‐operations (95% CI 55 to 385 fewer). Of note is the very low quality evidence, as half as many participants in the MPHN group had rotator cuff symptoms. The MPHN group had higher Constant scores (very low quality evidence), which the authors linked with the lower incidence of rotator cuff symptoms in this group.

Hemiarthroplasty: comparison of two types

There was very low quality evidence from one trial (Fialka 2008: 35 out of 40 people with four‐part fractures) for better function (Constant scores and range of motion) at one year for the EPOCA prosthesis when compared with the HAS prosthesis. Two participants in each group had a serious complication or pain requiring further treatment.

Hemiarthroplasty: tenodesis of the long head of the biceps (LHB) versus LHB tendon left intact

There was very low quality evidence from one trial (Soliman 2013: 45 people with four‐part fractures undergoing hemiarthroplasty) of no between‐group differences in complications at three months follow‐up, in function (Constant score), in numbers of participants with shoulder pain or range of motion.

Continuing management (including rehabilitation) after surgical intervention

The need for and duration of immobilisation before commencing physiotherapy after surgery for displaced fractures was tested in two small heterogeneous trials for fixation and hemiarthroplasty respectively. There was very low quality and incomplete evidence from one trial (Wirbel 1999: 64 participants (of the 77 recruited)) of no difference between one week versus three weeks immobilisation after percutaneous fixation in the numbers of participants with an unsatisfactory or worse outcome based on the Neer outcome score at six months or incurring premature removal of K‐wires failure (five in each group). There was very low quality evidence from one trial (Agorastides 2007: 49 participants (of the 59 recruited)) of no difference between participants mobilised after two weeks (which was current practice) after hemiarthroplasty versus those mobilised after six weeks in function (Oxford shoulder score or Constant score), radiological outcomes and range of motion at one year.

Overall completeness and applicability of evidence

To inform consideration of applicability of the evidence from individual trials, we give quite extensive details in the Characteristics of included studies on the study populations and interventions. Additionally, Table 2 shows our assessments for each trial of four aspects of relevance to ascertaining external validity: definition of the study population, description of the interventions, definition of primary outcome measures and length of follow‐up. Clearly unhelpful is where there are incomplete descriptions of study inclusion (10 trials) and interventions (five trials). Five trials had less than one year's follow‐up: Lefevre‐Colau 2007 (six months), Livesley 1992 (six months), Ockert 2010 (six months) and Rommens 1993 (until fracture consolidation ‐ time unspecified). Additionally, the minimum follow‐up was six months in Lopiz 2014. Despite the claims of longer follow‐up, the results seemed to apply to six months at most in Stableforth 1984. In Wirbel 1999, though follow‐up of 21 participants was more than two years, the main results applied to the set follow‐up at six months. Our setting of our criterion to one‐year follow‐up as acceptable is arbitrary and mainly reflects a reasonable timing for assessment of function. However, it should be noted that in terms of a full outcome assessment, data at one‐year follow‐up must be considered preliminary results only given that complications such as avascular necrosis and device failure may not become evident until later and functional recovery can still be ongoing.

Open in table viewer
Table 2. Assessment of items relating to applicability of trial findings

Clearly defined study population?

Interventions sufficiently described?

Main outcomes sufficiently
described?

Appropriate timing of outcome measurement?
(Yes = ≥ 1 year)

Agorastides 2007

Partial: exclusions not specified upfront

Yes

Yes

Yes: 1 year

Bertoft 1984

Partial: no exclusion criteria given (e.g. ability to understand instructions for exercises)

Yes

Yes

Yes: 1 year

Boons 2012

Yes

Yes

Yes

Yes: 1 year

Buecking 2014

Partial: indication for hemiarthroplasty poorly defined (27 excluded before randomisation because "implantation of a prosthesis was planned")

Yes

Yes

Yes: 1 year

Cai 2012

Partial: unclear definition of 4‐part fractures.

Yes: however, time to surgery not reported

Yes

Yes: 2 years

Fialka 2008

Yes

Yes

Yes

Yes: 1 year

Fjalestad 2010

Yes

Yes

Yes

Yes: 2 years

Hoellen 1997

Yes: but some question over fracture type in that the Holbein 1999 report included 3‐part fractures too

Yes

Yes

Yes: 1 year

Hodgson 2003

Yes

Yes

Yes

Yes: 2 years

Kristiansen 1988

Partial: no exclusion criteria given

Partial: incomplete description of timing of sling use and care of external fixator pin sites

Partial: no description of measurement procedures

Yes: 1 year

Kristiansen 1989

Partial: no exclusion criteria given

Partial: although sling and body bandage are common expressions, some variation possible

Partial: no description of measurement procedures

Yes: 2 years

Lefevre‐Colau 2007

Yes

Yes

Yes

Partial: 6 months

Livesley 1992

Yes: although this included 4 patients under 20 years with epiphyseal fractures

Yes

Yes

Partial: 6 months

Lopiz 2014

Partial: insufficient criteria given in terms of suitability for surgery

Yes

Yes

Partial: minimum 6 months

Lundberg 1979

Partial: no exclusion criteria given (e.g. ability to understand instructions for exercises)

Yes

Yes

Yes: 1 year or above (mean: 16 months)

Ockert 2010

Partial: exclusion criteria described in context of post‐randomisation exclusions.

Yes

Yes

Partial: 6 months

Olerud 2011a

Yes

Yes

Yes

Yes: 2 years

Olerud 2011b

Yes

Yes

Yes

Yes: 2 years

ProFHER 2015

Yes

Yes

In the context of this being a pragmatic trial

Yes

Yes: 2 years

Revay 1992

Yes

Partial: frequency of swimming sessions not stated

Yes

Yes: 1 year

Rommens 1993

Yes: but to note that other fractures including rib (3 participants) were included

Yes

Partial: functional outcome assessment not described (sufficiently)

No: only until fracture consolidation

Sebastiá‐Forcada 2014

Yes

Yes

Yes

Yes: minimum 2 years

Smejkal 2011

Yes

Partial: Only minimal intra‐operative details given and nothing regarding post‐operative management including rehabilitation

Partial: this may have been ‘lost in translation’ (Czech article)

Yes: mean 2 years but range not stated (probably most/all > 1 year as recruitment had finished January 2010).

Soliman 2013

No: no explanation given for a younger population; insufficient criteria given in terms of suitability for hemiarthroplasty

Yes

Partial: incomplete description of pain categories; no clarification of modification to Constant score

Yes: minimum 21 months follow‐up

Stableforth 1984

Yes

Yes

Partial: no description of measurement procedures, incomplete description of pain categories

Partial: up to 6 months, then between 18 months to 12 years. This is too spread out. Most results applied to the 6‐month follow‐up.

Torrens 2012

Partial: the < 1.5 cm criterion for posterior displacement of the greater tuberosity is unusual and no justification was given by the authors

Partial: incomplete description of accompanying "progressive rehabilitation program"

Partial: incomplete description of measurement procedures

Yes: 1 year

Voigt 2011

Yes

Yes

Yes

Yes: 1 year

Wirbel 1999

Yes

Yes

Partial: no description of measurement procedures

Partial: between 9 and 36 months; < 1 year in 10 participants. Main results applied to 6 months.

Zhang 2011

Yes

Yes

Partial: Insufficient information on measurement of complications and timing of their measurement.

Yes: All over 25 months (mean 30.8 months)

Zhu 2011

Yes

Yes

Yes

Yes: 1 and 3 years

Zyto 1997

Yes

Yes

Yes

Yes: 1 year, and 3 to 5 years

The measurement of outcome was variable, though generally comprehensive. In most of the older trials, there was frequent use of non‐validated or, at best, partly validated scoring systems such as the Neer (Neer 1970) and Constant (Constant 1987) systems, but also of simple rating systems for individual outcomes. Validated schemes such as the Oxford Shoulder Score (Dawson 1996) and Shoulder Rating Questionnaire (L'Insalata 1997) for subjective assessment of symptoms and function were not available at the time for the trials in earlier versions of this review. Nonetheless, some consideration of interobserver reproducibility and other aspects of validity was evident in the establishment of the Constant score in two trials (Lundberg 1979; Zyto 1997). Non‐validated outcome assessment schemes, often with arbitrary criteria for grading overall outcome (excellent, good, fair, poor), are probably best viewed as 'blunt' and flawed instruments. This needs to be noted when viewing the results of many of the older included trials; in particular Kristiansen 1989, whose outcome assessment is almost completely based on the Neer scoring system. As noted also in our 2012 update, more recent trials continue to be better in this respect. Four of the eight newly included trials reported on PROMS for function: for example, Boons 2012 reported on the Simple Shoulder test; Cai 2012 on DASH; ProFHER 2015, the Oxford Shoulder Score; and Sebastiá‐Forcada 2014, the QuickDASH. The continued use of the Constant score is notable, being reported by the newly included trials with the exception of ProFHER 2015, which did not conduct additional clinical examinations for the collection of such data.

The majority of the trials used Neer's fracture classification (Neer 1970). Problems, such as poor interobserver reproducibility and intraobserver reliability, with the classification of fractures according to the Neer and AO systems have been shown for both radiographs and computerised tomographic scans (Bernstein 1996; Brorson 2008; Sidor 1993; Siebenrock 1993; Sjoden 1997). This variation in the classification of fractures and hence diagnosis needs to be considered when interpreting the results of trials, both in respect to the comparability and composition of the intervention groups and in the applicability of the trial's findings. The limitations of the Neer classification scheme were also demonstrated by the identification of the valgus impacted four‐part fracture as a separate category with a lower risk of avascular necrosis (Jakob 1991). Ideally a fracture classification system should act as a guide to treatment as well to enable the comparison of results from studies of patients with similar fracture patterns. However, other factors, such as osteoporotic bone, associated soft tissue injury and the patient's overall health and motivation, will also influence treatment choices and outcome. A recent study (Brorson 2012) looking at the agreement of surgeons' treatment recommendations in conjunction with the Neer classification concluded that the low observer agreement on the Neer classification may have less clinical importance than previously assumed. However, it noted that inter‐observer agreement on treatment did not exceed moderate levels. The purposefully pragmatic inclusion criteria used in ProFHER 2015 is noteworthy in this regard. These stipulated that the degree of displacement had to be sufficient for the treating surgeon to consider surgical intervention but did not have to meet the displacement criteria of Neer for inclusion in the trial. Post‐recruitment classification by two independent surgeons of the baseline X‐rays, resulted in the identification of 18 one‐part fractures (see Table 1). Nonetheless these exceptions were judged sufficiently displaced that they would have been considered for surgical intervention in practice; where the exact observation of Neer's arbitrary criteria is rare.

While it is possible that all 31 trials are relevant to current practice somewhere in the world, it is likely that some interventions are now rarely used. These include body bandages as tested in Rommens 1993. Nowadays it is much more common practice to use either a 'collar and cuff' sling or a 'poly‐sling' (these incorporate a chest strap that can be passed around the body). Additionally, the applicability of the findings from older trials, such as Stableforth 1984, is potentially less given subsequent changes in practice including the availability of new implants. These include locking plates, which are being increasingly used and promoted for these fractures (Thanasas 2009). Previously, we noted that the increasing use of locking plates for these fractures was reflected in the use of locking plates in more recently included trials (Handoll 2012). Another more recent development has been the use of reverse shoulder arthroplasty (RSA), typically for more complex four‐part fractures in older people. This was tested in a newly included trial (Sebastiá‐Forcada 2014), with evidence pending from three ongoing trials comparing RSA with hemiarthroplasty (NTR3208; NCT02075476; SHeRPA), one ongoing trial comparing RSA with plating (DELPHI) and one ongoing trial comparing two types of RSA (NCT01086202).

Comments on individual comparisons

Surgical treatment versus non‐surgical treatment

In our previous commentary for this comparison we noted that "Trials comparing surgical versus non‐surgical interventions, or indeed different surgical interventions, risk losing currency as different implants and methods become available and fashionable." (Handoll 2012). We also noted the impact of surgical decision‐making in favour of locking plating systems, which allow for stronger constructs and fixation of more complex fracture patterns in osteopenic bone with the potential for less soft‐tissue stripping and compromise to the blood supply (Thanasas 2009). More recently for more complex (predominantly four‐part) fractures, reverse shoulder arthroplasty is being promoted, as illustrated by its increasing use, for instance in the USA (Han 2015; Schairer 2015). These illustrate how evolving technology (and marketing forces) mitigates against applying the findings of these types of trials. However, more emphasis can be given to the evidence from the five more recent trials (Boons 2012; Fjalestad 2010; Olerud 2011a; Olerud 2011b; ProFHER 2015), all of which report patient‐reported outcome measures of function and quality of life. When considering the validity and applicability of surgical trials, account needs to be taken also of fundamental variations in surgical practice, including facilities and operator expertise. In particular, operator expertise and the linked issue of the surgical learning curve, play a pivotal role in the validity and applicability of surgical trial findings. Awareness of these issues was behind the pragmatic decision in ProFHER 2015 for surgeons to use methods with which they are familiar rather than stipulate the type of surgery. Indeed, the pragmatic multicentre design of ProFHER 2015, including the constant emphasis on good standard practice and surgery by experienced surgeons (predominantly consultants), means that its results have immediate applicability at least in the setting where it was conducted (UK NHS trauma hospitals) and most likely in many other countries with similar surgical practice.

Because of the dominance of the evidence from ProFHER 2015, particular note should be taken of its exclusion criteria (such as of fracture dislocations and two‐part greater tuberosity fractures and other patterns not involving the surgical neck) and its study population, the composition of which shows the treatment uncertainty covered by this trial applied to the majority of displaced fractures of the proximal humerus. Additionally the lack of subgroup differences in ProFHER 2015, either for age (threshold of 65 years) or fracture type (tuberosity involvement or not; or Neer one‐ or two‐part versus three‐ or four‐part) strengthens the case for not differentiating treatment (use of surgery) on the basis of these characteristics. Nonetheless, in this trial and the other seven trials, six of which purposefully excluded younger adults, the evidence is predominantly from older people. This reflects the population distribution for these fractures (Karl 2015) but also the population for which the main treatment uncertainty applies.

Initial treatment, including immobilisation

Most of the evidence for the comparison of early versus delayed mobilisation came from Hodgson 2003 and thus applies primarily to the less severe fractures (minimally displaced two‐part fractures). A survey sent to senior hospital physiotherapists working directly with orthopaedic patients revealed large variation in rehabilitation, in particular with regards to routine immobilisation, duration of immobilisation and timing of first contact with a physiotherapist, within and between hospitals in the UK (Hodgson 2003a; Hodgson 2006). A survey sent to the participating centres of ProFHER 2015, which included displaced fractures, found the recommended duration of arm immobilisation for non‐surgically treated patients ranged from two to six weeks, with 29 (91%) of 32 UK hospitals recommending immobilisation of ≥ 3 weeks (Handoll 2015). This variation also needs to be viewed in the context of the type of arm immobilisation used, as methods such as collar and cuff provide support rather than rigid immobilisation. As noted by McKee 2007 in his commentary on Lefevre‐Colau 2007, the applicability of this trial is limited by the intensive physiotherapy regimen used in both groups. Both practically and financially the 32 two‐hour sessions of physiotherapy may be difficult for patients and health providers; notably, 10 participants withdrew from the trial because of difficulties in attending. In contrast, the mean numbers of treatment sessions in Hodgson 2003 were nine and 14 respectively in the two groups.

As stated above the body bandages tested in Rommens 1993, which compared the Gilchrist arm sling with the Desault body bandage, is rarely used in practice. The above‐mentioned survey of practice carried out as part of ProFHER 2015 confirms this in the UK, where 'collar and cuff' slings, poly‐slings and more rarely broad‐arm slings are used (Handoll 2010).

Continuing management (rehabilitation) after initial treatment involving sling immobilisation

The three trials in this category that examined supervised versus home exercises were based in Sweden and possible differences in conventional physiotherapy regimens within and between countries, then and now, need to be taken into account when considering the application of trial findings. If they work, self‐instruction and home‐based exercise programmes are attractive for patients and conserve health care resources. There is some evidence from a Cochrane Review on fall prevention that older people, if well instructed and with intensive support (regular phone calls etc), can maintain a home‐based exercise programme (Gillespie 2003; Gillespie 2009). However, there will still be some patients with insufficient understanding or motivation to perform the required exercises.

Different methods of surgical management

Most of the recent research activity, in both published and registered trials, evaluates different types of surgery. We now distinguish between trials comparing different categories of surgical interventions (tested by five trials) and trials comparing different methods of performing an intervention in the same category (tested by seven trials).

Comparisons of different categories of surgical intervention

The variety of available implants in the same category can limit the applicability and usefulness of trials comparing different categories of surgical intervention by comparisons of specific implants. Nonetheless, the comparison by Zhu 2011 of one of two locking plates versus a locking nail is very pertinent in terms of providing a useful investigation of the appropriateness of the current trend from locking nails to plates. This trial is too small to establish the superiority of one method over the other but it does provide some evidence of better function in the plate group at one year, and possibly for longer, although at a potentially greater risk of surgical complications and initially more invasive surgery. The comparison by Smejkal 2011 of a locking plate versus minimally invasive fixation with distally inserted multiple intramedullary K‐wires (the Zifko method of minimally invasive fixation) is of relevance to current practice but, while data from Smejkal 2011 lend support to the use of the Zifko method in terms of it being a less extensive surgical procedure with potentially an earlier recovery than plate fixation, there were inadequate data on longer term function and outcome.

Again, the trial comparing hemiarthroplasty versus open reduction and locking plate fixation was too small to inform practice (Cai 2012). The absence of intraoperative conversions for the open reduction and internal fixation (ORIF) group to hemiarthroplasty or early failures is notable for a series of 13 displaced four‐part fractures and fracture‐dislocations and may indicate differences in assessing and dealing with problematic or failed fixation in this centre compared with other centres. Hoellen 1997, a flawed trial with only one‐year follow‐up, considered only one of several shoulder prostheses now available (the prosthesis was cemented in place) in their comparison with tension band wiring.

The comparison of reverse shoulder arthroplasty (RSA) versus hemiarthroplasty tested in Sebastiá‐Forcada 2014 is very topical, as shown also by the three ongoing trials that are making the same comparison (NCT02075476; NTR3208; SHeRPA). The prostheses compared within each of these four trials come from only one manufacturer. However, the three ongoing trials examines prosthesis from three different manufacturers. Prostheses made by different manufacturers will differ to some extent; however, the variation between prostheses from different manufacturers is likely to be of lesser importance clinically than the large differences between RSA and hemiarthroplasty. In terms of applicability, Sebastiá‐Forcada 2014 is mainly limited by being a single‐centre trial with the participants being operated on by two surgeons.

Comparisons of different methods of performing an intervention in the same category

The trial (Buecking 2014) comparing two approaches (deltoid‐split approach versus deltopectoral) for non‐contact bridging plate fixation had two notable limitations in terms of external validity. One was the absence of criteria for excluding patients for whom hemiarthroplasty was planned. The second was the inappropriate interpretation of the Lawson quality‐of‐life score. The two trials (Ockert 2010; Voigt 2011) comparing 'polyaxial' (where surgeons had greater control in their positioning of screws into the bone) versus 'monoaxial' locking plate fixation found no difference between the two methods. With no report of functional outcome, Ockert 2010 contributed relatively little to this question. Voigt 2011, which was a stronger trial but still insufficient to be conclusive, pointed out that the "majority of surgeons chose the same screw directions for the polyaxial screws as already exist in the fixed angle plate". In their 2014 publication, Ockert 2010 also found that polyaxial screws were placed similarly to monoaxial screws. Of note also is the differences in the types of screws in the two implants in Voigt 2011, which could in some respects alter the question. Zhang 2011 tested the use of medial support locking screws for fixation using the PHILOS locking plate. While Zhang 2011 did not provide conclusive evidence of clinical benefit of the enhanced stabilisation of this commonly used plate, the direction of effect is consistent with the theoretical advantages for medial support screws.

In their comparison of the MultiLoc Proximal Humeral Nail (MPHN) versus the Polarus nail, Lopiz 2014 found the newer "straight" nail (the MPHN) resulted in fewer adverse events (screw loosening, impingement, rotator cuff symptoms) than the "curved" Polarus nail. This is plausible given the different design features of the MPHN that attempt to avoid the various problems, including impingement, that have been identified when using the Polarus nail. However, some consideration is also required of the rather high incidence of adverse events for the Polarus nail and the general inadequacies of tests for rotator cuff symptoms (Hanchard 2013).

Fialka 2008 compared two shoulder prostheses but although the authors ascribed the different functional outcomes to tuberosity fixation, other design differences may account for these results. These include a different stem finish and a more accurate recreation of pre‐operative humeral geometry with the EPOCA prosthesis. The study population of Soliman 2013, which compared tenodesis of the long head of the biceps (LHB) versus LHB tendon left intact in people undergoing hemiarthroplasty, was exceptional in being younger (aged 45 to 60 years) than all other trial populations in this review and younger than the population for whom hemiarthroplasty is more typically used. Although the inclusion criteria included more severe injuries such as head‐splitting fractures, Soliman 2013 provided insufficient criteria on which to judge participant suitability for hemiarthroplasty. Additionally of note, is the absence of spontaneous ruptures of the long head of biceps.

Continuing management (including rehabilitation) after surgical intervention

The need for and duration of immobilisation before commencing physiotherapy after surgical treatment is likely to depend on the method of fixation or type of prosthesis; and also other factors such as bone quality. While neither trial (Agorastides 2007; Wirbel 1999) found conclusive evidence for early mobilisation, such as offering any functional advantage, it can also be observed that the evidence was inconclusive for later mobilisation too, such as avoiding destabilisation of the fracture fixation after percutaneous fixation or tuberosity fixation after hemiarthroplasty.

Quality of the evidence

As noted in Handoll 2012 and continues to apply in this update, more recent trials generally have better study design (e.g. they have appropriate random sequence generation and allocation concealment, and thus are at low risk of selection bias) and reporting (e.g. including participant flow diagrams). Nonetheless, as shown by Figure 2, many of the included trials had serious shortcomings and are at high risk of bias that could affect the validity of their findings. The main but generally unavoidable shortcoming in trials testing physical and surgical interventions was the lack of blinding, which is unavoidable to a great degree. Twenty‐one trials were considered at high risk of outcome assessment bias for function and other subjective outcomes. The risk of bias resulting from a high loss to follow‐up or exclusion of participants from the analyses was considered high in 13 trials, two of which were new to this version (Buecking 2014; Soliman 2013). Most comparisons were carried out in small single trials only; there is clearly a need for caution in interpreting the results of small trials which demonstrate 'no evidence of an effect' rather than 'evidence of no effect'. Insufficiencies in quantity and quality of the evidence still preclude the drawing of robust conclusions for most of the comparisons evaluated by the included trials.

Only one of the eight newly included trials had prospective trial registration and a published protocol (ProFHER 2015). While this is discouraging at present, it is notable that the increased research activity in this previously overlooked area is also associated with far more prospective trial registration as well as publication of trial protocols. Both these items show the greater use of robust methodology that is required to minimise bias. Additionally, there is a growing interest in multicentre trials. Seven of the 21 ongoing trials are multicentre.

The results of the GRADE assessment of the quality of evidence for the individual comparisons are summarised below. With the exception of the evidence for the comparison of surgical versus non‐surgical treatment, most of the GRADE assessments for the other comparisons were low or very low quality. This typically reflects the insufficiency of the evidence from small single trials which have limitations in design, conduct, analysis and reporting, putting them at high risk of bias.

Surgical treatment versus non‐surgical treatment

Initial treatment, including immobilisation

  • Early versus delayed mobilisation: the quality of evidence assessments for difference outcomes for this comparison ranged from very low to low (for details please see summary of findings Table 2).

  • Type of bandage (Gilchrist arm sling versus the Desault body bandage): the quality of evidence assessments for all reported outcomes were very low. The evidence was downgraded two levels for serious study limitations reflecting a serious risk of bias (including selection bias: quasi‐randomised trial) and one level for imprecision (single small trial: 28 participants).

Continuing management (rehabilitation) after initial treatment involving sling immobilisation

  • Instructions for home exercises versus physiotherapy: the quality of evidence assessments for all reported outcomes were very low. The evidence was downgraded one level for study limitations reflecting a high risk of bias and two levels for imprecision (evidence from single small trials: 20 and 42 participants).

  • Supervised exercises in a swimming pool plus home exercises versus home exercises alone: the quality of evidence assessments for all reported outcomes were very low. There is no quantitative evidence available for this comparison.

  • Pulsed electromagnetic high frequency energy (PHFE): the quality of evidence assessments for all reported outcomes were very low. There is no quantitative evidence available for this comparison.

Different methods of surgical management

Comparisons of different categories of surgical intervention

  • Open reduction with internal fixation using a locking plate versus a locking nail: the quality of evidence assessments for all reported outcomes were very low. The evidence was downgraded one level for study limitations reflecting a high risk of bias and two levels for imprecision (evidence from 57 participants in one trial).

  • Open reduction with internal fixation using a locking plate versus minimally invasive fixation with distally inserted intramedullary K‐wires: the quality of evidence assessments for all reported outcomes were very low. The evidence was downgraded two levels for study limitations reflecting a serious risk of bias and one level for imprecision (evidence from 55 participants in one trial).

  • Hemiarthroplasty versus open reduction and locking plate fixation: the quality of evidence assessments for all reported outcomes were very low. The evidence was downgraded one level for study limitations reflecting a high risk of bias and two levels for imprecision (evidence from 32 participants in one trial).

  • Hemiarthroplasty versus tension band wiring: the quality of evidence assessments for all reported outcomes were very low. The evidence was downgraded two levels for study limitations reflecting a serious risk of bias and one level for imprecision (evidence from 30 participants in one trial).

  • Reverse shoulder arthroplasty versus hemiarthroplasty: the quality of evidence assessments for all reported outcomes were low. The evidence was downgraded one level for study limitations reflecting unclear risk of bias in several domains and one level for imprecision (evidence from 62 participants in one trial).

Comparisons of different methods of performing an intervention in the same category

  • Deltoid‐split approach versus deltopectoral approach for non‐contact bridging plate fixation: the quality of evidence assessments for all reported outcomes were very low. The evidence was downgraded two levels for study limitations reflecting a serious risk of bias and one level for imprecision (wide confidence intervals; evidence from 120 participants in one trial). The evidence would have been further downgraded one level for indirectness because of the possible misapplication of the Lawson quality‐of‐life score.

  • Polyaxial versus monoaxial locking plate fixation: the quality of evidence assessments for all reported outcomes were very low. The evidence was downgraded one level for study limitations reflecting a serious risk of bias and two levels for imprecision (wide confidence intervals; evidence for function from 48 participants in one trial).

  • Locking plate ‐ use of medial locking screws: the quality of evidence assessments for all reported outcomes were very low. The evidence was downgraded one level for study limitations reflecting unclear risk of bias in several domains and two levels for imprecision (wide confidence intervals; evidence from 68 participants in one trial).

  • Intramedullary nails: MPHN versus Polarus: the quality of evidence assessments for the reported outcomes were low or very low. The evidence was downgraded one or two levels for study limitations reflecting the high risk of outcome assessment bias and the unclear risk of bias relating to detection given the large range in follow‐up (6 to 22 months) for some outcomes, and one level for imprecision (wide confidence intervals; evidence from 54 participants in one trial).

  • Hemiarthroplasty ‐ comparison of the EPOCA versus the HAS prosthesis: the quality of evidence assessments for all reported outcomes were very low. The evidence was downgraded two levels for study limitations reflecting a serious risk of bias and one level for imprecision (inadequate data presented; evidence from 35 participants in one trial).

  • Hemiarthroplasty ‐ tenodesis of the long head of the biceps (LHB) versus LHB tendon left intact: the quality of evidence assessments for all reported outcomes were very low. The evidence was downgraded two levels for study limitations reflecting a serious risk of bias and one level for imprecision (evidence from 45 participants in one trial).

Continuing management (including rehabilitation) after surgical intervention

  • One week versus three weeks immobilisation after percutaneous fixation: the quality of evidence assessments for all reported outcomes were very low. The evidence was downgraded two levels for study limitations reflecting a serious risk of bias and one level for imprecision (evidence at six months from 64 participants in one trial).

  • Mobilisation after two weeks versus six weeks following hemiarthroplasty: the quality of evidence assessments for all reported outcomes were very low. The evidence was downgraded two levels for study limitations reflecting a serious risk of bias and one level for imprecision (evidence from 49 participants in one trial).

Potential biases in the review process

While our search was comprehensive it is likely that we have failed to identify some randomised trials, particularly those reported only in abstracts or in non‐English language publications. We may also have overlooked mixed‐population trials that included proximal humeral fractures as a subgroup. However, we are almost certain that we have not overlooked trials that would provide definitive evidence that could inform practice. It is clear, from the growing awareness and imperative of trial registration, that such trials are now in progress. We prepared the review using systematic processes throughout, including contacting trial investigators for clarification and missing data. We describe the dilemma presented in the pooling of data from clearly heterogeneous trials for the surgical treatment versus non‐surgical treatment comparison in the Effects of interventions. This is, however, compatible with the overall question and notably the pooled analyses did not result in statistically significant heterogeneity.

Agreements and disagreements with other studies or reviews

Several new systematic reviews, none of which cover all treatment options, were identified via the search update. The only rehabilitation review examined the effects of exercise in people with select upper limb fractures including proximal humeral fractures (Bruder 2011). Two reviews compared surgical versus non‐surgical intervention (Li 2013; Mao 2014). Li 2013 limited surgery to internal fixation. One review compared arthroplasty versus 'joint preservation' that was either non‐surgical treatment or internal fixation (Zhang 2014). Two compared arthroplasty versus internal fixation (Dai 2014; Gomberawalla 2013). Dai 2014 limited internal fixation to locking plate fixation. Two reviews compared reverse shoulder arthroplasty versus hemiarthroplasty (Mata‐Fink 2013; Namdari 2013). All eight reviews, four of which included evidence from a broader spectrum of study designs (Dai 2014; Gomberawalla 2013; Mata‐Fink 2013; Namdari 2013), reported the limitations in the available evidence. Unlike our review, with its later search date, none of the three reviews comparing surgical versus non‐surgical treatment included ProFHER 2015 and neither of the two reviews comparing RSA with hemiarthroplasty included the first randomised trial on this topic (Sebastiá‐Forcada 2014).

8 (9 articles) new ongoing studies; additional materials (9 articles) for 7 already ongoing studies
 4 new studies (5 articles) awaiting classification
 Change in status:
 1 previously ongoing study to awaiting classification (1 extra article)
 1 study previously awaiting classification to ongoing (2 extra articles)
 Study flow diagram
Figures and Tables -
Figure 1

8 (9 articles) new ongoing studies; additional materials (9 articles) for 7 already ongoing studies
4 new studies (5 articles) awaiting classification
Change in status:
1 previously ongoing study to awaiting classification (1 extra article)
1 study previously awaiting classification to ongoing (2 extra articles)
Study flow diagram

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.
Figures and Tables -
Figure 2

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.
Figures and Tables -
Figure 3

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Forest plot of comparison: 4 Surgical versus non‐surgical treatment, outcome: 4.1 Functional scores at 12 months (higher = better outcome).
Figures and Tables -
Figure 4

Forest plot of comparison: 4 Surgical versus non‐surgical treatment, outcome: 4.1 Functional scores at 12 months (higher = better outcome).

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 1 Shoulder disability: Croft Shoulder Disability Score.
Figures and Tables -
Analysis 1.1

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 1 Shoulder disability: Croft Shoulder Disability Score.

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 2 Croft shoulder disability score: individual problems at 2 years.
Figures and Tables -
Analysis 1.2

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 2 Croft shoulder disability score: individual problems at 2 years.

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 3 Number of treatment sessions (until independent function achieved).
Figures and Tables -
Analysis 1.3

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 3 Number of treatment sessions (until independent function achieved).

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 4 SF‐36 scores: pain & physical dimensions.
Figures and Tables -
Analysis 1.4

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 4 SF‐36 scores: pain & physical dimensions.

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 5 Quality of life assessment: EuroQol 5D (0: dead to 1: best health).
Figures and Tables -
Analysis 1.5

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 5 Quality of life assessment: EuroQol 5D (0: dead to 1: best health).

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 6 Adverse events.
Figures and Tables -
Analysis 1.6

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 6 Adverse events.

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 7 Mortality.
Figures and Tables -
Analysis 1.7

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 7 Mortality.

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 8 Constant shoulder score (ratio of affected/unaffected arm).
Figures and Tables -
Analysis 1.8

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 8 Constant shoulder score (ratio of affected/unaffected arm).

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 9 Constant shoulder score (0 to 100: best).
Figures and Tables -
Analysis 1.9

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 9 Constant shoulder score (0 to 100: best).

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 10 Pain VAS (0 to 100: worst pain).
Figures and Tables -
Analysis 1.10

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 10 Pain VAS (0 to 100: worst pain).

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 11 Changes in pain intensity (mm) from baseline: 100 mm visual analogue scale (positive change = less pain).
Figures and Tables -
Analysis 1.11

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 11 Changes in pain intensity (mm) from baseline: 100 mm visual analogue scale (positive change = less pain).

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 12 Range of motion at 6 months (degrees): difference between two shoulders.
Figures and Tables -
Analysis 1.12

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 12 Range of motion at 6 months (degrees): difference between two shoulders.

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 13 Patient dissatisfied with treatment.
Figures and Tables -
Analysis 1.13

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 13 Patient dissatisfied with treatment.

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 14 Patient satisfaction (0 to 10: higher scores ‐ greater satisfaction).
Figures and Tables -
Analysis 1.14

Comparison 1 Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks, Outcome 14 Patient satisfaction (0 to 10: higher scores ‐ greater satisfaction).

Comparison 2 Gilchrist bandage versus 'Classic' Desault bandage, Outcome 1 Problems with bandages.
Figures and Tables -
Analysis 2.1

Comparison 2 Gilchrist bandage versus 'Classic' Desault bandage, Outcome 1 Problems with bandages.

Comparison 2 Gilchrist bandage versus 'Classic' Desault bandage, Outcome 2 Fracture displacement by 3 weeks.
Figures and Tables -
Analysis 2.2

Comparison 2 Gilchrist bandage versus 'Classic' Desault bandage, Outcome 2 Fracture displacement by 3 weeks.

Comparison 2 Gilchrist bandage versus 'Classic' Desault bandage, Outcome 3 Poor or bad rating by patient at fracture consolidation.
Figures and Tables -
Analysis 2.3

Comparison 2 Gilchrist bandage versus 'Classic' Desault bandage, Outcome 3 Poor or bad rating by patient at fracture consolidation.

Comparison 3 Instructed self‐exercise versus conventional physiotherapy, Outcome 1 Pain at one year (scale 0 to 8: maximum pain).
Figures and Tables -
Analysis 3.1

Comparison 3 Instructed self‐exercise versus conventional physiotherapy, Outcome 1 Pain at one year (scale 0 to 8: maximum pain).

Comparison 3 Instructed self‐exercise versus conventional physiotherapy, Outcome 2 Severe or moderate pain at 3 months.
Figures and Tables -
Analysis 3.2

Comparison 3 Instructed self‐exercise versus conventional physiotherapy, Outcome 2 Severe or moderate pain at 3 months.

Comparison 3 Instructed self‐exercise versus conventional physiotherapy, Outcome 3 Requested change of therapy.
Figures and Tables -
Analysis 3.3

Comparison 3 Instructed self‐exercise versus conventional physiotherapy, Outcome 3 Requested change of therapy.

Comparison 3 Instructed self‐exercise versus conventional physiotherapy, Outcome 4 Adverse events (frozen shoulder: 1 v 2; unexplained prolonged pain: 0 v 1).
Figures and Tables -
Analysis 3.4

Comparison 3 Instructed self‐exercise versus conventional physiotherapy, Outcome 4 Adverse events (frozen shoulder: 1 v 2; unexplained prolonged pain: 0 v 1).

Comparison 3 Instructed self‐exercise versus conventional physiotherapy, Outcome 5 Neer's rating (0 to 100: best) at mean 16 months (exploratory analysis).
Figures and Tables -
Analysis 3.5

Comparison 3 Instructed self‐exercise versus conventional physiotherapy, Outcome 5 Neer's rating (0 to 100: best) at mean 16 months (exploratory analysis).

Comparison 3 Instructed self‐exercise versus conventional physiotherapy, Outcome 6 Active gleno‐humeral elevation (degrees).
Figures and Tables -
Analysis 3.6

Comparison 3 Instructed self‐exercise versus conventional physiotherapy, Outcome 6 Active gleno‐humeral elevation (degrees).

Comparison 4 Surgical versus non‐surgical treatment, Outcome 1 Functional scores at 12 months (higher = better outcome).
Figures and Tables -
Analysis 4.1

Comparison 4 Surgical versus non‐surgical treatment, Outcome 1 Functional scores at 12 months (higher = better outcome).

Comparison 4 Surgical versus non‐surgical treatment, Outcome 2 Functional scores at 24 months (higher = better outcome).
Figures and Tables -
Analysis 4.2

Comparison 4 Surgical versus non‐surgical treatment, Outcome 2 Functional scores at 24 months (higher = better outcome).

Comparison 4 Surgical versus non‐surgical treatment, Outcome 3 Oxford Shoulder Score (0 to 48: best outcome).
Figures and Tables -
Analysis 4.3

Comparison 4 Surgical versus non‐surgical treatment, Outcome 3 Oxford Shoulder Score (0 to 48: best outcome).

Comparison 4 Surgical versus non‐surgical treatment, Outcome 4 DASH (0 to 100: worst disability).
Figures and Tables -
Analysis 4.4

Comparison 4 Surgical versus non‐surgical treatment, Outcome 4 DASH (0 to 100: worst disability).

Comparison 4 Surgical versus non‐surgical treatment, Outcome 5 American Shoulder and Elbow Surgeons score (0 to 24: best).
Figures and Tables -
Analysis 4.5

Comparison 4 Surgical versus non‐surgical treatment, Outcome 5 American Shoulder and Elbow Surgeons score (0 to 24: best).

Comparison 4 Surgical versus non‐surgical treatment, Outcome 6 Simple Shoulder Test (0 to 12: best function).
Figures and Tables -
Analysis 4.6

Comparison 4 Surgical versus non‐surgical treatment, Outcome 6 Simple Shoulder Test (0 to 12: best function).

Comparison 4 Surgical versus non‐surgical treatment, Outcome 7 Activities of daily living.
Figures and Tables -
Analysis 4.7

Comparison 4 Surgical versus non‐surgical treatment, Outcome 7 Activities of daily living.

Comparison 4 Surgical versus non‐surgical treatment, Outcome 8 Quality of life assessment: EuroQol (0: dead to 1: best health).
Figures and Tables -
Analysis 4.8

Comparison 4 Surgical versus non‐surgical treatment, Outcome 8 Quality of life assessment: EuroQol (0: dead to 1: best health).

Comparison 4 Surgical versus non‐surgical treatment, Outcome 9 Quality of life assessment (Fjalestad 2010 and 2014 data).
Figures and Tables -
Analysis 4.9

Comparison 4 Surgical versus non‐surgical treatment, Outcome 9 Quality of life assessment (Fjalestad 2010 and 2014 data).

Comparison 4 Surgical versus non‐surgical treatment, Outcome 10 Quality of life: SF‐12 Physical Component Score (0 to 100: best).
Figures and Tables -
Analysis 4.10

Comparison 4 Surgical versus non‐surgical treatment, Outcome 10 Quality of life: SF‐12 Physical Component Score (0 to 100: best).

Comparison 4 Surgical versus non‐surgical treatment, Outcome 11 Quality of life: SF‐12 Mental Component Score (0 to 100: best).
Figures and Tables -
Analysis 4.11

Comparison 4 Surgical versus non‐surgical treatment, Outcome 11 Quality of life: SF‐12 Mental Component Score (0 to 100: best).

Comparison 4 Surgical versus non‐surgical treatment, Outcome 12 Mortality.
Figures and Tables -
Analysis 4.12

Comparison 4 Surgical versus non‐surgical treatment, Outcome 12 Mortality.

Comparison 4 Surgical versus non‐surgical treatment, Outcome 13 Additional surgery (re‐operation or secondary surgery).
Figures and Tables -
Analysis 4.13

Comparison 4 Surgical versus non‐surgical treatment, Outcome 13 Additional surgery (re‐operation or secondary surgery).

Comparison 4 Surgical versus non‐surgical treatment, Outcome 14 Adverse events / complications.
Figures and Tables -
Analysis 4.14

Comparison 4 Surgical versus non‐surgical treatment, Outcome 14 Adverse events / complications.

Comparison 4 Surgical versus non‐surgical treatment, Outcome 15 Dependent in activities of daily living (or dead) at 6 months.
Figures and Tables -
Analysis 4.15

Comparison 4 Surgical versus non‐surgical treatment, Outcome 15 Dependent in activities of daily living (or dead) at 6 months.

Comparison 4 Surgical versus non‐surgical treatment, Outcome 16 Constant scores (overall: 0 to 100: best score).
Figures and Tables -
Analysis 4.16

Comparison 4 Surgical versus non‐surgical treatment, Outcome 16 Constant scores (overall: 0 to 100: best score).

Comparison 4 Surgical versus non‐surgical treatment, Outcome 17 Constant scores (difference between injured and uninjured shoulder): Normal = 0..
Figures and Tables -
Analysis 4.17

Comparison 4 Surgical versus non‐surgical treatment, Outcome 17 Constant scores (difference between injured and uninjured shoulder): Normal = 0..

Comparison 4 Surgical versus non‐surgical treatment, Outcome 18 Poor or unsatisfactory function at 1 year (Neer rating).
Figures and Tables -
Analysis 4.18

Comparison 4 Surgical versus non‐surgical treatment, Outcome 18 Poor or unsatisfactory function at 1 year (Neer rating).

Comparison 4 Surgical versus non‐surgical treatment, Outcome 19 VAS disability (0 to 100: no restrictions).
Figures and Tables -
Analysis 4.19

Comparison 4 Surgical versus non‐surgical treatment, Outcome 19 VAS disability (0 to 100: no restrictions).

Comparison 4 Surgical versus non‐surgical treatment, Outcome 20 Pain: VAS (0 to 100: worst pain).
Figures and Tables -
Analysis 4.20

Comparison 4 Surgical versus non‐surgical treatment, Outcome 20 Pain: VAS (0 to 100: worst pain).

Comparison 4 Surgical versus non‐surgical treatment, Outcome 21 Constant score at 50 months: overall and components.
Figures and Tables -
Analysis 4.21

Comparison 4 Surgical versus non‐surgical treatment, Outcome 21 Constant score at 50 months: overall and components.

Comparison 4 Surgical versus non‐surgical treatment, Outcome 22 Constant (often severe) pain at 6 months.
Figures and Tables -
Analysis 4.22

Comparison 4 Surgical versus non‐surgical treatment, Outcome 22 Constant (often severe) pain at 6 months.

Comparison 4 Surgical versus non‐surgical treatment, Outcome 23 Failure to recover 75% muscle power relative to other arm (survivors) at 6 months.
Figures and Tables -
Analysis 4.23

Comparison 4 Surgical versus non‐surgical treatment, Outcome 23 Failure to recover 75% muscle power relative to other arm (survivors) at 6 months.

Comparison 4 Surgical versus non‐surgical treatment, Outcome 24 Range of movement impairments in survivors at 6 months.
Figures and Tables -
Analysis 4.24

Comparison 4 Surgical versus non‐surgical treatment, Outcome 24 Range of movement impairments in survivors at 6 months.

Study

Measure

Surgery

Non‐surgical treatment

Difference (conclusion)

Fjalestad 2010

Total health‐care costs

mean = 10,367

mean = 10,946

Abstract: "the mean difference in total health‐care costs was 597 Euros in favour of surgery (95% CI = ‐5291, 3777)". No significant difference.

Fjalestad 2010

Health‐care + indirect costs

mean = 23,953

mean = 21,878

Reformatted text: "Including indirect costs... the difference [was] 2,075 (95% CI = ‐15,949 to 20,100)". No significant difference, but favours the non‐surgical group.

Figures and Tables -
Analysis 4.25

Comparison 4 Surgical versus non‐surgical treatment, Outcome 25 Costs at 1 year (Euros in 2005).

Comparison 4 Surgical versus non‐surgical treatment, Outcome 26 Total costs including indirect costs (Euros) at 1 year.
Figures and Tables -
Analysis 4.26

Comparison 4 Surgical versus non‐surgical treatment, Outcome 26 Total costs including indirect costs (Euros) at 1 year.

Comparison 5 Locking plate versus locking intramedullary nail, Outcome 1 American Shoulder and Elbow Surgeons (ASES) score (0 to 100: best).
Figures and Tables -
Analysis 5.1

Comparison 5 Locking plate versus locking intramedullary nail, Outcome 1 American Shoulder and Elbow Surgeons (ASES) score (0 to 100: best).

Comparison 5 Locking plate versus locking intramedullary nail, Outcome 2 Death, re‐operation and adverse events.
Figures and Tables -
Analysis 5.2

Comparison 5 Locking plate versus locking intramedullary nail, Outcome 2 Death, re‐operation and adverse events.

Study

Measure

Locking plate

Locking nail

Reported significance

Zhu 2011

Pain at 1 year

median = 0.5

interquartile range: 1.8

n = 29

median = 1.0

interquartile range = 1.0
n = 26

P = 0.042

Zhu 2011

Pain at 3 years

median = 0

interquartile range = 0.8
n = 26

median = 0

interquartile range = 1.0
n = 25

P = 0.642

Figures and Tables -
Analysis 5.3

Comparison 5 Locking plate versus locking intramedullary nail, Outcome 3 Pain (VAS: 0 to 10: worst).

Comparison 5 Locking plate versus locking intramedullary nail, Outcome 4 Constant score (0 to 100: best).
Figures and Tables -
Analysis 5.4

Comparison 5 Locking plate versus locking intramedullary nail, Outcome 4 Constant score (0 to 100: best).

Comparison 5 Locking plate versus locking intramedullary nail, Outcome 5 Active range of motion (at 3 years).
Figures and Tables -
Analysis 5.5

Comparison 5 Locking plate versus locking intramedullary nail, Outcome 5 Active range of motion (at 3 years).

Study

Measure

Locking plate

Locking nail

Reported significance

Zhu 2011

At 1 year

mean location = T8

range = T4 to L2
n = 29

mean location = T9

range = T2 to buttock
n = 26

P = 0.443

Zhu 2011

At 3 years

mean location = T8

range = T2 to buttock
n = 26

mean location = T8

range = T2 to buttock
n = 25

P = 0.636

Figures and Tables -
Analysis 5.6

Comparison 5 Locking plate versus locking intramedullary nail, Outcome 6 Range of movement: internal rotation (level on spine).

Comparison 5 Locking plate versus locking intramedullary nail, Outcome 7 Strength of suprapinatus (relative to opposite side) % ‐ at 3 years.
Figures and Tables -
Analysis 5.7

Comparison 5 Locking plate versus locking intramedullary nail, Outcome 7 Strength of suprapinatus (relative to opposite side) % ‐ at 3 years.

Comparison 5 Locking plate versus locking intramedullary nail, Outcome 8 Operation times and blood loss.
Figures and Tables -
Analysis 5.8

Comparison 5 Locking plate versus locking intramedullary nail, Outcome 8 Operation times and blood loss.

Comparison 5 Locking plate versus locking intramedullary nail, Outcome 9 Intra‐operative complication.
Figures and Tables -
Analysis 5.9

Comparison 5 Locking plate versus locking intramedullary nail, Outcome 9 Intra‐operative complication.

Comparison 6 Locking plate versus intramedullary nails (Zifko method), Outcome 1 Complications and [slight] malunion.
Figures and Tables -
Analysis 6.1

Comparison 6 Locking plate versus intramedullary nails (Zifko method), Outcome 1 Complications and [slight] malunion.

Comparison 6 Locking plate versus intramedullary nails (Zifko method), Outcome 2 Constant score (% of healthy limb) at mean 2 years.
Figures and Tables -
Analysis 6.2

Comparison 6 Locking plate versus intramedullary nails (Zifko method), Outcome 2 Constant score (% of healthy limb) at mean 2 years.

Comparison 6 Locking plate versus intramedullary nails (Zifko method), Outcome 3 Time to union and time to recover upper limb function (weeks).
Figures and Tables -
Analysis 6.3

Comparison 6 Locking plate versus intramedullary nails (Zifko method), Outcome 3 Time to union and time to recover upper limb function (weeks).

Comparison 6 Locking plate versus intramedullary nails (Zifko method), Outcome 4 Operation and fluoroscopic times.
Figures and Tables -
Analysis 6.4

Comparison 6 Locking plate versus intramedullary nails (Zifko method), Outcome 4 Operation and fluoroscopic times.

Comparison 6 Locking plate versus intramedullary nails (Zifko method), Outcome 5 Length of hospital stay (days).
Figures and Tables -
Analysis 6.5

Comparison 6 Locking plate versus intramedullary nails (Zifko method), Outcome 5 Length of hospital stay (days).

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 1 DASH score (0 to 100: worst disability).
Figures and Tables -
Analysis 7.1

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 1 DASH score (0 to 100: worst disability).

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 2 EQ‐5D score (0 to 1: best quality of life).
Figures and Tables -
Analysis 7.2

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 2 EQ‐5D score (0 to 1: best quality of life).

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 3 Re‐operation.
Figures and Tables -
Analysis 7.3

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 3 Re‐operation.

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 4 Dead at 2 years.
Figures and Tables -
Analysis 7.4

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 4 Dead at 2 years.

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 5 Implant removal at 1 year.
Figures and Tables -
Analysis 7.5

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 5 Implant removal at 1 year.

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 6 Constant score (0 to 100: best score).
Figures and Tables -
Analysis 7.6

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 6 Constant score (0 to 100: best score).

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 7 Pain VAS (0 to 100: worst pain) at 24 months.
Figures and Tables -
Analysis 7.7

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 7 Pain VAS (0 to 100: worst pain) at 24 months.

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 8 Pain at 1 year.
Figures and Tables -
Analysis 7.8

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 8 Pain at 1 year.

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 9 Range of motion at 24 months.
Figures and Tables -
Analysis 7.9

Comparison 7 Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures), Outcome 9 Range of motion at 24 months.

Comparison 8 Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA), Outcome 1 Shoulder function scores at 24 to 49 months.
Figures and Tables -
Analysis 8.1

Comparison 8 Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA), Outcome 1 Shoulder function scores at 24 to 49 months.

Comparison 8 Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA), Outcome 2 Re‐operation.
Figures and Tables -
Analysis 8.2

Comparison 8 Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA), Outcome 2 Re‐operation.

Comparison 8 Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA), Outcome 3 Death.
Figures and Tables -
Analysis 8.3

Comparison 8 Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA), Outcome 3 Death.

Comparison 8 Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA), Outcome 4 Composite (objective and subjective) shoulder function scores at 24 to 49 months.
Figures and Tables -
Analysis 8.4

Comparison 8 Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA), Outcome 4 Composite (objective and subjective) shoulder function scores at 24 to 49 months.

Comparison 8 Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA), Outcome 5 Constant score at 24 to 49 months: overall and components.
Figures and Tables -
Analysis 8.5

Comparison 8 Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA), Outcome 5 Constant score at 24 to 49 months: overall and components.

Comparison 8 Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA), Outcome 6 Complications.
Figures and Tables -
Analysis 8.6

Comparison 8 Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA), Outcome 6 Complications.

Comparison 8 Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA), Outcome 7 Radiological assessment findings.
Figures and Tables -
Analysis 8.7

Comparison 8 Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA), Outcome 7 Radiological assessment findings.

Comparison 8 Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA), Outcome 8 Range of motion (degrees) at 24 to 49 months.
Figures and Tables -
Analysis 8.8

Comparison 8 Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA), Outcome 8 Range of motion (degrees) at 24 to 49 months.

Comparison 9 Deltoid‐split versus deltopectoral approaches for plate fixation, Outcome 1 Re‐operation.
Figures and Tables -
Analysis 9.1

Comparison 9 Deltoid‐split versus deltopectoral approaches for plate fixation, Outcome 1 Re‐operation.

Comparison 9 Deltoid‐split versus deltopectoral approaches for plate fixation, Outcome 2 Dead at 1 year.
Figures and Tables -
Analysis 9.2

Comparison 9 Deltoid‐split versus deltopectoral approaches for plate fixation, Outcome 2 Dead at 1 year.

Comparison 9 Deltoid‐split versus deltopectoral approaches for plate fixation, Outcome 3 Complications.
Figures and Tables -
Analysis 9.3

Comparison 9 Deltoid‐split versus deltopectoral approaches for plate fixation, Outcome 3 Complications.

Comparison 9 Deltoid‐split versus deltopectoral approaches for plate fixation, Outcome 4 Constant score (0 to 100: best score).
Figures and Tables -
Analysis 9.4

Comparison 9 Deltoid‐split versus deltopectoral approaches for plate fixation, Outcome 4 Constant score (0 to 100: best score).

Comparison 9 Deltoid‐split versus deltopectoral approaches for plate fixation, Outcome 5 Pain (VAS 0 to 10: intolerable pain).
Figures and Tables -
Analysis 9.5

Comparison 9 Deltoid‐split versus deltopectoral approaches for plate fixation, Outcome 5 Pain (VAS 0 to 10: intolerable pain).

Comparison 9 Deltoid‐split versus deltopectoral approaches for plate fixation, Outcome 6 Operation and fluoroscopic times.
Figures and Tables -
Analysis 9.6

Comparison 9 Deltoid‐split versus deltopectoral approaches for plate fixation, Outcome 6 Operation and fluoroscopic times.

Comparison 9 Deltoid‐split versus deltopectoral approaches for plate fixation, Outcome 7 Length of hospital stay (days).
Figures and Tables -
Analysis 9.7

Comparison 9 Deltoid‐split versus deltopectoral approaches for plate fixation, Outcome 7 Length of hospital stay (days).

Comparison 10 Polyaxial versus monoaxial screw insertion in plate fixation, Outcome 1 DASH score at 12 months (0 to 100: greatest disability).
Figures and Tables -
Analysis 10.1

Comparison 10 Polyaxial versus monoaxial screw insertion in plate fixation, Outcome 1 DASH score at 12 months (0 to 100: greatest disability).

Comparison 10 Polyaxial versus monoaxial screw insertion in plate fixation, Outcome 2 Simple shoulder test (0 to 12: best outcome).
Figures and Tables -
Analysis 10.2

Comparison 10 Polyaxial versus monoaxial screw insertion in plate fixation, Outcome 2 Simple shoulder test (0 to 12: best outcome).

Comparison 10 Polyaxial versus monoaxial screw insertion in plate fixation, Outcome 3 Re‐operation.
Figures and Tables -
Analysis 10.3

Comparison 10 Polyaxial versus monoaxial screw insertion in plate fixation, Outcome 3 Re‐operation.

Comparison 10 Polyaxial versus monoaxial screw insertion in plate fixation, Outcome 4 Dead at 1 year.
Figures and Tables -
Analysis 10.4

Comparison 10 Polyaxial versus monoaxial screw insertion in plate fixation, Outcome 4 Dead at 1 year.

Comparison 10 Polyaxial versus monoaxial screw insertion in plate fixation, Outcome 5 Constant score at 12 months (% of contralateral limb).
Figures and Tables -
Analysis 10.5

Comparison 10 Polyaxial versus monoaxial screw insertion in plate fixation, Outcome 5 Constant score at 12 months (% of contralateral limb).

Comparison 10 Polyaxial versus monoaxial screw insertion in plate fixation, Outcome 6 Complications (radiological assessment).
Figures and Tables -
Analysis 10.6

Comparison 10 Polyaxial versus monoaxial screw insertion in plate fixation, Outcome 6 Complications (radiological assessment).

Comparison 10 Polyaxial versus monoaxial screw insertion in plate fixation, Outcome 7 Range of motion (degrees) at 12 months.
Figures and Tables -
Analysis 10.7

Comparison 10 Polyaxial versus monoaxial screw insertion in plate fixation, Outcome 7 Range of motion (degrees) at 12 months.

Comparison 10 Polyaxial versus monoaxial screw insertion in plate fixation, Outcome 8 Operation and fluoroscopic times.
Figures and Tables -
Analysis 10.8

Comparison 10 Polyaxial versus monoaxial screw insertion in plate fixation, Outcome 8 Operation and fluoroscopic times.

Comparison 11 Medial support screws versus control for locking plate fixation, Outcome 1 Adverse events.
Figures and Tables -
Analysis 11.1

Comparison 11 Medial support screws versus control for locking plate fixation, Outcome 1 Adverse events.

Comparison 11 Medial support screws versus control for locking plate fixation, Outcome 2 Constant score (0 to 100: best) at 2.5 years.
Figures and Tables -
Analysis 11.2

Comparison 11 Medial support screws versus control for locking plate fixation, Outcome 2 Constant score (0 to 100: best) at 2.5 years.

Comparison 12 MultiLoc Proximal Humeral Nail (MPHN) versus Polarus nail, Outcome 1 Adverse events.
Figures and Tables -
Analysis 12.1

Comparison 12 MultiLoc Proximal Humeral Nail (MPHN) versus Polarus nail, Outcome 1 Adverse events.

Comparison 12 MultiLoc Proximal Humeral Nail (MPHN) versus Polarus nail, Outcome 2 Constant score (0 to 100: best outcome) at 14 months (6 to 22 months).
Figures and Tables -
Analysis 12.2

Comparison 12 MultiLoc Proximal Humeral Nail (MPHN) versus Polarus nail, Outcome 2 Constant score (0 to 100: best outcome) at 14 months (6 to 22 months).

Comparison 12 MultiLoc Proximal Humeral Nail (MPHN) versus Polarus nail, Outcome 3 Range of shoulder motion (degrees).
Figures and Tables -
Analysis 12.3

Comparison 12 MultiLoc Proximal Humeral Nail (MPHN) versus Polarus nail, Outcome 3 Range of shoulder motion (degrees).

Comparison 12 MultiLoc Proximal Humeral Nail (MPHN) versus Polarus nail, Outcome 4 Lengths of surgery and hospital stay.
Figures and Tables -
Analysis 12.4

Comparison 12 MultiLoc Proximal Humeral Nail (MPHN) versus Polarus nail, Outcome 4 Lengths of surgery and hospital stay.

Comparison 13 Hemiarthoplasty: EPOCA prosthesis versus HAS prosthesis, Outcome 1 Adverse events.
Figures and Tables -
Analysis 13.1

Comparison 13 Hemiarthoplasty: EPOCA prosthesis versus HAS prosthesis, Outcome 1 Adverse events.

Comparison 13 Hemiarthoplasty: EPOCA prosthesis versus HAS prosthesis, Outcome 2 Radiological assessment findings.
Figures and Tables -
Analysis 13.2

Comparison 13 Hemiarthoplasty: EPOCA prosthesis versus HAS prosthesis, Outcome 2 Radiological assessment findings.

Study

Measure

EPOCA prosthesis

n = 18

HAS prosthesis

n = 17

Reported significance

Fialka 2008

Active forward flexion

mean = 109°
range = 30° to 150°

mean = 62°
range = 20° to 110°

P < 0.001

Fialka 2008

Active abduction

mean = 101°
range = 30° to 150°

mean = 62°
range = 30° to 100°

P = 0.001

Fialka 2008

Active external rotation in 90° abduction

mean = 30°
range = 0° to 60°

mean = 17°
range = 0° to 40°

P = 0.01

Fialka 2008

Active external rotation in 90° abduction

mean = 45°
range = 0° to 70°

mean = 13°
range = 0° to 40°

P = 0.001

Figures and Tables -
Analysis 13.3

Comparison 13 Hemiarthoplasty: EPOCA prosthesis versus HAS prosthesis, Outcome 3 Range of motion results at one year (degrees).

Comparison 14 Hemiarthroplasty: tenodesis of long head of biceps (LHB) versus LHB tendon left intact, Outcome 1 Complications and further surgery.
Figures and Tables -
Analysis 14.1

Comparison 14 Hemiarthroplasty: tenodesis of long head of biceps (LHB) versus LHB tendon left intact, Outcome 1 Complications and further surgery.

Comparison 14 Hemiarthroplasty: tenodesis of long head of biceps (LHB) versus LHB tendon left intact, Outcome 2 Constant score (0 to 100: best function) at 2 years.
Figures and Tables -
Analysis 14.2

Comparison 14 Hemiarthroplasty: tenodesis of long head of biceps (LHB) versus LHB tendon left intact, Outcome 2 Constant score (0 to 100: best function) at 2 years.

Comparison 14 Hemiarthroplasty: tenodesis of long head of biceps (LHB) versus LHB tendon left intact, Outcome 3 Shoulder pain at 2 year follow‐up.
Figures and Tables -
Analysis 14.3

Comparison 14 Hemiarthroplasty: tenodesis of long head of biceps (LHB) versus LHB tendon left intact, Outcome 3 Shoulder pain at 2 year follow‐up.

Comparison 14 Hemiarthroplasty: tenodesis of long head of biceps (LHB) versus LHB tendon left intact, Outcome 4 Active shoulder elevation (degrees) at 2 years.
Figures and Tables -
Analysis 14.4

Comparison 14 Hemiarthroplasty: tenodesis of long head of biceps (LHB) versus LHB tendon left intact, Outcome 4 Active shoulder elevation (degrees) at 2 years.

Comparison 15 Post‐operative (percutaneous fixation) immobilisation for 1 week versus 3 weeks, Outcome 1 Neer score ≤ 80 points (unsatisfactory or failure) at 6 months.
Figures and Tables -
Analysis 15.1

Comparison 15 Post‐operative (percutaneous fixation) immobilisation for 1 week versus 3 weeks, Outcome 1 Neer score ≤ 80 points (unsatisfactory or failure) at 6 months.

Comparison 15 Post‐operative (percutaneous fixation) immobilisation for 1 week versus 3 weeks, Outcome 2 Premature removal of Kirschner wires.
Figures and Tables -
Analysis 15.2

Comparison 15 Post‐operative (percutaneous fixation) immobilisation for 1 week versus 3 weeks, Outcome 2 Premature removal of Kirschner wires.

Comparison 16 Post‐operative (hemiarthroplasty) mobilisation: early (2 weeks immobilisation) versus late (6 weeks), Outcome 1 Oxford Shoulder Score at 1 year (adjusted: 0 to 100 best).
Figures and Tables -
Analysis 16.1

Comparison 16 Post‐operative (hemiarthroplasty) mobilisation: early (2 weeks immobilisation) versus late (6 weeks), Outcome 1 Oxford Shoulder Score at 1 year (adjusted: 0 to 100 best).

Comparison 16 Post‐operative (hemiarthroplasty) mobilisation: early (2 weeks immobilisation) versus late (6 weeks), Outcome 2 Constant shoulder score (at 1 year).
Figures and Tables -
Analysis 16.2

Comparison 16 Post‐operative (hemiarthroplasty) mobilisation: early (2 weeks immobilisation) versus late (6 weeks), Outcome 2 Constant shoulder score (at 1 year).

Comparison 16 Post‐operative (hemiarthroplasty) mobilisation: early (2 weeks immobilisation) versus late (6 weeks), Outcome 3 Radiological assessment findings.
Figures and Tables -
Analysis 16.3

Comparison 16 Post‐operative (hemiarthroplasty) mobilisation: early (2 weeks immobilisation) versus late (6 weeks), Outcome 3 Radiological assessment findings.

Comparison 16 Post‐operative (hemiarthroplasty) mobilisation: early (2 weeks immobilisation) versus late (6 weeks), Outcome 4 Range of motion at 1 year.
Figures and Tables -
Analysis 16.4

Comparison 16 Post‐operative (hemiarthroplasty) mobilisation: early (2 weeks immobilisation) versus late (6 weeks), Outcome 4 Range of motion at 1 year.

Summary of findings for the main comparison. Summary of findings: surgical versus non‐surgical treatment for proximal humeral fractures

Surgical versus non‐surgical treatment for proximal humeral fractures

Patient or population: [mainly older] adults with most types of displaced proximal humeral fractures1 (8 trials)

Settings: hospital (tertiary care)

Intervention: surgery, various: mainly open reduction and internal fixation (ORIF) with locking plate or hemiarthroplasty

Comparison: non‐surgical treatment, mainly sling 'immobilisation'; more rarely, closed reduction/manipulation of the fracture (2 trials)

Outcomes

Illustrative comparative risks* (95% CI)

Relative effect
(95% CI)

No of Participants
(studies)

Quality of the evidence
(GRADE)

Comments

Assumed risk

Corresponding risk

Non‐surgical treatment

Surgical treatment

Functional scores2 (higher = better outcome)

Follow‐up: 1 year

The mean difference in function (overall) in the surgery groups was

0.07 standard deviations higher

(0.12 lower to 0.26 higher)

SMD 0.07
(‐0.12 to 0.26)

419 participants

(5 studies)

⊕⊕⊕⊕
high3

This does not represent a clinically important difference:

  • 0.2 represents a small difference, 0.5 a moderate difference and 0.8 a large difference. Thus, based on this 'rule of thumb', there is little difference between the two groups. At most, the extreme range of the 95% CI includes a minimal difference in favour of surgery at one year.

  • All of the best estimates of between‐group differences for the individual outcome scores2 were much smaller than their associated MCIDs

Functional scores4 (higher = better outcome)

Follow‐up: 2 years

The mean difference in function (overall) in the surgery groups was

0.07 standard deviations higher

(0.14 lower to 0.28 higher)

(SMD 0.07, 95% CI ‐0.14 to 0.28)

351 participants

(4 studies)

⊕⊕⊕⊕
high5

This does not represent a clinically‐important difference.

  • 0.2 represents a small difference, 0.5 a moderate difference and 0.8 a large difference. Thus, based on this 'rule of thumb', there is little difference between the two groups. At most, the extreme range of the 95% CI includes a minimal difference in favour of surgery at two years.

  • All of the best estimates of between‐group differences for the individual outcome scores4 were much smaller than their associated MCIDs

Quality of life assessment: EuroQol (0: dead to 1: best health)

Follow‐up: 2 years

The mean EuroQol score ranged across control groups from
0.7 to 0.85

The mean EuroQol score in the surgery groups was 0.03 higher,
(0.01 lower to 0.08 higher)

354 participants

(4 studies)

⊕⊕⊕⊝
moderate6

The MCID of 0.12 was outside the 95% CI at this time period and at 6 months (MD 0.04, 95% CI 0.01 to 0.08) and 12 months (MD 0.02, 95% CI ‐0.02 to 0.06)

Quality of life: SF‐12 Physical Component Score (0 to 100: best)
Follow‐up: 2 years

The mean SF‐12 PCS was 44.1

The mean SF‐12 PCS in the surgery group was
1.10 higher (1.99 lower to 4.19 higher)

210 participants

(1 study)

⊕⊕⊕⊝
moderate7

A similar lack of clinically important difference8 was noted at 6 and 12 months. This measure may not be sensitive to recovery from this injury.

Mortality

Follow‐up: up to 2 years

52 per 10008

73 per 1000
(4 to 147)

RR 1.40 (0.69 to 2.83)

596 participants

(6 studies)

⊕⊕⊕⊝
moderate9

Surgery resulted in 21/1000 more deaths up to 2 years (95% CI 48 fewer to 95 more)

Where reported, none of the deaths was related to their fracture or treatment with the exception of one early death due to venous thromboembolism in the surgical group of one trial

Additional surgery (re‐operation or secondary surgery)

Follow‐up: up to 2 years

40 per 10009

83 per 1000
(47 to 144)

RR 2.06
(1.18 to 3.60)

523 participants

(7 studies)

⊕⊕⊕⊝
moderate10

Surgery resulted in 43/1000 more patients having additional surgery up to 2 years (95% CI 7 to 104 more).

One trial (250 participants) also reported on additional shoulder‐related therapy (7/1254 versus 4/125; RR 1.75 favouring non‐surgical therapy, 95% CI 0.53 to 5.83)

Adverse events / complications ‐ Number of patients with complications

Follow‐up: 2 years

184 per 10009

239 per 1000
(147 to 389)

RR 1.30
(0.80 to 2.11)

250 participants

(1 study)

⊕⊕⊕⊝
moderate11

Surgery resulted in 55/1000 more patients having adverse events up to 2 years (95% CI 37 fewer to 205 more).

All 8 trials reported on individual complications, the pattern of distribution generally reflecting the expected: e.g. infection 8/279 cases after surgery versus 0/280 cases after non‐surgical treatment

*The basis for the assumed risk (e.g. the median control group risk across studies) is provided in footnotes. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).
CI: confidence interval; MCID: minimal clinically important differences; RR: risk ratio; SMD: standardised mean difference

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1. The inclusion/exclusion criteria varied among the trials: one (30 participants) included 2‐, 3‐ or 4‐part fractures; one (60 participants) included only 3‐part fractures that included surgical neck; two (90 participants) included 3‐ or 4‐part fractures, three (137 participants) included only 4‐part fractures. The final trial (250 participants) included "displaced fracture of the proximal humerus that involved the surgical neck", resulting in a few 1‐part (but confirmed as still "displaced") as well as 2‐, 3‐ and 4‐part fractures. The majority of the fractures (146/250 = 58.4%) in the largest trial were either 2‐part (128) or 1‐part (18) fractures. Several trials included further criteria; for example, the largest trial explicitly excluded fracture dislocations (i.e. fractures with an associated dislocation of the injured shoulder joint). Consideration is also needed of other inclusion and exclusion criteria, including multiple trauma, clear indications for surgery (severe soft‐tissue compromise), and co‐morbidities precluding surgery or anaesthesia
2. Patient‐reported functional scores were the Disability of the Arm, Shoulder, and Hand questionnaire (DASH; 2 trials), the Oxford Shoulder Score (OSS; 1 trial); the American Shoulder and Elbow Surgeons (ASES; 1 trial) and Simple Shoulder Test (SST; 1 trial)
3. Although the evidence was first downgraded by one level for study limitations, reflecting a high risk of performance bias relating to lack of blinding in four single‐centre trials, the consistency in the results of these and the fifth and largest trial, where the analysis indicated that the study design limited the risk of bias relating to the inevitable lack of blinding, resulted in an upgrade
4. Patient‐reported functional scores were the Disability of the Arm, Shoulder, and Hand questionnaire (DASH; 2 trials), the Oxford Shoulder Score (OSS; 1 trial); and the American Shoulder and Elbow Surgeons (ASES; 1 trial)
5. The evidence was downgraded by one level for study limitations, reflecting a high risk of performance bias relating to lack of blinding in 3 single‐centre trials. There was, however, consistency in the results of these and the fourth and largest trial, where the analysis indicated that the study design limited the risk of bias relating to the inevitable lack of blinding, resulting in an upgrade
6. The evidence was downgraded by one level for inconsistency, reflecting the statistical heterogeneity (Chi² = 6.76, df = 3 (P = 0.08); I² = 56%), but also data from two trials (102 participants) from the same centre that found minimal clinically important differences favouring surgery
7. The evidence was downgraded one level for imprecision, reflecting that these data were from one trial alone
8. A minimal clinically important difference for the SF‐12 PCS was assumed to be 6.5. Notably, a similar finding applied for the between‐group differences in SF‐12 Mental Component Scores, but the direction of effect favoured non‐surgical treatment
9. Assumed risk is the median control group risk across studies
10. The evidence was downgraded one level for imprecision.
11. The evidence was downgraded one level for inconsistency (heterogeneity: Chi² = 8.50, df = 6 (P = 0.20); I² = 29%), which was greater for the two years follow‐up data (heterogeneity: Chi² = 7.29, df = 3 (P = 0.06); I² = 59%). At two years, three trials (160 participants) reported more additional surgery in the surgery group, but the trial (250 participants) contributing 65% of the weight of the evidence recorded equal numbers of participants (11 versus 11) undergoing additional surgery.

Figures and Tables -
Summary of findings for the main comparison. Summary of findings: surgical versus non‐surgical treatment for proximal humeral fractures
Summary of findings 2. Summary of findings: early versus delayed mobilisation for non‐surgically treated proximal humeral fractures

Early versus delayed mobilisation for non‐surgically treated proximal humeral fractures

Patient or population: adults with minimally displaced or displaced (2‐part or 3‐part) proximal humeral fractures (4 trials)
Settings: various, including fracture clinics and physiotherapy

Intervention: early (within or at one week) mobilisation

Comparison: delayed (usual) mobilisation or physiotherapy after three or four weeks immobilisation

Outcomes

Illustrative comparative risks* (95% CI)

Relative effect
(95% CI)

No of Participants
(studies)

Quality of the evidence
(GRADE)

Comments

Assumed risk

Corresponding risk

3 to 4 weeks immobilisation

Early mobilisation (≤ 1 week)

Shoulder disability: Croft Shoulder Disability Score ‐ Disability (1 or more problems)
Follow‐up: 1 year

725 per 10001

428 per 1000
(290 to 638)

RR 0.59 (0.40 to 0.88)

82 participants

(1 study)

⊕⊕⊝⊝
low2

Early mobilisation resulted in 297/1000 fewer people with one or more problems at 1 year (95% CI 87 fewer to 435 fewer)3

Number of treatment sessions (until independent function achieved)

Follow‐up: as described

The mean number of sessions was 14 in the usual timing group4

The mean number of sessions in the early group was
5.0 lower (1.75 to 8.25 sessions lower)

86 participants
(1 study)

⊕⊕⊝⊝
low5

This pertains to early recovery to a level that may vary with individual patients.

SF‐36 scores: pain & physical dimensions ‐ all 3 dimensions 0‐100: higher scores mean better quality of life)
Follow‐up: 16 weeks

The mean values for 3 dimensions in the delayed group4 were:

Physical functioning 69.2
Role limitation physical 39.7
Pain 59.9

The mean values in the early group were:

Physical functioning 0.70 higher (9.91 lower to 11.31 higher)
Role limitation physical 22.2 higher (3.82 to 40.58 higher)
Pain 12.10 higher (3.26 to 20.94 higher)

81 participants
(1 study)

⊕⊕⊝⊝
low6

An overall score was not available.
General physical functioning was high and comparable in the two groups. It is likely that the results for role limitation physical and pain are clinically important. This is consistent with the earlier recovery in independent function judged by treating physiotherapists (see above)

SF‐36 scores: pain & physical dimensions ‐ all 3 dimensions 0‐100: higher scores mean better quality of life)
Follow‐up: 1 year

The mean values for 3 dimensions in the delayed group4 were:

Physical functioning 68.4
Role limitation physical 54.4
Pain 65.6

The mean values in the early group were:

Physical functioning 3.00 lower (16.48 lower to 10.48 higher)
Role limitation physical 5.60 higher (13.75 lower to 24.95 higher)
Pain 3.60 higher (8.19 lower to 15.39 higher)

80 participants
(1 study)

⊕⊕⊝⊝
low6

An overall score was not available.
Results for all three dimensions are comparable in the two groups.

None of best estimates are likely to equate to clinically important differences.

Quality of life assessment: EuroQol 5D (0: dead to 1: best health)

Follow‐up: 1 year

The mean EuroQol 5D score in the early group was 0.764

The mean EuroQol 5D score in the delayed group was
0.09 lower (0.21 lower to 0.03 higher)

39 participants

(1 study)7

⊕⊝⊝⊝
very low8

Similar results of little between‐group differences of no clinical importance applied at 3 and 6 months.

Adverse events:

Shoulder complications

Follow‐up: 1 year

26 per 10009

19 per 1000
(4 to 95)

RR 0.73

(0.15 to 3.63)

259 participants
(4 studies)

⊕⊝⊝⊝
very low10

Reported shoulder complications were frozen shoulder (1 case), complex regional pain syndrome type 1 (2 cases) and treated subacromial impingement (2 cases).

Early mobilisation resulted in 7/1000 fewer people with a shoulder complication at 1 year (95% CI 22 fewer to 69 more)

Adverse events:

Fracture displacement and non‐union

Follow‐up: 1 year

23 per 10009

51 per 1000

(5 to 517)

RR 2.20 (0.22 to 22.45)

106 participants
(2 studies)

⊕⊝⊝⊝
very low10

There were no cases of non‐union. All three fracture displacements (none of which required surgery) occurred in one trial that included displaced fractures

Early mobilisation resulted in 28/1000 more people with a fracture displacement at 1 year (95% CI 18 fewer to 494 more)

*The basis for the assumed risk (e.g. the median control group risk across studies) is provided in footnotes. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).
CI: Confidence interval; RR: Risk Ratio

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1. Control risk based on study data
2. Evidence downgraded one level for one level for imprecision (single small trial) and one level for indirectness (question over outcome measure's validity; the importance of individual problems will vary)
3. Two‐year follow‐up data from the same trial (74 participants) showed that based on a control risk of 595 per 1000 in the delayed group, early mobilisation resulted in 160/1000 fewer people with one or more problems at two years (95% CI 321 fewer to 90 more); very low quality evidence (see above footnote)
4. Data from control group of study
5. Evidence downgraded one level for imprecision (single small trial data) and one level for indirectness ('independent function' and physiotherapy discharge depicts an intermediate outcome)
6. Evidence downgraded one level for study limitations (several domains at unclear risk of bias) and one level for imprecision (single small trial data)
7. Evidence from a trial comparing 1 versus 4 weeks immobilisation for predominantly displaced fractures
8. Evidence downgraded one level for study limitations (study at high risk of bias) and two levels for imprecision (wide confidence intervals; single small trial data)
9. The assumed risk is the median control group risk across studies
10. Evidence downgraded one level for study limitations and two levels for imprecision (sparse data and wide confidence intervals)

Figures and Tables -
Summary of findings 2. Summary of findings: early versus delayed mobilisation for non‐surgically treated proximal humeral fractures
Table 1. Surgical versus non‐surgical treatment trials: brief characteristics

Study

Participants
(Neer classification)

Surgery

Non‐surgical
(starting with)

Follow‐up

Boons 2012

50 participants with 4‐part fractures
(The Netherlands)

Humeral head replacement with the Global prostheses; cemented

Sling immobilisation

1 year

Fjalestad 2010

 

50 participants with 3‐ or 4‐part fractures

(Norway)

Open reduction and fixation with an interlocking plate device and metal cerclages

Immobilisation of the injured arm in a modified Velpeau bandage. Closed reduction in 8 patients.

2 years

Kristiansen 1988

 

30 participants with 31 2‐, 3‐ or 4‐part fractures. Included 7 2‐part, 19 3‐part and 5 4‐part fractures
(Denmark)

Percutaneous reduction and external fixation

 

Closed manipulation and sling immobilisation

2 years

Olerud 2011a

60 participants with 3‐part fractures (all had displaced surgical neck fracture)
(Sweden)

Open reduction and fixation with a PHILOS plate and non‐absorbable sutures

Sling immobilisation

2 years

Olerud 2011b

55 participants with 4‐part fractures
(Sweden)

Humeral head replacement with the Global Fx prosthesis

Sling immobilisation

2 years

ProFHER 2015

250 participants with "displaced fracture of the proximal humerus that involved the surgical neck". Included 18 1‐part (but confirmed as still "displaced"), 128 2‐part, 93 3‐part and 11 4‐part fractures

(UK)

Either internal fixation (majority were PHILOS plates) or joint replacement (hemiarthroplasty)

Pragmatic trial ‐ choice based on surgeon's experience with method

Sling immobilisation

2 years

Stableforth 1984

32 participants with 4‐part fracture
(UK)

Hemiarthroplasty

Closed manipulation and sling

6 months

Zyto 1997 

40 participants with 3‐ or 4‐part fractures (3 others excluded)
(Sweden)

Internal fixation using surgical tension band or cerclage wiring

Sling immobilisation

50 months

Figures and Tables -
Table 1. Surgical versus non‐surgical treatment trials: brief characteristics
Table 2. Assessment of items relating to applicability of trial findings

Clearly defined study population?

Interventions sufficiently described?

Main outcomes sufficiently
described?

Appropriate timing of outcome measurement?
(Yes = ≥ 1 year)

Agorastides 2007

Partial: exclusions not specified upfront

Yes

Yes

Yes: 1 year

Bertoft 1984

Partial: no exclusion criteria given (e.g. ability to understand instructions for exercises)

Yes

Yes

Yes: 1 year

Boons 2012

Yes

Yes

Yes

Yes: 1 year

Buecking 2014

Partial: indication for hemiarthroplasty poorly defined (27 excluded before randomisation because "implantation of a prosthesis was planned")

Yes

Yes

Yes: 1 year

Cai 2012

Partial: unclear definition of 4‐part fractures.

Yes: however, time to surgery not reported

Yes

Yes: 2 years

Fialka 2008

Yes

Yes

Yes

Yes: 1 year

Fjalestad 2010

Yes

Yes

Yes

Yes: 2 years

Hoellen 1997

Yes: but some question over fracture type in that the Holbein 1999 report included 3‐part fractures too

Yes

Yes

Yes: 1 year

Hodgson 2003

Yes

Yes

Yes

Yes: 2 years

Kristiansen 1988

Partial: no exclusion criteria given

Partial: incomplete description of timing of sling use and care of external fixator pin sites

Partial: no description of measurement procedures

Yes: 1 year

Kristiansen 1989

Partial: no exclusion criteria given

Partial: although sling and body bandage are common expressions, some variation possible

Partial: no description of measurement procedures

Yes: 2 years

Lefevre‐Colau 2007

Yes

Yes

Yes

Partial: 6 months

Livesley 1992

Yes: although this included 4 patients under 20 years with epiphyseal fractures

Yes

Yes

Partial: 6 months

Lopiz 2014

Partial: insufficient criteria given in terms of suitability for surgery

Yes

Yes

Partial: minimum 6 months

Lundberg 1979

Partial: no exclusion criteria given (e.g. ability to understand instructions for exercises)

Yes

Yes

Yes: 1 year or above (mean: 16 months)

Ockert 2010

Partial: exclusion criteria described in context of post‐randomisation exclusions.

Yes

Yes

Partial: 6 months

Olerud 2011a

Yes

Yes

Yes

Yes: 2 years

Olerud 2011b

Yes

Yes

Yes

Yes: 2 years

ProFHER 2015

Yes

Yes

In the context of this being a pragmatic trial

Yes

Yes: 2 years

Revay 1992

Yes

Partial: frequency of swimming sessions not stated

Yes

Yes: 1 year

Rommens 1993

Yes: but to note that other fractures including rib (3 participants) were included

Yes

Partial: functional outcome assessment not described (sufficiently)

No: only until fracture consolidation

Sebastiá‐Forcada 2014

Yes

Yes

Yes

Yes: minimum 2 years

Smejkal 2011

Yes

Partial: Only minimal intra‐operative details given and nothing regarding post‐operative management including rehabilitation

Partial: this may have been ‘lost in translation’ (Czech article)

Yes: mean 2 years but range not stated (probably most/all > 1 year as recruitment had finished January 2010).

Soliman 2013

No: no explanation given for a younger population; insufficient criteria given in terms of suitability for hemiarthroplasty

Yes

Partial: incomplete description of pain categories; no clarification of modification to Constant score

Yes: minimum 21 months follow‐up

Stableforth 1984

Yes

Yes

Partial: no description of measurement procedures, incomplete description of pain categories

Partial: up to 6 months, then between 18 months to 12 years. This is too spread out. Most results applied to the 6‐month follow‐up.

Torrens 2012

Partial: the < 1.5 cm criterion for posterior displacement of the greater tuberosity is unusual and no justification was given by the authors

Partial: incomplete description of accompanying "progressive rehabilitation program"

Partial: incomplete description of measurement procedures

Yes: 1 year

Voigt 2011

Yes

Yes

Yes

Yes: 1 year

Wirbel 1999

Yes

Yes

Partial: no description of measurement procedures

Partial: between 9 and 36 months; < 1 year in 10 participants. Main results applied to 6 months.

Zhang 2011

Yes

Yes

Partial: Insufficient information on measurement of complications and timing of their measurement.

Yes: All over 25 months (mean 30.8 months)

Zhu 2011

Yes

Yes

Yes

Yes: 1 and 3 years

Zyto 1997

Yes

Yes

Yes

Yes: 1 year, and 3 to 5 years

Figures and Tables -
Table 2. Assessment of items relating to applicability of trial findings
Comparison 1. Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Shoulder disability: Croft Shoulder Disability Score Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

1.1 Disability (1 or more problems) at 1 year

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.2 Severe disability (5 or more problems) at 1 year

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.3 Disability (1 or more problems) at 2 years

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.4 Severe disability (5 or more problems) at 2 years

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2 Croft shoulder disability score: individual problems at 2 years Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

2.1 Pain on movement

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.2 Bathing difficulties

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.3 Change position at night more often

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.4 Disturbed sleep

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.5 No active pastimes or usual physical recreation

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.6 Lifting problems

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.7 Help needed

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.8 More accidents (e.g. dropping things)

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

3 Number of treatment sessions (until independent function achieved) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

4 SF‐36 scores: pain & physical dimensions Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

4.1 Physical functioning (0‐100: excellent) at 16 weeks

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

4.2 Physical functioning (0‐100: excellent) at 1 year

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

4.3 Role limitation physical (0‐100: none) at 16 weeks

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

4.4 Role limitation physical (0‐100: none) at 1 year

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

4.5 Pain (0‐100: none) at 16 weeks

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

4.6 Pain (0‐100: none) at 1 year

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

5 Quality of life assessment: EuroQol 5D (0: dead to 1: best health) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

5.1 At 3 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

5.2 At 6 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

5.3 At 12 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

6 Adverse events Show forest plot

4

Risk Ratio (M‐H, Fixed, 95% CI)

Subtotals only

6.1 Frozen shoulder

1

80

Risk Ratio (M‐H, Fixed, 95% CI)

0.33 [0.01, 7.95]

6.2 Fracture displacement

2

106

Risk Ratio (M‐H, Fixed, 95% CI)

2.2 [0.22, 22.45]

6.3 Non‐union

2

106

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.4 Complex regional pain syndrome type 1

2

115

Risk Ratio (M‐H, Fixed, 95% CI)

1.09 [0.07, 16.71]

6.5 Treated (injection) subacromial impingement

1

64

Risk Ratio (M‐H, Fixed, 95% CI)

1.0 [0.07, 15.30]

6.6 Shoulder complications

4

259

Risk Ratio (M‐H, Fixed, 95% CI)

0.73 [0.15, 3.63]

6.7 Fracture complications

2

106

Risk Ratio (M‐H, Fixed, 95% CI)

2.2 [0.22, 22.45]

7 Mortality Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

8 Constant shoulder score (ratio of affected/unaffected arm) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

8.1 8 weeks

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

8.2 16 weeks

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

8.3 1 year

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

9 Constant shoulder score (0 to 100: best) Show forest plot

2

Mean Difference (IV, Fixed, 95% CI)

Subtotals only

9.1 At 6 weeks

1

64

Mean Difference (IV, Fixed, 95% CI)

10.10 [2.02, 18.18]

9.2 At 3 months

2

106

Mean Difference (IV, Fixed, 95% CI)

6.53 [0.77, 12.30]

9.3 At 6 months

2

105

Mean Difference (IV, Fixed, 95% CI)

3.39 [‐1.46, 8.24]

9.4 At 12 months

1

39

Mean Difference (IV, Fixed, 95% CI)

1.46 [‐7.05, 9.97]

9.5 6 months: subjective assessment (0 to 35: best)

1

64

Mean Difference (IV, Fixed, 95% CI)

1.90 [‐0.54, 4.34]

9.6 6 months: objective assessment range of motion and strength (0 to 65: best)

1

64

Mean Difference (IV, Fixed, 95% CI)

4.10 [‐0.62, 8.82]

10 Pain VAS (0 to 100: worst pain) Show forest plot

2

Mean Difference (IV, Fixed, 95% CI)

Subtotals only

10.1 At 6 weeks

1

64

Mean Difference (IV, Fixed, 95% CI)

‐3.60 [‐20.76, 13.56]

10.2 At 3 months

2

106

Mean Difference (IV, Fixed, 95% CI)

‐5.13 [‐14.76, 4.50]

10.3 At 6 months

2

105

Mean Difference (IV, Fixed, 95% CI)

4.29 [‐5.48, 14.07]

10.4 At 12 months

1

39

Mean Difference (IV, Fixed, 95% CI)

10.8 [‐4.59, 26.19]

11 Changes in pain intensity (mm) from baseline: 100 mm visual analogue scale (positive change = less pain) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

11.1 At 6 weeks

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

11.2 At 3 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

11.3 At 6 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

12 Range of motion at 6 months (degrees): difference between two shoulders Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

12.1 Abduction

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

12.2 Anterior elevation

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

12.3 Lateral rotation

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

13 Patient dissatisfied with treatment Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

14 Patient satisfaction (0 to 10: higher scores ‐ greater satisfaction) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

14.1 At 3 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

14.2 At 6 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

14.3 At 12 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

Figures and Tables -
Comparison 1. Early mobilisation (within or up to 1 week) versus immobilisation for 3 or 4 weeks
Comparison 2. Gilchrist bandage versus 'Classic' Desault bandage

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Problems with bandages Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

1.1 Application of bandage was uncomfortable

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.2 Premature bandage removal

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2 Fracture displacement by 3 weeks Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

3 Poor or bad rating by patient at fracture consolidation Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

Figures and Tables -
Comparison 2. Gilchrist bandage versus 'Classic' Desault bandage
Comparison 3. Instructed self‐exercise versus conventional physiotherapy

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Pain at one year (scale 0 to 8: maximum pain) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

2 Severe or moderate pain at 3 months Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

3 Requested change of therapy Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

4 Adverse events (frozen shoulder: 1 v 2; unexplained prolonged pain: 0 v 1) Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

5 Neer's rating (0 to 100: best) at mean 16 months (exploratory analysis) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

6 Active gleno‐humeral elevation (degrees) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

Figures and Tables -
Comparison 3. Instructed self‐exercise versus conventional physiotherapy
Comparison 4. Surgical versus non‐surgical treatment

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Functional scores at 12 months (higher = better outcome) Show forest plot

5

419

Std. Mean Difference (IV, Fixed, 95% CI)

0.07 [‐0.12, 0.26]

1.1 DASH (0 to 100: worst disability) (reversed)

2

105

Std. Mean Difference (IV, Fixed, 95% CI)

0.19 [‐0.19, 0.57]

1.2 ASES (0 to 24: best)

1

48

Std. Mean Difference (IV, Fixed, 95% CI)

‐0.10 [‐0.67, 0.46]

1.3 SST (0 to 12: best)

1

47

Std. Mean Difference (IV, Fixed, 95% CI)

0.11 [‐0.46, 0.69]

1.4 OSS (0 to 48: best)

1

219

Std. Mean Difference (IV, Fixed, 95% CI)

0.04 [‐0.22, 0.31]

2 Functional scores at 24 months (higher = better outcome) Show forest plot

4

351

Std. Mean Difference (IV, Fixed, 95% CI)

0.07 [‐0.14, 0.28]

2.1 DASH (0 to 100: worst disability) (reversed)

2

99

Std. Mean Difference (IV, Fixed, 95% CI)

0.33 [‐0.07, 0.73]

2.2 ASES (0 to 24: best)

1

42

Std. Mean Difference (IV, Fixed, 95% CI)

‐0.02 [‐0.62, 0.59]

2.3 OSS (0 to 48: best)

1

210

Std. Mean Difference (IV, Fixed, 95% CI)

‐0.03 [‐0.30, 0.24]

3 Oxford Shoulder Score (0 to 48: best outcome) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Subtotals only

3.1 Over 2 years

1

231

Mean Difference (IV, Fixed, 95% CI)

0.75 [‐1.68, 3.18]

3.2 At 6 months

1

226

Mean Difference (IV, Fixed, 95% CI)

2.25 [‐0.42, 4.92]

3.3 At 12 months

1

219

Mean Difference (IV, Fixed, 95% CI)

0.43 [‐2.10, 2.96]

3.4 At 24 months

1

210

Mean Difference (IV, Fixed, 95% CI)

‐0.29 [‐2.84, 2.26]

4 DASH (0 to 100: worst disability) Show forest plot

2

Mean Difference (IV, Fixed, 95% CI)

Subtotals only

4.1 at 4 months

2

106

Mean Difference (IV, Fixed, 95% CI)

0.91 [‐7.00, 8.83]

4.2 at 12 months

2

105

Mean Difference (IV, Fixed, 95% CI)

‐4.51 [‐13.50, 4.48]

4.3 at 24 months

2

99

Mean Difference (IV, Fixed, 95% CI)

‐7.43 [‐16.26, 1.41]

5 American Shoulder and Elbow Surgeons score (0 to 24: best) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

5.1 at 6 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

5.2 at 12 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

5.3 at 24 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

6 Simple Shoulder Test (0 to 12: best function) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

6.1 at 3 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.2 at 12 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

7 Activities of daily living Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

7.1 Unable to manage personal hygiene at 1 year

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

7.2 Unable to comb hair at 1 year

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

7.3 Unable to sleep on fractured side at 1 year

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

7.4 Unable to carry 5 kg at 1 year

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

7.5 Unable to manage personal hygiene at 50 months

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

7.6 Unable to comb hair at 50 months

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

7.7 Unable to sleep on fractured side at 50 months

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

7.8 Unable to carry 5 kg at 50 months

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

8 Quality of life assessment: EuroQol (0: dead to 1: best health) Show forest plot

4

Mean Difference (IV, Fixed, 95% CI)

Subtotals only

8.1 at 3 to 4 months

4

360

Mean Difference (IV, Fixed, 95% CI)

0.01 [‐0.02, 0.04]

8.2 at 6 months

4

381

Mean Difference (IV, Fixed, 95% CI)

0.04 [0.01, 0.08]

8.3 at 12 months

4

371

Mean Difference (IV, Fixed, 95% CI)

0.02 [‐0.02, 0.06]

8.4 at 24 months

4

354

Mean Difference (IV, Fixed, 95% CI)

0.03 [‐0.01, 0.08]

9 Quality of life assessment (Fjalestad 2010 and 2014 data) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

9.1 15D at 3 months (0: death; 1: perfect health)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

9.2 15D at 6 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

9.3 15D at 12 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

9.4 number of QALYs at 1 year

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

9.5 numbers of QALYs at 1 year (‐ deaths)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

9.6 15D at 24 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

10 Quality of life: SF‐12 Physical Component Score (0 to 100: best) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Subtotals only

10.1 at 6 months

1

216

Mean Difference (IV, Fixed, 95% CI)

2.60 [‐0.24, 5.44]

10.2 at 12 months

1

218

Mean Difference (IV, Fixed, 95% CI)

1.5 [‐1.42, 4.42]

10.3 at 24 months

1

210

Mean Difference (IV, Fixed, 95% CI)

1.10 [‐1.99, 4.19]

11 Quality of life: SF‐12 Mental Component Score (0 to 100: best) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Subtotals only

11.1 at 6 months

1

216

Mean Difference (IV, Fixed, 95% CI)

‐0.60 [‐3.57, 2.37]

11.2 at 12 months

1

218

Mean Difference (IV, Fixed, 95% CI)

‐2.0 [‐4.81, 0.81]

11.3 at 24 months

1

210

Mean Difference (IV, Fixed, 95% CI)

‐1.40 [‐4.33, 1.53]

12 Mortality Show forest plot

6

496

Risk Ratio (M‐H, Fixed, 95% CI)

1.40 [0.69, 2.83]

13 Additional surgery (re‐operation or secondary surgery) Show forest plot

7

523

Risk Ratio (M‐H, Fixed, 95% CI)

2.06 [1.18, 3.60]

13.1 at 6 to 12 months

3

113

Risk Ratio (M‐H, Fixed, 95% CI)

1.37 [0.32, 5.93]

13.2 at 2 years

4

410

Risk Ratio (M‐H, Fixed, 95% CI)

2.20 [1.20, 4.04]

14 Adverse events / complications Show forest plot

8

Risk Ratio (M‐H, Fixed, 95% CI)

Subtotals only

14.1 Number of patients with complications

1

250

Risk Ratio (M‐H, Fixed, 95% CI)

1.30 [0.80, 2.11]

14.2 Additional shoulder‐related therapy

1

250

Risk Ratio (M‐H, Fixed, 95% CI)

1.75 [0.53, 5.83]

14.3 Infection

8

559

Risk Ratio (M‐H, Fixed, 95% CI)

4.31 [1.11, 16.74]

14.4 Nerve injury / palsy

4

396

Risk Ratio (M‐H, Fixed, 95% CI)

1.16 [0.37, 3.59]

14.5 Non‐union

7

523

Risk Ratio (M‐H, Fixed, 95% CI)

0.43 [0.19, 0.98]

14.6 Avascular necrosis

7

513

Risk Ratio (M‐H, Fixed, 95% CI)

0.83 [0.53, 1.32]

14.7 Symptomatic malunion

1

250

Risk Ratio (M‐H, Fixed, 95% CI)

0.8 [0.22, 2.91]

14.8 Screw penetration into joint

3

160

Risk Ratio (M‐H, Fixed, 95% CI)

11.49 [2.25, 58.76]

14.9 Metalwork (internal fixation) problems

1

250

Risk Ratio (M‐H, Fixed, 95% CI)

21.0 [1.24, 354.53]

14.10 Wire penetration at 1 year

1

38

Risk Ratio (M‐H, Fixed, 95% CI)

3.0 [0.13, 69.31]

14.11 Redisplacement resulting in an operation

2

81

Risk Ratio (M‐H, Fixed, 95% CI)

0.26 [0.03, 2.22]

14.12 Implant‐related (hemiarthroplasty) failure

2

300

Risk Ratio (M‐H, Fixed, 95% CI)

4.0 [0.45, 35.18]

14.13 Secondary dislocation or resorption of the greater tuberosity

2

101

Risk Ratio (M‐H, Fixed, 95% CI)

13.15 [1.78, 96.90]

14.14 Tuberosity displacement at 50 months

1

29

Risk Ratio (M‐H, Fixed, 95% CI)

0.15 [0.01, 2.71]

14.15 Fixation failure resulting in an operation

1

50

Risk Ratio (M‐H, Fixed, 95% CI)

3.0 [0.13, 70.30]

14.16 Refracture

1

22

Risk Ratio (M‐H, Fixed, 95% CI)

1.0 [0.07, 14.05]

14.17 Post‐traumatic stiffness

1

250

Risk Ratio (M‐H, Fixed, 95% CI)

1.2 [0.38, 3.83]

14.18 Impingement

2

308

Risk Ratio (M‐H, Fixed, 95% CI)

1.0 [0.18, 5.62]

14.19 Rotator cuff tear

2

300

Risk Ratio (M‐H, Fixed, 95% CI)

3.0 [0.48, 18.73]

14.20 Post‐traumatic stiffness

1

250

Risk Ratio (M‐H, Fixed, 95% CI)

1.2 [0.38, 3.83]

14.21 CRPS or severe pain

1

250

Risk Ratio (M‐H, Fixed, 95% CI)

2.0 [0.18, 21.78]

14.22 Dislocation or instability

1

250

Risk Ratio (M‐H, Fixed, 95% CI)

0.33 [0.01, 8.10]

14.23 Heterotopic ossification

1

50

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

14.24 Post‐traumatic osteoarthritis (signs of)

4

183

Risk Ratio (M‐H, Fixed, 95% CI)

0.68 [0.27, 1.70]

15 Dependent in activities of daily living (or dead) at 6 months Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

16 Constant scores (overall: 0 to 100: best score) Show forest plot

5

Mean Difference (IV, Fixed, 95% CI)

Subtotals only

16.1 at 3‐4 months

3

156

Mean Difference (IV, Fixed, 95% CI)

‐2.90 [‐7.35, 1.56]

16.2 at 12 months

4

199

Mean Difference (IV, Fixed, 95% CI)

2.81 [‐2.20, 7.82]

16.3 at 24 months

3

143

Mean Difference (IV, Fixed, 95% CI)

‐0.25 [‐6.75, 6.25]

16.4 at 50 months

1

29

Mean Difference (IV, Fixed, 95% CI)

‐5.0 [‐17.52, 7.52]

17 Constant scores (difference between injured and uninjured shoulder): Normal = 0. Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

17.1 at 6 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

17.2 at 12 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

17.3 at 24 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

18 Poor or unsatisfactory function at 1 year (Neer rating) Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

19 VAS disability (0 to 100: no restrictions) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

19.1 at 3 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

19.2 at 12 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

20 Pain: VAS (0 to 100: worst pain) Show forest plot

3

Mean Difference (IV, Fixed, 95% CI)

Subtotals only

20.1 At 3 months

1

49

Mean Difference (IV, Fixed, 95% CI)

‐18.0 [‐29.03, ‐6.97]

20.2 At 2 years

2

101

Mean Difference (IV, Fixed, 95% CI)

‐6.38 [‐14.18, 1.41]

21 Constant score at 50 months: overall and components Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

21.1 Overall score (0‐100: best score)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

21.2 Pain (maximum score 15)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

21.3 Range of motion (maximum score 40)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

21.4 Power (maximum score 25)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

21.5 Activities of daily living (maximum score 20)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

22 Constant (often severe) pain at 6 months Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

23 Failure to recover 75% muscle power relative to other arm (survivors) at 6 months Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

23.1 Flexion

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

23.2 Abduction

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

23.3 Lateral rotation

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

24 Range of movement impairments in survivors at 6 months Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

24.1 Flexion < 45 degrees

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

24.2 Unable to place thumb on mid spine (T12)

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

24.3 Lateral rotation < 5 degrees

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

25 Costs at 1 year (Euros in 2005) Show forest plot

Other data

No numeric data

26 Total costs including indirect costs (Euros) at 1 year Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

Figures and Tables -
Comparison 4. Surgical versus non‐surgical treatment
Comparison 5. Locking plate versus locking intramedullary nail

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 American Shoulder and Elbow Surgeons (ASES) score (0 to 100: best) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

1.1 At 1 year

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.2 At 3 years

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

2 Death, re‐operation and adverse events Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

2.1 Death

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.2 Any complication

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.3 Screw penetration into humeral head (all had re‐operation)

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.4 Heterotopic ossification

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.5 Infection

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.6 Osteonecrosis

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.7 Degenerative change of glenohumeral joint

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.8 Secondary varus collapse

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.9 Non‐union

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

3 Pain (VAS: 0 to 10: worst) Show forest plot

Other data

No numeric data

4 Constant score (0 to 100: best) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

4.1 At 1 year

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

4.2 At 3 year

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

5 Active range of motion (at 3 years) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

5.1 Forward elevation (degrees)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

5.2 External rotation

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

6 Range of movement: internal rotation (level on spine) Show forest plot

Other data

No numeric data

7 Strength of suprapinatus (relative to opposite side) % ‐ at 3 years Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

7.1 At 1 year

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

7.2 At 3 years

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

8 Operation times and blood loss Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

8.1 Duration of surgery (minutes)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

8.2 Blood loss (ml)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

9 Intra‐operative complication Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

9.1 Pneumothorax

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

9.2 Blood transfusion

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

Figures and Tables -
Comparison 5. Locking plate versus locking intramedullary nail
Comparison 6. Locking plate versus intramedullary nails (Zifko method)

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Complications and [slight] malunion Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

1.1 Any complication

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.2 Malunion (usually slight)

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2 Constant score (% of healthy limb) at mean 2 years Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

3 Time to union and time to recover upper limb function (weeks) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

3.1 Time to radiographic union

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

3.2 Time to recover normal upper limb function

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

4 Operation and fluoroscopic times Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

4.1 Duration of operation (minutes)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

4.2 X‐ray exposure (minutes)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

5 Length of hospital stay (days) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

Figures and Tables -
Comparison 6. Locking plate versus intramedullary nails (Zifko method)
Comparison 7. Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures)

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 DASH score (0 to 100: worst disability) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

1.1 At 4 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.2 At 12 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.3 At 24 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

2 EQ‐5D score (0 to 1: best quality of life) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

2.1 At 4 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.2 At 12 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.3 At 24 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

3 Re‐operation Show forest plot

2

62

Risk Ratio (M‐H, Fixed, 95% CI)

0.32 [0.10, 1.10]

3.1 Hemiarthroplasty versus tension band wiring

1

30

Risk Ratio (M‐H, Fixed, 95% CI)

0.09 [0.01, 1.51]

3.2 Hemiarthroplasty versus locking plate fixation

1

32

Risk Ratio (M‐H, Fixed, 95% CI)

0.68 [0.16, 2.88]

4 Dead at 2 years Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

5 Implant removal at 1 year Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

6 Constant score (0 to 100: best score) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

6.1 At 4 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.2 At 12 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.3 At 24 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

7 Pain VAS (0 to 100: worst pain) at 24 months Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

8 Pain at 1 year Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

9 Range of motion at 24 months Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

9.1 Flexion (degrees)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

9.2 Extension (degrees)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

Figures and Tables -
Comparison 7. Replacement (hemiarthroplasty) versus fixation (tension band wiring; plate fixation) (4 part fractures)
Comparison 8. Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA)

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Shoulder function scores at 24 to 49 months Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

1.1 Quick DASH score (0 to 55: worst outcome)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

2 Re‐operation Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

3 Death Show forest plot

1

62

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

4 Composite (objective and subjective) shoulder function scores at 24 to 49 months Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

4.1 UCLA score (0 to 35: best outcome)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

4.2 Constant score (0 to 100: best outcome)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

4.3 Constant % relative to opposite side

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

5 Constant score at 24 to 49 months: overall and components Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

5.1 Overall score (0‐100: best score)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

5.2 Pain (maximum score 15)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

5.3 Range of motion (maximum score 40)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

5.4 Power (maximum score 25)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

5.5 Activities of daily living (maximum score 20)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

6 Complications Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

6.1 Any complication

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.2 Intra‐operative fracture

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.3 Deep infection

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.4 Superficial infection

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.5 Haematoma

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.6 Neurological complications

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.7 Severe stiffness

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.8 Proximal migration of implant

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

7 Radiological assessment findings Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

7.1 Malunion of tuberosities

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

7.2 Resorption of tuberosities

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

7.3 Scapular notching

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

7.4 Heterotopic ossification

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

8 Range of motion (degrees) at 24 to 49 months Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

8.1 Anterior forward

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

8.2 Abduction

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

Figures and Tables -
Comparison 8. Reverse shoulder arthroplasty (RSA) versus hemiarthroplasty (HA)
Comparison 9. Deltoid‐split versus deltopectoral approaches for plate fixation

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Re‐operation Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

1.1 For complication or a fall

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.2 Plate removal by patient request

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2 Dead at 1 year Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

3 Complications Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

3.1 Injurious fall on shoulder

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

3.2 Axillary nerve damage

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

3.3 Screw perforation

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

3.4 Implant (head or shaft) loosening

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

3.5 Deep infection

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

3.6 Humeral head necrosis

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

4 Constant score (0 to 100: best score) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

4.1 At 6 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

4.2 At 12 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

5 Pain (VAS 0 to 10: intolerable pain) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

5.1 At 6 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

5.2 At 12 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

6 Operation and fluoroscopic times Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

6.1 Duration of operation (minutes)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.2 X‐ray exposure (minutes)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

7 Length of hospital stay (days) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

Figures and Tables -
Comparison 9. Deltoid‐split versus deltopectoral approaches for plate fixation
Comparison 10. Polyaxial versus monoaxial screw insertion in plate fixation

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 DASH score at 12 months (0 to 100: greatest disability) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

2 Simple shoulder test (0 to 12: best outcome) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

2.1 At 3 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.2 At 6 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.3 At 12 months

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

3 Re‐operation Show forest plot

2

Risk Ratio (M‐H, Fixed, 95% CI)

Subtotals only

3.1 By 6 months

1

66

Risk Ratio (M‐H, Fixed, 95% CI)

0.85 [0.15, 4.76]

3.2 By 1 year

2

180

Risk Ratio (M‐H, Fixed, 95% CI)

1.10 [0.58, 2.08]

4 Dead at 1 year Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

5 Constant score at 12 months (% of contralateral limb) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

6 Complications (radiological assessment) Show forest plot

2

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

6.1 Any complication

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.2 Primary implant malposition

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.3 Secondary loss of reduction and screw perforation

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.4 Non‐union / delayed union due to osteonecrosis (6 months)

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.5 Avascular necrosis at 1 year

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.6 Varus deformity (> 10 / ≥20 degrees)

2

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.7 Greater tuberosity displacement

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

6.8 Screw cut‐out (intra‐articular)

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

7 Range of motion (degrees) at 12 months Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

7.1 Flexion

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

7.2 Abduction

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

7.3 External rotation

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

7.4 Internal rotation

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

8 Operation and fluoroscopic times Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

8.1 Duration of operation (minutes)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

8.2 Fluoroscopic time (minutes)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

Figures and Tables -
Comparison 10. Polyaxial versus monoaxial screw insertion in plate fixation
Comparison 11. Medial support screws versus control for locking plate fixation

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Adverse events Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

1.1 Early loss of fixation

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.2 Re‐operation for early failure

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.3 Osteonecrosis (asymptomatic)

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2 Constant score (0 to 100: best) at 2.5 years Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

Figures and Tables -
Comparison 11. Medial support screws versus control for locking plate fixation
Comparison 12. MultiLoc Proximal Humeral Nail (MPHN) versus Polarus nail

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Adverse events Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

1.1 Re‐operation

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.2 Post‐op impingement

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.3 Screw loosening

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.4 Non‐union

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.5 Rotator cuff symptoms

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.6 Intra‐operative complications

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.7 Mortality

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.8 Radiographic malunion

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2 Constant score (0 to 100: best outcome) at 14 months (6 to 22 months) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

2.1 Unadjusted Constant score

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.2 Adjusted Constant score

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

3 Range of shoulder motion (degrees) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

3.1 Lateral elevation

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

3.2 Forward flexion

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

3.3 External rotation

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

4 Lengths of surgery and hospital stay Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

4.1 Length of surgery (minutes)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

4.2 Length of hospital stay (days)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

Figures and Tables -
Comparison 12. MultiLoc Proximal Humeral Nail (MPHN) versus Polarus nail
Comparison 13. Hemiarthoplasty: EPOCA prosthesis versus HAS prosthesis

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Adverse events Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

1.1 Deep infection

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.2 Persistent pain ‐ scheduled for reoperation

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2 Radiological assessment findings Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

2.1 Resorption of tuberosities

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.2 Secondary dislocation of tuberosities

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.3 Superior migration of prosthesis

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.4 Anterior subluxations

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.5 Glenoid erosion

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.6 Aseptic loosening of stem

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

3 Range of motion results at one year (degrees) Show forest plot

Other data

No numeric data

Figures and Tables -
Comparison 13. Hemiarthoplasty: EPOCA prosthesis versus HAS prosthesis
Comparison 14. Hemiarthroplasty: tenodesis of long head of biceps (LHB) versus LHB tendon left intact

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Complications and further surgery Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

1.1 Any complication

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.2 Further surgery for listed complications

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.3 Deep infection

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.4 Tuberosity malunion

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.5 Inferior subluxation of prosthesis

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

1.6 Loss of reduction of greater tuberosity

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

2 Constant score (0 to 100: best function) at 2 years Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

3 Shoulder pain at 2 year follow‐up Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

4 Active shoulder elevation (degrees) at 2 years Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

Figures and Tables -
Comparison 14. Hemiarthroplasty: tenodesis of long head of biceps (LHB) versus LHB tendon left intact
Comparison 15. Post‐operative (percutaneous fixation) immobilisation for 1 week versus 3 weeks

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Neer score ≤ 80 points (unsatisfactory or failure) at 6 months Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

2 Premature removal of Kirschner wires Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

Figures and Tables -
Comparison 15. Post‐operative (percutaneous fixation) immobilisation for 1 week versus 3 weeks
Comparison 16. Post‐operative (hemiarthroplasty) mobilisation: early (2 weeks immobilisation) versus late (6 weeks)

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Oxford Shoulder Score at 1 year (adjusted: 0 to 100 best) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

2 Constant shoulder score (at 1 year) Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

2.1 Overall score (0 to 100: best)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.2 Pain component (0 to 15: best))

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.3 Activities of daily living component (0 to 25: best)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.4 Mobility component (0 to 40: best)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

2.5 Strength component (0 to 25: best)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

3 Radiological assessment findings Show forest plot

1

Risk Ratio (M‐H, Fixed, 95% CI)

Totals not selected

3.1 Non‐union (with bone resorption)

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

3.2 Malunion

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

3.3 Greater tuberosity migration (all had severe pain at 6 & 12 months)

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

3.4 Superior luxation of prosthesis

1

Risk Ratio (M‐H, Fixed, 95% CI)

0.0 [0.0, 0.0]

4 Range of motion at 1 year Show forest plot

1

Mean Difference (IV, Fixed, 95% CI)

Totals not selected

4.1 Elevation (degrees)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

4.2 External rotation (degrees)

1

Mean Difference (IV, Fixed, 95% CI)

0.0 [0.0, 0.0]

Figures and Tables -
Comparison 16. Post‐operative (hemiarthroplasty) mobilisation: early (2 weeks immobilisation) versus late (6 weeks)