Scolaris Content Display Scolaris Content Display

Music therapy for depression

Collapse all Expand all

Abstract

Background

Depression is a highly prevalent mood disorder that is characterised by persistent low mood, diminished interest, and loss of pleasure. Music therapy may be helpful in modulating moods and emotions. An update of the 2008 Cochrane review was needed to improve knowledge on effects of music therapy for depression.

Objectives

1. To assess effects of music therapy for depression in people of any age compared with treatment as usual (TAU) and psychological, pharmacological, and/or other therapies.

2. To compare effects of different forms of music therapy for people of any age with a diagnosis of depression.

Search methods

We searched the following databases: the Cochrane Common Mental Disorders Controlled Trials Register (CCMD‐CTR; from inception to 6 May 2016); the Cochrane Central Register of Controlled Trials (CENTRAL; to 17 June 2016); Thomson Reuters/Web of Science (to 21 June 2016); Ebsco/PsycInfo, the Cumulative Index to Nursing and Allied Health Literature (CINAHL), Embase, and PubMed (to 5 July 2016); the World Health Organization International Clinical Trials Registry Platform (WHO ICTRP), ClinicalTrials.gov, the National Guideline Clearing House, and OpenGrey (to 6 September 2016); and the Digital Access to Research Theses (DART)‐Europe E‐theses Portal, Open Access Theses and Dissertations, and ProQuest Dissertations and Theses Database (to 7 September 2016). We checked reference lists of retrieved articles and relevant systematic reviews and contacted trialists and subject experts for additional information when needed. We updated this search in August 2017 and placed potentially relevant studies in the "Awaiting classification" section; we will incorporate these into the next version of this review as appropriate.

Selection criteria

All randomised controlled trials (RCTs) and controlled clinical trials (CCTs) comparing music therapy versus treatment as usual (TAU), psychological therapies, pharmacological therapies, other therapies, or different forms of music therapy for reducing depression.

Data collection and analysis

Two review authors independently selected studies, assessed risk of bias, and extracted data from all included studies. We calculated standardised mean difference (SMD) for continuous data and odds ratio (OR) for dichotomous data with 95% confidence intervals (CIs). We assessed heterogeneity using the I2 statistic.

Main results

We included in this review nine studies involving a total of 421 participants, 411 of whom were included in the meta‐analysis examining short‐term effects of music therapy for depression. Concerning primary outcomes, we found moderate‐quality evidence of large effects favouring music therapy and TAU over TAU alone for both clinician‐rated depressive symptoms (SMD ‐0.98, 95% CI ‐1.69 to ‐0.27, 3 RCTs, 1 CCT, n = 219) and patient‐reported depressive symptoms (SMD ‐0.85, 95% CI ‐1.37 to ‐0.34, 3 RCTs, 1 CCT, n = 142). Music therapy was not associated with more or fewer adverse events than TAU. Regarding secondary outcomes, music therapy plus TAU was superior to TAU alone for anxiety and functioning. Music therapy and TAU was not more effective than TAU alone for improved quality of life (SMD 0.32, 95% CI ‐0.17 to 0.80, P = 0.20, n = 67, low‐quality evidence). We found no significant discrepancies in the numbers of participants who left the study early (OR 0.49, 95% CI 0.14 to 1.70, P = 0.26, 5 RCTs, 1 CCT, n = 293, moderate‐quality evidence). Findings of the present meta‐analysis indicate that music therapy added to TAU provides short‐term beneficial effects for people with depression if compared to TAU alone. Additionally, we are uncertain about the effects of music therapy versus psychological therapies on clinician‐rated depression (SMD ‐0.78, 95% CI ‐2.36 to 0.81, 1 RCT, n = 11, very low‐quality evidence), patient‐reported depressive symptoms (SMD ‐1.28, 95% CI ‐3.75 to 1.02, 4 RCTs, n = 131, low‐quality evidence), quality of life (SMD ‐1.31, 95% CI ‐ 0.36 to 2.99, 1 RCT, n = 11, very low‐quality evidence), and leaving the study early (OR 0.17, 95% CI 0.02 to 1.49, 4 RCTs, n = 157, moderate‐quality evidence). We found no eligible evidence addressing adverse events, functioning, and anxiety. We do not know whether one form of music therapy is better than another for clinician‐rated depressive symptoms (SMD ‐0.52, 95% CI ‐1.87 to 0.83, 1 RCT, n = 9, very low‐quality evidence), patient‐reported depressive symptoms (SMD ‐0.01, 95% CI ‐1.33 to 1.30, 1 RCT, n = 9, very low‐quality evidence), quality of life (SMD ‐0.24, 95% CI ‐1.57 to 1.08, 1 RCT, n = 9, very low‐quality evidence), or leaving the study early (OR 0.27, 95% CI 0.01 to 8.46, 1 RCT, n = 10). We found no eligible evidence addressing adverse events, functioning, or anxiety.

Authors' conclusions

Findings of the present meta‐analysis indicate that music therapy provides short‐term beneficial effects for people with depression. Music therapy added to treatment as usual (TAU) seems to improve depressive symptoms compared with TAU alone. Additionally, music therapy plus TAU is not associated with more or fewer adverse events than TAU alone. Music therapy also shows efficacy in decreasing anxiety levels and improving functioning of depressed individuals.

Future trials based on adequate design and larger samples of children and adolescents are needed to consolidate our findings. Researchers should consider investigating mechanisms of music therapy for depression. It is important to clearly describe music therapy, TAU, the comparator condition, and the profession of the person who delivers the intervention, for reproducibility and comparison purposes.

PICOs

Population
Intervention
Comparison
Outcome

The PICO model is widely used and taught in evidence-based health care as a strategy for formulating questions and search strategies and for characterizing clinical studies or meta-analyses. PICO stands for four different potential components of a clinical question: Patient, Population or Problem; Intervention; Comparison; Outcome.

See more on using PICO in the Cochrane Handbook.

Music therapy for depression

Why is this review important?

Depression is a common problem that causes changes in mood and loss of interest and pleasure. Music therapy, an intervention that involves regular meetings with a qualified music therapist, may help in improving mood through emotional expression. This review might add new information about effects of music therapy in depressed individuals.

Who will be interested in this review?

Our review will be of interest for the following people: people with depression and their families, friends, and carers; general practitioners, psychiatrists, psychologists, and other professionals working in mental health; music therapists working in mental health; and mental health policy makers.

What questions does this review aim to answer?

1. Is music therapy more effective than treatment as usual alone or psychological therapy?

2. Is any form of music therapy better than another form of music therapy?

Which studies were included in the review?

We included nine studies with a total of 421 people of any age group (from adolescents to older people). Studies compared effects of music therapy versus treatment as usual, and versus psychological therapy. Additionally, we examined the differences between two different forms of music therapy: active (where people sing or play music) and receptive (where people listen to music).

What does evidence from the review tell us?

We found that music therapy plus treatment as usual is more effective than treatment as usual alone. Music therapy seems to reduce depressive symptoms and anxiety and helps to improve functioning (e.g. maintaining involvement in job, activities, and relationships). We are not sure whether music therapy is better than psychological therapy. We do not know whether one form of music therapy is better than another. The small numbers of identified studies and participants make it hard to be confident about these comparisons.

What should happen next?

Music therapy for depression is likely to be effective for people in decreasing symptoms of depression and anxiety. Music therapy also helps people to function in their everyday life. However, our findings are not complete and need to be clarified through additional research. Future trials should study depression in children and adolescents, and future trial reports should thoroughly describe music therapy interventions, other interventions, and the person who delivers these interventions.

Authors' conclusions

available in

Implications for practice

For people affected by depressive disorders

Evidence suggests that music therapy, when added to treatment as usual (e.g. psychotherapy in combination with medication, collaborative care, occupational therapy), can help people affected by depressive disorders, such as major depression, by improving symptoms related to the condition (moderate quality) and its most frequent comorbidities, such as anxiety (low quality). Additionally, social, occupational, and psychological functioning may improve among individuals who are involved in music therapy in addition to treatment as usual (low‐quality evidence). Music therapy was not associated with more or fewer adverse events than treatment as usual (low‐quality evidence). We highlighted no differences in improving depressive symptoms between a music therapy intervention and a psychological therapy or medication only. We noted no differences between active and receptive music therapy approaches (very low‐quality evidence).

Active participation is crucial for the success of music therapy. Participants do not need musical skills, but motivation to work actively within a music therapy process is important. Some characteristics of these individuals, such as a tendency toward symbolic processing and imagery, or particular personality traits, may favour engagement in music therapy.

For clinicians

Music therapy, when added to treatment as usual, seems to improve symptoms of depression (moderate‐quality evidence). Music therapy seems to be beneficial also for anxiety (low‐quality evidence), which is often a comorbidity of depression. Severely depressed individuals often experience impairment in maintaining relationships and work engagements. In this regard, evidence suggests significant amelioration in the level of functioning among those who attended music therapy sessions (low‐quality evidence). Improvements in depressive symptoms and functioning are likely interrelated: Mitigation of depressive symptoms may lead to better outcomes in the socio‐occupational sphere, and vice versa. In fact, positive effects might help support motivation as well as emotional and relational competencies of people affected by depression, from adolescents to older adults. Our results do not suggest superiority of music therapy over other psychological therapies (evidence of low to very low quality). Rather, music therapy should be considered in combination with standard care, with respect to patient preferences. We do not know whether one form of music therapy is better than another (very low‐quality evidence).

When providing music therapy, clinicians must be mindful that the specific methods and techniques of music therapy, including among others adaptation of musical material to individual needs, musical improvisation, and discussion of personal topics emerging through musical processes, require specialised music therapy training. Training courses and qualified music therapists are available in many countries, but in some countries, training programmes of better quality may be needed.

For managers/policy makers

Evidence suggests that music therapy, when added to treatment as usual, can help people affected by depressive disorders, such as major depression, by improving symptoms related to depression (moderate‐quality evidence; large effect size for clinician‐reported depression and patient‐reported depression) and its most frequent comorbidities, such as anxiety (low‐quality evidence). We are uncertain whether music therapy is better than psychological therapy (evidence of low to very low quality). Neither do we know whether one form of music therapy is better than the other (very low‐quality evidence).

Depression incurs high costs for healthcare systems and for society because it may cause impairment in both psychological and socio‐occupational functioning. Reductions in depressive and anxious symptoms and consequent improvement in everyday life functioning may reduce the costs that burden both healthcare systems and society. Wider implementation may be slow because trained music therapists are not available everywhere. Currently, around 6000 qualified music therapists are practising in Europe (EMTC 2017) and 7000 in the USA (CBMT 2017), with large differences in numbers between and within countries.

Implications for research

In general, the quality of research concerning music therapy for patients with depressive disorders could be improved. Future researchers should adhere to guidelines such as the CONSORT statement and should focus on particular points that deserve to be addressed (Schulz 2010).

Characteristics of the population

Further research is needed on the effectiveness of music therapy for adults with a specific diagnosis of major depressive disorder and other depressive disorders. It appears of primary importance to clarify the type of diagnosis, which should be possibly performed by clinicians according to an international diagnostic classification (e.g. International Classification of Diseases (ICD) or Diagnostic and Statistical Manual of Mental Disorders (DSM)), not just according to self‐reported scales or questionnaires. Additionally, it would be desirable to investigate effects of music therapy both in recurrent depressive disorders and in single depressive episodes.

Relatively little research has focused on working‐age people with depression. Only four of the nine included studies specifically addressed this broad and important age group, although the largest study did include working‐age people. Future studies should also consider depression in children and adolescents.

Characteristics of the intervention
Music therapy characteristics

Future reports should better describe characteristics of the music therapy approaches adopted in these trials. Researchers should clearly describe the aims and rationale and specific methods, techniques, and procedures implemented by music therapists. First, it should be more consistently stated whether or not the intervention is conducted by a trained music therapist or a music therapy trainee. Interventions developed and conducted in trials by a certified music therapist are needed. Second, a thorough description of the interventions appears essential, to give professionals the opportunity to learn and apply effective methods and techniques in their clinical practice, as well as in music therapy training programmes and future trials. The topic of treatment fidelity is also relevant (Erkkilä 2014). In fact, only one study reported that therapists participated in extensive training to guarantee reliability of the intervention provided (Erkkilä 2011).

It would be desirable to conduct trials in which different music therapy methods are adopted, such as active and receptive music therapy or combinations thereof. This would be important for improved understanding of which form of music therapy could be better tolerated and more pleasant and beneficial for participants, as well as for enhanced knowledge of the mechanisms underlying treatment effects. It could also be hypothesised that a portion of the population is more likely to respond positively to music therapy as the result of individual features. This information could prove helpful in the development of different music therapy techniques that can be tailored to patient characteristics.

Duration of the intervention

To date, researchers have mainly considered short‐term interventions and have provided limited attention to long‐term effects of music therapy extending over more than six months. Actually, only one study evaluated effects of the intervention at medium term (Erkkilä 2011). Interventions of longer duration and longer follow‐up periods are needed to better elucidate the medium‐term and long‐term effects of music therapy on symptoms of depression and its correlates. This is particularly important because the length of trials often does not reflect the complexity of therapeutic processes, which usually last months or years.

Additionally, it would be useful to know the rate of attendance of participants at music therapy sessions; this information is rarely reported. In fact, analyses considering patient compliance could be useful toward understanding whether treatment adherence might influence outcomes.

Dosage of the intervention

Studies randomising high versus low 'dosage' of music therapy would be required. Such trials would require considerably larger sample sizes because expected differences in effect sizes between two active treatments will be smaller than those between music therapy as add‐on treatment and standard care alone.

Outcomes

This review indicates that reduction in symptoms of depression and anxiety could be identified as the most frequently assessed outcomes. These are of general clinical importance in mental health care, but other health‐related aspects (e.g. quality of life, level of functioning, personality, self‐esteem) supported by music therapy could be similarly relevant to both the patient and the music therapeutic method. Of note, none of the included studies addressed outcomes such as cost, cost‐effectiveness, or satisfaction with treatment. To gain knowledge about effects, mechanisms, and ingredients of music therapy for depression, better‐designed trials must include similar and meaningful outcomes, as well as mixed methods and outcomes more directly related to music therapy processes, such as the impact of music elements and specific music therapy techniques.

The studies included under Studies awaiting classification, once assessed, may alter the conclusions of this review.

Summary of findings

Open in table viewer
Summary of findings for the main comparison. Music therapy plus treatment as usual (TAU) versus TAU for depression (primary comparison)

Music therapy plus treatment as usual (TAU) versus TAU

Patient or population: individuals with depression
Setting: any setting
Intervention: music therapy plus treatment as usual
Comparison: treatment as usual

Outcomes

Anticipated absolute effects* (95% CI)

Relative effect
(95% CI)

No. of participants
(studies)

Quality of the evidence
(GRADE)

Comments

Risk with treatment as usual

Risk with music therapy

Depressive symptoms

(clinician‐rated) (various scales)

Up to 3 months

Mean clinician‐rated depressive symptoms

in the intervention group were

SMD 0.98 SD lower (1.69 lower to 0.27 lower).

219
(3 RCTs; 1 CCT)

⊕⊕⊕⊝
MODERATEa

Lower score equals a better outcome.

SMD corresponds to a large effect size.

Depressive symptoms

(patient‐reported) (various scales)

Up to 3 months

Mean patient‐reported depressive symptoms

in the intervention group were

SMD 0.85 SD lower (1.37 lower to 0.34 lower).

142
(3 RCTs; 1 CCT)

⊕⊕⊕⊝
MODERATEa

Lower score equals a better outcome.

SMD corresponds to a large effect size.

Any adverse events

Up to 3 months

Study population

OR 0.45
(0.02 to 11.46)

79
(1 RCT)

⊕⊕⊝⊝
LOWb

22 per 1000

10 per 1000
(0 to 203)

Functioning (GAF)

Up to 3 months

Mean functioning in the intervention group was

SMD 0.51 SD higher (0.02 higher to 1 higher).

67
(1 RCT)

⊕⊕⊝⊝
LOWb

Higher score equals a better outcome.

SMD corresponds to a moderate effect size.

Quality of life (RAND‐36)

Up to 3 months

Mean quality of life in the intervention group was

SMD 0.32 SD higher (0.17 lower to 0.80 higher).

67
(1 RCT)

⊕⊕⊝⊝
LOWb

Higher score equals a better outcome.

Leaving the study early

Up to 3 months

Study population

OR 0.49
(0.14 to 1.70)

293
(5 RCTs; 1 CCT)

⊕⊕⊕⊝
MODERATEa

65 per 1000

33 per 1000
(10 to 106)

Anxiety (HADS‐A)

Up to 3 months

Mean anxiety in the intervention group was

SMD 0.74 SD lower (1.40 lower to 0.08 lower).

195
(2 RCTs; 1 CCT)

⊕⊕⊝⊝
LOWa,c

Lower score equals a better outcome.

SMD corresponds to a moderate effect size.

*The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CCT: controlled clinical trial; CI: confidence interval; GAF: Global Assessment of Functioning scale; HADS‐A: Hospital Anxiety and Depression Scale ‐ Anxiety; OR: odds ratio; RAND‐36: health‐related quality of life survey distributed by RAND; RCT: randomised controlled trial; RR: risk ratio; SD: standard deviation; SMD: standardised mean difference.

GRADE Working Group grades of evidence.
High quality: We are very confident that the true effect lies close to that of the estimate of the effect.
Moderate quality: We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low quality: Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect.
Very low quality: We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect.

aDowngraded one level for unclear randomisation, allocation concealment, blinding, missing study protocol.

bDowngraded two levels for wide confidence intervals, although adequately powered, well‐performed trial.
cDowngraded one level for variation effect sizes, non‐ or small overlap confidence intervals, high heterogeneity.

Open in table viewer
Summary of findings 2. Music therapy versus psychological treatment for depression

Music therapy versus psychological treatment for depression

Patient or population: adults with depression
Setting: any setting
Intervention: music therapy
Comparison: psychological therapy (counselling, cognitive‐behavioural therapy)

Outcomes

Anticipated absolute effects* (95% CI)

Relative effect
(95% CI)

No. of participants
(studies)

Quality of the evidence
(GRADE)

Comments

Risk with psychological treatment

Risk with music therapy

Depressive symptoms

(clinician‐rated) (MADRS)

Up to 3 months

Mean clinician‐rated depressive symptoms

in the intervention group was

SMD 0.78 SD lower (2.36 lower to 0.81 higher).

11
(1 RCT)

⊕⊝⊝⊝
VERY LOWa,b

Lower score equals better outcome.

SMD corresponds to a large effect size.

Depressive symptoms

(patient‐reported) (various scales)

Up to 3 months

Mean patient‐reported depressive symptoms

in the intervention group were

SMD 1.28 SD lower (3.57 lower to 1.02 higher).

131
(4 RCTs)

⊕⊕⊝⊝
LOWa,c

Lower score equals better outcome.

SMD corresponds to a large effect size.

Any adverse events ‐ not reported

Functioning ‐ not reported

Quality of life (Thai RAND‐36)

Up to 3 months

Mean quality of life

in the intervention group was

SMD 1.31 SD higher (0.36 lower to 2.99 higher).

11
(1 RCT)

⊕⊝⊝⊝
VERY LOWa,b

Higher score equals better outcome.

Leaving the study early

Up to 3 months

Study population

OR 0.17
(0.02 to 1.49)

157
(4 RCTs)

⊕⊕⊕⊝
MODERATEa

35 per 1000

9 per 1000
(1 to 77)

Anxiety ‐ not reported

*The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CI: confidence interval; MADRS: Montgomery‐Åsberg Depression Rating Scale; OR: odds ratio; RAND‐36: health‐related quality of life survey distributed by RAND; RCT: randomised controlled trial; RR: risk ratio; SD: standard deviation; SMD: standardised mean difference.

GRADE Working Group grades of evidence.
High quality: We are very confident that the true effect lies close to that of the estimate of the effect.
Moderate quality: We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low quality: Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect.
Very low quality: We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect.

aDowngraded one level for limitations in design such as unclear allocation concealment, blinding, incomplete outcome data, missing protocol.

bDowngraded two levels for small sample size.

cDowngraded one level for non‐overlap of confidence intervals, high heterogeneity (P < 0.00001); I2 = 96%.

Open in table viewer
Summary of findings 3. Active music therapy versus receptive music therapy for depression

Active music therapy versus receptive music therapy for depression

Patient or population: adults with depression
Setting: any setting
Intervention: active music therapy
Comparison: receptive music therapy

Outcomes

Anticipated absolute effects* (95% CI)

Relative effect
(95% CI)

No. of participants
(studies)

Quality of the evidence
(GRADE)

Comments

Risk with receptive music therapy

Risk with active music therapy

Depressive symptoms

(clinician‐rated) (MADRS)

Up to 3 months

Mean clinician‐rated depressive symptoms

in the intervention group were

SMD 0.52 SD lower (1.87 lower to 0.83 higher).

9
(1 RCT)

⊕⊝⊝⊝
VERY LOWa,b

Lower score equals a better outcome.

Depressive symptoms (patient‐reported) (TDI)

Up to 3 months

Mean patient‐reported depressive symptoms

in the intervention group were

SMD 0.01 SD lower (1.33 lower to 1.3 higher).

9
(1 RCT)

⊕⊝⊝⊝
VERY LOWa,b

Lower score equals a better outcome.

Any adverse events ‐ not reported

Functioning ‐ not reported

Quality of life (SF‐36 Thai)

Up to 3 months

Mean quality of life

in the intervention group was

SMD 0.24 SD lower (1.57 lower to 1.08 higher).

9
(1 RCT)

⊕⊝⊝⊝
VERY LOWa,b

Higher score equals a better outcome.

Leaving the study early

Up to 3 months

Study population

OR 0.27
(0.01 to 8.46)

10
(1 RCT)

⊕⊝⊝⊝
VERY LOWa,b

200 per 1000

63 per 1000
(2 to 679)

Anxiety ‐ not reported

*The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CI: confidence interval; MADRS: Montgomery‐Åsberg Depression Rating Scale; OR: odds ratio; RCT: randomised controlled trial; RR: risk ratio; SD: standard deviation; SF‐36: Short Form‐36; SMD: standardised mean difference; TDI: Thai Depression Inventory.

GRADE Working Group grades of evidence.
High quality: We are very confident that the true effect lies close to that of the estimate of the effect
Moderate quality: We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different
Low quality: Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect
Very low quality: We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect

aDowngraded one level for limitations in design such as unclear allocation concealment, blinding, missing protocol.

bDowngraded two levels for small sample size.

Background

available in

Description of the condition

Depression is a mood disorder and a common mental illness that affects more than 300 million people worldwide. Depression is projected to become the leading cause of disability by the year 2020. At its worst, depression can lead to suicide, and it has been linked to approximately 800,000 cases of suicide per year (WHO 2017).

Depression is characterised by core symptoms of persistent low mood, diminished interest, loss of pleasure, and lack of energy, along with other symptoms such as sleep disturbance, appetite and weight disturbance, poor concentration, psychomotor changes, and feelings of guilt, worthlessness, and low self‐esteem (WHO 1992). Affective disturbance is at the core of depression (Gotlib 2014).

As with most psychiatric disorders, the aetiology of depression appears to be multi‐factorial, involving both genetic and environmental factors, and current evidence points towards a complex interaction between neurotransmitter availability and receptor regulation within the brain (Palazidou 2012).

A major depressive disorder (MDD) can be diagnosed on the basis of one of two widely used classification systems: the World Health Organization’s International Classification of Disease, Tenth Revision (ICD‐10) (WHO 1992), and the American Psychiatric Association's Diagnostic and Statistical Manual of Mental Disorders, 5th edition (DSM‐5) (APA 2013). In both systems, diagnosis requires the presence of at least one core symptom for most of the day, almost every day for at least two weeks. Severity of depression ‐ mild, moderate, or severe ‐ is determined by the number and severity of symptoms and the degree of functional impairment. Depressive disorders are comorbid with a vast array of other psychiatric disorders, health problems, and diseases, and with many types of severely dysfunctional relationships (Richards 2014).

Depressive symptoms can become chronic and recurrent and can lead to substantial impairment in an individual’s ability to function in everyday life (WHO 2012). It is important to recognise that individuals experiencing persistent depressive symptoms below the threshold for a diagnosis of MDD, previously categorised as having a ‘minor depressive disorder’, may find their symptoms equally as distressing and disabling (Fils 2010).

Description of the intervention

Music therapy can be defined as “the professional use of music and its elements as an intervention in medical, educational, and everyday environment with individuals, groups, families, or communities, who seek to optimise their quality of life and improve their physical, social, communicative, emotional, intellectual and spiritual health and well‐being. Research, practice, education, and clinical training in music therapy are based on professional standards according to cultural, social, and political contexts (WFMT 2011)".

Music therapy is delivered in a variety of contexts (e.g. mental health, medical, community, developmental, and educational contexts) (Edwards 2016). Music therapy can be delivered to groups or individually, and participants may drop into an open group (e.g. in a psychiatric ward setting) or may be referred and assessed by the music therapist before placement in individual treatment or closed group therapy.

Music therapy approaches across the world have emerged from diverse traditions such as behavioural, psychoanalytical, educational, or humanistic models of therapy. Music therapy methods can be active and/or receptive and include verbal processing of feelings and experiences. In active methods (improvisational, re‐creative, compositional), participants are ‘making music’, and in receptive music therapy, participants are ‘receiving’ (e.g. listening to) music (Bruscia 2014; Wheeler 2015). Improvisation might be the active method most commonly used in adult mental health (Gold 2009). Often, different methods and techniques are combined in the same therapy. In recent years, specialisations have evolved (e.g. neurologic music therapy (NMT)), to improve cognitive, sensory, and motor functioning (Thaut 1999; Thaut 2014).

The aim of music therapy is to improve health via therapeutic change agents such as music, relationships, and reflections. In both active and receptive methods, the music therapist and participants are actively involved and musical interaction takes place between therapist and patient, or between therapist and group. Sessions are carried out within a structured therapeutic framework that serves as the basis for the music therapy intervention. Music therapy training is delivered at the Master's level, at the Bachelor's level, or at completion of extended undergraduate degree programmes.

Evidence‐based practice (EBP) is receiving increased attention in music therapy (Edwards 2016; Silverman 2015; Wigram 2014). This work involves integration of the best available research evidence, the therapist’s clinical expertise, and the patient’s unique values and circumstances (Hoffmann 2013; Straus 2011). Cochrane reviews are an important source of information on EBP of music therapy and have been conducted to provide a guide for music therapy treatment, music therapy education, and development of meaningful guidelines (Edwards 2016).

How the intervention might work

Music is a powerful stimulus that evokes and modulates moods and emotions (Baumgartner 2006; Baumgartner 2006a; Koelsch 2015); music is often used intentionally to regulate moods and emotions in daily life (Juslin 2010). Juslin reports that music may influence motivation, self‐image, and coping mechanisms around difficult feeling states; in some forms of music therapy, the therapist explicitly helps individuals process feelings that have been aroused by music (Juslin 2010). Other possible mechanisms of action have been described by Maratos and colleagues (Maratos 2011), who suggest that high levels of engagement are seen in music therapy trials because music‐making is largely a social, pleasurable, and meaningful activity, and that therapists use these affordances in a variety of ways to help people.

In active music therapy, the music therapist uses improvisational, re‐creative, or compositional methods. Improvisational methods in music therapy include any experience by which the patient actively participates in spontaneous music‐making with the music therapist or with other individuals while playing instruments, vocalising, or sounding their bodies or other objects. Re‐creative methods involve reproduction of pre‐composed musical material vocally or instrumentally. With compositional methods, the process of composition helps patients generate and refine personal opinions, ideas, and fantasies, and puts them into a workable musical structure (Bruscia 2014).

The putative mechanism of action in active music therapy for depression is that the co‐created musical relationship between the therapist and the patient or the patient group enables the patient to experience and to gain insight into relational and emotional problems by talking about the musical dialogue (Nordoff 1977; Odell‐Miller 1995); to organise, problem‐solve, take responsibility, communicate, improve attention, and experience feelings of self‐worth and achievement (Bruscia 2014); to meet a variety of emotional states and physical needs (Wheeler 2015a); and to express emotions by creating musical sounds and structures (Punkanen 2011). Synchronisation and attuned musical expression can modulate levels of stress and anxiety. Intersubjective moments form the basis for development of subjectivity, togetherness, creation of meaning, and possibilities of actions and language (Trondalen 2016).

Active music therapy is likely to be influenced by psychodynamic, cognitive‐behavioural, or humanistic traditions, and sometimes is combined with other forms of art, such as writing, drawing, and movement.

In receptive music therapy, the music therapist uses methods and techniques by which the patient is a recipient of the music experience (Grocke 2007). The music in music therapy may consist of live or recorded improvisations, performances, or compositions presented in various styles, such as classical, rock, jazz, and country. The patient is encouraged to listen to music and to respond silently, verbally, or in another modality. Methods include music relaxation, song discussion, listening to the patient's preferred music, and imaginal listening, for which Guided Imagery and Music (GIM) is an internationally well‐known method (Bruscia 2014).

The putative mechanism of action in receptive music therapy for depression is that different types of musical stimuli directly induce shifts in consciousness, stimulate imaging and senses, induce moods and evoke feelings, influence the body, stimulate or sedate physical or mental energy, motivate or discourage physical activity, motivate interaction, and evoke introspection, reflection, and insight (Bruscia 2015). It has been suggested that receptive music therapy can help reduce stress, soothe pain, and energise the body (Bruscia 1991; Standley 1991). Intentional listening via images enables the patient to focus, relax, experience, and share experiences, and leads to reduced anxiety (Grocke 2007; Grocke 2015).

Receptive music therapy is also likely to be influenced by cognitive‐behavioural, humanistic, or psychodynamic traditions and may involve an adjunctive activity performed whilst listening, such as relaxation, meditation, movement, drawing, or reminiscence.

Why it is important to do this review

This is an update of a Cochrane review first published in 2008 (Maratos 2008). Authors of the original review stated that music therapy has been offered to people with mental disorders across the world, yet the evidence base of music therapy for depression had not been examined. Trials were not reviewed, and randomised controlled trials (RCTs) included small sample sizes, making outcomes difficult to gauge accurately. Participant groups were often heterogeneous, and approaches to and methods of music therapy varied. Since the first review was published, several larger, more robust RCTs of music therapy for depression have been reported, and an update of the 2008 systematic review has become necessary to assess available evidence on music therapy with the goals of understanding its effectiveness for patients with depression and comparing effects of different forms of music therapy.

Maratos and colleagues included five studies in the first version of this review (Chen 1992; Hanser 1994; Hendricks 1999; Radulovic 1996; Zerhusen 1995); review authors concluded at that time that music therapy was accepted by people with depression and was associated with improvement in depressive symptoms. Because of the small number and low methodological quality of identified studies, review authors could not confidently provide conclusions about the effectiveness of music therapy. Those review authors suggested that high‐quality trials evaluating effects of music therapy on depression were required (Maratos 2008). Additionally, Maratos and colleagues did not conduct a meta‐analysis owing to heterogeneity of studies.

To date, several other trials related to music therapy and depression have been conducted, but they have not yet been systematically reviewed. Authors of a narrative review on music therapy and depression concluded that current research regarding music therapy and depression suggests a significant and persistent reduction in patients' symptoms, along with improvements in quality of life (Assche 2015). However, review authors did not include all relevant data from the most recent trials and did not conduct a meta‐analysis. Also, the authors of another recent systematic review and meta‐analysis concluded that music therapy reduces depressive symptoms, but that review was limited to studies of older adults (Zhao 2016).

We prepared the current update to provide up‐to‐date conclusions on the effectiveness of music therapy for individuals of all age groups with a diagnosis of depression, in any setting. We also aimed to compare different music therapy methods and approaches to enable better understanding of the relationship between process and outcomes. Finally, results of this systematic review might lead to new implications for research, guidelines, clinical practice, policy, and music therapy education.

Objectives

available in

  1. To assess effects of music therapy for depression in people of any age compared with treatment as usual (TAU) and psychological, pharmacological, and/or other therapies.

  2. To compare effects of different forms of music therapy for people of any age with a diagnosis of depression.

Methods

available in

Criteria for considering studies for this review

Types of studies

All randomised controlled trials (RCTs) and clinical controlled trials (CCTs), published and unpublished, undertaken in any country, were eligible for inclusion.

Types of participants

Participant characteristics

People of any age, gender, and ethnicity, in any country.

Diagnosis

The primary diagnosis for trial participants was clinical depression, as classified by the International Classification of Diseases, Tenth Revision (ICD‐10) (WHO 1992), or the Diagnostic and Statistical Manual of Mental Disorders, 3rd edition (DSM‐III) (APA 1980), DSM, 3rd revised edition (DSM‐III‐R) (APA 1987), DSM, 4th edition (DSM‐IV) (APA 1994), DSM, 4th text revised edition (DSM‐IV‐TR) (APA 2000), or DSM, 5th edition (DSM‐5) (APA 2013). Review authors identified this diagnosis by (1) performing a psychological assessment, or making a psychiatric diagnosis; (2) scoring above a cutoff score on a validated self‐rating depression questionnaire; or (3) scoring above a cutoff score on a validated clinician‐rated instrument.

Comorbidities

Given that depression is often related to other health problems and may co‐occur with other diagnoses, we accepted for inclusion any kind of comorbidity such as anxiety disorder, alcohol abuse, personality disorder, dementia, autism, schizophrenia, psychosis, or somatoform comorbidity.

Setting

We included all settings in this review.

Types of interventions

Music therapy

Any form of music therapy (e.g. improvisational, re‐creative, compositional, or receptive methods) provided alone or in addition to any form of treatment as usual (TAU), as defined by trialists.

To be included, music therapy had to be provided by a trained therapist or health professional. To be classified as well‐defined music therapy, a coherent theoretical framework underpinning the intervention must have been described. Trials involving trainees in formal music therapy training programmes were considered, as were programmes provided by music therapists without formal training. Some untrained practitioners call their practice music therapy; owing to the relative newness of music therapy as a regulated profession, we included these studies in this review as well. In summary, to be classified as well‐defined music therapy, the intervention had to comprise the following features.

  1. Sessions were carried out within a structured therapeutic framework.

  2. Some kind of musical interaction took place between therapist and participant, or between therapist and members of a group (e.g. improvisation, other forms of musical expression, listening to music).

  3. The aim of therapy was to improve health.

  4. The main therapeutic change agent could be described as the music; the relationship; or reflections induced by the music.

Comparator interventions

  1. TAU (as defined by trialists)

  2. Psychological therapies

  3. Pharmacological therapies

  4. Another form of music therapy

TAU, which can be defined as the combination of different therapies or activities (e.g. psychotherapy, medication, collaborative care, occupational therapy, re‐creative activities), represents standard treatment for individuals with mental health conditions, such as depression.

Main comparisons

  1. Music therapy alone versus TAU

  2. Music therapy plus TAU versus TAU alone

  3. Music therapy alone versus psychological therapies

  4. Music therapy alone versus pharmacological therapies

  5. One form of music therapy versus another form of music therapy

Types of outcome measures

Primary outcomes

  1. Depressive symptoms: We assessed depressive symptoms according to continuous validated depression measures. We analysed clinician‐rated scales, such as the Hamilton Rating Scale for Depression (HAM‐D; Hamilton 1960), separately from patient‐reported scales, such as the Beck Depression Inventory (BDI; Beck 1961).

  2. Adverse effects: We assessed the number of adverse events.

Secondary outcomes

  1. Social and occupational functioning, as measured by a validated tool, such as the Social Functioning Questionnaire (SFQ; Tyrer 2005)

  2. Self‐esteem, as measured by a validated tool, such as the Rosenberg Self‐Esteem Inventory (RSE; Rosenberg 1979)

  3. Quality of life, as assessed on a validated measure scale, such as EuroQol (Brooks 1995)

  4. Costs or cost‐effectiveness (or a combination) of treatment, as assessed by any type of qualitative or quantitative analysis, such as TiC‐P (commonly applied questionnaire on healthcare utilisation and productivity losses in patients with a psychiatric disorder) (Bouwmans 2013)

  5. Leaving the study early owing to non‐acceptability or tolerability of treatment for any reason, based on any type of qualitative or quantitative analysis

  6. Anxiety, as measured by a validated assessor‐rating scale, such as the Hamilton Anxiety Scale (HAM‐A; Hamilton 1959), or a self‐rating scale, such as the Beck Anxiety Inventory (BAI; Beck 1988)

  7. Satisfaction with treatment, as measured by validated tools, such as the howRwe questionnaire (Benson 2014)

Timing of outcome assessment

We included in the review any duration of treatment period and all time frames of outcome assessment. We grouped time points of outcome assessments and classified them into short‐term (up to three months from randomisation), medium‐term (up to six months), and long‐term (longer than six months) outcomes. We decided that short‐term outcomes were most important to include in the 'Summary of findings' tables. If a study reported more than one time point within the considered time frame, we chose the latest time point for analyses.

Hierarchy of outcome measures

If a study used multiple measures per outcome, we planned to give preference to measures of validated instruments, such as the Montgomery‐Åsberg Depression Rating Scale (MADRS; Montgomery 1979), the HAM‐D (Hamilton 1960), the Beck Depression Inventory (BDI; Beck 1961; Beck 1988), the Inventory of Depressive Symptomatology (IDS; Rush 1986), and the Symptom Checklist‐90‐Revision (SCL‐90‐R; Derogatis 1977). If several measures assessed the same outcomes in one particular study, we prioritised the measures with highest validity and reliability. Rating scales were completed by participants, their significant others, an independent observer who may or may not have been masked, or music therapists conducting the music therapy. We decided to report both clinician‐rated and patient‐reported outcomes in the 'Summary of findings', when available.

Search methods for identification of studies

Specialised Register of the Cochrane Common Mental Disorders Group (CCMD‐CTR)

The Cochrane Common Mental Disorders Group maintains a specialised register of RCTs ‐ the CCMD‐CTR. This register contains over 40,000 reference records (reports of RCTs) for anxiety disorders, depression, bipolar disorder, eating disorders, self‐harm, and other mental disorders within the scope of this Group. The CCMD‐CTR is a partially studies‐based register with more than 50% of reference records tagged to 12,500 individually PICO‐coded study records. We collated reports of trials for inclusion in the register from (weekly) generic searches of MEDLINE (1950‐), Embase (1974‐), and PsycINFO (1967‐); through quarterly searches of the Cochrane Central Register of Controlled Trials (CENTRAL); and by review‐specific searches of additional databases. We also sourced reports of trials from international trial registries and drug companies, and handsearched key journals, conference proceedings, and other (non‐Cochrane) systematic reviews and meta‐analyses. Details of CCMD's core search strategies (used to identify RCTs) can be found on the Group's website; an example of the core MEDLINE search is displayed in Appendix 1.

Electronic searches

We developed a review protocol that was based on the Preferred Reporting Items for Systematic Reviews and Meta‐Analysis (PRISMA) statement (www.prisma‐statement.org). Sarah Dawson (SD), Trials Search Co‐ordinator, Cochrane Common Mental Disorders (CCMD) Group, searched CCMD‐CTR and the Wiley/Cochrane Library from inception. We searched Thomson Reuters/Web of Science, Ebsco/PsycInfo, Ebsco/Cumulative Index to Nursing and Allied Health Literature (CINAHL), Embase.com, PubMed, the World Health Organization International Clinical Trials Registry Platform (WHO ICTRP), ClinicalTrials.gov, the National Guideline Clearing House, OpenGrey, Digital Access to Research Theses (DART)‐Europe E‐theses Portal, Open Access Theses and Dissertations, and the ProQuest Dissertations and Theses Database from inception (JCFK, SA). We also searched CCMD‐CTR to 6 May 2016; the Wiley/Cochrane Library to 17 June 2016 (SD); Thomson Reuters/Web of Science to 21 June 2016; Ebsco/PsycInfo, Ebsco/CINAHL, Embase.com, and PubMed to 5 July 2016; WHO ICTRP, ClinicalTrials.gov, the National Guideline Clearing House, and OpenGrey to 6 September 2016; and DART‐Europe E‐theses Portal, Open Access Theses and Dissertations, and the ProQuest Dissertations and Theses Database to 7 September 2016 (JCFK, SA). We used the following terms (including synonyms and closely related words) as index terms or free‐text words: ‘depression’ or ‘mood disorders’ or ‘affective disorders’ and ‘music’ and ‘RCT’s’. We have provided full search strategies for all databases in the Appendices. We performed a further search in August 2017 (Appendix 3). We have added those results to 'Studies awaiting classification' and will incorporate them into this systematic review at the next update.

Searching other resources

Reference lists

We checked the reference lists of all included studies and relevant systematic reviews to identify additional studies missing from the original electronic searches (e.g. unpublished or in‐press citations).

Personal communication

We contacted trialists and subject experts for information on unpublished or ongoing studies, or to request additional trial data.

Other resources

We planned to search the International Music Therapy Research Register, which is specialised in music therapy studies, but this register was no longer available. We did not handsearch specialist journals in music therapy for this review update because all journals are now available online, and articles could be obtained in the databases mentioned above.

Data collection and analysis

Selection of studies

We considered studies for inclusion if they had an RCT or CCT design. We downloaded all search results into EndNote and Review Manager (RevMan 2014). One review author (SA) removed exact duplicates. Two review authors (SA, RF) independently screened remaining titles and abstracts for inclusion to select all potentially relevant studies. To prevent bias in assessment, the first review author was knowledgeable about music therapy, and the second review author was knowledgeable about mental health care. We identified multiple reports related to the same study to determine which studies were eligible for inclusion. If uncertainties about duplication remained, review authors contacted authors of study reports. We coded all articles as potentially eligible or not eligible. After reading full‐text articles, the same review authors independently decided whether studies met the inclusion criteria. We resolved disagreements through discussion or by consultation with a third review author (AV). We have shown the selection process in a PRISMA flow diagram (Figure 1). We listed included studies under Characteristics of included studies; we identified potentially relevant studies that we ultimately excluded under Characteristics of excluded studies, and provided the primary reason for exclusion.


PRISMA flow diagram.

PRISMA flow diagram.

Data extraction and management

Two review authors (SA, RF) independently extracted study characteristics and outcome data from included studies, using a standardised data extraction form in Word, which was piloted at seven studies before use, and double‐entered the data into Review Manager (RevMan 2014) software. In cases of disagreement between review authors, we sought clarification from trial investigators. We obtained missing information from investigators when possible (SA). We resolved disagreements by discussion or through consultation with a third review author (AV). If outcome data were not reported in a usable way, we mentioned this in the notes in the Characteristics of included studies table. SA transferred data into the Review Manager file (RevMan 2014). SA and RF double‐checked whether data were correctly entered. Other review authors (LF, CG) checked study characteristics for accuracy against the trial report and extracted the following study characteristics.

  1. Source: study ID, report ID, review author ID, date of study, citation and contact details.

  2. Methods: study design, power calculation, date of study, duration of study, sequence generation, allocation sequence concealment, blinding, other concerns of bias, ethics.

  3. Participants: total number, setting, diagnostic criteria, severity of depression, number of prior depressive episodes, age, sex, country, comorbidity, sociodemographics, ethnicity.

  4. Intervention: total number of groups, music therapy method, intensity of sessions, duration of session, duration of treatment, individual or group, therapist’s training, therapist's post‐qualifying experience, monitoring of adherence to music therapy paradigm/protocol, comparison, concomitant treatment, medication, excluded interventions, integrity of interventions.

  5. Outcomes: primary outcomes, secondary outcomes, collected and reported (for scales) upper and lower limits and whether high or low score is good, time points reported.

  6. Results: number of participants allocated to each intervention group, total sample size, summary data for each intervention group (2 × 2 table for dichotomous data; means and standard deviations (SDs) for continuous data).

  7. Miscellaneous: funding for trial, notable conflicts of interest of trial authors, other and key conclusions (Higgins 2015).

Assessment of risk of bias in included studies

We assessed risk of bias according to the new Cochrane method (Higgins 2015). Two review authors (SA, RF) independently assessed risk of bias for each included study using the criteria outlined in the Cochrane Handbook for Systematic Reviews of Interventions, to prevent overestimation or underestimation of the true intervention effect (Higgins 2015). We assessed risk of bias according to the following domains.

  1. Random sequence generation.

  2. Allocation concealment.

  3. Blinding of participants.

  4. Blinding of personnel.

  5. Outcome assessment.

  6. Incomplete outcome data.

  7. Selective outcome reporting.

  8. Other potential threats to validity.

We judged each potential source of bias as having high, low, or unclear risk. We resolved disagreements by discussion and consensus or, in cases of no consensus, by involving a third review author (AV). We provided a supporting quotation from the study report, together with a justification for judgements, in the Risk of bias in included studies table. We summarised risk of bias judgements across different studies for each of the domains listed. In the case that information on risk of bias was related to unpublished data or to correspondence with a trialist, we planned to quote this in the ‘Risk of bias’ table. When considering conclusions on treatment effects, we took into account risk of bias of trials that contributed to that outcome (Higgins 2015).

For cluster‐randomised trials, we considered particular biases (e.g. recruitment bias), along with baseline imbalance, loss of clusters, incorrect analysis, and comparability with individually randomised trials. To assess risk of bias in cross‐over trials, we took the following topics into account: whether the cross‐over design was suitable, whether a carry‐over effect was evident, whether only first period data were available, whether findings on analysis were incorrect, and whether results were comparable with those reported by parallel‐group trials (Higgins 2015).

Measures of treatment effect

Dichotomous data

We analysed dichotomous outcome data using odds ratios (ORs) and calculated 95% confidence intervals (CIs) for each effect estimate.

Continuous data

We planned to analyse continuous outcomes as mean differences (MDs) if outcomes were measured on the same scale, and as standardised mean differences (SMDs) if outcomes were measured on different scales. We had to combine different scales for all outcomes and therefore used only SMDs. We calculated 95% confidence intervals for each effect estimate. Because baseline group means varied across studies, we examined change scores (differences between baseline and treatment end or follow‐up). We decided that treatment, participants, and the underlying clinical question were sufficiently similar for pooling, and therefore undertook meta‐analysis. In case multiple trial arms were reported in a single trial, we included only relevant arms (Higgins 2015).

We planned to narratively describe skewed data reported as medians and interquartile ranges.

Unit of analysis issues

Cluster‐randomised trials

To incorporate cluster‐randomised trials, we intended to reduce the size of each trial to its ‘effective sample size’. If intracluster correlation coefficients were not reported, we planned to find external estimates from similar studies.

Cross‐over trials

To avoid carry‐over effects, we planned to include data from only the first period of cross‐over studies. We detected no cross‐over trials.

Studies with multiple treatment groups

We treated with care included studies that compared more than two intervention groups. To overcome a unit of analysis error, we combined all relevant experimental intervention groups into a single group, and all relevant control intervention groups into a single control group, to create a single pair‐wise comparison (Higgins 2015).

Dealing with missing data

We contacted trial authors to verify key study characteristics and to obtain missing numerical outcome data when possible (e.g. when a study was identified as abstract only, when a study was identified as full text and data regarding an outcome of interest were not reported). We assumed that dropouts from treatment were treatment failures unless trialists expressly stated otherwise. We used intention‐to‐treat (ITT) analysis when data were missing for participants who dropped out of trials before completion. We documented all correspondence with trialists (Higgins 2015).

Assessment of heterogeneity

We assessed clinical and methodological heterogeneity by examining the characteristics of studies. We reported similarities between interventions, participants, design, and outcomes in the Included studies subsection. We visually inspected forest plots to investigate the possibility of statistical heterogeneity. To assess whether observed differences in results were compatible with chance alone, we applied the Cochrane Chi². We regarded a P value less than 0.10 as statistically significant, which means that evidence suggested heterogeneity of intervention effects. We took care in interpreting the Chi² test because it has low power in cases of a small sample size. As heterogeneity will always exist, we decided to quantify inconsistency by applying the I² statistic to estimate the observed degree of heterogeneity (Higgins 2015).

Assessment of reporting biases

To avoid publication bias, we obtained and included data from unpublished trials and took into account that unpublished studies could introduce new bias through, for example, poor methodological quality or missing data. If we had identified more than ten studies, we planned to create a funnel plot to detect possible publication bias (in the absence of bias, the plot should approximately resemble a symmetrical funnel). If we thought that asymmetry of the funnel (bias) was explained by other reasons, such as lack of unpublished smaller studies (Higgins 2015), selection bias, poor methodological quality, or chance (Egger 1997; Sterne 2000), we planned to report this information in the Discussion section.

Data synthesis

We analysed data using Review Manager software and pooled data for meta‐analysis when studies assessed similar treatments and had similar outcomes (RevMan 2014). We conducted a meta‐analysis using available or calculated standardised mean differences (SMDs) for continuous outcomes, and odds ratios (ORs) for dichotomous outcomes. We chose SMD because we expected many different scales to be used across studies, and because existing guidelines facilitate clinical interpretation, particularly when lesser‐ known scales are used (Cohen 1988). We expected that true effects for all included studies would not be the same; therefore, we planned to analyse data by applying a random‐effects model to combine results and produce a summary of findings of all included studies. We included in the results measures of uncertainty, such as 95% confidence intervals and estimates of T² and I². When suitable numerical data were not available for meta‐analysis, or when meta‐analyses were considered inappropriate to yield clinically meaningful results, we planned to produce only narrative summaries of all included studies to provide a systematic assessment of available evidence. We produced a descriptive paragraph for each study, presenting all studies consistently (e.g. using the same elements of information for each study and in the same order) (Higgins 2015).

Subgroup analysis and investigation of heterogeneity

When we identified heterogeneity, we planned to present the results of subgroups separately. We planned to examine clinical heterogeneity according to the following.

  1. Participant characteristics ‐ age, length of depression history, comorbidity.

  2. Duration of treatment ‐ 20 sessions versus more than 20 sessions.

  3. Modality of treatment ‐ individual versus group therapy.

  4. Type of music therapy.

Sensitivity analysis

When applicable, we planned to conduct the following sensitivity analyses for primary outcomes to examine the robustness of observed findings.

  1. Excluding studies with high risk of bias. We defined a study as having an overall "high risk of bias" if we judged that it had high risk of bias in at least one domain.

'Summary of findings' tables

We assessed the quality of the evidence by using the GRADE approach for our main comparisons and outcomes (as listed in Types of outcome measures). We planned to create ‘Summary of findings’ tables to provide key information regarding the quality of evidence and the magnitude of effect of interventions examined, and to summarise available data on all outcomes for a given comparison. To ensure consistency of use across reviews, we prepared standard Cochrane 'Summary of findings' tables by using GRADEproGDT 2015 and including the following elements: comparison, population, setting, intervention, comparator intervention, primary and secondary outcomes, burden of outcomes (illustrative risk, or illustrative mean, on control intervention; source of any external information used in this column), absolute and relative magnitude of effect, numbers of participants and studies, rating of evidence quality, and space for comments. For every comparison, we produced another table.

We decided that music therapy versus treatment as usual was our main comparison. In the 'Summary of findings' table, we reported the seven main outcomes. Primary outcomes were short‐term clinician‐rated and patient‐reported depression and adverse events. Secondary outcomes included functioning, quality of life, leaving the study early, and anxiety. We created our 'Summary of findings' tables before writing the abstract, discussion, and conclusions to consider how risk of bias in studies contributing to each outcome affected mean treatment effects and our confidence in mean treatment effects (Higgins 2015).

Results

Description of studies

Results of the search

In total, we identified 2867 records. Of these, we retrieved 2784 records through database searching. We found 83 additional references by searching the World Health Organization International Clinical Trials Registry Platform (WHO ICTRP) (n = 4), Clinical Trials.gov (n = 25), the National Guideline Clearing House (n = 11), OpenGrey (n = 11), the Digital Access to Research Theses (DART)‐Europe E‐theses Portal (n = 16), the ProQuest Dissertations and Theses database (n = 1), personal communications (n = 5), and published systematic reviews on music therapy for depression (n = 10). We found no additional references in the Electronic Theses Online Service (EthOS), the British Libraries e‐thesis online service, Open Access Theses and Dissertations, or the reference lists of included studies.

After removing 1165 duplicates, we screened 1702 titles and abstracts and excluded 1661 irrelevant records. We retrieved full‐text reports for the remaining 41 studies. After reading the full texts, we excluded 30 studies, as they did not meet review eligibility criteria. We have provided primary reasons for exclusion in the Characteristics of excluded studies table and in Figure 1. Two studies are awaiting assessment owing to insufficient information on design, intervention, and analysis (see Characteristics of studies awaiting classification). We added to the Studies awaiting classification section three study reports obtained from an updated search conducted in August 2017 (Ahessy 2016; Jasemi 2016; Kim 2014). In preparing this review, we identified no records of ongoing studies. Finally, we included nine trials in both qualitative and quantitative syntheses.

Included studies

We included in the present review a total of nine studies with 421 participants (of whom we included 411 in the meta‐analyses) (see Characteristics of included studies). Of these, we had included five studies in the first version of this review (Chen 1992; Hanser 1994; Hendricks 1999; Radulovic 1996; Zerhusen 1995) (Maratos 2008); we added the other four studies to the current update (Albornoz 2011; Atiwannapat 2016; Erkkilä 2011; Hendricks 2001).

Design

Eight of the included studies were randomised trials (Albornoz 2011; Atiwannapat 2016; Chen 1992; Erkkilä 2011; Hanser 1994; Hendricks 1999; Hendricks 2001; Zerhusen 1995), and one was a controlled clinical trial (Radulovic 1996). All were single‐centre trials.

Participants

Participants across all studies had received a diagnosis of a depressive disorder. Methods of diagnosing depression varied across studies. Three studies performed diagnosis according to ICD‐10 criteria (Atiwannapat 2016; Erkkilä 2011; Radulovic 1996); one study performed diagnosis according to DSM‐III‐R criteria (Chen 1992). Atiwannapat 2016 also required a score of 7 or above on the MADRS. Other studies confirmed the presence of a depressive disorder exclusively by using a validated scale, such as the BDI (Albornoz 2011; Hendricks 1999; Hendricks 2001; Zerhusen 1995), or the Schedule of Affective Disorders (SADS; Hanser 1994). Chen 1992 reported that some participants had a history of bipolar disorder.

In Albornoz 2011, depression was not the primary diagnosis but was diagnosed in comorbidity with a substance disorder. Another psychiatric comorbidity across included studies was represented by anxiety (Erkkilä 2011). In Chen 1992 and Radulovic 1996, anxiety was an outcome, but whether it was diagnosed was not reported.

Information regarding the history of depression was available only for Atiwannapat 2016, with a mean clinical history of 9.07 years.

Ages of participants were heterogeneous among the included studies. In particular, two studies recruited only adolescents aged 14 to 18 years (Hendricks 1999; Hendricks 2001). Three studies exclusively included adults aged 18 to 65 years (Atiwannapat 2016; Erkkilä 2011; Radulovic 1996). Three studies focused on a geriatric population of participants aged 60 to 86 years (Chen 1992; Hanser 1994; Zerhusen 1995). Finally, Albornoz 2011 investigated depression in both adolescents and adults, with an age range of 16 to 60 years.

In seven studies, samples included participants of both sexes; prevalence of males ranged from 10.53% in Hendricks 1999 to 49.21% in Hendricks 2001. Albornoz 2011 recruited only male participants. Zerhusen 1995 did not provide information regarding the sex of participants.

Sample size

The total number of participants enrolled in the nine studies was 421; however, one study randomised 10 participants to an arm that was outside the scope of this review (Hanser 1994). Study sizes varied from 14 participants in Atiwannapat 2016 to 79 participants in Erkkilä 2011.

Setting

Five of the included trials recruited participants from mental health services (Albornoz 2011; Atiwannapat 2016; Erkkilä 2011; Hanser 1994; Radulovic 1996). Zerhusen 1995 enrolled participants who were hospitalised in a nursing home, and participants in Chen 1992 resided in a geriatric facility. Two studies recruited participants from high schools (Hendricks 1999; Hendricks 2001).

Four studies took place in North America, more precisely, in the United States (Hendricks 1999; Hendricks 2001; Hanser 1994; Zerhusen 1995). Two trials took place in Asia: Chen 1992 was conducted in China, and Atiwannapat 2016 in Thailand. Two studies were realised in European countries: Erkkilä 2011 was conducted in Finland, and Radulovic 1996 took place in Serbia. Finally, one study was conducted in South‐America ‐ Venezuela (Albornoz 2011).

Interventions

We included studies for the following comparisons: music therapy plus TAU versus TAU alone, music therapy alone versus psychological therapies, and one form of music therapy versus another form of music therapy. We found no studies comparing music therapy alone versus TAU or pharmacological therapies.

Music therapy

Music therapy methods were heterogeneous across the included studies. In three studies, researchers adopted an active music therapy method (Albornoz 2011; Chen 1992, Erkkilä 2011). Two studies took into consideration a combination of active and receptive music therapy (Hanser 1994; Hendricks 1999). Of note, Albornoz 2011 provided a specific music therapy intervention by combining music, movement, poetry, psychodrama, and public performance (Artistic Music Therapy; MAR). The intervention was more thoroughly described in a separate publication (Albornoz 2016). Hendricks 2001, Radulovic 1996, and Zerhusen 1995 evaluated receptive music therapy. In Atiwannapat 2016, two of the three arms of treatment involved music therapy: one arm, active music therapy, and the other arm, receptive music therapy.

In seven studies, music therapy sessions were conducted in a group setting (Albornoz 2011, Atiwannapat 2016, Chen 1992, Hendricks 1999; Hendricks 2001; Radulovic 1996; Zerhusen 1995). Two studies provided individual sessions (Erkkilä 2011; Hanser 1994).

In four studies, trained music therapists provided music therapy (Albornoz 2011; Atiwannapat 2016; Hanser 1994; Erkkilä 2011). In the remaining studies it was not clear whether a trained music therapist provided therapy, although trained therapists, counsellors, or other healthcare professionals were mentioned (Chen 1992; Hendricks 1999; Hendricks 2001; Radulovic 1996; Zerhusen 1995).

Lengths of intervention varied from six weeks in Radulovic 1996 to 12 weeks in Atiwannapat 2016 and Hendricks 2001, with the total number of sessions ranging from eight in Hanser 1994 to 48 in Chen 1992. The duration of each session varied from 20 minutes in Radulovic 1996 to 120 minutes in Albornoz 2011.

Comparator interventions

Six studies had one comparator (Albornoz 2011; Chen 1992; Erkkilä 2011; Hendricks 1999; Hendricks 2001; Radulovic 1996). Three studies each included three treatment arms (Atiwannapat 2016; Hanser 1994; Zerhusen 1995).

Five studies compared music therapy versus treatment as usual (Albornoz 2011; Chen 1992; Erkkilä 2011; Radulovic 1996; Zerhusen 1995). Extent of treatment as usual varied both between and within studies, but treatment commonly included antidepressant medication and group or individual psychotherapy. Four studies mentioned antidepressant medication (Albornoz 2011; Chen 1992; Erkkilä 2011; Radulovic 1996). Two studies mentioned group or individual psychotherapy (Albornoz 2011; Erkkilä 2011). Two studies mentioned rehabilitation services and related activities (Albornoz 2011; Zerhusen 1995). Hanser 1994 mentioned no specific therapy as researchers used a waiting list, but all participants were patients from a family research and resource centre.

Four studies used active comparators, which included cognitive‐behavioural therapy in Hendricks 1999, Hendricks 2001, and Zerhusen 1995, and counselling in Atiwannapat 2016. Atiwannapat 2016 compared two types of music therapy (active and receptive) versus each other. We excluded self‐directed music listening from Hanser 1994 as a comparator because this was outside the scope of the review.

Outcomes
Primary outcomes

Depression symptoms

All studies assessed depression symptoms using different scales. Two studies used only a clinician‐rated depression scale (Chen 1992; Erkkilä 2011); four used only a self‐reported depression scale (Hanser 1994; Hendricks 1999; Hendricks 2001; Zerhusen 1995); and three used both types (Albornoz 2011; Atiwannapat 2016; Radulovic 1996).

Researchers used two clinician‐rated depression scales.

  1. The Hamilton Rating Scale for Depression (various abbreviations are encountered: HRSD, HDRS, HAM‐D; in this review, abbreviated as HAM‐D) is a measure of depressive symptoms in adults with a diagnosis of depressive disorder. The original version (Hamilton 1960) contained 17 items, but four questions were added to later revisions (Hamilton 1966; Hamilton 1967; Hamilton 1969; Hamilton 1980). Each item on the questionnaire is scored on a 3‐ or 5‐point scale. Total score can range from 0 to 54 points, with scores from 7 to 17 indicating mild depression, from 18 to 24 indicating moderate depression, and above 24 indicating severe depression. Three studies used the HAM‐D. Albornoz 2011 used the original 17‐item version, and Chen 1992 and Radulovic 1996 did not specify which version investigators used. Although this is one of the most widely used scales, information on its typical standard deviation (SD) in people with depression is not available from the original validation studies. The included studies for which SDs could be derived showed SDs around 10 (Albornoz 2011; Chen 1992), and we imputed this value when the SD was missing (see notes in Characteristics of included studies).

  2. The Montgomery‐Åsberg Depression Rating Scale (MADRS) is a ten‐item questionnaire used to measure the severity of depressive episodes in people with mood disorders. Each item yields a score of 0 to 6, and the overall score can range from 0 to 60 (Davidson 1986). Two studies used MADRS (Atiwannapat 2016; Erkkilä 2011). Its typical SD in people with depression is around 7 (Davidson 1986; Erkkilä 2011), so we imputed this value when it was missing.

Investigators used three self‐rated depression scales.

  1. Beck Depression Inventory (BDI) is a self‐report measure of the severity of depression, composed of 21 multiple choice questions related to depression symptoms. The overall score has a possible range from 0 to 63. Five studies used the BDI (Albornoz 2011; Hendricks 1999; Hendricks 2001; Radulovic 1997; Zerhusen 1995). Two of these failed to report SDs (Radulovic 1997; Zerhusen 1995), so we had to impute a typical SD derived from other studies. In particular, as the original report describing the BDI presented SD ≈ 10 based on a sample of 409 participants (Beck 1961), we used this value as the best available estimate when the SD was missing (see notes in Characteristics of included studies section).

  2. Thai Depression Inventory (TDI) is a self‐rating instrument composed of 20 items and used to evaluate the severity of depression (Lotrakul 1999). The score for each item in the TDI ranges from 0 to 3. The overall score has a possible range from 0 to 60. One study used the TDI (Atiwannapat 2016).

  3. Geriatric Depression Scale (GDS) is a self‐report assessment specifically designed to identify depression in older adults. This scale is composed of 30 items with yes/no answers. A score of 11 or above is usually considered as indicative of depression (Yesavage 1983). One study used the GDS (Hanser 1994).

For all depression scales, higher scores represent greater severity of depression symptoms.

Adverse events

Adverse events reported in these studies included worsening of depression and lower back pain (Erkkilä 2011). None of the other studies reported whether any adverse events occurred.

Secondary outcomes

Functioning

Only one study assessed functioning using the Global Assessment of Functioning scale (GAF; APA 2000) (Erkkilä 2011). The GAF is a clinician‐reported scale that is used to rate the social, occupational, and psychological functioning of an individual. Values can range from a minimum score of 0 (severely impaired functioning) to a maximum score of 100 (extremely high functioning).

Quality of life

Two studies measured quality of life. Specifically, Atiwannapat 2016 used the Thai version of the Short‐Form Health Survey (SF‐36; Ware 1992), and Erkkilä 2011 used the Finnish translation of the health‐related quality of life survey distributed by RAND (RAND‐36; Hays 1993). SF‐36 and RAND‐36 are closely related patient‐reported measures that are based on the same set of 36 items but with slightly different scoring (Hays 1993). Lower scores indicate increased disability.

Leaving the study early

Data on leaving the study early were available for all nine studies, although events occurred in only four trials (Atiwannapat 2016; Erkkilä 2011; Hanser 1994; Zerhusen 1995). Of the remaining five studies, four reported no dropouts (Albornoz 2011; Chen 1992; Hendricks 2001; Radulovic 1996), and one did not report to which arm dropouts belonged (Hendricks 1999), thus contributing no usable data for this outcome.

Anxiety

Three studies assessed anxiety (Chen 1992; Erkkilä 2011; Radulovic 1996). Two studies used the clinician‐rated Hamilton Anxiety Scale (HAM‐A) as described in Maier 1988 (Chen 1992; Radulovic 1996), and one study used the clinician‐rated Hospital Anxiety and Depression Scale ‐ Anxiety (HADS‐A), as described in Zigmond 1983 (Erkkilä 2011).

The HAM‐A is a clinician‐rated scale that intends to provide an analysis of the severity of anxiety in adults, adolescents, and children. It is composed of 14 items. Each item can receive a score between 0 and 4, and the composite score can range from 0 to 56. Its validity and reliability in people with depression are well established (Maier 1988), with a typical SD of around 7 (Maier 1988 reported standard error (SE) = 0.8 with n = 73, leading to SD ≈ 7).

The HADS is a clinician‐rated scale that comprises 14 items (Zigmond 1983), seven of which are related to anxiety (HADS‐A); the other seven are related to depression (HADS‐D). Erkkilä 2011 used the HADS‐A. Each item on the HADS‐A is scored from 0 to 3, yielding a total score between 0 and 21.

Self‐esteem

One study assessed self‐esteem (Hanser 1994), using the Rosenberg Self‐esteem Inventory (RSE; Rosenberg 1979). The RSE is a 10‐item scale that evaluates global self‐worth. All items are answered via a 4‐point Likert scale format ranging from strongly agree to strongly disagree. Total score ranges from 0 to 30. Scores below 15 suggest low self‐esteem.

Costs and cost‐effectiveness

No studies addressed costs or cost‐effectiveness.

Satisfaction

No studies addressed satisfaction.

Excluded studies

We excluded 30 studies. Two studies were not RCTs or CCTs (Carolan 2016; No author 2008). Nineteen studies included an ineligible population (Ashida 2000; Bae 2011; Bittman 2001; Boothby 2011; Broersen 2013; Carr 2012; Cassileth 2003; Chen 2016; Choi 2008; Chu 2014; Clark 2006; Cross 2012; Iliya 2015; Lu 2013; Mohammadi 2011; Raglio 2015; Romito 2013; Schwantes 2014; Werner 2015). In four studies. the intervention was not music therapy (Brandes 2010; Castillo‐Pérez 2010; Lu 2012; Huang 2010), and two studies did not include a relevant comparator intervention (Chen 2015; Wu 2002). Full‐text reports were not available for three studies (Bradford 1991; Li 2002; Liu 2014). See also Characteristics of excluded studies.

Ongoing studies

We identified no ongoing studies.

Studies awaiting classification

Two studies are awaiting classification (Kumar 2013; Tang 2011). Kumar 2013 provided insufficient information about study design, and Tang 2011 provided insufficient details related to the music therapy intervention and statistical results. We were unable to obtain more information. See also Characteristics of studies awaiting classification. We added to the Studies awaiting classification section three study reports obtained from an updated search conducted in August 2017 (Ahessy 2016; Jasemi 2016; Kim 2014).

New studies found at this update

We added four new studies to the current update of the review (Albornoz 2011; Atiwannapat 2016; Erkkilä 2011; Hendricks 2001). Of note, Albornoz 2011 evaluated the effect of improvisational music therapy on depressed individuals with substance abuse. We did not include this particular population in the previous version of the review. Atiwannapat 2016 compared active and receptive music therapy for adult outpatients with major depression. Hendricks 2001 was a replication of a study that was included in the previous version of this review (Hendricks 1999). Investigators evaluated the effects of school‐based music therapy among adolescents with depressive symptoms. Finally, Erkkilä 2011 investigated individual music therapy provided to a working‐age group of depressed individuals.

Risk of bias in included studies

We present in Figure 2 a summary of risk of bias across domains. Figure 3 provides a summary of risk of bias results for each included study. We provide reasons for judgements in the Risk of bias in included studies tables. For clarification, we provide quotes in these tables.


Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.


Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Allocation

Random sequence generation

Two studies specified that researchers used spreadsheet software to generate random number lists in blocks (Albornoz 2011, Erkkilä 2011); Atiwannapat 2016 reported drawing lots in a 1:1:1 ratio. We judged these studies to be at low risk of bias. Most studies were at unclear risk of bias because study authors stated only that participants were randomised, but did not describe how (Chen 1992; Hanser 1994; Hendricks 1999; Hendricks 2001; Zerhusen 1995). One study did not describe how participants were allocated, and we judged it to be at high risk of bias (Radulovic 1996).

Allocation concealment

In three studies, participants and investigators enrolling participants could not foresee assignment (Albornoz 2011; Erkkilä 2011; Hendricks 2001).Albornoz 2011 used sequentially numbered envelopes; Erkkilä 2011 used remote email randomisation; and Hendricks 2001 used coded packets. We judged the remaining six studies to be at unclear risk (Atiwannapat 2016; Chen 1992; Hanser 1994; Hendricks 1999; Radulovic 1996; Zerhusen 1995).

Blinding

We judged one of nine studies to be at low risk of bias (Erkkilä 2011). In this study, investigators did not blind participants, but one masked clinical expert conducted all psychiatric assessments. Review authors judged that outcomes in this study were not likely to be influenced by lack of blinding. The remaining eight studies were at unclear risk (Albornoz 2011; Atiwannapat 2016; Chen 1992; Hanser 1994; Hendricks 1999; Hendricks 2001; Radulovic 1996; Zerhusen 1995). In two of these studies (Albornoz 2011; Atiwannapat 2016), both important personnel and clinician‐reported outcomes (HRSD; MADRS) were blinded. However, blinding for participants and for self‐reported outcomes in depression (BDI) was not possible in two studies (Albornoz 2011; Atiwannapat 2016); and blinding for quality of life (SF‐36) was not possible in Atiwannapat 2016. Six studies did not address blinding of personnel and participants and provided insufficient information to permit a clear judgement (Chen 1992; Hanser 1994; Hendricks 1999; Hendricks 2001; Radulovic 1996; Zerhusen 1995).

Incomplete outcome data

We judged eight out of nine studies to be at low risk of bias (Albornoz 2011; Chen 1992; Erkkilä 2011; Hanser 1994; Hendricks 1999; Hendricks 2001; Radulovic 1996; Zerhusen 1995). For five of these studies, reports indicated no missing outcome data (Albornoz 2011; Chen 1992; Hanser 1994; Hendricks 2001. Radulovic 1996). In one study, missing outcome data were balanced in numbers across intervention groups, with similar reasons noted for missing data across groups (e.g. one resident left the study early, and corresponding participants in the other two groups were therefore also discarded from the data analysis, leaving 19 participants in each group available for the purpose of data analysis) (Zerhusen 1995). One other study imputed data using appropriate methods (Erkkilä 2011). In another study. it remains unclear to which group participants leaving the study early were originally allocated, although study authors stated that participants dropped out of the music therapy group and the treatment as usual group (Hendricks 1999). We judged one study to be at high risk (Atiwannapat 2016). In this study, the proportion of and reasons for missing data in one of the control arms were sufficient to have a clinically relevant effect because of the small study group.

Selective reporting

We judged two of nine studies to be at low risk (Albornoz 2011; Erkkilä 2011). For Albornoz 2011, an earlier published dissertation was available, and all outcomes were reported as planned (Albornoz 2009). For the other study (Erkkilä 2011), a study protocol was available, and all expected outcomes were identified and reported as planned (Erkkilä 2008). We judged the remaining seven studies to be at unclear risk (Atiwannapat 2016; Chen 1992; Hanser 1994; Hendricks 1999; Hendricks 2001; Radulovic 1996; Zerhusen 1995). For these studies, a protocol was not available, and all outcomes were reported as planned in the Methods section. Therefore, information was insufficient to permit judgement of low or high risk.

Other potential sources of bias

We judged three of eight studies to be at low risk of bias (Albornoz 2011; Atiwannapat 2016; Erkkilä 2011), as these studies appeared to be free of other sources of bias. We judged the six remaining studies to be at unclear risk because risk of bias could be present (Chen 1992; Hanser 1994; Hendricks 1999; Hendricks 2001; Radulovic 1996; Zerhusen 1995), but information was insufficient to show whether an important risk of bias existed.

Effects of interventions

See: Summary of findings for the main comparison Music therapy plus treatment as usual (TAU) versus TAU for depression (primary comparison); Summary of findings 2 Music therapy versus psychological treatment for depression; Summary of findings 3 Active music therapy versus receptive music therapy for depression

Comparison 1. Music therapy plus treatment as usual (TAU) versus TAU alone

Primary outcomes
Severity of depression symptoms (clinician‐rated)

Four studies addressed clinician‐rated severity of depression symptoms in the short term (up to three months) (Albornoz 2011; Chen 1992; Erkkilä 2011; Radulovic 1996). A significant short‐term effect favoured music therapy (standardised mean difference (SMD) ‐0.98, 95% confidence interval (CI) ‐1.69 to ‐0.27, P = 0.007, 3 randomised controlled trials (RCTs), 1 clinical controlled trial (CCT), n = 219, moderate‐quality evidence) (Analysis 1.1). Heterogeneity was high (I2 = 83%). See summary of findings Table for the main comparison.

Only one study evaluated the medium‐term effect (up to six months) of clinician‐rated depressive symptoms and found no significant effect (SMD ‐0.38, 95% CI ‐0.87 to 0.12, P = 0.14, 1 RCT, n = 64, moderate‐quality evidence) (Erkkilä 2011) (Analysis 1.1).

Severity of depression symptoms (patient‐reported)

In total, four studies evaluated patient‐reported severity of symptoms. Three studies used the Beck Depression Inventory (BDI) (Albornoz 2011; Radulovic 1996; Zerhusen 1995). Hanser 1994 preferred to use the Geriatric Depression Scale (GDS). At short term, a significant effect favoured music therapy in patient‐reported severity of symptoms (SMD ‐0.85, 95% CI ‐1.37 to ‐0.34, P = 0.001, 3 RCTs, 1 CCT, n = 142, moderate‐quality evidence) (Analysis 1.2). Heterogeneity was moderate (I2 = 49%). See summary of findings Table for the main comparison. Researchers reported no data at medium term (Analysis 1.2).

Adverse events

One RCT provided data concerning this outcome, revealing no significant evidence that music therapy was associated with more or fewer adverse events than treatment as usual in the short term (odds ratio (OR) 0.45, 95% CI 0.02 to 11.46, P = 0.63, n = 79, low‐quality evidence) or in the medium term (OR 0.69, 95% CI 0.06 to 7.91, P = 0.76, n = 79, low‐quality evidence) (Erkkilä 2011) (Analysis 1.3). See summary of findings Table for the main comparison.

Secondary outcomes
Functioning

Only one RCT measured level of functioning using the Global Assessment of Functioning scale (GAF) (Erkkilä 2011). A significant effect favoured music therapy in the short term (SMD 0.51, 95% CI 0.02 to 1, P = 0.04, n = 67, low‐quality evidence). On the contrary, investigators found no significant differences in the medium term (SMD 0.38, 95% CI ‐0.12 to 0.88, P = 0.13, n = 64, low‐quality evidence). Heterogeneity was not applicable (Analysis 1.4). See summary of findings Table for the main comparison.

Quality of life

Only one study used the health‐related quality of life survey distributed by RAND (RAND‐36) to evaluate quality of life (Erkkilä 2011). Researchers found no significant differences between the music therapy group and the treatment as usual group, both in the short term (SMD 0.32, 95% CI ‐0.17 to 0.80, P = 0.20, n = 67, low‐quality evidence) and in the medium term (SMD 0.26, 95% CI ‐0.23 to 0.76, P = 0.30, n = 64, low‐quality evidence) (Analysis 1.5). Heterogeneity was not applicable. See summary of findings Table for the main comparison.

Leaving the study early

Included studies reported no significant differences in rates of leaving the study early between participants who attended music therapy and those in the treatment as usual group at short term (OR 0.49, 95% CI 0.14 to 1.70, P = 0.26, 5 RCTs, 1 CCT, n = 293, moderate‐quality evidence). Heterogeneity was low (I2 = 0%) (Analysis 1.6). At medium term, only Erkkilä 2011 reported events of leaving the study early and noted no significant differences (OR 0.44, 95% CI 0.13 to 1.53, P = 0.20, n = 79, moderate‐quality evidence). Heterogeneity was not applicable (Analysis 1.6). See summary of findings Table for the main comparison.

Anxiety

Three studies evaluated anxiety in the short term (Chen 1992; Erkkilä 2011; Radulovic 1996). Chen 1992 and Radulovic 1996 used the Hamilton Anxiety Scale (HAM‐A) to assess outcome measures, and Erkkilä 2011 used the Hospital Anxiety and Depression Scale ‐ Anxiety (HADS‐A). Trialists reported a significant reduction in anxiety favouring music therapy in the short term (SMD ‐0.74, 95% CI ‐1.40 to ‐0.08, P = 0.03, 2 RCTs, 1 CCT, n = 195, low‐quality evidence) (Analysis 1.7). Heterogeneity was high (I2 = 80%). Similarly, as for the outcome of clinician‐rated depression reported above, Chen 1992 was the study showing the strongest positive effect, again possibly as a result of the geriatric population included or the large number of music therapy sessions provided.

Erkkilä 2011 also measured anxiety symptoms in the medium term and found no significant differences between treatment groups (SMD ‐0.40, 95% CI ‐0.90 to 0.10, P = 0.12, n = 64, moderate‐quality evidence) (Analysis 1.7). See summary of findings Table for the main comparison.

Self‐esteem

One study measured self‐esteem using the Rosenberg Self‐Esteem Inventory (RSE) (Hanser 1994). Results showed no significant differences between music therapy and treatment as usual groups (SMD ‐0.63, 95% CI ‐1.53 to 0.27, P = 0.17, n = 20, low‐quality evidence) (Analysis 1.8). Heterogeneity was not applicable. No data were available at medium term (Analysis 1.8).

Costs or cost‐effectiveness

We found no eligible studies addressing this outcome.

Satisfaction with treatment

We found no eligible studies addressing this outcome.

Comparison 2. Music therapy versus psychological therapy

Primary outcomes
Severity of depression symptoms (clinician‐rated)

One RCT measured severity of depressive symptoms at both short term and medium term (Atiwannapat 2016). Upon combining data regarding active and receptive music therapy approaches, we found no significant differences in comparison with psychological therapy (short‐term: SMD ‐0.78, 95% CI ‐2.36 to 0.81, P = 0.34, n = 11, very low‐quality evidence; medium‐term: SMD ‐1.11, 95% CI ‐2.74 to 0.53, P = 0.19, n = 11, very low‐quality evidence) (Analysis 2.1). Heterogeneity was not applicable. See summary of findings Table 2.

Severity of depression symptoms (patient‐reported)

Investigators found no significant differences in patient‐reported severity of depression symptoms, both at short term and at medium term. In particular, four RCTs evaluated changes in symptoms at short term (SMD ‐1.28, 95% CI ‐3.75 to 1.02, P = 0.28, n = 131, low‐quality evidence) (Atiwannapat 2016; Hendricks 1999; Hendricks 2001; Zerhusen 1995). Heterogeneity was high (I2 = 96%). Only Atiwannapat 2016 evaluated patient‐reported symptoms at medium term, noting no significant effects (SMD ‐0.68, 95% CI ‐2.26 to 0.89, P = 0.40, n = 11, very low‐quality evidence) (Analysis 2.2). See summary of findings Table 2.

Adverse events

We found no eligible evidence addressing this outcome. See summary of findings Table 2.

Secondary outcomes
Functioning

We found no eligible evidence addressing this outcome. See summary of findings Table 2.

Quality of life

Only one study evaluated quality of life using the Thai version of Short Form (SF)‐36 (Atiwannapat 2016). In the short term, researchers found no significant differences between music therapy groups and psychological therapy groups (SMD 1.31, 95% CI ‐0.36 to 2.99, P = 0.12, n = 11, very low‐quality evidence) (Analysis 2.3). Heterogeneity was not applicable. Investigators also found no significant effects in the medium term (SMD 0.93, 95% CI ‐0.67 to 2.54, P = 0.25, n = 11, very low‐quality evidence) (Analysis 2.3). See summary of findings Table 2.

Leaving the study early

Four included studies recruited a total of 137 participants (Atiwannapat 2016; Hendricks 1999; Hendricks 2001; Zerhusen 1995). At short term, one participant in the music therapy group and three participants in the psychological therapy group left the study early (OR 0.17, 95% CI 0.02 to 1.49, P = 0.11, n = 137, moderate‐quality evidence). Heterogeneity was low (I2 = 0%) (Analysis 2.4). Of note, in Hendricks 1999, two participants who were initially randomised left the study early. However, the study report does not specify to which group these participants belonged; we therefore decided to consider this missing information and assigned a value of zero. At medium term, data from Atiwannapat 2016 revealed no additional dropouts compared with the number reported at three‐month follow‐up, and showed no statistically significant differences between music therapy and psychological therapy groups (OR 0.11, 95% CI 0.01 to 1.92, P = 0.13, n = 14, very low‐quality evidence) (Analysis 2.4). See summary of findings Table 2.

Anxiety

We found no eligible studies addressing this outcome. See summary of findings Table 2.

Self‐esteem

We found no eligible studies addressing this outcome.

Costs or cost‐effectiveness

We found no eligible studies addressing this outcome.

Satisfaction with treatment

We found no eligible studies addressing this outcome.

Comparison 3. Active music therapy versus receptive music therapy

Primary outcomes
Severity of depression symptoms (clinician‐rated)

One RCT evaluated severity of depression symptoms in active and receptive music therapy (Atiwannapat 2016). An expert clinician administered the MADRS to study participants. Investigators found no significant differences between the two music therapy interventions in the short term (SMD ‐0.52, 95% CI ‐1.87 to 0.83, P = 0.45, n = 9, very low‐quality evidence) or in the medium term (SMD ‐0.64, 95% CI ‐2.02 to 0.73, P = 0.36, n = 9, very low‐quality evidence) (Analysis 3.1). Heterogeneity was not applicable. See summary of findings Table 3.

Severity of depression symptoms (patient‐reported)

Atiwannapat 2016 used the Thai Depression Inventory (TDI) to evaluate patient‐reported depressive symptoms. Trialists found no statistically significant differences between active and receptive music therapy groups in the short term (SMD ‐0.01, 95% CI ‐1.33 to 1.30, P = 0.98; n = 9, very low‐quality evidence). Quality of evidence was low in the short term. See summary of findings Table 3. Also in the medium term, analysis showed no differences between the two groups (SMD ‐0.16, 95% CI ‐1.48 to 1.16, P = 0.82, n = 9, very low‐quality evidence) (Analysis 3.2). See summary of findings Table 3.

Adverse events

We found no eligible studies addressing this outcome. See summary of findings Table 3.

Secondary outcomes
Functioning

We found no eligible studies addressing this outcome. See summary of findings Table 3.

Quality of life

Atiwannapat 2016 evaluated quality of life using the Thai version of SF‐36. Investigators found no significant differences between active music therapy and receptive music therapy in the short term (SMD ‐0.24, 95% CI ‐1.57 to 1.08, P = 0.72, n = 9, very low‐quality evidence) (Analysis 3.3). They also found no significant effects in the medium term (SMD 0.02, 95% CI ‐1.29 to 1.34, P = 0.97, n = 9, very low‐quality evidence) (Analysis 3.3). Heterogeneity was not applicable. One study including nine participants contributed data to this comparison. See summary of findings Table 3.

Leaving the study early

According to Atiwannapat 2016, the number of dropouts was higher in the receptive music therapy group, with one participant leaving the study in the first three months. On the contrary, no participants in the active music therapy group left the study early. However, this difference cannot be considered statistically significant (OR 0.27, 95% CI 0.01 to 8.46, P = 0.46, n = 10, very low‐quality evidence) (Analysis 3.4). Heterogeneity was not applicable. See summary of findings Table 3.

Anxiety

We found no eligible studies addressing this outcome. See summary of findings Table 3.

Self‐esteem

We found no eligible studies addressing this outcome.

Costs or cost‐effectiveness

We found no eligible studies addressing this outcome.

Satisfaction with treatment

We found no eligible studies addressing this outcome.

Sensitivity analyses

We conducted sensitivity analyses to determine the impact of the risk of bias of included studies on primary outcomes.

For Comparison 1 (Music therapy plus TAU vs TAU), removing the study at high risk of bias from the meta‐analysis did not change the significance of effects (Radulovic 1996). The effect estimate for clinician‐rated depression symptoms (Analysis 1.1) became larger (SMD ‐1.12, 95% CI ‐2.10 to ‐0.14, P = 0.03, n = 159); and the effect estimate for patient‐reported depression symptoms (Analysis 1.2) remained similar (SMD ‐0.98, 95% CI ‐1.82 to ‐0.14, P = 0.02, n = 82), both in favour of music therapy plus TAU. Heterogeneity remained high for clinician‐rated depression (I2 = 86%), and moderate for patient‐reported depression symptoms (I2 = 66%).

For Comparison 2 (Music therapy vs psychological therapy), removing the study at high risk of bias did not change the non‐significance of effects (Atiwannapat 2016). No studies remained for clinician‐rated depression symptoms nor for patient‐reported depression symptoms at medium term. Effects on patient‐reported depression symptoms at short term (Analysis 2.2) remained non‐significant (SMD ‐1.41, 95% CI ‐4.26 to 1.44, P = 0.33, n = 120). Heterogeneity remained high (I2 = 97%).

In summary, sensitivity analyses did not change results.

Discussion

available in

Summary of main results

Comparison 1. Music therapy plus treatment as usual (TAU) versus TAU

Review authors found a significant short‐term effect of music therapy combined with treatment as usual versus treatment as usual alone according to both clinician‐rated and patient‐reported measures of depressive symptoms. The effect sizes found can be interpreted in accordance with common guidelines for interventions in the behavioural sciences (Cohen 1988), by which effect sizes of up to 0.2 are considered small, those around 0.5 medium, and those at 0.8 and above large.

Our results show a large effect size of music therapy for clinician‐rated depressive symptoms (standardised mean difference (SMD) ‐0.98; moderate‐quality evidence). The effect size translates to a difference of 9.8 points on the Hamilton Rating Scale for Depression (HAM‐D), which normally has a standard deviation (SD) of around 10. This is likely to be a clinically important difference. We found a large effect size for music therapy (SMD ‐0.85), with moderate quality of evidence, when depressive symptoms were evaluated by means of self‐reported instruments. This effect size can be translated to a change of 8.5 points on the Beck Depression Inventory (BDI). This difference is also likely to be clinically relevant. The beneficial effect of music therapy did not seem to be maintained in the medium term. However, only one study evaluated depressive symptoms over a period of six months (Erkkilä 2011), showing a trend towards significance in favour of music therapy. Music therapy was not associated with more or fewer adverse events than treatment as usual, with low quality of evidence.

In the short term, we found a significant reduction in anxiety symptoms, with a medium effect size (SMD ‐0.71; low‐quality evidence). This effect size translates to a change of 5 points on the Hamilton Anxiety Scale (HAM‐A), which has an SD ≈ 7. This is likely to be a clinically relevant effect. Also the level of functioning improved in the short term with a medium effect size (SMD 0.51; low‐quality evidence). The effect size translates to a change of about 5 points on the Global Assessment of Functioning scale (GAF) (SD ≈ 10), which could be clinically relevant. We found no differences between music therapy added to treatment as usual versus treatment as usual alone in terms of quality of life/self‐esteem/number of adverse events, with low quality of evidence. The proportion of participants who left the study early did not significantly differ between music therapy plus TAU and TAU alone groups, and the quality of evidence was moderate.

Comparison 2. Music therapy versus psychological therapy

Review authors noted no significant differences between music therapy and psychological therapy in severity of depressive symptoms, for both clinician‐rated (very low‐quality evidence) and patient‐reported outcomes (low‐quality evidence). Also, we found no differences in quality of life and in the number of participants who left the study early, with evidence of very low and moderate quality, respectively. No studies reported the number of adverse events, and no studies measured anxiety and level of functioning.

Comparison 3. Active music therapy versus receptive music therapy

Review authors found no significant differences between active and receptive music therapy in severity of depressive symptoms for both clinician‐rated and patient‐reported outcomes. We also noted no differences in quality of life and in the number of participants who left the study early. Quality of evidence was very low for all outcomes. No studies reported the number of adverse events, and no studies measured anxiety and level of functioning.

Overall completeness and applicability of evidence

The present review included nine studies with a total of 421 participants. Of these, we included 411 participants in the meta‐analysis. These individuals belonged to almost all age groups that could be affected by a depressive disorder, from adolescents to older people. However, investigators did not always report the specific type of depressive disorder, and, in some cases, expert clinicians did not perform diagnosis according to valid diagnostic criteria. Even if the included studies comprised participants from a broad age range, it would be useful to evaluate the effects of music therapy in larger samples of adults with a specific diagnosis of major depressive disorder, which is a chronic and severe condition for which patients might benefit from music therapy more than for minor depressive disorders. Additionally, included studies did not evaluate depression, and future researchers should take this into consideration. As clinical depression is not usually diagnosed in children, a future review may need to apply wider inclusion criteria to encompass studies that include this group. The largest randomised controlled trial (RCT) of music therapy ever completed (n = 251) found that children with emotional and behavioural problems aged 8 to 16 years who had received music therapy alongside treatment as usual had significantly reduced symptoms of depression compared with those who did not (Porter 2016).

With regards to the intervention, review authors considered active, receptive, and mixed music therapy methods. Seven studies conducted music therapy sessions in groups, and only two studies provided individual sessions. This aspect is important to consider because individual music therapy might be personalised and tailored to the characteristics of individual patients and might show a more beneficial effect than group sessions.

It is important to mention that only single studies addressed some outcomes (i.e. self‐esteem, level of functioning), making these results not generalisable.

Quality of the evidence

Review authors rated quality of evidence for all comparisons using the five GRADE considerations (study limitations, consistency of effect, indirectness, imprecision, and publication bias; Schünemann 2009).

Limitations in study design or execution (risk of bias)

Concerning the main comparison, we downgraded the quality of evidence for the following outcomes for risk of bias (e.g. unclear randomisation, allocation concealment, blinding, absence of a study protocol): clinician‐rated depression, patient‐reported depression, leaving the study early, and anxiety. For both comparisons 2 and 3, we downgraded clinician‐rated depression, patient‐reported depression, quality of life, and leaving the study early by one level for similar reasons.

Inconsistency of results

We downgraded the quality of evidence by one level for inconsistency concerning anxiety in the main comparison and for patient‐reported depression in the second comparison, because we noted some variation in effect sizes, small or non‐overlap of confidence intervals, and high heterogeneity. Heterogeneity could be explained by differences in age groups, methods of music therapy, professionals providing music therapy, and quality of studies.

Indirectness of evidence

All included trials addressed the main review questions (PICO): treatment of depression in men and women of any age group, and receiving a music therapy intervention. Therefore, we did not downgrade any outcome in all comparisons for indirectness of evidence.

Imprecision

For the main comparison, we downgraded the quality of evidence for adverse events, functioning, and quality of life by two levels owing to wide confidence intervals, although we based this decision on an adequately powered, well‐performed, and well‐reported trial (Erkkilä 2011). For the second comparison, we downgraded clinician‐rated depression and quality of life by two levels owing to small sample size. For the same reason, we downgraded by two levels the quality of evidence for clinician‐rated depression, patient‐reported depression, quality of life, and leaving the study early in the third comparison.

Publication bias

For all comparisons and for all outcomes, we did not downgrade the quality of evidence for publication bias, as we did not detect publication bias. We asked experts and known researchers in the field whether they were aware of reported or ongoing trials on music therapy for depression. We did not produce a funnel plot to assess possible publication bias, as the total number of studies (fewer than 10) meant that application of a formal test of asymmetry was not appropriate (Sterne 2011).

Potential biases in the review process

We undertook an extensive search of databases and additional sources and applied no restrictions concerning nationality or language within the search process; thus we believe that we have identified and included in the present systematic review all potentially relevant studies. We translated non‐English abstracts into English for assessment of eligibility. We translated possibly relevant and relevant non‐English full‐text study reports into English, to finalise the eligibility process. We included relevant non‐English articles in the review. Furthermore, at least two review authors systematically extracted and managed trial data.

Some reports did not provide all information required to perform the meta‐analysis (e.g. pre‐post values, standard deviations). In such cases, as stated in the methods and results sections, we searched previously published literature (e.g. validation studies, similar studies with large sample sizes) to retrieve the missing information. This may limit the accuracy of the results.

We attempted to conduct a comprehensive search for studies, but the fact that studies found in the updated search have not yet been incorporated may present a source of potential bias.

Agreements and disagreements with other studies or reviews

In the current literature, we found a limited number of studies for comparison. Our meta‐analysis confirms the conclusions of Maratos 2008 and Assche 2015, which showed a beneficial effect of music therapy on depressive symptoms. However, the previous Cochrane review only narratively reviewed the efficacy of music therapy for depression (Maratos 2008). Our findings are strengthened by the fact that data provided by included studies were meta‐analysed. Assche 2015 did not meta‐analyse data from included studies. Zhao 2016 considered music therapy for older adults only and included trials that assessed participants and interventions that were not eligible for this review. In a broader review, Gold 2009 noted beneficial effects of music therapy on depressive symptoms, with a meta‐regression analysis suggesting that effects increased with the number of sessions.

PRISMA flow diagram.
Figures and Tables -
Figure 1

PRISMA flow diagram.

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.
Figures and Tables -
Figure 2

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.
Figures and Tables -
Figure 3

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Comparison 1 Music therapy plus TAU versus TAU alone (primary comparison), Outcome 1 Severity of depression symptoms, clinician‐rated (primary outcome; high=poor).
Figures and Tables -
Analysis 1.1

Comparison 1 Music therapy plus TAU versus TAU alone (primary comparison), Outcome 1 Severity of depression symptoms, clinician‐rated (primary outcome; high=poor).

Comparison 1 Music therapy plus TAU versus TAU alone (primary comparison), Outcome 2 Severity of depression symptoms, patient‐reported (primary outcome; high=poor).
Figures and Tables -
Analysis 1.2

Comparison 1 Music therapy plus TAU versus TAU alone (primary comparison), Outcome 2 Severity of depression symptoms, patient‐reported (primary outcome; high=poor).

Comparison 1 Music therapy plus TAU versus TAU alone (primary comparison), Outcome 3 Any adverse event.
Figures and Tables -
Analysis 1.3

Comparison 1 Music therapy plus TAU versus TAU alone (primary comparison), Outcome 3 Any adverse event.

Comparison 1 Music therapy plus TAU versus TAU alone (primary comparison), Outcome 4 Functioning (high=good).
Figures and Tables -
Analysis 1.4

Comparison 1 Music therapy plus TAU versus TAU alone (primary comparison), Outcome 4 Functioning (high=good).

Comparison 1 Music therapy plus TAU versus TAU alone (primary comparison), Outcome 5 Quality of life (high=good).
Figures and Tables -
Analysis 1.5

Comparison 1 Music therapy plus TAU versus TAU alone (primary comparison), Outcome 5 Quality of life (high=good).

Comparison 1 Music therapy plus TAU versus TAU alone (primary comparison), Outcome 6 Leaving the study early.
Figures and Tables -
Analysis 1.6

Comparison 1 Music therapy plus TAU versus TAU alone (primary comparison), Outcome 6 Leaving the study early.

Comparison 1 Music therapy plus TAU versus TAU alone (primary comparison), Outcome 7 Anxiety (high=poor).
Figures and Tables -
Analysis 1.7

Comparison 1 Music therapy plus TAU versus TAU alone (primary comparison), Outcome 7 Anxiety (high=poor).

Comparison 1 Music therapy plus TAU versus TAU alone (primary comparison), Outcome 8 Self‐esteem (high=good).
Figures and Tables -
Analysis 1.8

Comparison 1 Music therapy plus TAU versus TAU alone (primary comparison), Outcome 8 Self‐esteem (high=good).

Comparison 2 Music therapy versus psychological therapy, Outcome 1 Severity of depressive symptoms, clinician‐rated (primary outcome; high=poor).
Figures and Tables -
Analysis 2.1

Comparison 2 Music therapy versus psychological therapy, Outcome 1 Severity of depressive symptoms, clinician‐rated (primary outcome; high=poor).

Comparison 2 Music therapy versus psychological therapy, Outcome 2 Severity of depressive symptoms, patient‐reported (primary outcome; high=poor).
Figures and Tables -
Analysis 2.2

Comparison 2 Music therapy versus psychological therapy, Outcome 2 Severity of depressive symptoms, patient‐reported (primary outcome; high=poor).

Comparison 2 Music therapy versus psychological therapy, Outcome 3 Quality of life (high=good).
Figures and Tables -
Analysis 2.3

Comparison 2 Music therapy versus psychological therapy, Outcome 3 Quality of life (high=good).

Comparison 2 Music therapy versus psychological therapy, Outcome 4 Leaving the study early.
Figures and Tables -
Analysis 2.4

Comparison 2 Music therapy versus psychological therapy, Outcome 4 Leaving the study early.

Comparison 3 Active music therapy versus receptive music therapy, Outcome 1 Severity of depressive symptoms, clinician‐reported (primary outcome; high=poor).
Figures and Tables -
Analysis 3.1

Comparison 3 Active music therapy versus receptive music therapy, Outcome 1 Severity of depressive symptoms, clinician‐reported (primary outcome; high=poor).

Comparison 3 Active music therapy versus receptive music therapy, Outcome 2 Severity of depressive symptoms, patient‐reported (primary outcome; high=poor).
Figures and Tables -
Analysis 3.2

Comparison 3 Active music therapy versus receptive music therapy, Outcome 2 Severity of depressive symptoms, patient‐reported (primary outcome; high=poor).

Comparison 3 Active music therapy versus receptive music therapy, Outcome 3 Quality of life (high=good).
Figures and Tables -
Analysis 3.3

Comparison 3 Active music therapy versus receptive music therapy, Outcome 3 Quality of life (high=good).

Comparison 3 Active music therapy versus receptive music therapy, Outcome 4 Leaving the study early.
Figures and Tables -
Analysis 3.4

Comparison 3 Active music therapy versus receptive music therapy, Outcome 4 Leaving the study early.

Summary of findings for the main comparison. Music therapy plus treatment as usual (TAU) versus TAU for depression (primary comparison)

Music therapy plus treatment as usual (TAU) versus TAU

Patient or population: individuals with depression
Setting: any setting
Intervention: music therapy plus treatment as usual
Comparison: treatment as usual

Outcomes

Anticipated absolute effects* (95% CI)

Relative effect
(95% CI)

No. of participants
(studies)

Quality of the evidence
(GRADE)

Comments

Risk with treatment as usual

Risk with music therapy

Depressive symptoms

(clinician‐rated) (various scales)

Up to 3 months

Mean clinician‐rated depressive symptoms

in the intervention group were

SMD 0.98 SD lower (1.69 lower to 0.27 lower).

219
(3 RCTs; 1 CCT)

⊕⊕⊕⊝
MODERATEa

Lower score equals a better outcome.

SMD corresponds to a large effect size.

Depressive symptoms

(patient‐reported) (various scales)

Up to 3 months

Mean patient‐reported depressive symptoms

in the intervention group were

SMD 0.85 SD lower (1.37 lower to 0.34 lower).

142
(3 RCTs; 1 CCT)

⊕⊕⊕⊝
MODERATEa

Lower score equals a better outcome.

SMD corresponds to a large effect size.

Any adverse events

Up to 3 months

Study population

OR 0.45
(0.02 to 11.46)

79
(1 RCT)

⊕⊕⊝⊝
LOWb

22 per 1000

10 per 1000
(0 to 203)

Functioning (GAF)

Up to 3 months

Mean functioning in the intervention group was

SMD 0.51 SD higher (0.02 higher to 1 higher).

67
(1 RCT)

⊕⊕⊝⊝
LOWb

Higher score equals a better outcome.

SMD corresponds to a moderate effect size.

Quality of life (RAND‐36)

Up to 3 months

Mean quality of life in the intervention group was

SMD 0.32 SD higher (0.17 lower to 0.80 higher).

67
(1 RCT)

⊕⊕⊝⊝
LOWb

Higher score equals a better outcome.

Leaving the study early

Up to 3 months

Study population

OR 0.49
(0.14 to 1.70)

293
(5 RCTs; 1 CCT)

⊕⊕⊕⊝
MODERATEa

65 per 1000

33 per 1000
(10 to 106)

Anxiety (HADS‐A)

Up to 3 months

Mean anxiety in the intervention group was

SMD 0.74 SD lower (1.40 lower to 0.08 lower).

195
(2 RCTs; 1 CCT)

⊕⊕⊝⊝
LOWa,c

Lower score equals a better outcome.

SMD corresponds to a moderate effect size.

*The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CCT: controlled clinical trial; CI: confidence interval; GAF: Global Assessment of Functioning scale; HADS‐A: Hospital Anxiety and Depression Scale ‐ Anxiety; OR: odds ratio; RAND‐36: health‐related quality of life survey distributed by RAND; RCT: randomised controlled trial; RR: risk ratio; SD: standard deviation; SMD: standardised mean difference.

GRADE Working Group grades of evidence.
High quality: We are very confident that the true effect lies close to that of the estimate of the effect.
Moderate quality: We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low quality: Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect.
Very low quality: We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect.

aDowngraded one level for unclear randomisation, allocation concealment, blinding, missing study protocol.

bDowngraded two levels for wide confidence intervals, although adequately powered, well‐performed trial.
cDowngraded one level for variation effect sizes, non‐ or small overlap confidence intervals, high heterogeneity.

Figures and Tables -
Summary of findings for the main comparison. Music therapy plus treatment as usual (TAU) versus TAU for depression (primary comparison)
Summary of findings 2. Music therapy versus psychological treatment for depression

Music therapy versus psychological treatment for depression

Patient or population: adults with depression
Setting: any setting
Intervention: music therapy
Comparison: psychological therapy (counselling, cognitive‐behavioural therapy)

Outcomes

Anticipated absolute effects* (95% CI)

Relative effect
(95% CI)

No. of participants
(studies)

Quality of the evidence
(GRADE)

Comments

Risk with psychological treatment

Risk with music therapy

Depressive symptoms

(clinician‐rated) (MADRS)

Up to 3 months

Mean clinician‐rated depressive symptoms

in the intervention group was

SMD 0.78 SD lower (2.36 lower to 0.81 higher).

11
(1 RCT)

⊕⊝⊝⊝
VERY LOWa,b

Lower score equals better outcome.

SMD corresponds to a large effect size.

Depressive symptoms

(patient‐reported) (various scales)

Up to 3 months

Mean patient‐reported depressive symptoms

in the intervention group were

SMD 1.28 SD lower (3.57 lower to 1.02 higher).

131
(4 RCTs)

⊕⊕⊝⊝
LOWa,c

Lower score equals better outcome.

SMD corresponds to a large effect size.

Any adverse events ‐ not reported

Functioning ‐ not reported

Quality of life (Thai RAND‐36)

Up to 3 months

Mean quality of life

in the intervention group was

SMD 1.31 SD higher (0.36 lower to 2.99 higher).

11
(1 RCT)

⊕⊝⊝⊝
VERY LOWa,b

Higher score equals better outcome.

Leaving the study early

Up to 3 months

Study population

OR 0.17
(0.02 to 1.49)

157
(4 RCTs)

⊕⊕⊕⊝
MODERATEa

35 per 1000

9 per 1000
(1 to 77)

Anxiety ‐ not reported

*The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CI: confidence interval; MADRS: Montgomery‐Åsberg Depression Rating Scale; OR: odds ratio; RAND‐36: health‐related quality of life survey distributed by RAND; RCT: randomised controlled trial; RR: risk ratio; SD: standard deviation; SMD: standardised mean difference.

GRADE Working Group grades of evidence.
High quality: We are very confident that the true effect lies close to that of the estimate of the effect.
Moderate quality: We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low quality: Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect.
Very low quality: We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect.

aDowngraded one level for limitations in design such as unclear allocation concealment, blinding, incomplete outcome data, missing protocol.

bDowngraded two levels for small sample size.

cDowngraded one level for non‐overlap of confidence intervals, high heterogeneity (P < 0.00001); I2 = 96%.

Figures and Tables -
Summary of findings 2. Music therapy versus psychological treatment for depression
Summary of findings 3. Active music therapy versus receptive music therapy for depression

Active music therapy versus receptive music therapy for depression

Patient or population: adults with depression
Setting: any setting
Intervention: active music therapy
Comparison: receptive music therapy

Outcomes

Anticipated absolute effects* (95% CI)

Relative effect
(95% CI)

No. of participants
(studies)

Quality of the evidence
(GRADE)

Comments

Risk with receptive music therapy

Risk with active music therapy

Depressive symptoms

(clinician‐rated) (MADRS)

Up to 3 months

Mean clinician‐rated depressive symptoms

in the intervention group were

SMD 0.52 SD lower (1.87 lower to 0.83 higher).

9
(1 RCT)

⊕⊝⊝⊝
VERY LOWa,b

Lower score equals a better outcome.

Depressive symptoms (patient‐reported) (TDI)

Up to 3 months

Mean patient‐reported depressive symptoms

in the intervention group were

SMD 0.01 SD lower (1.33 lower to 1.3 higher).

9
(1 RCT)

⊕⊝⊝⊝
VERY LOWa,b

Lower score equals a better outcome.

Any adverse events ‐ not reported

Functioning ‐ not reported

Quality of life (SF‐36 Thai)

Up to 3 months

Mean quality of life

in the intervention group was

SMD 0.24 SD lower (1.57 lower to 1.08 higher).

9
(1 RCT)

⊕⊝⊝⊝
VERY LOWa,b

Higher score equals a better outcome.

Leaving the study early

Up to 3 months

Study population

OR 0.27
(0.01 to 8.46)

10
(1 RCT)

⊕⊝⊝⊝
VERY LOWa,b

200 per 1000

63 per 1000
(2 to 679)

Anxiety ‐ not reported

*The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CI: confidence interval; MADRS: Montgomery‐Åsberg Depression Rating Scale; OR: odds ratio; RCT: randomised controlled trial; RR: risk ratio; SD: standard deviation; SF‐36: Short Form‐36; SMD: standardised mean difference; TDI: Thai Depression Inventory.

GRADE Working Group grades of evidence.
High quality: We are very confident that the true effect lies close to that of the estimate of the effect
Moderate quality: We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different
Low quality: Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect
Very low quality: We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect

aDowngraded one level for limitations in design such as unclear allocation concealment, blinding, missing protocol.

bDowngraded two levels for small sample size.

Figures and Tables -
Summary of findings 3. Active music therapy versus receptive music therapy for depression
Comparison 1. Music therapy plus TAU versus TAU alone (primary comparison)

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Severity of depression symptoms, clinician‐rated (primary outcome; high=poor) Show forest plot

4

Std. Mean Difference (IV, Random, 95% CI)

Subtotals only

1.1 short‐term (up to 3 months)

4

219

Std. Mean Difference (IV, Random, 95% CI)

‐0.98 [‐1.69, ‐0.27]

1.2 medium‐term (up to 6 months)

1

64

Std. Mean Difference (IV, Random, 95% CI)

‐0.38 [‐0.87, 0.12]

2 Severity of depression symptoms, patient‐reported (primary outcome; high=poor) Show forest plot

4

Std. Mean Difference (IV, Random, 95% CI)

Subtotals only

2.1 short‐term (up to 3 months)

4

142

Std. Mean Difference (IV, Random, 95% CI)

‐0.85 [‐1.37, ‐0.34]

3 Any adverse event Show forest plot

1

Odds Ratio (M‐H, Random, 95% CI)

Totals not selected

3.1 short‐term (up to 3 months)

1

Odds Ratio (M‐H, Random, 95% CI)

0.0 [0.0, 0.0]

3.2 medium‐term (up to 6 months)

1

Odds Ratio (M‐H, Random, 95% CI)

0.0 [0.0, 0.0]

4 Functioning (high=good) Show forest plot

1

Std. Mean Difference (IV, Random, 95% CI)

Totals not selected

4.1 short‐term (up to 3 months)

1

Std. Mean Difference (IV, Random, 95% CI)

0.0 [0.0, 0.0]

4.2 medium‐term (up to 6 months)

1

Std. Mean Difference (IV, Random, 95% CI)

0.0 [0.0, 0.0]

5 Quality of life (high=good) Show forest plot

1

Std. Mean Difference (IV, Random, 95% CI)

Subtotals only

5.1 short‐term (up to 3 months)

1

67

Std. Mean Difference (IV, Random, 95% CI)

0.32 [‐0.17, 0.80]

5.2 medium‐term (up to 6 months)

1

64

Std. Mean Difference (IV, Random, 95% CI)

0.26 [‐0.23, 0.76]

6 Leaving the study early Show forest plot

6

Odds Ratio (M‐H, Random, 95% CI)

Subtotals only

6.1 short‐term (up to 3 months)

6

293

Odds Ratio (M‐H, Random, 95% CI)

0.49 [0.14, 1.70]

6.2 medium‐term (up to 6 months)

1

79

Odds Ratio (M‐H, Random, 95% CI)

0.44 [0.13, 1.53]

7 Anxiety (high=poor) Show forest plot

3

Std. Mean Difference (IV, Random, 95% CI)

Subtotals only

7.1 short‐term (up to 3 months)

3

195

Std. Mean Difference (IV, Random, 95% CI)

‐0.74 [‐1.40, ‐0.08]

7.2 medium‐term (up to 6 months)

1

64

Std. Mean Difference (IV, Random, 95% CI)

‐0.40 [‐0.90, 0.10]

8 Self‐esteem (high=good) Show forest plot

1

Std. Mean Difference (IV, Random, 95% CI)

Totals not selected

8.1 short‐term (up to 3 months)

1

Std. Mean Difference (IV, Random, 95% CI)

0.0 [0.0, 0.0]

Figures and Tables -
Comparison 1. Music therapy plus TAU versus TAU alone (primary comparison)
Comparison 2. Music therapy versus psychological therapy

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Severity of depressive symptoms, clinician‐rated (primary outcome; high=poor) Show forest plot

1

Std. Mean Difference (IV, Random, 95% CI)

Totals not selected

1.1 short‐term (up to 3 months)

1

Std. Mean Difference (IV, Random, 95% CI)

0.0 [0.0, 0.0]

1.2 medium‐term (up to 6 months)

1

Std. Mean Difference (IV, Random, 95% CI)

0.0 [0.0, 0.0]

2 Severity of depressive symptoms, patient‐reported (primary outcome; high=poor) Show forest plot

4

Std. Mean Difference (IV, Random, 95% CI)

Subtotals only

2.1 short‐term (up to 3 months)

4

131

Std. Mean Difference (IV, Random, 95% CI)

‐1.28 [‐3.57, 1.02]

2.2 medium‐term (up to 6 months)

1

11

Std. Mean Difference (IV, Random, 95% CI)

‐0.68 [‐2.26, 0.89]

3 Quality of life (high=good) Show forest plot

1

Std. Mean Difference (IV, Random, 95% CI)

Totals not selected

3.1 short‐term (up to 3 months)

1

Std. Mean Difference (IV, Random, 95% CI)

0.0 [0.0, 0.0]

3.2 medium‐term (up to 6 months)

1

Std. Mean Difference (IV, Random, 95% CI)

0.0 [0.0, 0.0]

4 Leaving the study early Show forest plot

4

Odds Ratio (M‐H, Random, 95% CI)

Subtotals only

4.1 short‐term (up to 3 months)

4

137

Odds Ratio (M‐H, Random, 95% CI)

0.17 [0.02, 1.49]

4.2 medium‐term (up to 6 months)

1

14

Odds Ratio (M‐H, Random, 95% CI)

0.11 [0.01, 1.92]

Figures and Tables -
Comparison 2. Music therapy versus psychological therapy
Comparison 3. Active music therapy versus receptive music therapy

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Severity of depressive symptoms, clinician‐reported (primary outcome; high=poor) Show forest plot

1

Std. Mean Difference (IV, Random, 95% CI)

Totals not selected

1.1 short‐term (up to 3 months)

1

Std. Mean Difference (IV, Random, 95% CI)

0.0 [0.0, 0.0]

1.2 medium‐term (up to 6 months)

1

Std. Mean Difference (IV, Random, 95% CI)

0.0 [0.0, 0.0]

2 Severity of depressive symptoms, patient‐reported (primary outcome; high=poor) Show forest plot

1

Std. Mean Difference (IV, Random, 95% CI)

Totals not selected

2.1 short‐term (up to 3 months)

1

Std. Mean Difference (IV, Random, 95% CI)

0.0 [0.0, 0.0]

2.2 medium‐term (up to 6 months)

1

Std. Mean Difference (IV, Random, 95% CI)

0.0 [0.0, 0.0]

3 Quality of life (high=good) Show forest plot

1

Std. Mean Difference (IV, Random, 95% CI)

Totals not selected

3.1 short‐term (up to 3 months)

1

Std. Mean Difference (IV, Random, 95% CI)

0.0 [0.0, 0.0]

3.2 medium‐term (up to 6 months)

1

Std. Mean Difference (IV, Random, 95% CI)

0.0 [0.0, 0.0]

4 Leaving the study early Show forest plot

1

Odds Ratio (M‐H, Random, 95% CI)

Totals not selected

4.1 short‐term (up to 3 months)

1

Odds Ratio (M‐H, Random, 95% CI)

0.0 [0.0, 0.0]

4.2 medium‐term (up to 6 months)

1

Odds Ratio (M‐H, Random, 95% CI)

0.0 [0.0, 0.0]

Figures and Tables -
Comparison 3. Active music therapy versus receptive music therapy