Scolaris Content Display Scolaris Content Display

18F‐FDG PET for the early diagnosis of Alzheimer’s disease dementia and other dementias in people with mild cognitive impairment (MCI)

Collapse all Expand all

Abstract

available in

Background

¹⁸F‐FDFG uptake by brain tissue as measured by positron emission tomography (PET) is a well‐established method for assessment of brain function in people with dementia. Certain findings on brain PET scans can potentially predict the decline of mild cognitive Impairment (MCI) to Alzheimer’s disease dementia or other dementias.

Objectives

To determine the diagnostic accuracy of the ¹⁸F‐FDG PET index test for detecting people with MCI at baseline who would clinically convert to Alzheimer’s disease dementia or other forms of dementia at follow‐up.

Search methods

We searched the Cochrane Register of Diagnostic Test Accuracy Studies, MEDLINE, EMBASE, Science Citation Index, PsycINFO, BIOSIS previews, LILACS, MEDION, (Meta‐analyses van Diagnostisch Onderzoek), DARE (Database of Abstracts of Reviews of Effects), HTA (Health Technology Assessment Database), ARIF (Aggressive Research Intelligence Facility) and C‐EBLM (International Federation of Clinical Chemistry and Laboratory Medicine Committee for Evidence‐based Laboratory Medicine) databases to January 2013. We checked the reference lists of any relevant studies and systematic reviews for additional studies.

Selection criteria

We included studies that evaluated the diagnostic accuracy of ¹⁸F‐FDG PET to determine the conversion from MCI to Alzheimer’s disease dementia or to other forms of dementia, i.e. any or all of vascular dementia, dementia with Lewy bodies, and fronto‐temporal dementia. These studies necessarily employ delayed verification of conversion to dementia and are sometimes labelled as ‘delayed verification cross‐sectional studies’.

Data collection and analysis

Two blinded review authors independently extracted data, resolving disagreement by discussion, with the option to involve a third review author as arbiter if necessary. We extracted and summarised graphically the data for two‐by‐two tables. We conducted exploratory analyses by plotting estimates of sensitivity and specificity from each study on forest plots and in receiver operating characteristic (ROC) space. When studies had mixed thresholds, we derived estimates of sensitivity and likelihood ratios at fixed values (lower quartile, median and upper quartile) of specificity from the hierarchical summary ROC (HSROC) models.

Main results

We included 14 studies (421 participants) in the analysis. The sensitivities for conversion from MCI to Alzheimer's disease dementia were between 25% and 100% while the specificities were between 15% and 100%. From the summary ROC curve we fitted we estimated that the sensitivity was 76% (95% confidence interval (CI): 53.8 to 89.7) at the included study median specificity of 82%. This equates to a positive likelihood ratio of 4.03 (95% CI: 2.97 to 5.47), and a negative likelihood ratio of 0.34 (95% CI: 0.15 to 0.75). Three studies recruited participants from the same Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort but only the largest ADNI study (Herholz 2011) is included in the meta‐analysis. In order to demonstrate whether the choice of ADNI study or discriminating brain region (Chételat 2003) or reader assessment (Pardo 2010) make a difference to the pooled estimate, we performed five additional analyses. At the median specificity of 82%, the estimated sensitivity was between 74% and 76%. There was no impact on our findings. In addition to evaluating Alzheimer's disease dementia, five studies evaluated the accuracy of ¹⁸F‐FDG PET for all types of dementia. The sensitivities were between 46% and 95% while the specificities were between 29% and 100%; however, we did not conduct a meta‐analysis because of too few studies, and those studies which we had found recruited small numbers of participants. Our findings are based on studies with poor reporting, and the majority of included studies had an unclear risk of bias, mainly for the reference standard and participant selection domains. According to the assessment of Index test domain, more than 50% of studies were of poor methodological quality.

Authors' conclusions

It is difficult to determine to what extent the findings from the meta‐analysis can be applied to clinical practice. Given the considerable variability of specificity values and lack of defined thresholds for determination of test positivity in the included studies, the current evidence does not support the routine use of ¹⁸F‐FDG PET scans in clinical practice in people with MCI. The ¹⁸F‐FDG PET scan is a high‐cost investigation, and it is therefore important to clearly demonstrate its accuracy and to standardise the process of ¹⁸F‐FDG PET diagnostic modality prior to its being widely used. Future studies with more uniform approaches to thresholds, analysis and study conduct may provide a more homogeneous estimate than the one available from the included studies we have identified.

Plain language summary

¹⁸F‐FDG PET scan for early prediction of developing Alzheimer’s disease or other dementia in people with mild cognitive impairment (MCI)

Background

The numbers of people with dementia and other cognitive problems are increasing globally. A diagnosis of dementia at early stage is recommended but there is no agreement on the best approach. A range of tests have been developed which healthcare professionals can use to assess people with poor memory or cognitive impairment. In this review we have focused on the ¹⁸F‐FDG PET test.

Aim

We aimed to see how accurately the ¹⁸F‐FDG PET scan identified those people with MCI who would clinically convert to Alzheimer’s disease dementia or other types of dementia over a period of time.

Study characteristics

The evidence is current to January 2013. We included 16 studies covering 697 participants with MCI. The studies have been published over a 14‐year period (1999 to 2013). Study sizes were small and ranged from 19 to 94 participants. Five papers have a mean age of less than 70 years. The age range in the youngest sample was 55 to 73 years and in the oldest sample was 71 to 86 years. Participants were mainly recruited from university departments, clinics or research centres. The percentage of participants with positive ¹⁸F‐FDG PET scans at baseline ranged in the included studies from 10.5% to 74% and the percentage of those participants who converted to Alzheimer’s disease dementia over a period of time ranged from 22% to 50%. Included studies reported a range of different cut‐off values used for identifying their participants with positive ¹⁸F‐FDG PET scans.

Quality of the evidence

Our findings are based on studies with poor reporting. The majority of included studies had an unclear risk of bias, mainly because they did not describe in sufficient details how participants were selected and how the clinical diagnosis of Alzheimer’s disease dementia was justified. According to the assessment of the ¹⁸F‐FDG PET test domain, more than 50% of studies were of poor methodological quality.

The main limitations of the review are poor reporting in the included studies, a lack of a widely‐accepted cut‐off value of the ¹⁸F‐FDG PET scan in people with MCI, and the marked variation in test accuracy between the included studies.

Key findings

In this review, we have found that the ¹⁸F‐FDG PET scan, as a single test, lacks the accuracy to identify those people with MCI who would develop Alzheimer’s disease dementia or other forms of dementia over a period of time. Assuming a typical conversion rate of MCI to Alzheimer’s disease dementia of 38%, the findings indicate that for every 1000 ¹⁸F‐FDG PET scans, 174 cases with a negative scan will progress to Alzheimer's disease dementia and 285 with a positive scan will not. Therefore, a positive ¹⁸F‐FDG PET scan in people with MCI is of no clinical value in early prediction of developing Alzheimer's disease dementia.

Authors' conclusions

Implications for practice

Positron emission tomography (PET) is a unique diagnostic tool, since it can assess pathophysiologic and metabolic processes before any anatomic changes have taken place. This capacity of PET could potentially lead to several future applications in dementias, and generally in the field of neurosciences. ¹⁸F‐FDG PET is becoming increasingly accepted in the diagnostic approach to Alzheimer's disease and related disorders. Moreover, Silverman 2001 has developed a cost‐effectiveness algorithm for assessment of geriatric patients presenting with early symptoms of cognitive decline; according to this, PET can be incorporated into the diagnostic work‐up of these patients, when the ‘standard’ diagnostic testing does not reveal an underlying cause for the cognitive decline (Moulin‐Romsee 2005; Silverman 2001; Silverman 2002).

The results of the included studies show a great deal of heterogeneity, encompassing both the values which would render the technology 'useless' and some which indicate a valuable diagnostic tool. Given the considerable variability and specificity values, the heterogeneity in the conduct and interpretation of the test, and the lack of defined thresholds for determining test positivity, the current evidence does not support the routine use of a ¹⁸F‐FDG PET scan in clinical practice. ¹⁸F‐FDG PET scan is a high‐cost investigation, and it is therefore important to clearly demonstrate its accuracy and to standardise the process of ¹⁸F‐FDG PET diagnostic modality prior to extending its use.

Implications for research

The understanding of the functions of the nervous system and the biology of its disorders remains a big challenge. The attempt to comprehend the molecular basis of such disorders, and to potentially interfere in the natural history of the disease, is not driven just by theoretical or purely scientific needs. In the coming decades the number of adults over 65 years is expected to increase dramatically. In this context, the development and application of functional diagnostic imaging modalities that have the opportunity to detect metabolic changes before any macroscopic anatomical changes take place, and furthermore can achieve this with the highest accuracy, will be pivotal in selecting those people who would be candidates and would benefit most from the application of such treatments.

The ¹⁸F‐FDG PET represents a modality that can reflect biochemical/molecular changes before respective morphological imaging modalities detect them. PET assesses cerebral metabolism by measuring glucose utilisation with the use of the radiotracer ¹⁸F‐FDG, a glucose analogue, which is trapped in the neuronal cell after undergoing the first metabolic step of phosphorylation by hexokinase. Since neuronal activity depends on the continuous supply of energy, the assessment of glucose consumption by the cells is indicative of neuronal integrity and function. The ability of PET to serve as a biomarker of dementia has already been highlighted (Dubois 2007; McKhann 2011). Moreover, a recent meta‐analysis demonstrated that ¹⁸F‐FDG PET is the strongest individual positive predictive biomarker of short‐term incident dementia in MCI (Frisoni 2013). However, energy metabolism, reflected by ¹⁸F‐FDG distribution, is not a specific process. Neurodegenerative diseases are based on complex and overlapping molecular processes, and it is known that the metabolic pattern particularly seen in ADD is a complicated one, resulting from neurochemical changes, neuronal disconnection effects and neuronal cell loss, several of which are probably not detected by ¹⁸F‐FDG PET, due the non‐specific nature of ¹⁸F‐FDG (Hoffman 2000). Moreover, PET as a technique carries the inherent drawback of low spatial resolution and subsequently provides anatomical information of low accuracy in comparison with computed tomography or magnetic resonance imaging. The application of newer radiopharmaceuticals (e.g. the PET tracer ¹¹C‐PIB that specifically binds fibrillar amyloid‐beta plaques), which reflect different mechanisms that contribute to the progression from MCI to ADD and other dementias (Brück 2013), and the advent of newer hybrid imaging modalities, like PET/MRI, that provide complementary anatomic, physiologic, metabolic, and functional information about the brain (Catana 2012) could therefore significantly aid our understanding of brain pathophysiology, regarding very early neurodegeneration.

Nevertheless, the results of the present analysis do not suggest the routine use of ¹⁸F‐FDG PET for detection of those people with MCI who will develop ADD. Our review carries some limitations, since generally, the methodological and reporting quality of all considered papers was relatively poor. Therefore, future studies with more uniform approaches to thresholds, analysis and study conduct with particular consistency in length of follow‐up may provide a more homogeneous estimate than the one that has been available from the included studies we have identified, in order to determine the exact role of ¹⁸F‐FDG PET in the diagnostic algorithm for such patients.

Summary of findings

Open in table viewer
Summary of findings Summary of findings table

What is the diagnostic accuracy of ¹⁸F‐FDG PET biomarker for detecting Alzheimer's disease, and predicting progression to dementia, in people with mild cognitive impairment

Descriptive

Participant population

Participants diagnosed with MCI at baseline using any of the Petersen criteria or CDR = 0.5 or any 16 definitions included by Matthews 2008

Sources of referral

  • GP surgeries or specialists or other institutions or self referral (n = 3)

  • Not reported (n = 13)

MCI criteria

  • Petersen criteria (with or without CDR = 0.5 criterion) (n = 14)

  • AAMI criteria (n = 1)

  • Global Deterioration Scale (n = 1)

Sampling procedure

  • Consecutive or random (n = 3)

  • Unclear (n = 13)

Prior testing

The only testing prior to performing the ¹⁸F‐FDG PET scans was the application of diagnostic criteria for identifying participants with MCI

Sources of recruitment

  • University departments, clinics or research centres (n = 7)

  • Multicentre, not specified (n = 3)

  • Outpatient memory clinic or outpatients (sources not specified) or Geriatric, Research, Education and Medical centre (n = 3)

  • Not reported (n = 3)

Index tests

¹⁸F‐FDG PET

Threshold prespecified at baseline

  • Yes (n = 6)

  • No (n = 10)

PET scan interpretation

  • Combination of visual inspection and quantitative (rCGMr) evaluation (n = 12)

  • Only visual PET scan interpretation (n = 4)

Threshold

Almost all included studies referred to ratios of cerebral glucose metabolism (rCGMr) and not to absolute numbers. They used a range of different thresholds (different brain regions studied as potential AD areas as well as different scaling).

Most of the studies (12/16) performed PET analysis based on the combination of visual analysis (qualitative) and rCGMr estimations (quantitative). The rest (4/16) only referred to visual PET inspections (qualitative‐only analysis).

¹⁸F‐FDG hypometabolism regions

Authors used brain regions that are expected to be affected by AD. In these terms, all studies involved temporo‐parietal lobes and most of them (12/16) also included the posterior cingulate metabolism in their assessment. 7 studies also involved part of the frontal lobes in their evaluations.

Reference standard

For Alzheimer’s disease dementia:

  • NINCDS‐ADRDA (n = 13)

  • Other (n = 3)

Target condition

Conversion from MCI to Alzheimer’s disease dementia or any other forms of dementia.

Included studies

Prospectively well‐defined cohorts with any accepted definition of MCI (as above). 16 included studies (N=697 participants) were identified. 3 studies included ADNI participants. Number included in analysis was 421 from 14 studies.

Quality concerns

QUADAS‐2 scoring was challenging due to insufficient details. Poor reporting about sampling procedure led mainly to unclear risk of bias or contributed to high risk of bias in the participant selection domain. Although the reference standard was regarded as adequate to correctly classify the target condition, poor reporting on blinding of dementia assessors determined unclear risk of bias in the reference domain in most of the included studies. According to the assessment of Index test domain, more than 50% of studies were of poor methodological quality due to lack of prespecified threshold.

Limitations

Limited investigation of heterogeneity due to insufficient number of studies. Lack of common thresholds.

Test

Studies

Cases/ Participants

Median specificity from included studies

Sensitivity

(95% CI)1 at median specificity

Consequences in a cohort of 1000

Median percentage converting % (range)2

Missed cases3

Over

Diagnosed3

Alzheimer's disease dementia

¹⁸F‐FDG PET

14

150/421

82

76 (54 to 90)

38.5 (22 ‐ 50)

174

285

Key feature of the results:

The results of the included studies show a great deal of heterogeneity, encompassing both values which would render the technology “useless” and some which indicate a valuable diagnostic tool. The sensitivity values ranged from 25% to 100%, while the specificity values ranged from 29% to 100%. The values for both sensitivity and specificity were mainly over 80% in 7 included studies. In the remaining 7 studies those values were less than 80% or a sensitivity value higher than 80% was accompanied by a very low specificity value and vice versa.

Investigation of heterogeneity:

We investigated the effects of interpretation of PET scan (combination of visual inspection and quantitative rCGMr evaluation versus those that used visual interpretation only) and use of the prespecified threshold on the test results. There was no impact on our findings.

The remaining planned investigations (e.g. the effect of spectrum of participants, referral centres, ¹⁸F‐FDG reduction brain regions, inadequate blinding or loss to follow‐up) were not possible due to the limited number of studies available for each analysis. We conducted sensitivity analyses for type of clinical diagnosis for MCI and for type of reference standard. There was no impact on our findings.

Conclusions:

Given the considerable variability and specificity values, the heterogeneity in the conduct and interpretation of the test, and lack of defined thresholds for determination of test positivity in the included studies, the current evidence does not support the routine use of ¹⁸F‐FDG PET scan in clinical practice. ¹⁸F‐FDG PET scan is a high‐cost investigation, and it is therefore important to clearly demonstrate its accuracy and to standardise the process of ¹⁸F‐FDG PET diagnostic modality prior to its being widely used. Future studies with more uniform approaches to thresholds, analysis and study conduct may provide a more homogeneous estimate than the one that has been available from the included studies we have identified.

1 Meta‐analytic estimate of sensitivity derived from the HSROC model at a fixed value of specificity. We did not compute summary estimates of sensitivity and specificity because the studies that contributed to the estimation of the summary ROC curve used various thresholds.

2 We derived the median proportion converting (reported as a percentage) and range using all the studies included in the analysis for each target condition.

3 We computed missed and over‐diagnosed numbers using the median proportion converting to each target condition.

Background

The most common form of dementia in the general population is Alzheimer’s disease. It is useful to distinguish the term Alzheimer's disease, which refers to underlying pathology, from Alzheimer's disease dementia, which is the final stage of a clinical syndrome associated with the pathology.

Alzheimer’s disease dementia (ADD) afflicts 5% of men and 6% of women over the age of 60 worldwide (ICD‐10 2010). The prevalence increases exponentially with age as Alzheimer’s disease dementia affects fewer than 1% of people aged 60 to 64, but 24% to 33% of those over 85 (Ferri 2005). The earliest symptoms of ADD include short‐term memory loss, a gradual decline in other cognitive abilities and behavioural changes. Cortical intracellular neurofibrillary tangles (NFT) and extracellular β‐amyloid (Aβ) plaques (Braak 1991) represent the neuropathological features of Alzheimer's disease dementia and are responsible for synapse dysfunction, neuronal cell loss and consequent brain atrophy (Ballard 2011). According to the National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer's Disease and Related Disorders Association (NINCDS‐ADRDA) criteria (McKhann 1984), definite Alzheimer's diseases dementia can only be diagnosed following neuropathological examination of brain tissue, obtained by biopsy or autopsy.

Mild cognitive impairment (MCI) represents a possible intermediary condition between normal cognition and dementia (Morris 2001; Petersen 2009). Currently, 16 different classifications are used to define MCI (Matthews 2008). The different definitions of MCI are based on general criteria that include a cognitive complaint (self‐ or informant‐reported or both), preserved basic activities of daily living, cognitive impairment (not normal for age and education) or decline in cognition evidenced by performance on objective cognitive tasks, and absence of dementia (Petersen 2004; Winblad 2004). In this review MCI refers to the clinical criteria defined by Petersen and Winbald (Petersen 1999; Petersen 2001; Petersen 2004; Winblad 2004), or the Clinical Dementia Rating (CDR) scale (CDR = 0.5) (Morris 1993), or any of the 16 descriptions of MCI reported by Matthews 2008.

There are four outcomes for those within an MCI population: progression to ADD, progression to another dementia, maintaining stable MCI, or recovery. An early identification of those people who would convert from MCI to ADD and other forms of dementia may improve the opportunities for early intervention and might help their carers to plan the future. However, current data in the medical literature are still not adequate to guide clinicians and researchers in understanding the progression of dementia. There is no clinical method to predict the possible conversion of people with MCI to ADD or other dementias. Studies (Bruscoli 2004; Mattsson 2009; Petersen 1999; Petersen 2009) indicate that an annual average of 10% to 15% of people with MCI progress to ADD. This all depends on clinical profile, settings and investigation for vascular disease. Thus, the improvement of diagnostic accuracy is critical for the management and treatment of ADD and other dementias. Research suggests that measurable change in positron emission tomography (PET), magnetic resonance imaging (MRI) and cerebrospinal fluid (CSF) biomarkers occurs years in advance of the onset of clinical symptoms (Beckett 2010). This review focuses on the relation between the ¹⁸F‐2‐fluoro‐2‐deoxy‐D‐glucose (¹⁸F‐FDG)‐PET biomarker results, the brain tissue glucose metabolism at baseline, and i) ‘conversion from MCI to Alzheimer’s disease dementia’ or ii) ‘conversion from MCI to other forms of dementia’ at follow‐up.

Target condition being diagnosed

The primary target condition is Alzheimer's disease dementia. The diagnosis is based on the exclusion of other causes of dementia through clinical, paraclinical and neuropsychological investigations criteria as indicated in the NINCDS‐ADRDA guidelines (McKhann 1984). Exclusion of other diseases such as depression, hypothyroidism, and non‐Alzheimer's disease brain lesions is a fundamental part of the diagnostic process (McKhann 1984). A standard diagnostic practice is based on clinical examinations and neurological and mental status examination of the patient. Moreover, the standard diagnostic practice includes caregiver or family member interviews, focusing on progressive cognitive impairments and behavioural changes associated with the disease.

The secondary target condition is any other form of dementia, including all‐cause dementia (APA 1987; APA 1994), vascular dementia (Román 1993), dementia with Lewy bodies (McKeith 2006), and fronto‐temporal dementia (Lund Manchester 1994; Neary 1998).

Index test(s)

PET represents a unique, minimally invasive diagnostic nuclear medicine modality of well‐documented accuracy. It assesses pathophysiologic and chemical processes by using radiopharmaceuticals that mimic endogenous molecules. Depending on the distribution of the radiotracer in the human body, images are produced and diagnostic information is acquired. Kinetic information may also be available. ¹⁸F‐FDG is the most common molecular imaging biomarker used in PET. In particular, ¹⁸F‐FDG is a radiolabeled glucose analogue and thus by entering the glucose metabolic pathway provides information about tissue metabolism. In other words, ¹⁸F‐FDG is an indicator of intracellular glucose metabolism. It has a wide variety of applications in neurosciences, oncology, and cardiology.

¹⁸F‐FDG uptake by brain tissue as measured by PET is a well‐established method for evaluation of brain function and it has been used in the study of dementia for more than three decades. ¹⁸F‐FDG PET evaluates the regional cerebral metabolic rate for glucose (rCGMr), thus giving information about the entity of neuronal loss or synapse dysfunction The key finding is the reduced brain glucose metabolism that is associated with neurodegenerative diseases. Glucose metabolism imaging with ¹⁸F‐FDG is the most sensitive and specific imaging modality available today for the diagnosis of ADD (Lucignani 2006). Furthermore, ¹⁸F‐FDG PET is nowadays considered an imaging biomarker for Alzheimer's disease before onset of dementia and in clinical trials (Bohnen 2012; Dubois 2010; Hampel 2010). Hypometabolism in the temporo‐parietal lobe, as assessed by qualitative visual interpretation of the PET scan, represents the typical pattern found in ADD (Herholz 2002; Nitrini 2000). Moreover, it has been demonstrated that progression of neurodegenerative changes in people with ADD and other dementias is associated with both more cognitive impairment and larger PET metabolic deficits (Duara 1986; Haxby 1986).

The ¹⁸F‐FDG PET pattern for MCI is not so consistent, which is unsurprising, due to the variable physical history of the disorder. However, people with MCI usually present on PET with mild global and regional hypometabolism (Mosconi 2009). ¹⁸F‐FDG PET studies have found characteristic and progressive cerebral metabolic rate for glucose (CMRgl) reductions in posterior cingulate, precuneus, parietal, temporal and frontal regions in both ADD and in people with MCI, with the findings being more pronounced in those with MCI who eventually converted to ADD (Chen 2010; Morbelli 2010; Patterson 2010). Moreover, a growing body of ¹⁸F‐FDG PET studies have been carried out specifically in order to evaluate the correlation between glucose metabolism impairment and the progression from MCI to ADD and other dementias. These studies suggest that certain findings on brain PET scans can potentially predict the decline of MCI to ADD. In agreement with this, a recent meta‐analysis points out that people converting from MCI, in comparison with those who did not convert to ADD, showed hypometabolism/hypoperfusion in the parietal lobe (Schroeter 2009). Further, Laforce 2010 studied the role of ¹⁸F‐FDG PET in the diagnosis of atypical/unclear dementias in a cohort of 94 people suffering from MCI or dementia. Their results showed that PET significantly reduced the percentage of unclear dementia diagnoses from 39% to 16%.

The mainstay in ¹⁸F‐FDG PET interpretation is the visual reading of the scan, which depends heavily on the experience and previous training of the reading physician. This reliance on the observer’s qualitative interpretation and the lack of well‐defined thresholds for differentiation of pathological from normal scans is an issue regarding the application of the modality in the diagnostic work‐up of people with MCI. Nevertheless, the development and utilisation in recent years of new software tools for image analysis have helped in the direction of carrying out many brain ¹⁸F‐FDG PET studies. These software applications, some of which are currently Food and Drug Administration (FDA)‐approved, have enabled the quantification of brain PET scans, achieving objective evaluation and thus increasing the physician's interpretative confidence. Although subjective (visual) interpretation of the brain scan is the general standard in clinical practice, the addition of quantitative information can be crucial in such studies, since it improves the diagnostic accuracy (Patterson 2010).

Clinical pathway

Dementia develops over a trajectory of several years. There is a presumed period when people are asymptomatic, and when pathology is accumulating. Individuals or their relatives may then notice subtle impairments of recent memory. Gradually, more cognitive domains become involved, and difficulty in planning complex tasks becomes increasingly apparent. In the UK, people usually present to their general practitioner, who may administer some neuropsychological tests, and will potentially refer them to a hospital memory clinic. However, many people with dementia do not present until much later in the disorder and will follow a different pathway to diagnosis, for example being identified during an admission to a general hospital for a physical illness. Thus the pathway influences the accuracy of the diagnostic test. The accuracy of the test will vary with the experience of the administrator, and the accuracy of the subsequent diagnosis will vary with the history of referrals to the particular healthcare setting. Diagnostic assessment pathways may vary between countries and diagnoses may be made by a variety of specialists including neurologists and geriatricians. 

Alternative test(s)

We are not including alternative tests in this review because there are currently no standard practice tests available for the diagnosis of dementia. 

The Cochrane Dementia and Cognitive Improvement Group (CDCIG) is in the process of conducting a series of diagnostic test accuracy reviews of biomarkers and scales (see list below). Although we are conducting reviews on individual tests compared to a reference standard, we plan to compare our results in an overview.

  • ¹¹C‐PIB PET( Pittsburgh Compound‐B positron emission tomography)

  • CSF (Cerebrospinal fluid analysis of abeta and tau)

  • sMRI (structural magnetic resonance imaging)

  • Neuropsychological tests (MMSE; Mini‐Cog; MoCA)

  • Informant interviews (IQCODE; AD8)

  • APOE ϵ4 (apolipoprotein ϵ4 allele gene)

  • rCBF SPECT (regional cerebral blood flow single photon emission computerised tomography)

Rationale

According to the latest revised NINCDS‐ADRA diagnostic criteria for ADD of the National Institute on Aging and Alzheimer Association (Albert 2011; Dubois 2010; McKhann 2011; Sperling 2011), the confidence in diagnosing MCI due to Alzheimer's disease is raised with the application of biomarkers based on imaging or CSF measures. These tests, added to core clinical criteria, might increase the sensitivity or specificity of a testing strategy. However, it is crucial that each of these biomarkers is assessed for their diagnostic accuracy before they are adopted as routine add‐on tests in clinical practice.

The ¹⁸F‐FDG PET biomarker, as the extra diagnostic criterion, might facilitate accurate identification of those people with MCI who would convert to ADD or other forms of dementia. At the present time there is no 'cure' for dementia, but there are some treatments which can slow cognitive and functional decline, or reduce the associated behavioural and psychiatric symptoms of dementia (Birks 2006; McShane 2006). In addition, the accurate early diagnosis of dementia may improve opportunities for the use of newly‐evolving interventions designed to delay or prevent progression to more debilitating stages of dementia (Oddo 2004). Coupled with appropriate contingency planning, proper recognition of the disease may also help to prevent inappropriate and potentially harmful admissions to hospital or institutional care (NAO 2007).

Objectives

To determine the diagnostic accuracy of the ¹⁸F‐FDG‐PET index test for detecting people with MCI at baseline who would clinically convert to ADD or other forms of dementia at follow‐up. Although we want to identify whether 'early forms of Alzheimer's disease dementia' are present at the moment of examination, there is no reference standard to make a final diagnosis of early ADD. Follow‐up is therefore needed to reconstruct what was going on at the time of the index examination. In this context, it is clear that the duration of follow‐up becomes critical, i.e. sufficiently long to capture the natural course of conversion.

Secondary objectives

To investigate heterogeneity of test accuracy in the included studies.

We expect that heterogeneity will be likely and that it will be an important component of the review. The potential sources of heterogeneity, which we will use as a framework for the investigation of heterogeneity, include target population, index test, target disorder and study quality.

Methods

Criteria for considering studies for this review

Types of studies

We consider longitudinal cohort studies in which index test results are obtained at baseline and the reference standard results at follow‐up (see below for detail about the nature of the index test and reference standard). These studies necessarily employ delayed verification of conversion to dementia and are sometimes labelled as ‘delayed verification cross‐sectional studies’ (Bossuyt 2004; Bossuyt 2008; Knottnerus 2002). 

We also consider case‐control studies if they incorporate a delayed verification design. This occurs in the context of a cohort study, so these studies are invariably diagnostic nested case‐control studies. We only include data on performance of the index test to discriminate between people with MCI who convert to dementia and those who remained stable from those studies. We have not considered data from healthy controls or any other control group.

Participants

Participants recruited and clinically classified as those with MCI at baseline were eligible for this review. We include studies that used the Petersen or revised Petersen criteria (Petersen 1999; Petersen 2004; Winblad 2004) or the Clinical Dementia Rating (CDR = 0.5) scale (Morris 1993) or any of the 16 different classifications of MCI described by Matthews 2008 as diagnostic criteria for MCI. Those criteria are presented in Table 1 and Table 2.

Open in table viewer
Table 1. Classification systems for describing mild cognitive impairment according to Matthews 2008

System

Description

Impairment

Age‐related cognitive change

ACMI

Age‐consistent memory

impairment

Memories aging in accord with normative expectations; individual aged 50 to 79 and reports a decline in memory verified using objective memory testing performance (within ± 1 standard deviation of aged norms on 75% of tests (memory) administered); preserved general functioning.

ARCD

Age‐related cognitive decline

Objectively identified decline in memory and cognitive functioning considered to be a normal consequence of aging.

Category systems

SMC

Subjective memory complaint

Self‐reported decline in memory.

MMSE MCI

Mini‐Mental State Examination

Classification based on MMSE cut‐offs (maximum score 30). 'Normal' defined as a score from 27 to 30; MCI defined as a score from 22 to 26; and 'impaired' defined as a score < 21.

Pathological decline

MNCD

Mild neurocognitive disorder

Impairment resulting from a general medical condition; reported decline in cognitive functioning supported by formal testing; deficits observed in at least 2 areas of cognitive functioning; interference in social, occupational, or other areas of functioning.

CIND

Cognitive impairment no dementia

Cognitive impairment in one or more domains (including memory and non‐memory domains) that can have a variety of aetiologies.

BSF

Benign senescent

Forgetfulness

Impairment in remote memory and intact recent memory; awareness of the impairment and use of compensatory

strategies.

AAMI

Age‐associated memory impairment

Subjective and objective memory loss associated with normal aging; individual aged 50 and older and shows adequate intellectual function (i.e. without dementia).

MD

Minimal dementia

Cognitive impairment in memory and minor and variable errors in orientation; no evidence of impaired occupational or social functioning; self care unimpaired.

LCD

Limited cognitive disturbance

Reported decline in memory with use of compensatory strategies; occasional forgetfulness (e.g. names, places); 1 or 2 errors on cognitive (memory) testing.

QD

Questionable dementia

Impaired memory and non‐memory test performance; no significant activity of daily living or instrumental activity of daily living interference.

AACD

Age‐associated cognitive decline

Self‐ or informant report of cognitive decline (gradual and present for 6 months); objective difficulties in any of the following domains: learning and memory, attention and concentration, thinking, language, and visuospatial functioning

MCD

Mild cognitive disorder

Decline in cognitive performance, including memory impairment and learning or concentration difficulties; complaint confirmed by cognitive tests; may precede, accompany, or follow a wide variety of infections or physical disorders.

Mayo Clinic Criteria

N‐MCI

Non‐amnestic mild cognitive impairment

Subjective memory complaint; normal general cognitive function; normal range of activities of daily living; objective impairment in 1 or more domains other than memory.

A‐MCI

Amnestic mild cognitive impairment

Subjective memory complaint; normal general cognitive function; normal range of activities of daily living; impaired memory performance; normal non‐memory test performance

M‐MCI

Multiple mild cognitive impairment

Subjective memory complaint; normal general cognitive function; normal range of activities of daily living; impaired memory and nonmemory test performance

Open in table viewer
Table 2. Criteria for the diagnosis of mild cognitive impairment

Petersen 1999

  1. Memory complaint, preferably corroborated by an informant.

  2. Impaired memory function compared for age and education.

  3. Preserved general cognitive function.

  4. Intact activities of daily living.

  5. Not demented.

Petersen 2004

 

4 subtypes of MCI were identified:

  1. Amnesic mild cognitive impairment, single domain – isolated memory impairment of more than 1.0 SD compared with the age‐ and education‐specific norms, and no difficulty in any other area of cognitive functioning.

  2. Amnesic mild cognitive impairment, multiple domain – 2 or more cognitive domains are impaired, 1 of which is memory impairment (impairment of more than 1.0 SD below the mean of the respective age‐ and education‐matched population).

  3. Non‐amnesic mild cognitive impairment, single domain – impairment in a single domain other than memory of more than 1.0 SD.

  4. Non‐amnesic mild cognitive impairment, multiple domains – impairments in 2 or more domains of more than 1.0 SD but no memory impairment.

 

All 4 subtypes of mild cognitive impairment also have to meet the following criteria:

a)  the presence of a complaint about memory – participants or informants (or both) reporting memory impairment.

b)  intact ability to perform activities of daily living – forgetfulness not compromising overall functional ability; impairment owing to physical disease not sufficient for exclusion.

c)   absence of dementia – assessed by DSM–IV criteria.

 

Morris 1993

 

Clinical Dementia Rating (CDR) is a scale used to classify patients along a continuum from normal aging through Alzheimer’s disease. This scale describes a continuum from normal (CDR 0) through questionable dementia or MCI (CDR 0.5) to mild (CDR 1), moderate (CDR 2), and severe (CDR 3) dementia.

 

Patient's cognitive and functional performances are assessed in 6 areas: memory, orientation, judgement and problem solving, community affairs, home and hobbies, and personal care. Scores in each of these areas are combined to obtain the total score.

MCI: mild cognitive impairment

We exclude those studies that involve people with MCI possibly caused by: i) current use or history of alcohol/drug abuse; ii) Central Nervous System trauma (e.g. subdural haematoma), tumour or infection; iii) other neurological conditions e.g. Parkinson’s or Huntington’s diseases.

Index tests

¹⁸F‐FDG PET biomarker test

There are currently no generally accepted standards for a ¹⁸F‐FDG positivity threshold, and therefore we have used the criteria which were applied in each included primary study to classify participants as either ¹⁸F‐FDG‐positive or ¹⁸F‐FDG‐negative, according to the degree of glucose metabolism in selected brain regions. Some studies apply a qualitative assessment of PET scans, while some apply both qualitative and quantitative assessments. Moreover, different thresholds are used in quantitative studies. While this may generate heterogeneity it should be noted that the addition of quantitative analysis (in the interpretation of the ¹⁸F‐FDG PET brain scan) in clinical practice is done in order to support the visual (qualitative) reading of the scan by the physician.

A range of thresholds have been used in primary research, for instance: i) "the regional cerebral glucose metabolism ratio (rCGM‐r) is lower than 80% of whole brain mean of control subjects" (Chételat 2003); ii) "the rCGM‐r of temporo‐parietal and posterior cingulate < 1.3 ‐ 8" (Anchisi 2005).

We considered the use of any image analysis technique, ¹⁸F‐FDG injection dose, the time between ¹⁸F‐FDG injection and PET acquisition, and ¹⁸F‐FDG reduction regions (e.g. parietal, temporal, frontal lobes, posterior cingulated, precuneus). The exact administered ¹⁸F‐FDG activity does not affect the PET examination (as long as it ranges between the accepted limits for acquiring proper images), as this can be compensated for by the duration of the scan; the number of counts detected by the scanner is the key finding.

The accepted limits of administered activity are defined by guidelines published by the Nuclear Medicine Societies. The two major ones are the Society of Nuclear Medicine (SNM, USA) (Waxman 2009) and the European Association of Nuclear Medicine (EANM, Europe) (Varrone 2009). According to SNM, the recommended ¹⁸F‐FDG activity in adults for brain PET is 185 ‐ 740 MBq (or 5 ‐ 20 mCi). According to EANM, the recommended administered activity for adults is 300 – 600 MBq (typically 370 MBq) in 2D mode and 125 – 250 MBq (typically 150 MBq) in 3D mode. All studies included in this review demonstrated homogeneity in the protocol followed, with no substantial differences, regarding administered dose or scanning acquisition followed. Moreover, despite the between‐studies differences regarding interpretation criteria, the evaluation of the PET scans was based on the fundamental principle of detection of a pattern of brain hypometabolism (decreased ¹⁸F‐FDG uptake) in people with MCI that is topographically consistent with the respective hypometabolic pattern expected to be seen in ADD. This means detection of regional metabolic reductions mainly in the temporo‐parietal and posterior cingulate cortices.

The differences in exact timing of image acquisition also do not influence the study, as long as the acquisition does not start earlier than 30 minutes after ¹⁸F‐FDG injection. It is recommended, however, that each department follow a standard protocol with a fixed time for starting the acquisition (e.g. 30 or 60 minutes after injection) (Varrone 2009; Waxman 2009). The aim of the acquisition is the good contrast between grey and white matter.

We did not include any comparator test because there are currently no standard practice tests available for the diagnosis of dementia. We compared the index test with a reference standard.

Target conditions

There are two target conditions in this review:

1. Alzheimer’s disease dementia (conversion from MCI to Alzheimer’s disease dementia);

2. Other forms of dementia (conversion from MCI to other forms of dementia, i.e. any or all of vascular dementia, dementia with Lewy bodies, or fronto‐temporal dementia).

Reference standards

For the purpose of this review, several definitions of ADD are acceptable. We Included studies that applied probable or possible NINCDS‐ADRDA criteria (McKhann 1984). We also considered those studies that used the Diagnostic and Statistical Manual of Mental Disorders (DSM) (APA 1987; APA 1994) and International Classification of Diseases (ICD) (ICD‐10 2010) definitions for ADD.

Similarly, differing clinical definitions of other dementias are acceptable. For Lewy body dementia the reference standard is the McKeith criteria (McKeith 1996; McKeith 2006). For fronto‐temporal dementia the reference standards are the Lund criteria (Lund Manchester 1994), Neary 1998, Boxer 2005, DSM‐III (APA 1987), DSM‐IV (APA 1994), ICD‐9 (ICD‐9 2006), ICD‐10 (ICD‐10 2010). For vascular dementia the reference standards are the NINDS‐ARIEN criteria (Román 1993), DSM‐III (APA 1987), DSM‐IV (APA 1994), ICD‐9 (ICD‐9 2006) and ICD‐10 (ICD‐10 2010).

The time interval over which progression from MCI to ADD or other forms of dementia occurs is important. We chose one year as the minimum period of delay in the verification of the diagnosis (i.e. the time between the assessment at which a diagnosis of MCI is made and the assessment at which the diagnosis of dementia is made).

Search methods for identification of studies

We used a variety of information sources, aiming to retrieve as many relevant studies as possible. The Trials Search Co‐ordinator of the CDCIG devised search strategies for electronic database searching.

Electronic searches

The most recent search for this review was performed in January 2013. We requested a search of the Cochrane Register of Diagnostic Test Accuracy Studies (managed by the Cochrane Renal Group). We also searched MEDLINE (OvidSP) (1950 to January 2013), MEDLINE (1950 to present), EMBASE (OvidSP) (1974 to week 2 2013), PsycINFO (OvidSP) (1806 to January week 2 2013), all databases in the Web of Science collection: Web of Science (1945 to present); BIOSIS Previews (1926 to present); Journal Citation Reports, and LILACS (Bireme). See Appendix 1 for details of the sources searched, the search strategies used, and the number of hits retrieved. We did not apply any language or date restrictions to the electronic searches; we did not use methodological filters, so as to maximise sensitivity (Beynon 2013; Whiting 2011).

Searching other resources

We checked the reference lists of all relevant studies for additional studies. We also conducted searches in the MEDION database (Meta‐analyses van Diagnostisch Onderzoek) at www.mediondatabase.nl, Database of Abstracts of Reviews of Effects (DARE) at www.crd.york.ac.uk/CRDWeb/, Health Technology Assessments Database (HTA Database) at www.crd.york.ac.uk/CRDWeb/, and Aggressive Research Intelligence Facility (ARIF) database at www.arif.bham.ac.uk for other related systematic diagnostic accuracy reviews. We searched for systematic reviews of diagnostic studies from the International Federation of Clinical Chemistry and Laboratory Medicine Committee for Evidence‐based Laboratory Medicine database (C‐EBLM). We checked reference lists of any relevant systematic reviews for additional studies.

Data collection and analysis

Selection of studies

The CDCIG Trials Search Co‐ordinator (TSC), who is a researcher with experience of DTA systematic reviews, performed the first assessment of the search results in order to remove the obviously non‐relevant studies. Two review authors independently reviewed the remaining titles and abstracts for potentially eligible studies for full paper review. Two review authors then independently assessed full manuscripts against the inclusion criteria. Where necessary, a third review author resolved disagreements that the other two were not able to resolve through discussion.

Where a study did not present all relevant data (for creating a 2 x 2 table) in the published manuscript, we contacted the authors directly to request further information. When the same dataset was presented in more than one paper, we planned to include the primary paper, which is the paper with the largest number of participants or with the most informative data.

We detailed the number of studies selected at each point in a Study flow diagram (below).

Data extraction and management

We extracted the following data on study characteristics (if reported):

Bibliographic details of primary paper:

  • Author, title of study, year and journal

Basic clinical and demographic details:

  • Number of participants

  • MCI clinical criteria

  • Age

  • Gender

  • Referral centre(s)

  • Participant recruitment

  • Sampling procedures

Details of the index test:

  • Method of the ¹⁸F‐FDG PET index test administration, including who administered the test

  • Thresholds used to define positive and negative tests

  • Other technical aspects as seem relevant to the review, e.g. brain areas

Details of the reference standard:

  • Definition of ADD and other dementias used in reference standard

  • Duration of follow‐up from time of index test used to define ADD and other dementias in reference standard: 1 to < 2 years; 2 to < 4 years; and > 4 years; if participants have been followed for varied amounts of time we recorded a mean follow‐up period for each included study

  • Prevalence or proportion of population developing ADD and other dementias, with severity, if described

We created 2 x 2 tables (cross‐relating index test results of the reference standards) as shown in Appendix 2. We also extracted data necessary for the assessment of quality, as defined below. Two blinded review authors (NS, CS) extracted data independently, resolving disagreements in data extraction by discussion, and involving a third review author (CH) as arbiter when necessary.

Assessment of methodological quality

We assessed the methodological quality of each study using the QUADAS‐2 tool (Whiting 2011), as recommended by The Cochrane Collaboration. The tool is made up of four domains: i) Participant selection; ii) Index test; iii) Reference standard; iv) Participant flow.

Two independent raters (NS, SM), blinded to each other’s scores, performed the QUADAS‐2 assessment, resolving disagreement by further review and discussion, with potential to involve a third review author (CH) as arbiter if necessary. We assessed each domain in terms of risk of bias, with the first three domains also considered in terms of applicability. The components of each of these domains and a rubric which details how judgements concerning risk of bias are made are detailed in Appendix 3 and Appendix 4. Certain key areas important to quality assessment are participant selection, blinding and missing data.

We did not use QUADAS‐2 data to form a summary quality score in order to ensure that the nature of the limitations of the studies were as transparent as possible. We produced a narrative summary describing numbers of studies that were found to have high/low/unclear risk of bias as well as concerns regarding applicability.

Statistical analysis and data synthesis

We evaluated test accuracy according to target condition. There are no accepted thresholds to define ¹⁸F‐FDG PET positivity for Alzheimer's disease dementia and other forms of dementia, and so the estimates of diagnostic accuracy reported in primary studies were likely to be based on data‐driven threshold selection (Leeflang 2008). We conducted exploratory analyses by plotting estimates of sensitivity and specificity from each study in forest plots and in receiver operating characteristic (ROC) space. We meta‐analysed pairs of sensitivity and specificity using the hierarchical summary ROC (HSROC) model (Rutter 2001) which allows for the possibility of variation in threshold between studies. Where inadequate studies were available to estimate all parameters, we assumed a symmetrical shape to the summary ROC curve. Estimates of summary sensitivities and specificities are not clinically interpretable when studies with mixed thresholds are included in the HSROC model, and so we derived estimates of sensitivity and likelihood ratios at fixed values (lower quartile, median and upper quartile) of specificity from the HSROC models. We performed the analyses using the SAS software (version 9.2; SAS Institute 2011, Cary, NC).

Investigations of heterogeneity

In preliminary analyses, we visually examined forest plots of sensitivity and specificity, and SROC plots to explore the effect of the sources of heterogeneity. We investigated the effect of i) interpretation of PET scan (a combination of visual inspection and quantitative rCGMr evaluation interpretation or visual‐only interpretation) and ii) prespecification of threshold on the diagnostic accuracy of the ¹⁸F‐FDG PET index test. However as there were insufficient studies we did not perform meta‐regression (by including each potential source of heterogeneity as a covariate in the HSROC model) as planned (Differences between protocol and review).

Sensitivity analyses

Due to the limited number of studies evaluating ¹⁸F‐FDG PET for all dementia, we performed sensitivity analyses only for studies of ADD. This is a departure from the protocol (Vacante 2013) and is explained in the Differences between protocol and review section.

Assessment of reporting bias

We did not investigate reporting bias because of current uncertainty about how it operates in test accuracy studies and the interpretation of existing analytical tools such as funnel plots.

Results

Results of the search

The total number of records identified by the searches for this review was 9676. After de‐duplication, the Trials Search Co‐ordinator and two paid assessors with experience of screening citations for biomarker diagnostic test accuracy studies screened the titles and abstracts. In total, they assessed 397 full papers and conference abstracts for eligibility (Figure 1). We included 16 papers, and discarded 349 for the following reasons: i) not MCI participants at baseline; ii) not a longitudinal study; iii) index test not a ¹⁸F‐FDG PET. In addition, we excluded 32 papers due to insufficient data for creating 2 x 2 tables (Characteristics of excluded studies). We found no extra studies through reference checking. We obtained usable data for seven studies (Anchisi 2005; Clerici 2009; Galluzzi 2010; Landau 2010; Ossenkoppele 2012a; Ossenkoppele 2012b; Schmand 2012) through contacting the authors.


Study flow diagram.

Study flow diagram.

Included Studies

The Characteristics of included studies table lists the details of the 16 included studies covering 697 participants with MCI at baseline, of whom 595 had analysable data. Three studies (Landau 2010; Herholz 2011; Schmand 2012) recruited participants from the same cohort (ADNI participants). The studies have been published over a 14‐year period (1999 to 2013). Most of them (13/16) were conducted in Europe (five in Italy, four in The Netherlands, two in Germany, one in Sweden and one in France), and three in the USA. Fourteen included studies used a version of the Petersen criteria for MCI. Thirteen studies applied NINCDS‐ADRDA criteria as a reference standard for ADD (Berent 1999 also used ICD‐10 while Clerici 2009 and Nobili 2008 also used DSM‐IV; Fellgiebel 2007, Herholz 2011 and Pardo 2010 did not specify the reference standard at follow‐up).

Demographic and participant characteristics are summarised in Table 3. Study sizes were small and ranged from 19 to 94 participants. Five papers have a mean age of under 70 years (Arnáiz 2001; Fellgiebel 2007; Mosconi 2004; Ossenkoppele 2012a; Ossenkoppele 2012b); Schmand 2012 did not report demographic data on the sample with ¹⁸F‐FDG PET scan. The youngest sample was aged 64.0 ± 9.0 (Ossenkoppele 2012b) and the oldest sample was aged 78.3 ± 7.5 (Landau 2010). Seven studies included more men than women in the samples included in the analysis (range from 33% to 75%); three studies did not reported gender for those samples (Ossenkoppele 2012b; Pardo 2010; Schmand 2012). APOE Ɛ4 gene carriers and sampling procedure were poorly reported. Participants were mainly recruited from university departments, clinics or research centres (seven studies) while three studies did not report their sources of recruitment (Chételat 2003; Mosconi 2004; Ossenkoppele 2012a). Most of the studies did not clearly report whether participants were recruited from secondary or tertiary outpatient care settings.

Open in table viewer
Table 3. Demographic and participant characteristics of participants included in analysis

Study

N / n (included in analysis)

Age converters (non‐converters)

Gender

(M %)

MMSE score

converters (non‐converters)

APOE Ɛ4 carrier (%)

MCI diagnostic criteria

Sampling

Sources of referral

Sources of recruitment

(setting)

Anchisi 2005

(Italy)

67/48

71.1 ± 73.9

(65.0 ± 9.0)

25 (52.1)

26.6 ± 1.7

(28.4 ± 1.1)

Not reported

Patterson 2010

Consecutive sample

(email from the author on 22nd October 2013)

GP surgeries

(email from the author on 22nd October 2013)

Outpatients from

4 University Departments (Milan, Brescia, Cologne and Dresden)

(email on 22nd October 2013)

Arnáiz 2001

(Sweden)

20/20

64.9 ± 8.3

(60.1 ± 8.4)

12 (60.0)

26.7 ± 1.8

(27.2 ± 2.9)

Not reported

Global Deterioration Scale

Consecutive sample

Not reported

Geriatric University clinic

Berent 1999

(USA)

20/20

70.2 ± 5.5

Total sample

13 (65.0)

26.0 ± 1.9

Total sample

Not reported

AAMI

criteria

Not

reported

Not reported

Cognitive disorders University clinic

Chételat 2003

(France)

19/17

73.0 ± 5.1

(67.8 ± 7.0)

8 (45.2)

26.3 ± 1.0

(27.8 ± 1.2)

Not reported

Petersen 2001

Not

reported

Not reported

Not reported

Clerici 2009

(Italy)

30/26

74.2 ± 6.9

Total sample

10 (33.3)

26.2 ± 1.7

Total sample

Not reported

Petersen 2004 and CDR = 0.5 criteria

Not

reported

GP surgeries or specialists (85%) or self referral (15%)

University Centre for Research and Treatment

Drzezga 2005

(Germany)

30/30

74.7 ± 4.7

(67.6 ± 2.0)

14 (46.7)

25.9 ± 2.1

(27.6 ± 1.5)

17 (56.7)

Petersen 1999 and CDR = 0.5 criteria

Not

reported

GP surgeries or

specialists or other institutions

University Research Unit

Fellgiebel 2007

(Germany)

16/16

69.5 ± 7.9

(68.8 ± 10.0)

9 (56.2)

24.3 ± 1.5

(27.3 ± 1.8)

Not reported

Petersen 1999

Consecutive sample

Not reported

University memory clinic

Galluzzi 2010

(Italy)

90/38

72.0 ± 7.1

Total sample

37 (41.1)

26.1 ± 1.8

Total sample

35 (38.9)

Petersen 1999

Consecutive sample

Not reported

Outpatient memory clinic

Herholz 2011

(USA)

ADNI participants

94/94

75.0 ± 7.6

Total sample

66 (70.2)

2671 ± 1.6

Total sample

Not reported

Petersen 2010 &

CDR = 0.5

Not

reported

Not reported

Multicentre (not specified)

Landau 2010

(USA)

ADNI participants

85/85

78.3 ± 7.5

(78.0 ± 7.4)

56 (65.9)

26.4 ± 1.7

(27.3 ± 1.6)

25 (29.4)

Petersen 2010 &

CDR=0.5

Not

reported

Not reported

Multicenter (not specified)

Mosconi 2004

(Italy)

37/37

69.0 ± 4.0

(63.0 ± 8.0)

17 (45.9)

23.9 ± 1.7

(28.1 ± 1.6)

16 (43.2)

Petersen 2010

Not

reported

Not reported

Not reported

Nobili 2008

(Italy)

36/33

77.3 ± 4.8

(74.6 ± 5.4)

13 (39.4)

69.0 ± 4.0

(63.0 ± 8.0)

Not reported on all MCI included in analysis

Petersen 2004

Not

reported

Not reported

Outpatients

Ossenkoppele 2012a

(Netherlands)

15/12

67.0 ± 7.0

Sample analysed

9 (75%)

Sample analysed

75.0 ± 7.6

Sample analysed

8 (66.7)

Sample analysed

Petersen 1999

Not

reported

Not

reported

Not reported

Ossenkoppele 2012b

(Netherlands)

30/12

64.0 ± 9.0

Total sample

23 (30.4)

Total sample

75.0 ± 7.6

Total sample

Not reported

Petersen 2001

Not

reported

Not

reported

Outpatient University memory clinic

Pardo 2010

(USA)

19/18 Reader 1

19/19 Reader 2

Mean 80.0 (range 54 ‐ 83)

Total sample

Not reported

Not reported

Not reported

Petersen 1999

Not

reported

Not reported

Geriatric, Research, Education and Medical Centre

Unclear

Schmand 2012

(Netherlands)

ADNI participants

89/89

Not reported on the sample with ¹⁸F‐FDG scan (N = 89)

Not reported on the sample with ¹⁸F‐FDG scan (N = 89)

Not reported on the sample with ¹⁸F‐FDG scan (N = 89)

Not reported on the sample with ¹⁸F‐FDG scan (N = 89)

Petersen 2010 &

CDR = 0.5

Not

reported

Not reported

Multicentre (not specified)

ADNI: Alzheimer's Disease neuroimaging initiative

Table 4 summarises the data regarding the threshold used, image scaling, brain region as potential Alzheimer's disease areas, ¹⁸F‐FDG dose and the time between ¹⁸F‐FDG injection and performing a PET scan, the number of ¹⁸F‐FDG‐positive participants at baseline and the number of converters during the follow‐up period.

Open in table viewer
Table 4. Index test and numbers of converters to Alzheimer’s disease dementia

Study

Threshold

(prespecified

Yes/No)

Image scaling

Discriminating brain area

Image analysis

Time between FDG injection and PET acquisition (min)

¹⁸F‐FDG dose

Number of ¹⁸F‐FDG positive

(%)

Number of converters (%)

Duration of follow‐up

Mean (months) / Maximum (years)

Anchisi 2005

(Italy)

rCGMglc of temporo‐parietal and posterior cingulate of 1.138

(No)

Regional sensorimotor

FDG uptake ratio (p 1730)

Bilateral parietal and posterior cingulate cortex

SPM99

Not reported

Not

reported

19 (40)

(calculated in RevMan5)

14 (29)

Median: 12 months

Range: 12 ‐ 27 months

Arnáiz 2001

(Sweden)

rCGMglc of left temporo‐parietal region 13 mm above the basal ganglia

(Model I)

(No)

Sensorimotor

area of the cortex 26 mm above the level of the basal ganglia (p 852)

Temporo‐parietal cortex

SPSS

(Herholz 1999)

60

Not

reported

8 (40)

(calculated in RevMan5)

9 (45)

36.5 months

Berent 1999

(USA)

rCGMglc of

diagnostic index based on Z‐scores of the parietal cortex

(No)

Thalamus (p 11)

Frontal, temporal, parietal and occipital regions normalised to the thalamus

3D‐SSP

Not reported

370 MBq

10 (50)

10 (50)

3 years

Chételat 2003

(France)

rCGMglc at Z‐score of > 3.09

Thresholding was set at 80% of whole brain mean of control subjects

(No)

FDG uptake normalised by and adjusted to the person’s global uptake (p 1375)

Right temporo‐parietal and posterior cingulate

SPM99

Not reported

Not

reported

7 (41)

right temporo‐parietal region

8 (47.0) posterior cingulate

7 (41)

18 months

Clerici 2009

(Italy)

rCGMglc

lower than the control group corresponding to a P value < 0.01 level)

(Yes)

Global counts were normalised by proportional scaling to remove confounding effects due to global changes

(Del Sole 2008)

Posterior gyrus cingulate and bilateral inferior frontal cortex

SPM(t)

45

185‐370 MBq

23 (88.5)

13 (50)

1.5 years aMCI group.

3 years snaMCI group

Drzezga 2005

(Germany)

rCGMglc at

Z‐score of > 1.64 (1‐tail) corresponding to a P value of 0.05 (1‐tail)

(Yes)

Not

reported

Orbitofrontal, prefrontal, premotor, central, parietal superior and inferior, occipital, temporal anterior, temporal posterior and posterior cingulate

3D‐SSP

30

370 MBq

13 (43)

12 (40)

16 ± 2 months

Fellgiebel 2007

(Germany)

rCGMglc at

significantly decreased Z‐score > 2 in more than 50 adjacent pixels

(Yes)

Sensorimotor

area of the cortex (transaxial images parallel to the intercomissural line)

(Fellgiebel 2004)

Parietal mesial or posterior cingulate and temporal regions

SPSS

(Fellgiebel 2004)

30

(Fellgiebel 2004)

180 MBq (Fellgiebel 2004)

7 (44)

4 (25)

19.6 ± 9.0

Galluzzi 2010

(Italy)

rCGMglc of

t sum > 11.090 (email from the author)

(Herholz 2002)

(Yes)

Cerebellum

Temporo‐parietal, hippocampus and posterior cingulate

SPSS

Not reported

Not

reported

28 (74)

14 (37)

20.6.6 ± 9.7

Herholz 2011

(USA)

ADNI participants

rCGMglc of

t sum > 11.090

(Herholz 2002)

(Yes)

Global cortex

Temporal and parietal lobes

PALZ (PMOD software)

30 ‐ 60

Not

reported

38 (40)

30 (32)

2 years

Landau 2010

(USA)

ADNI participants

rCGMglc of 1.21

(No)

Cerebellar vermis and pons

ROI interest were study‐independent, frequently associated with decline in AD and MCI. No further details.

SPM5

30 ‐ 60

Not

reported

51 (60)

28 (33)

1.9 ± 0.4 years Range: 2 ‐ 3 years

Mosconi 2004

(Italy)

rCMRglc significantly reduced in certain cerebral areas with emphasis on the inferior parietal lobule (IPL).

(No)

Global cortex

Precuneus, anterior and posterior cingulate, inferior parietal lobe, superior, middle and inferior frontal gyrus, on both hemispheres

SPM99

19 ± 3

110 ‐ 370 MBq

4 (10.5)

8 (22)

12 ± 0.6

Nobili 2008

(Italy)

Visual inspection

rCGMglc threshold not reported

(No)

Global cortex

25 VROI in each hemisphere

Computerized Brain Atlas (CBA;

Applied Medical Imaging©, Uppsala, Sweden)

45

370 MBq

11 (33)

11 (33)

21.1 ± 10.9 months

Ossenkoppele 2012a

(Netherlands)

Visual inspection and SUVr of ROIs (threshold not reported) (No)

Cerebellar grey matter

Frontal, parietal and latero‐temporal and medial temporal lobes and posterior cingulate

PMOD Alzheimer discrimination tool (PALZ)

45 ‐ 60

150 ± 17 MBq

4 (33)

4 (33)

30

Range: 2 ‐ 4 years

Ossenkoppele 2012b

(Netherlands)

Visual inspection and SUVr of ROIs (threshold not reported) (No)

Cerebellar grey matter (p 3)

Frontal, parietal, occipital, and latero‐temporal and medial temporal lobes and posterior cingulate

PMOD Alzheimer discrimination tool (PALZ)

45‐60

185 MBq

5 (42)

6 (50)

2 years

Pardo 2010

(USA)

Visual inspection

Hypomethabolism if < 50% of the cubes had the label MCI or normal

(Only SVM analysis used thresholds)

(No)

PET scans were adjusted to a whole‐brain mean activity and stereotactically normalised by using Neurostat (p 328)

Frontal, parietal, occipital, and latero‐temporal and medial temporal lobes and posterior cingulate

SVM

Not reported

5 mCi/70 kg

Reader 1:

6(32)

Reader 2:

10 (53)

8 (44)

9 (47)

3 years

Schmand 2012

(Netherlands)

ADNI participants

rCGM value of < 1.20

(Email from the author)

(Yes)

Not

reported

Right and left angular gyrus, bilateral posterior cingulate gyrus and left middle/inferior temporal gyrus

SPSS

Not reported

Not

reported

18 (20)

38 (43)

2.7 ± 0.9

Range: 0.5 ‐ 4.6 years

ADNI: Alzheimer's Disease neuroimaging initiative
RevMan5: Review Manager 5 software
SUVr: standardised uptake value ratio
VROI: volumetric region of interest

All included studies applied qualitative/visual evaluations of the PET scans. In particular, 12 of the 16 studies performed PET analysis based on the combination of visual analysis (qualitative) and rCGMr estimations (quantitative), while the remaining four studies referred only to visual PET inspections (qualitative‐only analysis). A range of different thresholds were applied (different rCGMr values, various brain regions studied as potential Alzheimer's disease areas, as well as different scaling). The threshold was prespecified in only six studies (Clerici 2009; Drzezga 2005; Fellgiebel 2007; Galluzzi 2010; Herholz 2011; Schmand 2012). ¹⁸F‐FDG positivity ranged from 10.5% (Mosconi 2004) to 74% (Galluzzi 2010) (Table 5). Conversion to ADD ranged from 22% (Mosconi 2004) to 50% (Berent 1999; Clerici 2009; Ossenkoppele 2012a).

Open in table viewer
Table 5. Summary of test accuracy at study level for conversion to Alzheimer’s disease dementia

Studies included in meta‐analysis

Study ID

Participants (n)

Sensitivity (%)

Specificity (%)

% of MCI with ¹⁸F‐FDG PET positivity

Anchisi 2005

48

93

82

40

Arnáiz 2001

20

67

82

40

Berent 1999

20

70

70

50

Chételat 2003

(temporo‐parietal brain region)

17

100

100

41

Clerici 2009

26

92

15

88

Drzezga 2005

30

92

89

43

Fellgiebel 2007

16

100

75

44

Galluzzi 2010

38

79

29

74

Herholz 2011 (ADNI study)

94

57

67

40

Mosconi 2004

37

38

97

10.5

Nobili 2008

33

82

91

33

Ossenkoppele 2012a

12

75

88

33

Ossenkoppele 2012b

12

83

100

42

Pardo 2010 (Reader 1)

18

25

60

32

Studies included only in descriptive analysis

Chételat 2003

(posterior cingulate brain region)

17

100

90

47

Landau 2010 (ADNI study)

85

75

47

60

Pardo 2010 (Reader 2)

17

33

30

53

Schmand 2012 (ADNI study)

98

24

82

20

ADNI: Alzheimer's Disease neuroimaging initiative

Duration of follow‐up was reported as the mean and standard deviation (SD), or the median, or a range of values.

Methodological quality of included studies

We assessed methodological quality using the QUADAS‐2 tool (Whiting 2011). We present the review authors’ judgements about each methodological quality item for each included study in the Characteristics of included studies table and Figure 2. The overall methodological quality of included study cohorts is summarised in Figure 3.


Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study


Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

In the participant selection domain, we considered five studies (Berent 1999; Chételat 2003; Herholz 2011; Landau 2010; Pardo 2010) to be at high risk of bias because the participants were not consecutively or randomly enrolled. We had excluded studies with a case‐control design because we only considered data on performance of the index test to discriminate between participants with MCI who convert to dementia and those who remained stable. We considered four studies (Arnáiz 2001; Galluzzi 2010; Nobili 2008; Schmand 2012) to be at unclear risk of bias due to poor reporting on sampling procedure and exclusion criteria. We considered the remaining seven (44%) studies to be at low risk of bias.

In the index test domain, we considered nine (56%) studies (Anchisi 2005; Arnáiz 2001; Berent 1999; Chételat 2003; Landau 2010; Nobili 2008; Ossenkoppele 2012a; Ossenkoppele 2012b; Pardo 2010) to be at high risk of bias because the threshold used was not prespecified and the optimal cut‐off level was determined from ROC analyses; therefore, the accuracy of the ¹⁸F‐FDG biomarker reported in these studies appeared to be an overestimate. We considered one study (Mosconi 2004) to be at unclear risk of bias due to poor reporting. We considered the remaining six (38%) studies to be at low risk of bias.

In the reference standard domain, we considered 11 studies (69%) to be at unclear risk of bias, mainly because they did not report whether clinicians conducting follow‐up were aware of the initial ¹⁸F‐FDG biomarker analysis results. One of the 11 studies (Fellgiebel 2007) did not clearly report the reference standards used for diagnosing ADD. We considered two studies (Herholz 2011; Pardo 2010) to be at high risk of bias because they failed to provide information on either item in this domain. We were not able to obtain the information about how the reference standard was obtained and by whom, due to poor reporting. We considered the remaining three studies (Berent 1999; Chételat 2003; Drzezga 2005) to be at low risk of bias.

In the flow and timing domain, we judged 10 studies (62%) to be at low risk of bias because all participants were accounted for in the analysis or the reasons for missing data were given, and/or the time interval between index test and reference standard was appropriate (duration of follow‐up longer than one year). We considered three studies (19%) to be at high risk of bias, either because a large number of participants were excluded from the analyses (Anchisi 2005; Ossenkoppele 2012b) or because the interval between index test and reference standard was shorter than one year for some participants (Nobili 2008). We considered the remaining three studies (Clerici 2009Fellgiebel 2007; Pardo 2010) to be at unclear risk of bias due to poor reporting.

For assessment of applicability, we had no concern that the included participants and setting, the conduct and interpretation of the index test, and the target condition (as defined by the reference standard) in each of the included studies did not meet the review question. It should be noted that the lack of concern about applicability of the three domains mentioned above was based on the inclusion criteria set for the review. Considering the level of heterogeneity with respect to the index test (Table 4), it appears that the judgement about applicability may be optimistic.

Findings

The key characteristics of each study are summarised in Table 3 and Table 4. The summary of main results for 16 included studies is presented in the summary of findings Table.

¹⁸F‐FDG PET for Alzheimer's disease dementia

We identified three studies that recruited participants from the same Alzheimer’s disease Neuroimaging Initiative (ADNI) cohort (Herholz 2011; Landau 2010; Schmand 2012). The largest ADNI study (Herholz 2011) was included in the analysis.

The Chételat 2003 study assessed two different discriminating brain regions (temporo‐parietal and posterior cingulate). We included data for the temporo‐parietal region, since it represents a typical and 'wider' brain area that is potentially involved in conversion to ADD.

In Pardo 2010 the PET scan was interpreted by two raters. As already mentioned, visual/qualitative reading of the ¹⁸F‐FDG PET scan is accepted as the general standard in nuclear medicine (similar to other imaging modalities). It is, therefore, heavily dependent on the physician’s prior experience and training. The quantitative assessment plays a rather complementary role in PET evaluations. The physician usually decides based on his or her own visual‐qualitative assessment. We included data from Reader 1 because it is very likely that Reader 1 is more experienced in interpreting brain PET scans. This rater provided more accurate estimates of the diagnostic accuracy of Alzheimer's disease type imaging comparing to diagnosis on follow‐up.

Individual study estimates of sensitivity and specificity are shown in Figure 4 for each of the 14 studies (150 cases and 271 non‐cases) that evaluated ADD. The sensitivity values ranged from 25% to 100% while the specificity values ranged from 29% to 100%. The criteria for ¹⁸F‐FDG PET positivity varied between studies. Ten studies performed PET analysis based on the combination of visual analysis (qualitative) and rCGMr estimations (quantitative), and four studies only referred to visual PET inspections (qualitative‐only analysis). A range of different thresholds were used. The different brain regions were studied as potential Alzheimer's disease areas as well as different scaling.


Forest plot of ¹⁸F‐FDG PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 temporo‐parietal region and Herholz 2011 ADNI study).

Forest plot of ¹⁸F‐FDG PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 temporo‐parietal region and Herholz 2011 ADNI study).

The summary ROC curve presenting the accuracy of ¹⁸F‐FDG PET across the 14 studies is shown in Figure 5. Because of the variation in thresholds and measurement of ¹⁸F‐FDG uptake in brain regions, we did not estimate a summary sensitivity and specificity. However, we derived estimates of sensitivity and likelihood ratios at fixed values of specificity from the HSROC model we fitted to produce the summary ROC curve. At the median specificity of 82%, the estimated sensitivity was 76% (95% confidence interval (CI): 53.80 to 89.70), the positive likelihood ratio was 4.03 (95% CI: 2.97 to 5.47), and the negative likelihood ratio was 0.34 (95% CI: 0.15 to 0.75).


Summary ROC plot of ¹⁸F‐FDG PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 temporo‐parietal region and Herholz 2011 ADNI study).

Summary ROC plot of ¹⁸F‐FDG PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 temporo‐parietal region and Herholz 2011 ADNI study).

In order to demonstrate whether the choice of ADNI study or discriminating brain region or reader assessment made any differences to the pooled estimate, we performed five additional analyses. At the median specificity of 82%, the estimated sensitivity ranged from 74% to 77%. There was no impact on our findings.

¹⁸F‐FDG PET for all types of dementia (combined AD and non‐AD dementia)

Five studies (64 cases and 42 non‐cases) evaluated the accuracy of ¹⁸F‐FDG PET for all dementia (Figure 6) in addition to evaluating ADD. The sensitivity values ranged from 46% to 95%, while the specificity values ranged from 29% to 100%. Two studies used a semi‐quantitative threshold while the other three used visual inspection to determine test positivity. Meta‐analysis was not performed because the studies were too few and their sample sizes were too small. Figure 7 shows study specific estimates of sensitivity and specificity in ROC space together with their 95% confidence intervals.


Forest plot of ¹⁸F‐FDG PET Conversion from MCI to All dementia (with Reader 1 Pardo 2010).

Forest plot of ¹⁸F‐FDG PET Conversion from MCI to All dementia (with Reader 1 Pardo 2010).


Summary ROC plot of ¹⁸F‐FDG PET Conversion from MCI to All dementia (with Reader 1 Pardo 2010).

Summary ROC plot of ¹⁸F‐FDG PET Conversion from MCI to All dementia (with Reader 1 Pardo 2010).

The estimate of sensitivity and specificity for the Pardo 2010 study was 46% and 80% for Reader 1, and 64% and 0% for the Reader 2.

Investigation of heterogeneity

We visually inspected the summary of ROC space (see Figure 5). The results of the included studies show a great deal of heterogeneity. The values for both sensitivity and specificity were mainly over 80% in a number of studies (Anchisi 2005; Chételat 2003; Drzezga 2005; Fellgiebel 2007; Fellgiebel 2007; Ossenkoppele 2012a; Ossenkoppele 2012b). In the remaining studies those values were less than 80% or a sensitivity value higher than 80% was accompanied by a very low specificity value and vice versa (Clerici 2009; Galluzzi 2010; Mosconi 2004).

Interpretation of PET scan

There was little evidence that the HSROC model which allowed the shape, accuracy and threshold parameters to differ between those studies using a combination of visual inspection and quantitative rCGMr evaluation interpretation and those using visual‐only interpretation provided an improvement in fit over the basic model where a single HSROC curve (Chi² = 1.7; df = 3; P = 0.64).

Prespecifying threshold regarding rCGM

This investigation relates to the 10 studies in which semi‐quantitative estimation was used. Meta‐regression revealed little evidence of an improvement in fit between the basic model and the model that allowed the parameters to differ between those studies that did and did not specify a threshold (Chi² = 93.5 ‐ 88.3 = 5.2; df = 3; P = 0.16).

The remaining planned investigations (for instance, the effect of the spectrum of participants, referral centres, ¹⁸F‐FDG uptake reduction brain regions, inadequate blinding or loss to follow‐up) were not possible due to the limited number of studies available for each analysis. We conducted sensitivity analyses for type of clinical diagnosis for MCI and for type of reference standard.

Sensitivity analyses

Of the 14 studies that evaluated ¹⁸F‐FDG PET for ADD, 12 used Petersen criteria for diagnosing MCI. To explore the impact of type of diagnostic criteria on the summary estimates, we excluded two studies (Arnáiz 2001; Berent 1999) that used the Global Deterioration Scale and AAMI respectively as the diagnostic criteria. There was no impact on our findings.

Eleven studies used NINCDS‐ADRDA as the reference standard. To explore the impact of the type of reference standard on the summary estimates, we excluded one study (Fellgiebel 2007) that used CDR = 1, one study (Herholz 2011) that used a non‐specified clinical dementia rating and the Alzheimer's Disease Assessment Scale (ADAS‐cog), and one study (Pardo 2010) where the reference standard used was not reported. There was no impact on our findings.

Discussion

Summary of main results

For this review we identified 16 studies assessing the diagnostic accuracy of ¹⁸F‐FDG PET for conversion from mild cognitive impairment (MCI) to Alzheimer's disease dementia or to other forms of dementia. The key results are presented in summary of findings Table. Due to variation in thresholds and measurement of ¹⁸F‐FDG uptake in brain regions, we estimated a summary ROC curve for studies that evaluated conversion from MCI to Alzheimer's disease dementia. We did not estimate a summary sensitivity and specificity on the curve because with mixed thresholds a summary point lacks a clinically meaningful interpretation. At the median specificity of 82%, the estimated sensitivity of ¹⁸F‐FDG PET for conversion to Alzheimer's disease dementia derived from the summary ROC curve was 76% (95% confidence interval (CI): 53.80 to 89.70).

We did not pool data for the five studies that evaluated conversion from MCI to all types of dementia. The sensitivities were between 46% and 95%, and specificities were between 29% and 100%.

¹⁸F‐FDG PET for Alzheimer’s disease dementia

We identified three Alzheimer's Disease Neuroimaging Initiative (ADNI) studies (Herholz 2011; Landau 2010; Schmand 2012). ADNI is a multicentre project with approximately 50 medical centres and university sites across the United States and Canada; it has the primary goal of evaluating magnetic resonance imaging (MRI), positron emission tomography (PET), cerebrospinal fluid (CSF), and clinical measures acquired serially over two to three years. The aims of the three studies differed. Herholz 2011 demonstrated the validity of ¹⁸F‐FDG PET scores as an imaging biomarker for clinical trials to prevent dementia in people with MCI. Longitudinal ADNI data showed that PET scores provide much higher test–retest reliability than the Alzheimer's Disease Assessment Scale (ADAS‐cog), which is the most frequent outcome measure used in dementia trials. They argue that a PET scan evaluation may also provide a useful measure of disease progression, as the power for one‐year studies in people with MCI is similar to what they provide for two‐year studies based on progression of ADAS‐cog scores. Landau 2010 evaluated the prognostic ability of genetic, CSF, neuroimaging, and cognitive measurements obtained in the same participants. The authors concluded that baseline ¹⁸F‐FDG PET and episodic memory predict conversion to ADD, whereas P‐tau181p/Aβ1‐42 and, marginally, ¹⁸F‐FDG PET predict longitudinal cognitive decline. Schmand 2012 examined the value of neuropsychological assessment, structural MRI, CSF biomarkers, and ¹⁸F‐FDG PET scanning with respect to prediction of conversion from MCI to ADD. The authors’ practical conclusions were that these markers are not very useful, either with respect to the diagnosis of ADD beyond the age of 75, or with respect to the prediction of conversion to ADD within a few years. In younger people, all four techniques were equally informative, except for ¹⁸F‐FDG PET, which lost its predictive potential. We created 2 x 2 tables cross‐relating index test results of the reference standard for each ADNI study. The sensitivity and specificity values vary in those studies. All three studies used a combination of visual inspection and quantitative (rCGMr) evaluation of the PET scan, but the threshold, brain regions, image scaling and analysis used differed between them. The number of participants with positive ¹⁸F‐FDG test at baseline also varied (Table 5). Although the samples were recruited from the same ADNI cohort, it appears that test accuracy varies due to characteristics of the participants and index test domains. We have included the largest ADNI study (Herholz 2011) in the analysis.

The Chételat 2003 study assessed both the temporo‐parietal and posterior cingulate regions. We included data for the temporo‐parietal region. Although the posterior cingulate cortex represents the brain area in which hypometabolism occurs in the earliest disease stage (Lucignani 2006), the bilateral temporo‐parietal hypometabolism is the standard ¹⁸F‐FDG PET finding, which is also highly correlated with the pathologic diagnosis of ADD (Hoffman 2000). Moreover, it has been suggested that hypometabolism or hypoperfusion or both in the inferior parietal lobules are the most reliable functional indicators of progression from amnestic MCI to ADD, while changes in the posterior cingulate cortex are most likely non‐specific (Schroeter 2009).

In addition, in the Pardo 2010 study two independent blinded raters with experience in PET evaluated each image as normal or as having an Alzheimer's disease or fronto‐temporal dementia (FTD) pattern. The physicians usually based the decision on their own visual‐qualitative assessment. We therefore included data from Reader 1 because it is very likely that Reader 1 was more experienced in interpreting brain PET scans.

In order to explore whether the choice of ADNI study or discriminating brain region or reader assessment make any differences to the pooled estimate, we performed five additional analyses. The estimated sensitivity values at the median specificity of 82% were similar to the sensitivity value in our analysis (ranging from 74% to 77%).

In all studies ¹⁸F‐FDG PET data evaluation involved qualitative (visual) assessment of the PET scans, and in some of them this visual analysis was supported by the addition of semi‐quantitative data (mainly through rCGMr estimations), derived from quantification of brain PET images. In particular, 12 of the 16 studies applied a combination of qualitative and quantitative assessment, while the remaining four were based only on visual data analysis. This discrepancy introduces a degree of heterogeneity into our analysis. Moreover, and as already mentioned, the application of different thresholds in PET studies for the specific brain area examined and the glucose metabolism ratio applied as a threshold for discriminating a pathological from a normal scan, introduces a further degree of heterogeneity. The use of quantification methods is not the norm in routine clinical practice. However, their deployment has become more popular in recent years with the development of new software, which renders quantification simpler. In general terms, there are two categories of quantification methods: the ‘traditional’ region of interest (ROI) based method, which are manually operated, and the newer voxel‐based, which provide relatively automated results. The ROI‐based method has the disadvantage of being operator‐dependent and therefore time‐intensive. This approach requires a high level of neuro‐anatomical knowledge by the physician, which is particularly demanding in the case of ADD, in which several specific brain areas need to be assessed. The advent of newer voxel‐by‐voxel‐based techniques ((like SPM (statistical parametric mapping), 3D‐SSP (3D stereotactic surface projection) and PMOD software package, which were used in the majority of the studies involved in this meta‐analysis)) that permit normalisation of brain images into a stereotactic space and are less biased and provide higher spatial resolution than the ‘classical’ ROI‐based semi‐quantitative methods, increase the physician’s diagnostic accuracy in the interpretation of a brain PET scan, leading to more reliable results (Lucignani 2006; Patterson 2010; Silverman 2009). In this context, the application of different quantification strategies is another factor that could introduce heterogeneity between studies in this review.

Duration of follow‐up is also important in predicting conversion to ADD. The variability in the duration of follow‐up was considerable in the included studies (Table 4). The normal conversion rate of MCI to ADD is between 8% and 16% per year (Mitchell 2009), but the conversion rates in our review ranged from 22% to 50%. There was a positive correlation between follow‐up time and percentage of conversion. For example, in Berent 1999, Clerici 2009 and Ossenkoppele 2012a, the conversion rate from MCI was 50%, with the duration of follow‐up ranging from two to three years. On the other hand, the lowest conversion rates of 22% and 25% were found in the studies (Mosconi 2004; Fellgiebel 2007) with the shortest duration of follow‐up (mean 12 ± 0.6 and 19.6 ± 9.0 months respectively). Conversion rates from MCI could have influenced the test results. However, we were not able to formally investigate the effect of duration of follow‐up on the accuracy of the ¹⁸F‐FDG PET, due to a high level of heterogeneity and the small number of included studies.

We used the QUADAS‐2 tool to assess methodological quality. We identified methodological issues in all the QUADAS‐2 domains. Assessment of quality is dependent on adequate reporting, and QUADAS scoring was challenging because of insufficient detail. Poor reporting about sampling procedures led mainly to an unclear risk of bias or contributed to a high risk of bias in the participant selection domain. Although the reference standard was regarded as adequate to correctly classify the target condition, poor reporting on blinding of dementia assessors led to an unclear risk of bias in the reference domain in the majority of included studies.

¹⁸F‐FDG PET for other forms of dementia

We were not able to evaluate the accuracy of the index test for conversion from MCI to non‐Alzheimer’s disease dementia, due to insufficient data. Only five included studies (Clerici 2009; Galluzzi 2010; Ossenkoppele 2012a; Ossenkoppele 2012a; Pardo 2010) reported a small number of converters to non‐Alzheimer’s dementia. Clerici 2009 and Galluzzi 2010 reported six converters respectively (two FTD; four Lewy body dementia (LBD); six non‐Alzheimer's Disease non‐specified). The remaining three studies reported only one converter each (three FTD).

As a result of the information available from these five studies, we considered the new target condition (Figure 6 and Figure 7). We did not perform meta‐analysis because the studies were few and small, and there was considerable heterogeneity. Our review therefore suggests that there is inadequate evidence available at present to address the accuracy of the ¹⁸F‐FDG PET scan to identify those people with MCI who will convert to all types of dementia (combined Alzheimer's and non‐Alzheimer's disease dementia).

Strengths and weaknesses of the review

One strength of our review is that the included studies represent probably the majority of studies about this question. We conducted an extensive electronic search strategy, and where a study did not present all relevant data we contacted the study authors and obtained usable data for seven studies (Anchisi 2005; Clerici 2009; Galluzzi 2010; Landau 2010; Ossenkoppele 2012a; Ossenkoppele 2012b; Schmand 2012).

Our review has some limitations. First, the clinical diagnosis of ADD or other forms of dementia is imperfect, so that the findings from studies with post‐mortem confirmation of the diagnosis are more convincing than those from studies with a clinical diagnosis in the evaluation of the accuracy of PET imaging for the early detection of the dementia process in people with MCI.

The findings are based on studies with poor reporting and the majority of included studies were at an unclear risk of bias, mainly for the reference standard and for the participant selection domains. Although there was relative homogeneity in terms of scan acquisition protocol, the process followed and the fundamental evaluation approach that people demonstrating particular brain regions with reduced ¹⁸F‐FDG uptake represent potential MCI converters to ADD, according to the assessment of Index test domain more than 50% of studies were of poor methodological quality due to lack of a prespecified threshold. Index tests that require subjective interpretation (such as ¹⁸F‐FDG PET) are at high risk of bias for the index test domain compared to more objective tests with widely‐accepted thresholds. The Pardo 2010 study illustrates poor agreement between both experienced raters for the correlation between diagnosis at three‐year follow‐up and baseline ¹⁸F‐FDG PET scans classified as PET pattern (normal, ADD, and FTD).

Due to the limited number of included studies and the meta‐analytic techniques, we were unable to formally assess the sources of heterogeneity, or to disentangle the reasons for the test accuracy results varying between studies, so even those factors that we have been able to test cannot be excluded as reasons for the heterogeneity.

Applicability of findings to the review question

We had no concerns that the included participants and setting, the conduct and interpretation of the index test, and the target condition (as defined by the reference standard) in each of the included studies did not address the review question: Could ¹⁸F‐FDG PET scan predict whether people with MCI would convert clinically to dementia? However, due to limited number of included studies and levels of heterogeneity in the three domains mentioned above, it was difficult to determine to what extent the findings from this meta‐analysis can be applied to clinical practice.

Study flow diagram.
Figures and Tables -
Figure 1

Study flow diagram.

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study
Figures and Tables -
Figure 2

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies
Figures and Tables -
Figure 3

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Forest plot of ¹⁸F‐FDG PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 temporo‐parietal region and Herholz 2011 ADNI study).
Figures and Tables -
Figure 4

Forest plot of ¹⁸F‐FDG PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 temporo‐parietal region and Herholz 2011 ADNI study).

Summary ROC plot of ¹⁸F‐FDG PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 temporo‐parietal region and Herholz 2011 ADNI study).
Figures and Tables -
Figure 5

Summary ROC plot of ¹⁸F‐FDG PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 temporo‐parietal region and Herholz 2011 ADNI study).

Forest plot of ¹⁸F‐FDG PET Conversion from MCI to All dementia (with Reader 1 Pardo 2010).
Figures and Tables -
Figure 6

Forest plot of ¹⁸F‐FDG PET Conversion from MCI to All dementia (with Reader 1 Pardo 2010).

Summary ROC plot of ¹⁸F‐FDG PET Conversion from MCI to All dementia (with Reader 1 Pardo 2010).
Figures and Tables -
Figure 7

Summary ROC plot of ¹⁸F‐FDG PET Conversion from MCI to All dementia (with Reader 1 Pardo 2010).

18FFDG‐PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 temporo‐parietal region and Herholz 2011 ADNI study).
Figures and Tables -
Test 1

18FFDG‐PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 temporo‐parietal region and Herholz 2011 ADNI study).

18FFDG‐PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 temporo‐parietal region and Landau 2010 ADNI study).
Figures and Tables -
Test 2

18FFDG‐PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 temporo‐parietal region and Landau 2010 ADNI study).

18FFDG‐PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 temporo‐parietal region and Schmand 2012 ADNI study).
Figures and Tables -
Test 3

18FFDG‐PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 temporo‐parietal region and Schmand 2012 ADNI study).

18FFDG‐PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 posterior cingulate region and Herholz 2011 ADNI study).
Figures and Tables -
Test 4

18FFDG‐PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 posterior cingulate region and Herholz 2011 ADNI study).

18FFDG‐PET Conversion from MCI to AD (with Reader 2 Pardo 2010, Chetelat 2003 temporo‐parietal region and Herholz 2011 ADNI study).
Figures and Tables -
Test 5

18FFDG‐PET Conversion from MCI to AD (with Reader 2 Pardo 2010, Chetelat 2003 temporo‐parietal region and Herholz 2011 ADNI study).

18FFDG‐PET Conversion from MCI to AD (with Reader 2 Pardo 2010, Chetelat 2003 posterior cingulate region and Herholz 2011 ADNI study).
Figures and Tables -
Test 6

18FFDG‐PET Conversion from MCI to AD (with Reader 2 Pardo 2010, Chetelat 2003 posterior cingulate region and Herholz 2011 ADNI study).

18FFDG‐PET Conversion from MCI to All dementia (with Reader 1 Pardo 2010).
Figures and Tables -
Test 7

18FFDG‐PET Conversion from MCI to All dementia (with Reader 1 Pardo 2010).

18FFDG‐PET Conversion from MCI to All dementia (with Reader 2 Pardo 2010).
Figures and Tables -
Test 8

18FFDG‐PET Conversion from MCI to All dementia (with Reader 2 Pardo 2010).

Summary of findings Summary of findings table

What is the diagnostic accuracy of ¹⁸F‐FDG PET biomarker for detecting Alzheimer's disease, and predicting progression to dementia, in people with mild cognitive impairment

Descriptive

Participant population

Participants diagnosed with MCI at baseline using any of the Petersen criteria or CDR = 0.5 or any 16 definitions included by Matthews 2008

Sources of referral

  • GP surgeries or specialists or other institutions or self referral (n = 3)

  • Not reported (n = 13)

MCI criteria

  • Petersen criteria (with or without CDR = 0.5 criterion) (n = 14)

  • AAMI criteria (n = 1)

  • Global Deterioration Scale (n = 1)

Sampling procedure

  • Consecutive or random (n = 3)

  • Unclear (n = 13)

Prior testing

The only testing prior to performing the ¹⁸F‐FDG PET scans was the application of diagnostic criteria for identifying participants with MCI

Sources of recruitment

  • University departments, clinics or research centres (n = 7)

  • Multicentre, not specified (n = 3)

  • Outpatient memory clinic or outpatients (sources not specified) or Geriatric, Research, Education and Medical centre (n = 3)

  • Not reported (n = 3)

Index tests

¹⁸F‐FDG PET

Threshold prespecified at baseline

  • Yes (n = 6)

  • No (n = 10)

PET scan interpretation

  • Combination of visual inspection and quantitative (rCGMr) evaluation (n = 12)

  • Only visual PET scan interpretation (n = 4)

Threshold

Almost all included studies referred to ratios of cerebral glucose metabolism (rCGMr) and not to absolute numbers. They used a range of different thresholds (different brain regions studied as potential AD areas as well as different scaling).

Most of the studies (12/16) performed PET analysis based on the combination of visual analysis (qualitative) and rCGMr estimations (quantitative). The rest (4/16) only referred to visual PET inspections (qualitative‐only analysis).

¹⁸F‐FDG hypometabolism regions

Authors used brain regions that are expected to be affected by AD. In these terms, all studies involved temporo‐parietal lobes and most of them (12/16) also included the posterior cingulate metabolism in their assessment. 7 studies also involved part of the frontal lobes in their evaluations.

Reference standard

For Alzheimer’s disease dementia:

  • NINCDS‐ADRDA (n = 13)

  • Other (n = 3)

Target condition

Conversion from MCI to Alzheimer’s disease dementia or any other forms of dementia.

Included studies

Prospectively well‐defined cohorts with any accepted definition of MCI (as above). 16 included studies (N=697 participants) were identified. 3 studies included ADNI participants. Number included in analysis was 421 from 14 studies.

Quality concerns

QUADAS‐2 scoring was challenging due to insufficient details. Poor reporting about sampling procedure led mainly to unclear risk of bias or contributed to high risk of bias in the participant selection domain. Although the reference standard was regarded as adequate to correctly classify the target condition, poor reporting on blinding of dementia assessors determined unclear risk of bias in the reference domain in most of the included studies. According to the assessment of Index test domain, more than 50% of studies were of poor methodological quality due to lack of prespecified threshold.

Limitations

Limited investigation of heterogeneity due to insufficient number of studies. Lack of common thresholds.

Test

Studies

Cases/ Participants

Median specificity from included studies

Sensitivity

(95% CI)1 at median specificity

Consequences in a cohort of 1000

Median percentage converting % (range)2

Missed cases3

Over

Diagnosed3

Alzheimer's disease dementia

¹⁸F‐FDG PET

14

150/421

82

76 (54 to 90)

38.5 (22 ‐ 50)

174

285

Key feature of the results:

The results of the included studies show a great deal of heterogeneity, encompassing both values which would render the technology “useless” and some which indicate a valuable diagnostic tool. The sensitivity values ranged from 25% to 100%, while the specificity values ranged from 29% to 100%. The values for both sensitivity and specificity were mainly over 80% in 7 included studies. In the remaining 7 studies those values were less than 80% or a sensitivity value higher than 80% was accompanied by a very low specificity value and vice versa.

Investigation of heterogeneity:

We investigated the effects of interpretation of PET scan (combination of visual inspection and quantitative rCGMr evaluation versus those that used visual interpretation only) and use of the prespecified threshold on the test results. There was no impact on our findings.

The remaining planned investigations (e.g. the effect of spectrum of participants, referral centres, ¹⁸F‐FDG reduction brain regions, inadequate blinding or loss to follow‐up) were not possible due to the limited number of studies available for each analysis. We conducted sensitivity analyses for type of clinical diagnosis for MCI and for type of reference standard. There was no impact on our findings.

Conclusions:

Given the considerable variability and specificity values, the heterogeneity in the conduct and interpretation of the test, and lack of defined thresholds for determination of test positivity in the included studies, the current evidence does not support the routine use of ¹⁸F‐FDG PET scan in clinical practice. ¹⁸F‐FDG PET scan is a high‐cost investigation, and it is therefore important to clearly demonstrate its accuracy and to standardise the process of ¹⁸F‐FDG PET diagnostic modality prior to its being widely used. Future studies with more uniform approaches to thresholds, analysis and study conduct may provide a more homogeneous estimate than the one that has been available from the included studies we have identified.

1 Meta‐analytic estimate of sensitivity derived from the HSROC model at a fixed value of specificity. We did not compute summary estimates of sensitivity and specificity because the studies that contributed to the estimation of the summary ROC curve used various thresholds.

2 We derived the median proportion converting (reported as a percentage) and range using all the studies included in the analysis for each target condition.

3 We computed missed and over‐diagnosed numbers using the median proportion converting to each target condition.

Figures and Tables -
Summary of findings Summary of findings table
Table 1. Classification systems for describing mild cognitive impairment according to Matthews 2008

System

Description

Impairment

Age‐related cognitive change

ACMI

Age‐consistent memory

impairment

Memories aging in accord with normative expectations; individual aged 50 to 79 and reports a decline in memory verified using objective memory testing performance (within ± 1 standard deviation of aged norms on 75% of tests (memory) administered); preserved general functioning.

ARCD

Age‐related cognitive decline

Objectively identified decline in memory and cognitive functioning considered to be a normal consequence of aging.

Category systems

SMC

Subjective memory complaint

Self‐reported decline in memory.

MMSE MCI

Mini‐Mental State Examination

Classification based on MMSE cut‐offs (maximum score 30). 'Normal' defined as a score from 27 to 30; MCI defined as a score from 22 to 26; and 'impaired' defined as a score < 21.

Pathological decline

MNCD

Mild neurocognitive disorder

Impairment resulting from a general medical condition; reported decline in cognitive functioning supported by formal testing; deficits observed in at least 2 areas of cognitive functioning; interference in social, occupational, or other areas of functioning.

CIND

Cognitive impairment no dementia

Cognitive impairment in one or more domains (including memory and non‐memory domains) that can have a variety of aetiologies.

BSF

Benign senescent

Forgetfulness

Impairment in remote memory and intact recent memory; awareness of the impairment and use of compensatory

strategies.

AAMI

Age‐associated memory impairment

Subjective and objective memory loss associated with normal aging; individual aged 50 and older and shows adequate intellectual function (i.e. without dementia).

MD

Minimal dementia

Cognitive impairment in memory and minor and variable errors in orientation; no evidence of impaired occupational or social functioning; self care unimpaired.

LCD

Limited cognitive disturbance

Reported decline in memory with use of compensatory strategies; occasional forgetfulness (e.g. names, places); 1 or 2 errors on cognitive (memory) testing.

QD

Questionable dementia

Impaired memory and non‐memory test performance; no significant activity of daily living or instrumental activity of daily living interference.

AACD

Age‐associated cognitive decline

Self‐ or informant report of cognitive decline (gradual and present for 6 months); objective difficulties in any of the following domains: learning and memory, attention and concentration, thinking, language, and visuospatial functioning

MCD

Mild cognitive disorder

Decline in cognitive performance, including memory impairment and learning or concentration difficulties; complaint confirmed by cognitive tests; may precede, accompany, or follow a wide variety of infections or physical disorders.

Mayo Clinic Criteria

N‐MCI

Non‐amnestic mild cognitive impairment

Subjective memory complaint; normal general cognitive function; normal range of activities of daily living; objective impairment in 1 or more domains other than memory.

A‐MCI

Amnestic mild cognitive impairment

Subjective memory complaint; normal general cognitive function; normal range of activities of daily living; impaired memory performance; normal non‐memory test performance

M‐MCI

Multiple mild cognitive impairment

Subjective memory complaint; normal general cognitive function; normal range of activities of daily living; impaired memory and nonmemory test performance

Figures and Tables -
Table 1. Classification systems for describing mild cognitive impairment according to Matthews 2008
Table 2. Criteria for the diagnosis of mild cognitive impairment

Petersen 1999

  1. Memory complaint, preferably corroborated by an informant.

  2. Impaired memory function compared for age and education.

  3. Preserved general cognitive function.

  4. Intact activities of daily living.

  5. Not demented.

Petersen 2004

 

4 subtypes of MCI were identified:

  1. Amnesic mild cognitive impairment, single domain – isolated memory impairment of more than 1.0 SD compared with the age‐ and education‐specific norms, and no difficulty in any other area of cognitive functioning.

  2. Amnesic mild cognitive impairment, multiple domain – 2 or more cognitive domains are impaired, 1 of which is memory impairment (impairment of more than 1.0 SD below the mean of the respective age‐ and education‐matched population).

  3. Non‐amnesic mild cognitive impairment, single domain – impairment in a single domain other than memory of more than 1.0 SD.

  4. Non‐amnesic mild cognitive impairment, multiple domains – impairments in 2 or more domains of more than 1.0 SD but no memory impairment.

 

All 4 subtypes of mild cognitive impairment also have to meet the following criteria:

a)  the presence of a complaint about memory – participants or informants (or both) reporting memory impairment.

b)  intact ability to perform activities of daily living – forgetfulness not compromising overall functional ability; impairment owing to physical disease not sufficient for exclusion.

c)   absence of dementia – assessed by DSM–IV criteria.

 

Morris 1993

 

Clinical Dementia Rating (CDR) is a scale used to classify patients along a continuum from normal aging through Alzheimer’s disease. This scale describes a continuum from normal (CDR 0) through questionable dementia or MCI (CDR 0.5) to mild (CDR 1), moderate (CDR 2), and severe (CDR 3) dementia.

 

Patient's cognitive and functional performances are assessed in 6 areas: memory, orientation, judgement and problem solving, community affairs, home and hobbies, and personal care. Scores in each of these areas are combined to obtain the total score.

MCI: mild cognitive impairment

Figures and Tables -
Table 2. Criteria for the diagnosis of mild cognitive impairment
Table 3. Demographic and participant characteristics of participants included in analysis

Study

N / n (included in analysis)

Age converters (non‐converters)

Gender

(M %)

MMSE score

converters (non‐converters)

APOE Ɛ4 carrier (%)

MCI diagnostic criteria

Sampling

Sources of referral

Sources of recruitment

(setting)

Anchisi 2005

(Italy)

67/48

71.1 ± 73.9

(65.0 ± 9.0)

25 (52.1)

26.6 ± 1.7

(28.4 ± 1.1)

Not reported

Patterson 2010

Consecutive sample

(email from the author on 22nd October 2013)

GP surgeries

(email from the author on 22nd October 2013)

Outpatients from

4 University Departments (Milan, Brescia, Cologne and Dresden)

(email on 22nd October 2013)

Arnáiz 2001

(Sweden)

20/20

64.9 ± 8.3

(60.1 ± 8.4)

12 (60.0)

26.7 ± 1.8

(27.2 ± 2.9)

Not reported

Global Deterioration Scale

Consecutive sample

Not reported

Geriatric University clinic

Berent 1999

(USA)

20/20

70.2 ± 5.5

Total sample

13 (65.0)

26.0 ± 1.9

Total sample

Not reported

AAMI

criteria

Not

reported

Not reported

Cognitive disorders University clinic

Chételat 2003

(France)

19/17

73.0 ± 5.1

(67.8 ± 7.0)

8 (45.2)

26.3 ± 1.0

(27.8 ± 1.2)

Not reported

Petersen 2001

Not

reported

Not reported

Not reported

Clerici 2009

(Italy)

30/26

74.2 ± 6.9

Total sample

10 (33.3)

26.2 ± 1.7

Total sample

Not reported

Petersen 2004 and CDR = 0.5 criteria

Not

reported

GP surgeries or specialists (85%) or self referral (15%)

University Centre for Research and Treatment

Drzezga 2005

(Germany)

30/30

74.7 ± 4.7

(67.6 ± 2.0)

14 (46.7)

25.9 ± 2.1

(27.6 ± 1.5)

17 (56.7)

Petersen 1999 and CDR = 0.5 criteria

Not

reported

GP surgeries or

specialists or other institutions

University Research Unit

Fellgiebel 2007

(Germany)

16/16

69.5 ± 7.9

(68.8 ± 10.0)

9 (56.2)

24.3 ± 1.5

(27.3 ± 1.8)

Not reported

Petersen 1999

Consecutive sample

Not reported

University memory clinic

Galluzzi 2010

(Italy)

90/38

72.0 ± 7.1

Total sample

37 (41.1)

26.1 ± 1.8

Total sample

35 (38.9)

Petersen 1999

Consecutive sample

Not reported

Outpatient memory clinic

Herholz 2011

(USA)

ADNI participants

94/94

75.0 ± 7.6

Total sample

66 (70.2)

2671 ± 1.6

Total sample

Not reported

Petersen 2010 &

CDR = 0.5

Not

reported

Not reported

Multicentre (not specified)

Landau 2010

(USA)

ADNI participants

85/85

78.3 ± 7.5

(78.0 ± 7.4)

56 (65.9)

26.4 ± 1.7

(27.3 ± 1.6)

25 (29.4)

Petersen 2010 &

CDR=0.5

Not

reported

Not reported

Multicenter (not specified)

Mosconi 2004

(Italy)

37/37

69.0 ± 4.0

(63.0 ± 8.0)

17 (45.9)

23.9 ± 1.7

(28.1 ± 1.6)

16 (43.2)

Petersen 2010

Not

reported

Not reported

Not reported

Nobili 2008

(Italy)

36/33

77.3 ± 4.8

(74.6 ± 5.4)

13 (39.4)

69.0 ± 4.0

(63.0 ± 8.0)

Not reported on all MCI included in analysis

Petersen 2004

Not

reported

Not reported

Outpatients

Ossenkoppele 2012a

(Netherlands)

15/12

67.0 ± 7.0

Sample analysed

9 (75%)

Sample analysed

75.0 ± 7.6

Sample analysed

8 (66.7)

Sample analysed

Petersen 1999

Not

reported

Not

reported

Not reported

Ossenkoppele 2012b

(Netherlands)

30/12

64.0 ± 9.0

Total sample

23 (30.4)

Total sample

75.0 ± 7.6

Total sample

Not reported

Petersen 2001

Not

reported

Not

reported

Outpatient University memory clinic

Pardo 2010

(USA)

19/18 Reader 1

19/19 Reader 2

Mean 80.0 (range 54 ‐ 83)

Total sample

Not reported

Not reported

Not reported

Petersen 1999

Not

reported

Not reported

Geriatric, Research, Education and Medical Centre

Unclear

Schmand 2012

(Netherlands)

ADNI participants

89/89

Not reported on the sample with ¹⁸F‐FDG scan (N = 89)

Not reported on the sample with ¹⁸F‐FDG scan (N = 89)

Not reported on the sample with ¹⁸F‐FDG scan (N = 89)

Not reported on the sample with ¹⁸F‐FDG scan (N = 89)

Petersen 2010 &

CDR = 0.5

Not

reported

Not reported

Multicentre (not specified)

ADNI: Alzheimer's Disease neuroimaging initiative

Figures and Tables -
Table 3. Demographic and participant characteristics of participants included in analysis
Table 4. Index test and numbers of converters to Alzheimer’s disease dementia

Study

Threshold

(prespecified

Yes/No)

Image scaling

Discriminating brain area

Image analysis

Time between FDG injection and PET acquisition (min)

¹⁸F‐FDG dose

Number of ¹⁸F‐FDG positive

(%)

Number of converters (%)

Duration of follow‐up

Mean (months) / Maximum (years)

Anchisi 2005

(Italy)

rCGMglc of temporo‐parietal and posterior cingulate of 1.138

(No)

Regional sensorimotor

FDG uptake ratio (p 1730)

Bilateral parietal and posterior cingulate cortex

SPM99

Not reported

Not

reported

19 (40)

(calculated in RevMan5)

14 (29)

Median: 12 months

Range: 12 ‐ 27 months

Arnáiz 2001

(Sweden)

rCGMglc of left temporo‐parietal region 13 mm above the basal ganglia

(Model I)

(No)

Sensorimotor

area of the cortex 26 mm above the level of the basal ganglia (p 852)

Temporo‐parietal cortex

SPSS

(Herholz 1999)

60

Not

reported

8 (40)

(calculated in RevMan5)

9 (45)

36.5 months

Berent 1999

(USA)

rCGMglc of

diagnostic index based on Z‐scores of the parietal cortex

(No)

Thalamus (p 11)

Frontal, temporal, parietal and occipital regions normalised to the thalamus

3D‐SSP

Not reported

370 MBq

10 (50)

10 (50)

3 years

Chételat 2003

(France)

rCGMglc at Z‐score of > 3.09

Thresholding was set at 80% of whole brain mean of control subjects

(No)

FDG uptake normalised by and adjusted to the person’s global uptake (p 1375)

Right temporo‐parietal and posterior cingulate

SPM99

Not reported

Not

reported

7 (41)

right temporo‐parietal region

8 (47.0) posterior cingulate

7 (41)

18 months

Clerici 2009

(Italy)

rCGMglc

lower than the control group corresponding to a P value < 0.01 level)

(Yes)

Global counts were normalised by proportional scaling to remove confounding effects due to global changes

(Del Sole 2008)

Posterior gyrus cingulate and bilateral inferior frontal cortex

SPM(t)

45

185‐370 MBq

23 (88.5)

13 (50)

1.5 years aMCI group.

3 years snaMCI group

Drzezga 2005

(Germany)

rCGMglc at

Z‐score of > 1.64 (1‐tail) corresponding to a P value of 0.05 (1‐tail)

(Yes)

Not

reported

Orbitofrontal, prefrontal, premotor, central, parietal superior and inferior, occipital, temporal anterior, temporal posterior and posterior cingulate

3D‐SSP

30

370 MBq

13 (43)

12 (40)

16 ± 2 months

Fellgiebel 2007

(Germany)

rCGMglc at

significantly decreased Z‐score > 2 in more than 50 adjacent pixels

(Yes)

Sensorimotor

area of the cortex (transaxial images parallel to the intercomissural line)

(Fellgiebel 2004)

Parietal mesial or posterior cingulate and temporal regions

SPSS

(Fellgiebel 2004)

30

(Fellgiebel 2004)

180 MBq (Fellgiebel 2004)

7 (44)

4 (25)

19.6 ± 9.0

Galluzzi 2010

(Italy)

rCGMglc of

t sum > 11.090 (email from the author)

(Herholz 2002)

(Yes)

Cerebellum

Temporo‐parietal, hippocampus and posterior cingulate

SPSS

Not reported

Not

reported

28 (74)

14 (37)

20.6.6 ± 9.7

Herholz 2011

(USA)

ADNI participants

rCGMglc of

t sum > 11.090

(Herholz 2002)

(Yes)

Global cortex

Temporal and parietal lobes

PALZ (PMOD software)

30 ‐ 60

Not

reported

38 (40)

30 (32)

2 years

Landau 2010

(USA)

ADNI participants

rCGMglc of 1.21

(No)

Cerebellar vermis and pons

ROI interest were study‐independent, frequently associated with decline in AD and MCI. No further details.

SPM5

30 ‐ 60

Not

reported

51 (60)

28 (33)

1.9 ± 0.4 years Range: 2 ‐ 3 years

Mosconi 2004

(Italy)

rCMRglc significantly reduced in certain cerebral areas with emphasis on the inferior parietal lobule (IPL).

(No)

Global cortex

Precuneus, anterior and posterior cingulate, inferior parietal lobe, superior, middle and inferior frontal gyrus, on both hemispheres

SPM99

19 ± 3

110 ‐ 370 MBq

4 (10.5)

8 (22)

12 ± 0.6

Nobili 2008

(Italy)

Visual inspection

rCGMglc threshold not reported

(No)

Global cortex

25 VROI in each hemisphere

Computerized Brain Atlas (CBA;

Applied Medical Imaging©, Uppsala, Sweden)

45

370 MBq

11 (33)

11 (33)

21.1 ± 10.9 months

Ossenkoppele 2012a

(Netherlands)

Visual inspection and SUVr of ROIs (threshold not reported) (No)

Cerebellar grey matter

Frontal, parietal and latero‐temporal and medial temporal lobes and posterior cingulate

PMOD Alzheimer discrimination tool (PALZ)

45 ‐ 60

150 ± 17 MBq

4 (33)

4 (33)

30

Range: 2 ‐ 4 years

Ossenkoppele 2012b

(Netherlands)

Visual inspection and SUVr of ROIs (threshold not reported) (No)

Cerebellar grey matter (p 3)

Frontal, parietal, occipital, and latero‐temporal and medial temporal lobes and posterior cingulate

PMOD Alzheimer discrimination tool (PALZ)

45‐60

185 MBq

5 (42)

6 (50)

2 years

Pardo 2010

(USA)

Visual inspection

Hypomethabolism if < 50% of the cubes had the label MCI or normal

(Only SVM analysis used thresholds)

(No)

PET scans were adjusted to a whole‐brain mean activity and stereotactically normalised by using Neurostat (p 328)

Frontal, parietal, occipital, and latero‐temporal and medial temporal lobes and posterior cingulate

SVM

Not reported

5 mCi/70 kg

Reader 1:

6(32)

Reader 2:

10 (53)

8 (44)

9 (47)

3 years

Schmand 2012

(Netherlands)

ADNI participants

rCGM value of < 1.20

(Email from the author)

(Yes)

Not

reported

Right and left angular gyrus, bilateral posterior cingulate gyrus and left middle/inferior temporal gyrus

SPSS

Not reported

Not

reported

18 (20)

38 (43)

2.7 ± 0.9

Range: 0.5 ‐ 4.6 years

ADNI: Alzheimer's Disease neuroimaging initiative
RevMan5: Review Manager 5 software
SUVr: standardised uptake value ratio
VROI: volumetric region of interest

Figures and Tables -
Table 4. Index test and numbers of converters to Alzheimer’s disease dementia
Table 5. Summary of test accuracy at study level for conversion to Alzheimer’s disease dementia

Studies included in meta‐analysis

Study ID

Participants (n)

Sensitivity (%)

Specificity (%)

% of MCI with ¹⁸F‐FDG PET positivity

Anchisi 2005

48

93

82

40

Arnáiz 2001

20

67

82

40

Berent 1999

20

70

70

50

Chételat 2003

(temporo‐parietal brain region)

17

100

100

41

Clerici 2009

26

92

15

88

Drzezga 2005

30

92

89

43

Fellgiebel 2007

16

100

75

44

Galluzzi 2010

38

79

29

74

Herholz 2011 (ADNI study)

94

57

67

40

Mosconi 2004

37

38

97

10.5

Nobili 2008

33

82

91

33

Ossenkoppele 2012a

12

75

88

33

Ossenkoppele 2012b

12

83

100

42

Pardo 2010 (Reader 1)

18

25

60

32

Studies included only in descriptive analysis

Chételat 2003

(posterior cingulate brain region)

17

100

90

47

Landau 2010 (ADNI study)

85

75

47

60

Pardo 2010 (Reader 2)

17

33

30

53

Schmand 2012 (ADNI study)

98

24

82

20

ADNI: Alzheimer's Disease neuroimaging initiative

Figures and Tables -
Table 5. Summary of test accuracy at study level for conversion to Alzheimer’s disease dementia
Table Tests. Data tables by test

Test

No. of studies

No. of participants

1 18FFDG‐PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 temporo‐parietal region and Herholz 2011 ADNI study) Show forest plot

14

421

2 18FFDG‐PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 temporo‐parietal region and Landau 2010 ADNI study) Show forest plot

14

412

3 18FFDG‐PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 temporo‐parietal region and Schmand 2012 ADNI study) Show forest plot

14

416

4 18FFDG‐PET Conversion from MCI to AD (with Reader 1 Pardo 2010, Chetelat 2003 posterior cingulate region and Herholz 2011 ADNI study) Show forest plot

14

421

5 18FFDG‐PET Conversion from MCI to AD (with Reader 2 Pardo 2010, Chetelat 2003 temporo‐parietal region and Herholz 2011 ADNI study) Show forest plot

14

422

6 18FFDG‐PET Conversion from MCI to AD (with Reader 2 Pardo 2010, Chetelat 2003 posterior cingulate region and Herholz 2011 ADNI study) Show forest plot

14

422

7 18FFDG‐PET Conversion from MCI to All dementia (with Reader 1 Pardo 2010) Show forest plot

5

106

8 18FFDG‐PET Conversion from MCI to All dementia (with Reader 2 Pardo 2010) Show forest plot

5

107

Figures and Tables -
Table Tests. Data tables by test