Using the Bayley-III to assess neurodevelopmental delay: which cut-off should be used?
journal contributionposted on 08.07.2015, 11:11 by Samantha Johnson, T. Moore, N. Marlow
Background: As the latest edition of the Bayley Scales (Bayley-III) produces higher scores than its predecessor (BSID-II), there is uncertainty about how to classify moderate–severe neurodevelopmental delay. We have investigated agreement between classifications of delay made using the BSID-II and Bayley-III. Methods: BSID-II Mental Development Index (MDI) and Bayley-III cognitive and language scales were administered in 185 extremely preterm (<27 wk) children. A combined Bayley-III score (CB-III) was computed. Agreement between delay classified using MDI scores <70 and various Bayley-III cut-offs was assessed. Results: Bayley-III cognitive and language scores were close to the normative mean and were higher than BSID-II MDI scores. Nineteen (10.2%) children had MDI <70. Bayley-III scores <70 significantly underestimated the proportion with MDI <70. Bayley-III cognitive and language scores <85 had 99% agreement with MDI <70 and underestimated delay by 1.1%. CB-III scores <80 had 98% agreement and produced the same proportion with delay. Conclusion: Bayley-III cognitive and language scores <85 or CB-III scores <80 provide the best definition of moderate-severe neurodevelopmental delay for equivalence with MDI <70. CB-III scores have the advantage of producing a single continuous outcome measure but require further validation. The relative accuracy of both tests for predicting long-term outcomes requires investigation. The Bayley Scales are the most frequently used tests in infant developmental assessment. The second edition of the test, the BSID-II (1), was widely used as an outcome measure in epidemiological studies and randomized controlled trials of infant interventions. The sound psychometric properties of the BSID-II Mental Development Index (MDI), a composite measure of nonverbal cognitive and language development, engendered much professional confidence and the MDI rapidly became the gold standard for assessing neurodevelopmental outcome (2). In 2006, the third edition, the Bayley-III (3), was published which has separate cognitive and language scales. Although strong correlations are reported between MDI scores and Bayley-III cognitive and language scores (4,5), concerns have arisen over how to interpret test scores. Bayley-III scores are up to 10 points higher than MDI scores (3,4,5,6). Mean scores of control groups are of a similar magnitude above the normative mean (7,8) and those of clinical populations are higher than anticipated (4,5,8). Thus there is concern that the Bayley-III underestimates developmental delay using conventional cut-offs (4,6,8,9,10,11). However, as yet it is not clear whether the Bayley-III underestimates developmental delay or the BSID-II overestimated it. These issues have significant implications for research and clinical practice. Children with developmental problems may not be identified using the Bayley-III and may thus fail to be referred for intervention. For research, the underestimation of delay leads to reduced statistical power in randomized controlled trials where sample sizes have been calculated using prevalence estimates obtained from studies using the MDI. Some groups have opted to raise Bayley-III SD-banded cut-offs by 15-points (1 SD) to retain power for primary outcomes and to identify children in need of intervention (12). It has also been suggested that raising the cut-off by 10-points maximizes agreement with the MDI (5). However, there still remains the problem of how to use Bayley-III scores to provide a relevant and practical composite outcome for identifying delay in research (13). To address these issues, we have investigated the predictive value of different Bayley-III cut-offs for classifying neurodevelopmental delay as measured by the BSID-II MDI in children born extremely preterm.