Effective rating scale development for speaking tests : Performance decision trees
journal contributionposted on 24.10.2012, 09:10 by N. Glenn Fulcher, J. Kemp, F. Davidson
Rating scale design and development for testing speaking is generally conducted using one of two approaches: the measurement-driven approach or the performance data-driven approach. The measurement-driven approach prioritizes the ordering of descriptors onto a single scale. Meaning is derived from the scaling methodology and the agreement of trained judges as to the place of any descriptor on the scale. The performance data-driven approach, on the other hand, places primary value upon observations of language performance, and attempts to describe performance in sufficient detail to generate descriptors that bear a direct relationship with the original observations of language use. Meaning is derived from the link between performance and description. We argue that measurement-driven approaches generate impoverished descriptions of communication, while performance data-driven approaches have the potential to provide richer descriptions that offer sounder inferences from score meaning to performance in specified domains. With reference to original data and the literature on travel service encounters, we devise a new scoring instrument, a Performance Decision Tree (PDT). This instrument prioritizes what we term ‘performance effect’ by explicitly valuing and incorporating performance data from a specific communicative context. We argue that this avoids the reification of ordered scale descriptors which we find in measurement-driven scale construction for speaking tests.