A Study to Investigate Intra-Judge Consistency in Adjudication of

Large Instrumental Ensemble Performance

 

Robert L. McWilliams, Ph.D.

University of Wisconsin Oshkosh

mcwilrob@uwosh.edu

 

 

Abstract

The purpose of this study was to investigate intra-judge consistency in the adjudication of large ensemble performance. Nine experienced adjudicators from a large metropolitan area adjudicated the same sets of recorded performances on two different occasions (separated by seven to ten days). The recordings consisted of three contrasting musical items. Listening conditions and time of day were replicated in both adjudication sessions. The adjudicators were not informed that they were adjudicating the same recording and assumed the two sessions were different performances—they did not have access to adjudication sheets or information from the previous session. The BAF (Band Adjudication Form) required the subjects to score the performances on a scale from 0 to 10 in seven discrete categories in addition to providing a "global assessment" on a scale from 1 ("poor") to 9 ("outstanding"). For analysis purposes, the individual criteria scores were summed to provide a "total" score.

Results showed that, with notable exceptions, the subjects generally exhibited a high degree of consistency within their own scores from one session to the next. Correlations on the "total" scores using Pearson's 'r' revealed coefficients from .84 (item two) to .97 (item three). Analysis of each subject's individual criteria scores between adjudication sessions revealed that scores varied by 20% or more in 17.5 % of the comparisons. Analysis of the "total" score results revealed a variation of 10% or more in 14.8% of the comparisons. The most striking exception to the overall trend of consistency was a variation of approximately 30% in three of the seven criteria on musical item two for one judge. While not the primary focus of this study, the data obtained also indicated a relatively high degree of disparity when comparing each of the judges’ scores with each other (i.e. inter-judge comparisons).

The revealed differences in intra-judge and inter-judge consistency clearly need further investigation. This study has shown that there is enough variability between adjudicators, and in some cases within an individual adjudicator's assessments, that point differentials large enough to significantly affect final outcomes do occur. The debate about the value of ranked competition, rated or "comments-only" festivals must continue to take account of the vagaries of placing a numerical score on an artistic endeavor and the potential inconsistencies that seem to be an inevitable part of that process.

 

Robert L. McWilliams, Ph.D.

Director of Bands and Instrumental Music Education

University of Wisconsin Oshkosh Department of Music

800 Algoma Boulevard

Oshkosh, Wisconsin 54901

mcwilrob@uwosh.edu