A
Study to Investigate Intra-Judge Consistency in Adjudication of
Large
Instrumental Ensemble Performance
Robert
L. McWilliams, Ph.D.
University
of Wisconsin Oshkosh
Abstract
The purpose of
this study was to investigate intra-judge consistency in the adjudication of
large ensemble performance. Nine experienced adjudicators from a large
metropolitan area adjudicated the same sets of recorded performances on two
different occasions (separated by seven to ten days). The recordings consisted
of three contrasting musical items. Listening conditions and time of day were
replicated in both adjudication sessions. The adjudicators were not informed
that they were adjudicating the same recording and assumed the two sessions
were different performances—they did not have access to adjudication
sheets or information from the previous session. The BAF (Band Adjudication
Form) required the subjects to score the performances on a scale from 0 to 10
in seven discrete categories in addition to providing a "global
assessment" on a scale from 1 ("poor") to 9
("outstanding"). For analysis purposes, the individual criteria
scores were summed to provide a "total" score.
Results showed
that, with notable exceptions, the subjects generally exhibited a high degree
of consistency within their own scores from one session to the next.
Correlations on the "total" scores using Pearson's 'r' revealed
coefficients from .84 (item two) to .97 (item three). Analysis of each
subject's individual criteria scores between adjudication sessions revealed
that scores varied by 20% or more in 17.5 % of the comparisons. Analysis of the
"total" score results revealed a variation of 10% or more in 14.8% of
the comparisons. The most striking exception to the overall trend of
consistency was a variation of approximately 30% in three of the seven criteria
on musical item two for one judge. While not the primary focus of this study,
the data obtained also indicated a relatively high degree of disparity when
comparing each of the judges’ scores with each other (i.e. inter-judge
comparisons).
The revealed
differences in intra-judge and inter-judge consistency clearly need further
investigation. This study has shown that there is enough variability between
adjudicators, and in some cases within an individual adjudicator's assessments,
that point differentials large enough to significantly affect final outcomes do
occur. The debate about the value of ranked competition, rated or
"comments-only" festivals must continue to take account of the
vagaries of placing a numerical score on an artistic endeavor and the potential
inconsistencies that seem to be an inevitable part of that process.
Robert L. McWilliams,
Ph.D.
Director of Bands and
Instrumental Music Education
University of Wisconsin
Oshkosh Department of Music
800 Algoma Boulevard
Oshkosh, Wisconsin 54901