UW Oshkosh
menu
Future Students adult non-traditional Parents and Family Current Students Faculty and Staff Visitors and Community

Reliability of a Test:


Despite differences between the format and construction of various tests, there are two standards by which tests (as compared to items) are assessed. These two standards are reliability and validity.

Reliability refers to the consistency of test scores; how consistent a particular student’s test scores are from one testing to another. In theory, if test A is administered to Class X, and one week later is administered again to the same class, individual scores should be about the same both times (assuming unchanging conditions for both sessions, including familiarity with the test). If the students received radically different scores the second time, the test would have low reliability. Seldom, however, does a teacher administer a test to the same students more than once, so the reliability coefficient must be calculated a different way. Conceptually, this is done by dividing a homogeneous test into two parts (usually even and odd items) and treating them as two tests administered at one sitting. The calculation of the reliability coefficient, in effect, compares all possible halves of the test to all other possible halves.

One of the best estimates of reliability of test scores from a single administration of a test is provided by the Kuder-Richardson Formula 20 (KR20). On the “Standard Item Analysis Report” attached, it is found in the top center area. For example, in this report the reliability coefficient is .87. For good classroom tests, the reliability coefficients should be .70 or higher.

To increase the likelihood of obtaining higher reliability, a teacher can: (1) increase the length of the test, (2) include questions that measure higher, more complex levels of learning, and include questions with a range of difficulty with most questions in the middle range, and (3) if one or more essay questions are included on the test, grade them as objectively as possible.