Search this Site:
Information Centers
- Home
- Contact Us
- Directions/Parking
- Faculty Information
- Hours
- Staff Directory
- Student Information
- Test Links
UW Oshkosh Testing Services Supports:
Despite differences between the format and construction of various tests, there are two standards by which tests (as
compared to items) are assessed. These two standards are reliability and validity.
Reliability refers to the consistency of test scores; how consistent a particular student’s test scores are from one testing to
another. In theory, if test A is administered to Class X, and one week later is administered again to the same class, individual
scores should be about the same both times (assuming unchanging conditions for both sessions, including familiarity with the test).
If the students received radically different scores the second time, the test would have low reliability. Seldom, however, does a
teacher administer a test to the same students more than once, so the reliability coefficient must be calculated a different way.
Conceptually, this is done by dividing a homogeneous test into two parts (usually even and odd items) and treating them as two tests
administered at one sitting. The calculation of the reliability coefficient, in effect, compares all possible halves of the test to
all other possible halves.
One of the best estimates of reliability of test scores from a single administration of a test is provided by the Kuder-Richardson
Formula 20 (KR20). On the “Standard Item Analysis Report” attached, it is found in the top center area. For example, in this report
the reliability coefficient is .87. For good classroom tests, the reliability coefficients should be .70 or higher.
To increase the likelihood of obtaining higher reliability, a teacher can: (1) increase the length of the test, (2) include
questions that measure higher, more complex levels of learning, and include questions with a range of difficulty with most questions
in the middle range, and (3) if one or more essay questions are included on the test, grade them as objectively as possible.