Test for ReliabilityRunning head: CASE ANALYSIS 3Case Analysis 3Linda MacleanNational UniversityDecember 10, 2014Case Analysis 3There are numerous ways to test for reliability. Â I will discuss some of these to include inter-rater reliability, test-retest, and internal consistency. Â I will also explain the difference between reliability and validity. Â A review of case analysis 2, âI want to get into grad school real badâ (Gliner, Morgan, & Leech, 2009, p.163) will be done to determine if Glinerâs test was reliable or not.Reliability is extremely important in the construction of a good test. Â If a test does not measure consistently (reliably), then we could not count on the scores being an accurate assessment of a studentsâ knowledge. Â If we could not trust the bathroom scales to give an accurate weight measure because the readings fluctuate five pounds up or down in a given day, in the same manner, scores cannot be trusted on a test unless we know about the consistency with which they measure. Only when one can determine the extent that test scores are reliable can they be useful and fair to those taking the tests. Â A test or measure cannot be valid if it is not reliable (Gliner et al., 2009, p. 368).
Reliability shows the extent to which test scores are free from errors of measurement. Â No test is perfectly reliable due to random errors operate to cause scores to vary or be inconsistent from time to time and situation to situation. The goal is to try to minimize these inevitable errors of measurement and increase reliability. Â Test validity is the extent to which a test accurately measures what it purports to measure. In the fields of psychological testing and educational testing. Â Validity is the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests (Wainer & Braun, 1988).Inter-Observer Reliability is used when humans are part of the measurement procedure. Â When humans are used there is concern about the results being reliable or consistent. People can be very inconsistent and can misinterpret questions. Â There are ways to estimate inter-rater reliability. One of these can be used if the measurement consists of categories, the raters can check off which category each observation falls in, then calculate the percent of agreement between the raters. Â This method is used to a degree by the Navy during promotion boards.