Reliability Memo
Reliability Memo
I am evaluating the Stanford Achievement Test for school districts across the country that intends to use this test for their students. I recommend this test for the purpose it is intended for. The SAT has been regarded as “one of the best achievement batteries of its type.”(Brown 1992)
The SAT’s main objective is to measure student achievement in reading, mathematics, language, spelling, study skills, science, social science and listening. The SAT has been available since 1923 and is now in its eighth edition. It is designed to test students individual achievement as well as their individual achievement.
For test-retest reliability this may not be a great way to find the tests reliability. This is because if there is a short time between the first test and then the exact same test is administered the same responses may be recalled, which could allow the person to obtain almost the same score. This test is also based on what has been learned and if the test is administered more than once the taker may be able to get the solution just by knowing what is on the test. Test-retest could be effective if there were other factors that affected a person’s score such as illness, noise or even stress. There is no test-retest reliability provided with the SAT.
With equivalent forms-immediate the test is split into two equal parts and administered in immediate succession the correlation shows reliability across forms and not across occasions, such as equivalent forms-delayed would show. The hardest part with equivalent forms is trying to administer the same test in two equal parts. This is difficult because both sections should be parallel to each other and should meet the same specifications. They should include the same material and have the same amount of questions, which should be expressed in the same way. The difficulty of the material should also be the same. Other factors such as instructions, time limits and format should be the exact same. This type of test doesn’t seem to contain the right criteria for the SAT. This is because there are many sections with the SAT that it would be difficult to have the test separated into two equal parts so that the test would be as reliable as it could.
Internal consistency is basically the total score on the test itself. The performance on the upper criterion is compared with the lower criterion and the items that show a significantly greater proportion of passes in the upper than in the lower criterion are invalid and are eliminated or revised. There is also the administration of scores on different subtests, such as the SAT, which are then combined in order to find the total test score. Then