Test-Retest Reliability

The test is administered twice on the same group to assess the consistency of test scores over a period of time. The two tests are similar but not the same. Then the correlation between two sets of scores obtained by test and retest is found using Pearson product-moment “r”. Test-retest reliability is best used for things that are stable over time, for example, intelligence. Generally, reliability will be higher when little time has passed between two tests (Kubiszyne & Borich 2003).

Equivalent /Parallel-Forms method

In the parallel-forms method of determining reliability, the reliability is estimated by comparing two different tests that were created using the same content, difficulty, format, and length at the same test. The two tests are administered to the same group within a short interval of time. Then the test scores of two tests are correlated. This correlation provides an index of equivalence. For example, in intermediate or secondary board examinations, two questions paper for a particular subject are constructed and named as paper A or paper B, and sometimes paper C is prepared which show equivalent forms tests (Linn & Gronlund, 2000).

Internal Consistency method

The consistency of test results across items on the same test is determined in this method of determining the reliability of a test. Test items are compared with each other that measure the same construct to determine the test’s internal consistency. Questions are similar and designed to measure the same thing, the test taker should answer the same for both questions, which would indicate that the test has internal consistency (Swain et al, 2000). Three methods to find the internal consistency of a test known as split-half method and Kuder Richardson 21 formula and inter-rater internal consistency are given below.

Split-half method

Linn and Gronlund (2000) share that the split-half method of determining internal consistency employs single administration of an even-number test on a sample of pupils. The test is divided into two equivalent halves and a correlation for these half test scores is found. The test is divided into even-numbered items such as 2,4,6…, in one half and odd numbers such as1,3,5,…., in another half. Then the scores of both the halves are correlated by using the spearman brown formula. The formula is given below.
                                                              r2    = 2 (r2/1+ r1)
                                        Where            r2 = reliability coefficient on the full test
                                                               r1= correlation of coefficient between half tests

Kuder-Richardson formula 21method

Linn & Gronlund (2003) stated that it is another method of determining reliability using single administration of a test. It is known to provide a conservative estimate of the split-half type of reliability. The procedure is based on the consistency of an individual’s performance from item to item and on the standard deviation of the test such that the reliability coefficient obtained denotes the internal consistency of the test. Internal consistency here means the degree to which the items of a test measure a common attribute of the testee.

Inter-rater Reliability

In this method, two or more independent judges score the test. The scores are then compared to determine the consistency of the raters’ estimates. One way to test inter-rater reliability is to assign each rater score each test. For example, each rater might score items on a scale from 1 to 10. Then the correlation between the two ratings is found to determine the level of inter-rater reliability. Another means of testing inter-rater reliability is to have raters determine which category each observation falls into and then calculate the percentage of agreement between the raters. So, if the raters agree 8 out of 10 times, the test has an 80% inter-rater reliability rate (Swain et al, 2000).

References

Kubiszyne, T., & Borich, G. (2003). Educational testing and measurement: Classroom application      
               and practice (7thed.). New York: John Wiley & sons. 
Linn, R. L., & Gronlund, N. E. (2000). Measurement and assessment in teaching (8thed.). Delhi:                         Pearson Education.
Swain, S. K., Pradhan, C., & Khotoi, S. P. K. (2000). Educational measurement: Statistics and  
                guidance. Ludhiana: Kalyani.