The characteristic of a test about the consistency
with which a test yields the same result in measuring whatever it does measure is
called reliability (Swain et al, 2000).

Taiwo (1995) defines reliability as it refers to the
consistency of measurement that is how consistent a test scores are from one
measurement to other. For example, the students use a stop watch to measure
time for 15 vibrations of a pendulum. They take the reading twice or thrice. If
two of three times the reading is consistent then they proceed with it further.
It means that the stop watch provides reliable readings.

**Nature of reliability**

Reliability refers to consistency of the results
obtained with a test but not the test itself. It means that the results
obtained by a tool or test are reliable not the tool or test is said to be
reliable. It refers to a particular
interpretation of test scores. For example a test score which is reliable over
a period of time may not be reliable from one test to another equivalent test.Reliability
is a statistical concept. To determine the consistency, a test is administered
once or more than once. Then the consistency is measured in terms of relative
shifts. It is necessary but not a sufficient condition for validity ( Linn
& Gronlund, 2000).

**Functions of reliability**

Reliability coefficient provides the most revealing
statistical index of quality that is ordinarily available.Estimates of the
reliability of test provide essential information for judging the technical
quality and motivating efforts to improve the tests.Reliability estimation
determines how much of the variability in test scores is due to measurement
error and how much is due to variability in true scores (Swain et al, 2000).

**Methods of determining reliability**

**Test-Retest Reliability**

The test is administered twice on the same group to
assess the consistency of a test scores over a period of time. The two tests
are similar but not the same. Then the correlation between two sets of scores obtained
by test and retest is found using Pearson product moment “r”. Test-retest
reliability is best used for things that are stable over time, for example intelligence.
Generally, reliability will be higher when little time has passed between two tests (Kubiszyne & Borich, 2003).

**Equivalent /Parallel-Forms method**

In parallel-forms method of determining reliability,
the reliability is estimated by comparing two different tests that were created
using the same content, difficulty, format and length at the same test. The two
tests are administered to the same group within a short interval of time. Then
the test scores of two tests are correlated. This correlation provides an index
of equivalence. For example, in intermediate or secondary board examinations,
two questions paper for a particular subject are constructed and named as paper
A or paper B and some times paper C is prepared which show equivalent forms
tests ( Linn & Gronlund, 2000).

**Internal Consistency method**

The consistency of test results across items on the
same test is determined in this method of determining reliability of a test. Test
items are compared with each other that measure the same construct to determine
the test’s internal consistency. Questions are similar and designed to measure
the same thing, the test taker should answer the same for both questions, which
would indicate that the test has internal consistency(Swain et al, 2000). Three methods
to find the internal consistency of a test known as split-half method and Kuder
Richardson 21 formula and inter-rater internal consistency are given below.

**Split-half method**

Linn and Gronlund (2000) shares that the split-half
method of determining internal consistency employs single administration of an
even-number test on a sample of pupils. The test is divided into two equivalent
halves and correlation for these half test scores is found. The test is divided
into even numbered items such as 2,4,6…, in one half and odd numbers such
as1,3,5,…., in another half.Then
the scores of both the halves are correlated by using spearman brown formula. The
formula is given below.

r

_{2 }= 2 r_{2}/1_{+}r_{1}
Where r

_{2 = }reliability coefficient on full test
r

_{1= }correlation of coefficient between half tests**Kuder-Richardson formula 21 method**

Linn & Gronlund (2003), states that it is another method of determining reliability using single administration of a test. It is known to provide conservative estimate of the split-half type of reliability. The procedure is based on the consistency of an individual’s performance from item to item and on the standard deviation of the test such that the reliability coefficient obtained denotes internal consistency of the test. Internal consistency here means the degree to which the items of a test measure a common attribute of the testee.

**Inter-rater Reliability**

In this method two or more independent judges score
the test. The scores are then compared to determine the consistency of the
raters’ estimates. One way to test inter-rater reliability is to assign each
rater score each test. For example, each rater might score items on a scale
from 1 to 10. Then the correlation between the two ratings is found to
determine the level of inter-rater reliability. Another means of testing
inter-rater reliability is to have raters determine which category each
observation falls into and then calculate the percentage of agreement between
the raters. So, if the raters agree 8 out of 10 times, the test has an 80% inter-rater
reliability rate (Swain et al, 2000).

**Factors affecting reliability**

Factors related to test
which affect the reliability of a test are, length of the test, content of the
test, characteristics of test items and spread of scores. If the time for
taking a test is short then the reliability of the test will be affected. If
the content of the test is not the representative of the whole content to be
tested than the reliability of the test will be reduced. The more spread of the
test score, the less the reliability of a test. Factors related to testee which affect reliability of a test are;
heterogeneity of the group, test wiseness of the students and motivation
of the students. Time limit of the test and cheating opportunity given to the
students are the factors related to
testing procedure which affect the reliability of the test (linn & Gronlund,
2003).

**Reference:**

Kubiszyne,
T., &Borich, G. (2003). Educational testing and measurement: Classroom application and practice (7

^{th}ed.). New York: John Wiley & sons.
Linn,
R. L., &Gronlund, N.E. (2000). Measurement and assessment in teaching (8

^{th}ed.). Delhi: Pearson Education.
Rehman, A. (2007).

*Development and validation of objective test items analysis in the subject physics for class IX in Rawalpindi city*. Retrieved May 12, 2009 form International Islamic university, Department of Education Web site: http://eprints.hec.gov.pk/2518/1/2455.htm.
Swain,
S. K., Pradhan, C., &Khotoi, S. P. K. (2000). Educational measurement:
Statistics and guidance.
Ludhiana: Kalyani.

Taiwo, A. A. (1995). Fundamentals of classroom
testing. New Delhi: Vikas publishing house.