Face validity
If
an evaluator of a test asks a question about the reasonableness of the items of
the test regarding the background of the testee then he is interested in the face
validity of the test. It means how do test items look like in the light of
the objective of the test (Taiwo, 1995). According
to Linn and Gronlund (2000), Face validity refers to the appearance of the
test. In evaluating face validity, the task to be performed by the learner is
superficially examined which means the test appears to be a reasonable measure. A
test should look like an appropriate measure to obtain the cooperation of those
who are taking the test. Face validity should not be considered as a substitute
for a more rigorous evaluation of content definitions and sampling adequacy.
There is a clear distinction between making
validity claims based on a rationale of content definitions and making claims
based on face validity. For example, to test the skill of finding the area of
geometrical figures, the tester wants the area of a rectangle, he/she may ask
the students to find the area of an A4 paper, a shopkeeper may be
asked to find the area of a rectangular piece of cloth and a player of hockey
may be asked to find the area of nearest hockey ground. In these three test
items, the idea is the same to find the area of a rectangle but phrased for each group
in their own contexts.
Content validity
Content
validity is one of the simplest ways for a test to have sufficient validity
evidence. Content validity evidence is established by a thorough examination of
the test items whether they match the instructional objectives of the
tester. When the achievement of the students is intended to measure where the specification
of items to be included in the test is easy, content validity claim is easy.
While in personality tests and aptitude tests, content validity becomes
problematic (Kubiszyne & Borich, 2003). According
to Linn and Gronlund (2000), content consideration for validity gets first
priority when an individual’s performance is intended to describe a domain
of task which the test is supposed to represent.
For example, the tester may
expect the students to write plurals of 300 singular nouns, then the tester selects
a sample of 30 words and if a student writes 70% plurals correctly, it means
that the student can write 70% plurals correctly from 300words. Thus that can
be generalized on the basis of a sample of items for the whole list of singular
nouns. Content validity evidence is then the degree to which the test task
provides a relevant and representative sample of the domain of the task about which interpretations
of test results are made. To ensure content validity evidence the testers
proceed from what has been taught to what is to be measured, then to what should be focused in the test, and finally to
a representative sample of relevant tasks.
Rational
validation of a test
Analysis and comparison are the procedures
used for content-related validation of a test. The pupils are expected to make
to the content and this is compared with the domain of test is scanned to find
out the subject matter of content covered and the responses which the
achievement to be measured. The numerical value is not required for the expression of
content-related validation. It is determined by the analysis of content and
task given in the test and domain of outcomes to be measured and reviewing the
degree of connection between them (Swain et al, 2000). The data from analysis
and comparison is expressed in a two-way chart called the table of specifications
for validation of a test (Linn &Gronlund, 2000).
2. Criterion-related
validity
A
valued standard to measure the performance other than the test itself is known as
a criterion. The use of a test for the prediction of future performance or to find
out the current position against a valued measure other than the test itself is
called criterion-related validation (Swain et al, 2000).
Predictive
validity evidence
Linn
& Gronlund, (2000) asserts that predictive validity evidence refers to the
degree of adequacy of a test in predicting the future behavior of an individual.
This kind of validity is important, particularly in aptitude tests. For example, a scholastic aptitude test is used to decide who should be admitted where. The
predictive validity evidence of a test is determined by administering the test
to a group of subjects, then measuring the subjects on whatever the test is
supposed to predict after a period of time has over and done that means the test-retest method is used. The two sets of scores are then correlated by using
Pearson ‘r’ and the coefficient that results is called a predictive validity
coefficient.
Concurrent
validity
The
degree to which a test estimates present status or performance and thus the
relationship between two measures taken concurrently is called concurrent
validity (Swain et al, 2000). According
to Kubiszyne and Borich (2003), concurrent validity evidence of a test is
determined by administering two similar tests at the same time or in a very
short period of time to a group of students. Then the performance of students is
measured on what the test is supposed to measure current performance at the
same time. The two sets of scores are then correlated by using Pearson “r” and
the coefficient is called a concurrent validity coefficient.
Presentation
of the relationship of scores in criterion validity evidence
The
relationship between the scores of two concurrent tests is presented or shown
by using an expectancy table. It is a simple table in which the scores of two
tests are arranged. Another way of communicating relationships between the
scores is using a Scatter plot in which the scores are plotted in a graph
(Linn & Gronlund, 2000).
Construct
validity
A construct is a psychological quality that is
assumed to exist in order to explain some aspect of behavior among
individuals (Linn & Gronlund, 2000).
For example, reasoning, problem-solving, and so on are some of the
constructs among individuals. Construct
validation is the process of determining the extent to which a particular test
measures the psychological constructs that the tester wants to measure.
Construct validity is determined by defining
the domain or tasks to be measured, analyzing the response process required by
the assessment tasks, comparing the scores of known groups, comparing the
scores before and after a particular learning experience, and correlating the
scores with Pearson product-moment correlation (Swain et al, 2000).
Bibliography
Kubiszyne,
T., & Borich, G. (2003). Educational testing and measurement: Classroom application and practice (7thed.).
New York: John Wiley & sons.
Linn, R. L., & Gronlund, N.E.
(2000). Measurement and assessment in teaching (8thed.). Delhi: Pearson Education.
Rehman, A. (2007). Development and validation of objective test items analysis in the
subject physics for class IX in
Rawalpindi city. Retrieved May 12, 2009, from International Islamic University, Department of
Education Web site: http://eprints.hec.gov.pk/2518/1/2455.htm.
Swain,
S. K., Pradhan, C., &Khotoi, S. P. K. (2000). Educational measurement:
Statistics and guidance.
Ludhiana: Kalyani.
Taiwo, A. A. (1995). Fundamentals of classroom
testing. New Delhi: Vikas publishing house.
0 Comments
Post a Comment