The Wechsler Adult Intelligence Test - Third Edition (WAIS - III)
is part of a family of Wechsler tests. "The WAIS-III is the great-grandchild
of the original 1939 Wechsler-Bellevue Form I." (Kaufman and Lichtenberger
1999, p. 3) It has been the subject of extensive research, so this short
review will merely present an overview, with some focus on the issue
of its validity.
The WAIS - III project directors were David Tulsky and Jianjun Zhu,
the publisher is Harcourt Brace & Company, and the date of publication
of the test and of its normative data was 1997. Administration of the
Verbal IQ, Performance IQ, and Full Scale IQ is intended to take 60-90
minutes. The test costs either US$ 914 or US$ 967, depending on the
packaging (box or case) required (according to the WAIS-III WMS-III
Technical Technical Manual and from the website http://harcourtassessment.com/).
2. Purpose and nature
The purpose of WAIS-III is to measure an adult's intellectual ability
using a multiple aptitude battery. The test is for adults between the
ages of 16 and 89 years. It is designed for use with individuals, and
the battery is composed of seven performance subtests and seven verbal
subtests. The overall result of a WAIS-III test is called a Full Scale
IQ, but the verbal and performance components also yield their own scores
-- respectively the Verbal IQ and the Performance IQ. The verbal and
performance components each also have subcomponents, and these subcomponents
yield scores called indices.
According to Kaufman and Lichtenberger (1999), these subtests were
largely based on other researchers' work, especially the Stanford-Binet
and the Army Performance Scale Examination.
3. Practical evaluation
Face validity is one of the test's strong-points, because it is apparently
inclusive of a lot of intelligence-related skills. The design of the
main materials (Stimulus Booklet, Block Design blocks, Picture Arrangement
pictures, Object Assembly objects, and Administration and Scoring Manual)
is excellent. The content is literate, clearly laid-out, and easy to
use. Everything is attractive and apparently durable, except that the
cover of the stimulus booklet I saw was showing its age, being slightly
turned-up at the bottom. The materials seems appropriate to the age
of the users.
It is lengthy and somewhat complicated to administer. It cannot be
administered by computer -- this has to be done one-on-one, human-to-human.
The directions are not always completely clear, but administrators would
normally be taught how to administer the test, so this should not be
a real issue in most cases. According to the webpage http://harcourtassessment.com/hai/ProductLongDesc.aspx?Catalog=TPC-USCatalog&ISBN=015-8980-727&Category=Adolescents
, there is a training video available for purchase.
There is computer-assisted scoring available, but scoring procedures
-- while not simple -- are not really difficult. However, there is always
the risk of human error because of the number of manual entries of raw
scores and scaled scores that the administrator has to make. Scoring
templates are used for some subtests. Index scores are generated for
Verbal Comprehension, Working Memory, Perceptual Organization, and Processing
Speed, as well as scores for Verbal IQ, Performance IQ and Full-Scale
According to an email received on 25 April 2007 from Harcourt Assessment
Customer Service, WAIS-III requires a high level of expertise in test
interpretation, and can be purchased by individuals with:
Licensure or certification to practice in a field related to the
A doctorate degree in psychology, education, or closely related
field with formal training in the ethical administration, scoring,
and interpretation of clinical assessments related to the intended
use of the assessment.
4. Technical evaluation of psychometric properties
Shum, O'Gorman, and Myors (2006, p. 130) state that one of the strengths
of the WAIS-III is the size and representativeness of the standardisation
sample used in test development. According to the Technical Manual,
the WAIS-III and WMS-III (Wechsler Memory Scale -- Third Edition) normative
information was based on United States standardisation samples of 2,450
individuals representative of the population of adults aged 16-89 years.
A stratified, census-based sampling plan ensured that the standardisation
samples included representative proportions of adults according to each
selected demographic variable. The variables used for stratification
were age, sex, race/ethnicity, education level, and geographic region.
According to the Technical Manual, one set of norms was produced that
was representative of US Census proportions as regards all variables
except age. It was based on the performance of a reference group that
consisted of the participants in the standardisation sample who were
between the ages of 20 and 34. The Manual recommends that this set of
norms be used when clinical questions dictate comparisons of an individual's
performance to that of a reference group. Another set of norms was produced
that was based on age-corrected subtest scores. The Manual recommends
that this set of norms be used when clinical questions dictate comparisons
of an individual's performance to that of his or her age peers.
The WAIS-III only exists in one version, so there is no issue with
alternate forms. According to the Technical Manual, interscorer agreement
is very high, averaging in the high .90s. According to the Technical
Manual, the stability of WAIS-III scores was assessed in a study and
found to be adequate across time for all age-groups.
According to the Technical Manual, the reliability of each WAIS-III
subtest (except Digit Symbol-Coding and Symbol Search) was estimated
using a split-half procedure from the item scores from a single administration,
with the correlation corrected using the Spearman-Brown formula. Since
Digit Symbol-Coding and Symbol Search subtests are speeded subtests,
the split-half coefficient was not considered to be a good estimate
of their reliability. For that reason, test-retest stability coefficients
were used as the reliability estimates for these two subtests, with
the correlation being corrected for the variability of the standardisation
The sample included 394 participants, with roughly 30 participants
from each of the 13 age-groups. The reliability coefficients of the
WAIS-III IQ scales and indexes were calculated with the formula recommended
by Guilford (1954) and Nunnally (1978). The average reliability coefficients
across age-groups of the subtests (except Picture Arrangement, Symbol
Search and Object Assembly), which were calculated with Fisher's z transformation,
range from .82 to .93. The Symbol Search subtest had a coefficient of
.77, Picture Arrangement had .74, and Object Assembly had .70. The Object
Assembly subtest is not included in the computation of IQ and Index
scores, in part because of its low reliability for older adults.
The Technical Manual (p. 75) asserts that, in order to ensure content
validity, comprehensive literature reviews were undertaken, consultants
were consulted, surveys were carried out, and focus groups and an advisory
panel were set up. The Manual also provides considerable detail about
the testing that was done of the WAIS-III's concurrent criterion-related
A later section will examine the issue of construct validity in more
detail. Here it suffices to state that the Technical Manual provides
a lot of data on intercorrelation studies within the components of the
WAIS-II itself, on factor analysis and on the ability of the WAIS-III
to discriminate between the normal population and groups with various
neurological disorders, alcohol-related disorders, schizophrenia, psychoeducational
and developmental disorders, and deafness or hearing-impairment.
5. Research Relevant to Usefulness of Measure
There has been a vast amount of research done on WAIS-III and its
predecessors, so it is beyond the scope of this review to do more than
just to sample it -- giving a hopefully varied but unsystematic taster
of the available body of research. Watkins, C. E. Jnr., Campbell, V.
L., Nieberding, R. and Hallmark (1995) conclude that the Wechsler scales
are amongst the assessment procedures most frequently recommended by
American clinicians for clinical students to learn about and that most
clinicians still use most often what they call the "most tried
and true" assessment standards, including the Wechsler scales.
The WAIS-R (the immediate predecessor of WAIS-III) was the clear frontrunner
in terms of frequency of use of intelligence tests. Camara, Nathan and
Puente (2000) made a similar finding.
In this connection, it is worth noting that the WAIS-III Technical
Manual (on page 75) states that "... because of the similarities
between the WAIS-III and the WAIS-R ..., the accumulated research on
the WAIS-R ... should be considered in any evaluation of the validity
of the (WAIS-III)."
In Australia, Sharpley and Pain (1988) report that the Wechsler tests
of intelligence were also the most valued and recommended, and in New
Zealand Knight and Godfrey (1984) reports that the WAIS was the test
that the most hospital psychologists believed clinical psychology graduates
should have had experience in administering and interpreting.
There has been a lot of factor analysis of the validity of various
aspects of the WAIS-III subsequent to its publication, as was anticipated
in the Technical Manual itself. It is interesting to note that such
studies sometimes appear to contradict each other -- for example, Taub
(2001) concluded that his evidence did not support the Verbal IQ/Performance
IQ dichotomy, whereas Jones, van Schaik, and Witts (2006) conclude as
...we suggest that index scores should be used with caution in individuals
with low IQ (74 or less). The use of two scores (for verbal and performance
domains) is justified based on the two-factor solution obtained in
the current study.
Bennett (1981) investigates the effect of encouragement of examinees
by administrators on measured IQ and found a significant positive correlation
for Full-Scale IQ, with those who had received encouragement scoring
higher than those who had not. This effect was also found for Performance
IQ, but the effect for Verbal IQ was not significant. Bennett cites
previous research which had also found that reinforcement of various
kinds had a significant effect on academic performance and test scores.
He also investigated the interaction of encouragement with examinee
personality-type (Locus of Control), but he found no significant effect
in this case.
With regard to the issue of encouragement, Bennett states (p. 78)
that it is inevitable that some differences will arise among examiners.
"These differences do not matter if the encouragement has no effect,
but if that is the case, there is little point in using it." He
goes on to state (p.80):
Although the differences obtained in the present research were within
the standard error of measurement of the WAIS, it must be remembered
that as a result of factors mentioned above, and the fact that examiner
differences were kept to a minimum, the effect found in the present
study was probably a minimal one.
Heaton, Taylor and Manly (2003) investigates certain aspects of both
the WAIS-III and the WMS-III, which were standardised jointly. The authors
are concerned to optimise these two tests for clinical -- especially
neurodiagnostic -- purposes. The use of the tests that they have in
mind is for comparing the scores achieved by particular individuals
with what they would have achieved if they did not have any neuropsychiatric
disorder, so that the scores can be used to establish the presence or
absence, nature and extent of any such disorder in that individual.
This would involve comparing test results with norms (unless the individuals
concerned happened to have been recently tested prior to the suspected
onset of any relevant morbidity). Confounding variables would, of course,
need to be taken into account and it would be preferable therefore to
have separate norms for every relevant category that an individual might
fall into. The authors state that there is evidence that sex, education-level
and ethnicity are relevant in this regard. However, WAIS-III only has
separate norms for particular age-groups.
The authors address this problem and claim to have solved it. They
investigate the effect of these variables on WAIS-III and WMS-III test
scores and also the effect on score-interpretation of not taking these
factors into account. They then provide new standardised scores that
correct for these demographic influences, and demonstrate how these
result in more accurate score-interpretations.
6. Evaluation and Discussion
The Wechsler family of tests are long-established and well-known,
and have both a large amount of face-validity and professional credibility
because of this. The subtests of the WAIS-III are varied and attractive,
which reduces the tedium (for the examinee) which might be associated
with sitting a long test, although there is evidence (Axelrod and Ryan
2000) that some examinee groups can average as long as 110 minutes to
complete the full test.
One of the main strengths of the WAIS-III is the size and representativeness
of the standardisation sample used in test development. However, as
Kaufman and Lichtenberger (1999, p. 3) state, "The development
of Wechsler's tests was not based on theory ... but instead on practical
and clinical perspectives." This theoretical vacuum reflects on
its construct validity.
As the technical manual states (p. 75):
The validity of a test is regarded as the most fundamental and important
aspect of test development... validity is the overall evaluation of
the degree to which empirical evidence and theoretical rationales
support the adequacy and appropriateness of interpretations of test
The main weakness of the WAIS-III relates to the theoretical rationales
which underpin its claims to validity. The Technical Manual states that
Wechsler maintained throughout his career the definition of intelligence
as the "capacity of the individual to act purposefully, to think
rationally, and to deal effectively with his environment." From
the point of view of construct validity, however, it is implausible
to claim that WAIS-III measures the constructs intended by its design,
if the constructs are based on the above definition, which is extremely
broad. How prominently does "purposefulness" figure in the
WAIS-III? Not at all, as far as I am aware. And the term "environment"
is so broad that it would be implausible to suggest that sitting any
test at a desk under supervision was at all relevant to assessing how
an individual dealt with his environment as a whole (however that might
be defined). I have seen no evidence that the subtests (derived, as
they mostly were, from tests developed by other researchers) were developed
to test a construct based on that definition of intelligence -- or anything
Moreover, it would also be implausible to claim that users (administrators,
user organisations and examinees) of the WAIS-III would generally have
as broad a definition as that in mind when they purchased and/or used
it in good faith to produce scores of "intelligence". It is
beyond the scope of this review to investigate whether there have been
or might in future be legal arguments raised in connection with the
Coolican (2005, p. 288) warns:
Note that psychologists have not discovered that intelligence has
a normal distribution in the population. The tests were purposely created
to fit a normal distribution, basically for research purposes and practical
convenience in test comparisons.
This artificiality and pragmatism are not limited to the distribution
of intelligence scores. Psychologists often apply their theories to
important social purposes, and one of these purposes is to assess the
"amount" of the pre-existing popular concept of "intelligence"
that particular people possess. This popular concept itself is vague
and understood in different ways by different ordinary people, but tests
such as the WAIS-III are marketed back to ordinary people as being tests
of "intelligence" (as is shown by the appearance of the word
"intelligence" in the name of the test), with the implication
that this is the same concept that lay people have in mind when they
use that word. It might have been better to use a term such as "rational
There are no substantial ethical issues involved with the WAIS-III
that are not common to all psychometric tests. The one possible exception
is the ethical need to resist pressures from political groups to interpret
as ethical issues what are properly considered political issues related
to the education or employment of particular ethnic or other groups.
These need to be decided through the proper democratic poiltical processes.
Axelrod, B. N. & Ryan, J.J. (2000). Prorating Wechsler Adult Intelligence
Scale-III Summary Scores. Journal of Clinical Psychology, Vol. 56(6),
Bennett, W.J. (1981). Effects of Encouragement and Locus of Control
on WAIS IQ Scores. Massey University: M.A. Thesis.
Camara, W. J., Nathan, J. S., & Puente, A. E. (2000). Psychological
Test Usage: Implications in Professional Psychology. Professional Psychology:
Research and Practice, Vol. 31(2), 141-54.
Coolican, H. (2005). Research Methods and Statistics in Psychology.
London:Hodder & Stoughton.
Guilford, J.P. (1954). Psychometric Methods (2nd ed.). New York:McGraw-Hill.
Heaton, R.K., Taylor, M.J., & Manly, J. (2003). Demographic effects
and Use of Demographically Corrected Norms with the WAIS-III and WMS-III.
In Tulsky, D. S., Chelune, G. J., Ivnik, R. J., Prifitera, A., Saklofske,
D. H., Heaton, R. K., Bernstein, R. & Ledbetter, M. F. (Eds.) (2003).
Clinical Interpretation of the Wechsler Adult Intelligence Scale. San
Jones, J. J. S., van Schaik, P. & Witts, P. (2006). A Factor
Analysis of the Wechsler Adult Intelligence Scale 3rd Edition (WAIS-III)
in a Low IQ Sample. British Journal of Clinical Psychology Vol. 45,
No. 2, June 2006, Page 145-152.
Kaufman, A. S. & Lichtenberger, E. O. (1999). Essentials of WAIS-III
Assessment. New York:Wiley.
Knight, R. G. & Godfrey, H. P. D. (1984). Tests recommended by
New Zealand Hospital Psychologists. New Zealand Journal of Psychology,
Nunnally, J. (1978). Psychometric Theory (2nd ed.). New York:McGraw-Hill.
Sharpley, C. F. & Pain, M. D. (1988). Psychological Test Usage
in Australia. Australian Psychologist, Vol. 23 No. 3, 361-9.
Shum, D., O'Gorman, J. & Myors, B. (2006). Psychological Testing
and Assessment. Melbourne:Oxford University Press.
Taub, G.E. (2001). A Confirmatory Analysis of the Wechseler Adult
Intelligence Scale-Third Edition: Is the Verbal/Performance Discrepancy
Justified? Practical Assessment, Research & Evaluation, Retrieved
30 April 2007 from http://PAREonline.net .
Watkins, C. E. Jnr., Campbell, V. L., Nieberding, R. & Hallmark,
R. (1995). Contemporary Practice of Psychological Assessment by Clinical
Psychologists [Psychological Assessment and Clinical Practice]. Professional
Psychology: Research and Practice, Vol. 26(1), 54-60.