See also:
 | Book Reviews |
|
 | IQ |
|
|
|
"The malleability of gender as demonstrated by Money's work on
hermaphrodites and the adrenogenital syndrome has undermined the
strength of a causal link between hormones and gender-specific
behaviour."
Helen WEINREICH-HASTE, 1983, British Journal of Psychology 74.
|
"Somehow it always seems that the crummier the test, the higher the
heritability it produces."
Peter Schönemann (Purdue University), 1994. Reported by C.Mann, 1994,
'Behavioral genetics in transition', Science 264, 17 vi.
|
"[In 160 same-sex Croatian twin pairs, aged 15-19], intraclass
correlations for monozygotic and dizygotic twins were, respectively,
.75 and .44 for visualization;
.58 and .33 for spatial orientation;
.67 and .41 for word fluency; and
.74 and .41 for vocabulary."
D.BRATKO (University of Zagreb), 1995, from the abstract of an address
to International Society for the Study of Individual Differences,
meeting in Warsaw.
|
|
|
RUIN, v. To destroy. Specifically, to destroy a maid's belief in the
virtue of maids.
[The Devil's Dictionary A.B.]
|
Reflections on Stephen Jay Gould's The Mismeasure of Man
(1981):
JOHN B. CARROLL, University of North Carolina at Chapel Hill
A Retrospective Review in Intelligence 21, 121-134 (1995)
On its publication in 1981, The Mismeasure of Man (Gould,
1981) stirred in the reading public an interest and a clamor almost
equal to that evoked by the recent appearance of Herrnstein and
Murray's (1994) The Bell Curve. Although it never made the New
York Times best-seller list (as did the latter, for 14 weeks),
it was much discussed among intellectual dilettantes, and it received
a National Book Critics Circle award, as well as, perhaps unexpectedly,
the 1983 Outstanding Book Award from the American Educational
Research Association.
The biologist Bernard Davis (1983; see also Gould, 1984; Davis,
1984) called attention to the fact that reviews in the popular
and literary press, such as The New York Times Book Review,
The New Yorker, and The New York Review of Books,
were almost universally effusive in their approbation, whereas
most reviews in scientific journals, such as Science (Samelson,
1982), Nature, and Science '82, tended to be critical
on a number of counts. Davis cited Jensen's (1982) review in Contemporary
Education Review as "the most extensive scientific analysis,"
but mentioned, as an exception, a generally laudatory review by
Morrison that appeared in Scientific American because that
joumal's editorial staff had "long seen the study of the
genetics of intelligence as a threat to social justice" (Davis,
1983, p. 45).
To Davis' list of generally critical reviews in scientific journals,
I would add those by Spuhler (1982) in Contemporary Psychology,
and by Jones (1983) and Humphreys (1983) in Applied Psychological
Measurement (the latter appearing also in the American
Journal of Psychology, 1983).
Despite these critical reviews, however, The Mismeasure of
Man continues to be cited frequently in the social science
literature, usually, but not always, with what can be taken as
agreement and approval. In the annual volumes of the Social
Science Citation Index, the numbers of citations listed for
the years 1982 to 1993 were 18 (1982), 32 (1983), 32 (1984), 49
(1985), 46 (1986), 48 (1987, including a citation of a German
translation), 61 (1988), 51 (1989), 53 (1990), 62 (1991), 58 (1992),
and 56 (1993). It is evident that Gould's book has had a powerful
influence on public and professional thinking about mental testing.
I do not wish to imply that all of this influence was unfortunate
or negative. Gould's research on the history of craniometry is
interesting and possibly valuable for historians of science. His
account of the history of mental testing, however, may be regarded
as badly biased, and crafted in such a way as to prejudice the
general public and even some scientists against almost any research
concerning human cognitive abilities. In this account, he indicts
mental testing not only as racially motivated, at least in its
beginnings, but more importantly, as ethically and scientifically
flawed because it "reifies" the IQ as a single number
that places a value on a test result. This despite Gould's admonition
that:
"The misuse of mental tests is not inherent in the idea of
testing itself. It arises primarily from two fallacies, eagerly
(so it seems) endorsed by those who wish to use tests for the
maintenance of social ranks and distinctions: reification and
hereditarianism (p. 155)."
Gould's influence has come to the fore again in his recent review
(Gould, 1994) of Herrnstein and Murray's (1994) The Bell Curve-a
book that takes much stock in the "g" factor
of intelligence postulated by Spearman (1904, 1927) and many others.
Although I do not necessarily ally myself with any of Herrnstein
and Murray's analyses, views, and interpretations about the role
of g in American life, I feel it is important to correct
the impressions about g and factor analysis that Gould
put forth in his review. There he wrote:
"Nothing in 'The Bell Curve' angered me more than the authors'
failure to supply any justification for their central claim, the
sine qua non of their entire argument: that the number known as
g, the celebrated 'general factor' of intelligence ...
captures a real property in the head. Murray and Herrnstein simply
declare that the issue has been decided, as in this passage from
their New Republic article: 'Among the experts, it is by now beyond
much technical dispute that there is such a thing as a general
factor of cognitive ability on which human beings differ and that
this general factor is measured reasonably well by a variety of
standardized tests, best of all by I.Q. tests designed for that
purpose.' Such a statement represents extraordinary obfuscation,
achievable only if one takes 'expert' to mean 'that group of psychometricians
working in the tradition of g and it s avatar I.Q.' The
authors even admit that there are no major schools of psychometric
interpretation and that only one supports their view of g
and I.Q.
But this issue cannot be decided, or even understood, without
discussing the key and only rationale that has maintained g
since Spearman invented it: factor analysis. The fact that Herrnstein
and Murray barely mention the factor-analytic argument forms a
central indictment of 'The Bell Curve' and is an illustration
of its vacuousness. How can the authors base an eight-hundred
page book on a claim for the reality of I.Q. as measuring a genuine,
and largely genetic, general cognitive ability-and then hardly
discuss, either pro or con, the theoretical basis for their certainty?
(p. 143)"
Following that are a couple of paragraphs in which Gould tries
to explain what "lay readers" might need to know about
factor analysis. He briefly repeats some of the same ideas that
he offered in his 1981 book: how Spearman identified g
with an axis placed through the middle of a batch of vectors,
and how Thurstone made g "disappear" by rotating
the axes, "giving rise to a theory of multiple intelligences
(verbal, mathematical, spatial, etc.), with no overarching g."
He continues: "In this perspective, g cannot have
inherent reality, for it emerges in one form of mathematical representation
for correlations among tests and disappears (or greatly attenuates)
in other forms, which are entirely equivalent in amount of information
explained" (p. 144).
It is indeed odd that Gould continues to place the burden of his
critique on factor analysis, the nature and purpose of which,
I believe, he still fails to understand. Even if factor analysis
had never been invented, we would nonetheless have IQ tests and
many other kinds of aptitude tests measuring various cognitive
abilities. And there would still be "experts" dealing
with the construction, analysis, and interpretation of these tests,
and behavioral geneticists (Plomin & McClearn, 1993) concerned
with the heritability of the traits measured by these tests.
It is my intention here to focus on the defense of factor analysis
as an effective and scientifically justifiable method for the
study of individual differences in cognitive abilities and other
psychological attributes, as well as to make any necessary statements
concerning the adequate measurement of such attributes. This is
partly because the available scientific reviews of The Mismeasure
of Man gave little attention to Gould's treatment of factor
analysis. If some of my arguments sound pedantic, it is only because
a pedant (as is implied by the derivation of the term) seeks to
teach.
GOULD'S BASIC PREMISES
First, a general remark: I must raise cautions about two of Gould's
basic assumptions: (a) that the "urge to classify and rank
people is strong" and somehow wrong, and (b) that scientists
cannot be objective, because their findings reflect their surrounding
culture and "the unconscious and very personal prejudices
of the scientists themselves" (quoted from the dust jacket).
Regarding the first assumption, why it is wrong to attempt to
classify and rank people is never made completely clear by Gould.
Certainly classification is a basic technique in all of science,
including Gould's paleobiology. One can hardly make progress in
science without determining the attributes of the things being
studied; in many cases, assigning attributes to things "ranks"
them, for example, by length, weight, mass, frequency, an d so
on. In psychology and social science, we can assign attributes
to people with respect to age, social status, tolerance, and so
on, to a whole host of entities that can be "measured."
Indeed, measurement is one of the basic techniques of science.
It may become obnoxious, in some circumstances, when the measurements
are assigned "values" of greater or lesser "worthiness"
in terms of ethics, social justice, or social/emotional attitudes.
This is what Gould appears to mean when he objects to "ranking."
However, Gould confuses this kind of ranking with pure measurement.
Some may object to the imputation of ordinal, interval, or even
ratio scaling in t he assessment of ability, but in my view there
are adequate logical and scientific reasons to introduce such
scaling, for example in the use of Rasch scaling (Rasch, 1960)
of items (or tasks) in ability and achievement tests, or the more
complex models developed by Lord and Novick (1968) for what they
call Item Response Theory (IRT). These models take account of
the fact that for any ability, it is possible to find tasks that
differ with respect to the number of people in any population
that are able to perform them correctly, and that there are definite
(albeit probabilistic) mathematical relations between such tasks
that can be described in terms of a quantitative scale of ability.
To be more specific, IRT makes mathematical sense of the fact
that, if the tasks on a scale are graded in difficulty, a person
at a certain level of ability tends to be able to perform successfully
all tasks up to a certain point, after which the person tends
to fail the remaining tasks on the scale.
Concerning the second assumption, the idea that scientists cannot
be objective is an old one, pursued by many philosophers and sociologists
of science (e.g., Krasner & Houts, 1994; McMullin,
1988; Scheffler, 1967). Obviously there are many factors in scientists'
selection of the things and issues they choose to study, and perhaps
their personal wants, interests, and prejudices constitute some
of these factors. However, having made those choices, there is
no reason why they cannot be objective in their studies, in reporting
independently verifiable observations, analyses, and findings.
In particular, I object to Gould's tendency to visit the alleged
sin s of early investigators on present day investigators. If
Goddard, Brigham, and others once tended to view various human
races as relatively superior or inferior in intelligence and therefore
relatively worthy or unworthy, this does not mean that present-day
investigators, like Jensen (1980) or Rushton (1995), are necessarily
guilty of such views. In fact, from my personal acquaintance with
Jensen and his publications, I can attest that he does not view
the African race (if one accepts that it is a race) as in any
way less worthy than other so-called races. Indeed, Jensen has
been more interested and active than most other scientists in
trying to work through the problem of how to interpret, and what
to do about, the acknowledged lower mean measured intelligence
of Blacks. This fact has been almost totally ignored by most of
Jensen's activist critics.
Gould's remarks about scientific objectivity have at least impelled
me to consider my own motivations, over my career, in the study
of cognitive abilities. At a time when I was searching for a topic
suitable for a doctoral dissertation pertinent to my main interest
in the psychology of language, I became intrigued with Thurstone's
(1938) finding of several so-called "primary" abilities,
"verbal ability" and "word fluency," that
seemed relevant to the study of language behaviors and possibly
indicative of important mental processes. It did not occur to
me that mental testing might be relevant to racial differences;
in fact, I was unconvinced, by data available at the time, that
any important racial differences in mental abilities existed,
particularly because my linguistic studies had persuaded me that
all races and ethnic groups possess complex linguistic systems
that betokened higher states of mental processes among at least
substantial portions of every population that was able to acquire
and use these systems. Whether factor analysis has in fact led,
or will lead, to better understanding of mental processes remains
to be seen, but in any event my motivation to study and use factor
analysis has always been associated with the scientific investigation
of cognitive processes. I cannot believe that such motivation
is in any way associated with pernicious social attitudes.
GOULD ON FACTOR ANALYSIS
Gould (1981) starts his Chapter Six, "The real error of Cyril
Burt: Factor analysis and the reification of intelligence,"
with a consideration of "The case of Sir Cyril Burt,"
recounting the "twice-told tale" of Burt's alleged transgressions
of scientific proprieties. For present purposes, this whole story
is irrelevant. For a discussion of factor analysis it does not
matter whether Burt claimed to have invented it (he did not) or
fabricated data on twins raised apart, although the debate as
to whether he did still goes on (Fletcher, 1991; Hearnshaw, 1979;
Joynson, 1989; Samelson, 1992, 1995). It is interesting, though,
that Gould makes Burt a whipping boy for Spearman; if any blame
attaches to the supposed reification of intelligence, it should
be awarded to Spearman (as Gould eventually recognizes, pp. 250ff).
Gould goes on to discuss factor analysis, which he says "is,
to put it bluntly, a bitch" (p. 238). (Some have called his
exposition masterful, but I would call it masterful only in the
way one might use that word to describe the performance of a magician
in persuading an audience to believe in an illusory phenomenon.)
Gould cites his own use of factor analysis, early in his career:
"I was taught the technique as though it had developed from
first principles using pure logic. In fact, virtually all its
procedures arose as justifications for particular theories of
intelligence. . . . [T]hough its mathematical basis is unassailable,
its persistent use as a device for learning about the physical
structure of intellect has been mired in deep conceptual errors
from the start. 'Me principal error, in fact, has involved a major
theme of this book: reification-in this case, the notion that
such a nebulous, socially defined concept as intelligence might
be identified as a 'thing' with a locus in the brain and a definite
degree of heritability-and that it might be measured as a single
number, thus permitting a unilinear ranking of people according
to the amount of it they possess. By identifying a mathematical
factor axis with a concept of "general intelligence,"
Spearman and Burt provided a theoretical justification for the
unilinear scale that Binet had proposed as a rough empirical guide.
(Gould, 1981, pp. 238-239)
This statement calls for comment. First, it is not the case that
"virtually all [factor-analytic] procedures arose as justifications
for particular theories of intelligence." Perhaps Spearman
regarded his (at the time very primitive) procedures as justifying
his own theory of intelligence, but the many procedures of factor
analysis that have been developed over the subsequent years cannot
be regarded as "justifications" of any particular theory
of intelligence. Rather, factor-analytic procedures can be regarded
as devices to assist in developing different theories of intelligence
and choosing among them. Consider, for example, the different
theories of intelligence developed by Thurstone (1938), Guilford
(1967), and Cattell (1971). 1 regard myself as one of these theorists
(Carrol, 1994) in the field of intelligence, or "cognitive
abilities," as I prefer to say, but I do not regard factor
analysis as such a justification for my theory. Justification
of my theory comes, at least in part, from the manner in which
I use factor analysis and other techniques (such as IRT) to analyze
and interpret data.
Second, the wording "the physical structure of intelligence"
is strange and misleading. Factor analysts study what they call
the structure of intelligence, but they do not regard it as a
physical thing in any way. It is simply a statement of the varieties
of cognitive ability and the degree to which they occur or do
not occur together, or subsume each other; often the structure
of intelligence is diagrammed as a hierarchical tree structure.
It is no more a physical thing than the structures that biologists
employ to depict the evolutionary relations of biological species.
Third, and most importantly, factor analysis implies no "deep
conceptual error" of "reification." One can agree
with Gould that factors are not properly regarded as "things"
or physical entities. But factorists do not regard them in this
way (or if they do, they can be in error). Merely because it is
convenient to refer to a factor (like g) by use of a noun
does not make it a physical thing. At the most, factors should
be regarded as sources of variance, dimensions, intervening variables,
or "latent traits" that are useful in explaining manifest
phenomena, much as abstractions such as gravity,. mass, distance,
and force are useful in describing physical events. Gould's far-reaching
condemnation of factor analysis as a device for producing reifications
is one of his own deepest conceptual errors; it stands factor
analysis on its head. Unfortunately, it has had wide and beguiling
appeal among some readers, even among some social scientists.
Fourth, although the concept of intelligence may be "nebulous"
in Gould's mind, the purpose of factor analysis (and associated
techniques such as psychological tests and other means of behavioral
observation) is to make the concept more tangible, spelled out,
and scientifically respectable. In the preceding statement, Gould
seems to imply that intelligence is only a single "unilinear"
dimension. Actually, factor-analytic and other types of investigations
have revealed that the "socially defined" concept of
intelligence corresponds to a veritable plethora of different
dimensions of cognitive ability, varying in generality and import
(Carroll, 1993). However, persuading my reader that this is the
case must await further consideration of Gould's critique.
The next major section of Gould's chapter is devoted to the topic,
"Correlation, cause, and factor analysis." In it, Gould
offers an elementary exposition of various psychometric concepts,
such as the Pearsonian correlation coefficient, multiple dimensions
of ability, matrices, vectors, factor analysis, principal components,
and rotation of axes. In the main, it is correct as far as it
goes. However, even when it was written, it was out dated because
it omitted mention of various techniques to circumvent the problems
that Gould cited, and it actually misrepresented some of those
problems. I can mention only a few.
One red herring to which Gould devotes much space is the role
of cause in interpreting correlations. After giving an explanation
of what a Pearsonian correlation coefficient is, he points out,
correctly, that "[t]he vast majority of correlations in our
world are, without doubt, noncasual" (sic, p. 242), and that
"[t]he invalid assumption that correlation implies cause
is probably among the two or three most serious and common errors
of human reasoning." Further, "[i]n summary, most correlations
are noncausal; when correlations are causal, the fact and strength
of the correlation rarely specifies the nature of the cause"
(P. 243). In point of fact, factor analysts have not assumed that
the correlations they deal with are causal. The usual explanation
of a statistically significant correlation is that it suggests
that two variables tend to measure something in common; the problem
is to determine what that common something is, and whether it
can be interpreted as in any way causal in its influence, or referred
to still another variable that is causal. That a "factor"
discovered by factor analysis is causal is only a hypothesis to
be later confirmed or disconfirmed.
Gould proceeds to an exposition of "correlation in more than
two dimensions," winding up with inviting the reader to consider
"20 dimensions, or 100" (p. 245), and thus to appreciate
"the heart of what factor analysis attempts to do."
In his words, "[f]actor analysis is a mathematical technique
for reducing a complex system of correlations into fewer dimensions"
(p. 245), as if this were the only purpose or definition of factor
analysis. Factor analysis is much more than merely a technique
for reducing a system of correlations to fewer dimensions; such
reduction ("factor extraction") is only the first step
in determining what the reduced dimensions are and what they mean,
after any appropriate transformations.
Gould complains that Spearman reified g as an entity and
"tried to give it an unambiguous causal interpretation"
(p. 25 1). Perhaps so, but any causal explanation that Spearman
attempted to give g was only a hypothesis; it is only recently
that investigators have been able to find at least some evidence
for a physical basis for g in neuropsychological phenomena
(see, e.g., Duncan, 1995). It is incorrect to make a wholesale
accusation that factor analysts reify factors or make unjustified
attributions of causal influence. Gould wrote: "Spearman's
g is particularly subject to ambiguity in interpretation,
if only because the two most contradictory causal hypotheses are
both fully consistent with it: 1) that it reflects an inherited
level of mental activity (some people do well on most tests because
they are born smarter); or 2) that it records environmental advantages
and deficits (some people do well on most tests because they are
well schooled, grew up with enough to eat , books in the home,
and loving parents). (p. 252)"
He fails to make clear why these two hypotheses are "most
contradictory" (they would be only if it is assumed that
only one of them applies) and in any case shows his ignorance
or neglect of the whole of behavioral genetic science, which all
along has emphasized that heredity and environment both participate,
in complementary degrees, in the determination of behavioral outcomes.
Actually, factor analysis says absolutely nothing about the extent
to which a "factor" or dimension identified in a set
of data is affected more by hereditary or environmental determinants.
This is a problem for behavioral genetics and for developmental
and educational research into the effects of environments or interventions,
not for factor analysis.
Further on this page, Gould introduces his readers to one of his
most misleading and erroneous ideas about factor analysis. He
wrote:
"Another, more technical argument clearly demonstrates why
principal components cannot be automatically reified as causal
entities. If principal components represented the only way to
simplify a correlation matrix, then some special status for them
might be legitimately sought. But they represent only one method
among many for inserting axes into a multidimensional space. (p.
252)"
And
"During the 1930s factorists developed methods to treat this
dilemma [in finding the correct location of axes] and to recognize
clusters of vectors that principal components often obscured.
They did this by rotating fact or axes from the principal components
orientation to new positions. . . . [But in doing this,) g
has disappeared. We no longer find a "general factor"
of intelligence, nothing that can be reified as a single number
expressing overall ability. Yet we have lost no information. .
. . How can we argue that g has any claim to reified status
as an entity if it represents but one of numerous possible ways
to position axes within a set of vectors? (p. 253)"
In all this, Gould seems to be claiming that factor analysis is
a worthless technique (somewhere he calls it "bankrupt"
because it has no way of assuring that its results are determinate).
It is not until some pages late r (pp. 296ff) that he considers
Thurstone's contributions to factor analysis, and even here he
makes mistakes. He calls Thurstone "the exterminating angel
of Spearman's g (p. 296).
The fact is that Thurstone later came to accept a higher order
g, not only in his monograph (Thurstone & Thurstone,
1941) on factors of intelligence in eighth graders but also in
his text (1947) on factor analysis methods. Indeed, Gould acknowledges
Thurstone's acceptance of a second-order g, but apparently
in order to make his story consistent he wrote:
"Thurstone wrestled with what he called this "second-order"
g. I confess that I do not understand why he wrestled so
hard, unless the many years of working with orthogonal solutions
had set his mind and rendered the concept too unfamiliar to accept
at first. If anyone understood the geometrical representation
of vectors, it was Thurstone. This representation guarantees that
oblique axes will be positively correlated, and that a second-order
general factor must therefore exist. Second order g is
merely a fancier way of acknowledging what the raw correlation
coefficients show-that nearly all correlation coefficients between
mental tests are positive. (p. 313)"
This is a gross misrepresentation of Thurstone's views and methods
of thinking. Almost from the start, Thurstone postulated that
his "primary factors" might be correlated, and was surprised
to find that in the initial sample he studied, they were generally
uncorrelated. In his 1938 monograph, he described them as uncorrelated
mainly because at the time he had not developed completely satisfactory
techniques for making them conform to his criteria for simple
structure. (Gould failed to note Thurstone's statement in an autobiographical
essay [Thurstone, 1952] that the publication of orthogonal results
was due to a suggestion by Thorndike that his study's impact would
be reduced if too many innovations were introduced in one paper).
Certainly in his later years, Thurstone accepted the idea that
the primary factors were often correlated - he did not live to
see the further techniques that were later developed to depict
relationships among factors.
It was these techniques developed later that Gould totally ignored.
Already, 27 years earlier, Schmid and Leiman (1957) presented
a technique for depicting the hierarchical structure of a group
of variables and their factors. By 1978, Hakstian and Cattell
(1978) published an important paper on "Higher-stratum ability
structures on a basis of twenty primary abilities." In the
meantime, factorists had devoted much attention to methods of
rotating factor axes to simple structure (see Harman, 1976). Also,
by 1978 the Swedish statistician Joreskog (1978) published several
important contributions to factor analysis-contributions that
have made it possible to confirm the role of g in explaining
factor correlations (see, e.g., Gustafsson, 1984). If Gould
had done his homework properly, he could have seen that his criticisms
of factor analysis could no longer be well supported . I do not
use space critiquing Gould's many assertions about Spearman, Burt,
Jensen, and others, because they only further illustrate Gould's
many errors in interpreting factor analysis.
Two final points of clarification: First, Gould claims that regardless
of how factorial axes are placed, there is "no loss of information."
In a sense, this is true; the situation is analogous to the fact
that if you want two numbers that when multiplied together produce
a given product, there is an infinity of solutions (e.g.,
two numbers that can be multiplied to give 48 include I and 48,
-2 and -24, 3 and 16, .0208333 and 2304, etc., etc.) but there
is no "loss of information" in producing the product.
In factor analysis, however, the correct placement of axes to
produce simple structure in a sense adds information, in that
it specifies more clearly how much each test measures each factor,
on the assumption that the measurements ("factor loadings")
are generally either zero or positive, and not negative-basically
the idea of "simple structure" that Thurstone (1938,
19 40) initiated as a criterion for the "correct" placement
of axes. Contrary to Gould's assertion in the preceding quotation,
the geometrical representation of vectors does not guarantee that
axes must be oblique, or that a g factor must exist. (Also,
obliqueness of axes does not guarantee that they represent positive
correlations; the correlations may be negative.) However, when
the data dictate that correctly placed axes are oblique, it is
useful to specify a higher-order factor (which may or may not
be g) that accounts for their correlation, and then to
compute, by the Schmid and Leiman (1957) method, a hierarchical
orthogonal matrix to represent the positions of the tests in a
hyperspace that still retains their simple structure. In so doing
there may be a slight loss of parsimony, in that at least one
more factor is required to explain the correlations, but there
is a gain of information in the sense of specifying the factor
loadings on a reasonable scale.
Second, Gould claims that Thurstone's analysis permitted Burt
and Spearman "at best, a weak second-order g"
(p. 315). On the previous page he had asserted that "[s]econd-order
g (the correlation of oblique simple structure axes) rarely
accounts for more than a small percentage of the total information
in a matrix of tests" (p. 314). This is truly an egregious
error on Gould's part. The fact is that most of the time, g
accounts for a quite large proportion of the information in a
matrix of correlations among cognitive tests. Further, loadings
of tests on a g factor often tend to be fairly high, particularly
if the tests are observed to be "highly g-loaded"
in terms of their content. The g factor can hardly be called
"weak." I have estimated (Carroll, 1993, p. 57) that
typically, a higher-order factor such as g constitutes
about half of the common-factor variance in a cognitive test,
although the proportion may vary considerably.
IN DEFENSE OF FACTOR ANALYSIS AND MENTAL TESTING
Although statisticians (e.g., Goodall, 1990) occasionally
express doubts about the validity of factor analysis as a scientific
methodology, it is seldom clear whether such doubts are well founded
or merely the result of ignorance about recent developments in
the technique. There is a large community of social scientists
(psychologists, sociologists, and others) who have confidence
in factor analysis and use it in analyzing different type s of
data. In the field of individual differences in cognitive abilities,
it is prized chiefly as a method for identifying the "linearly
independent" dimensions in a set of data that need to be
examined and integrated with other knowledge about the structure
of abilities. (Linear independence means that different dimensions
can be distinguished even though they may be correlated.) The
method has now achieved a high degree of sophistication and reliability,
in that different investigators can obtain the same results in
analyzing a given set of data (Carroll, 1995). One indication
of this is that exploratory analysis procedures can correctly
recover a hypothetical simple structure matrix from a correlation
matrix generated from that matrix. For example, analysis can recover
the structure of a matrix that contains a general factor (g),
or even several higher-order factors with an overarching g
factor. If a general factor is found in a set of empirical data,
there is reason to believe that such a factor exists in the data,
however it may eventually be interpreted. "Confirmatory"
factor analysis, as embodied in procedures developed by Bentler
(1985), Joreskog and Sorbom (1989), and others, can add weight
to the finding of such a general factor.
In the meantime, the technology of constructing mental and achievement
tests has enormously improved over what was possible in Spearman's
or Thurstone's days. IRT makes it possible to examine the unidimensionality
of a mental test or other observational procedure. Although there
is much work to be done in providing adequate tests and measurement
procedures, it is possible to show that available procedures sample
the kinds of mental processes and knowledge that operate in the
real world.
For this and other reasons, it is possible to endorse the proposition
that tests designed for the purpose can adequately measure a "general"
or g factor of intelligence.
EPILOGUE
At this point, I hope I have demonstrated that in the main, Gould's
statements and accusations about factor analysis are incorrect
and unjustified, and should not be regarded as constituting an
authoritative guide to evaluating this technique. However, I should
add some cautions about the present-day status of g in
factor analysis.
First, although a higher-order g factor is often found
in factorial investigations, the precise nature of such a g
factor often depends on the types of measures analyzed in an assemblage
of such measures-psychological tests or other observational procedures.
For example, a g factor based on a series of highly verbal
tests may be biased toward the verbal abilities measured by such
tests. A good measure of g must be based on a suitable
variety of test materials.
The g factor may also depend on the precise way in which
the g factor is calculated-whether, for example, it is
calculated on the basis of a first principal component, a first
principal factor, or an orthogonalization of a structure of oblique
factor matrices by the Schmid and Leiman (1957) technique. Results
of these different procedures are generally different only in
small ways, but Jensen and Weng's (1994) work on ways of finding
a " good g" suggests that the Schmid-Leiman technique
is generally preferable (contrary to Jensen's [1980] previous
opinion that the first principal factor is more satisfactory).
Second, psychometricians continue to be engaged in debate over
the nature of g. Some feel that g is a unitary,
indivisible trait, although others (e.g., Detterman, 1982;
Kranzler & Jensen, 1991) postulate that it is actually a composite
of a number of different traits. The reader may consult an edited
work by Detterman (1994) for discussions, by a number of authorities,
of this and related problems in the theory of intelligence.
Above all, it must be realized that the development of mental
tests did not stop with the work of Spearman, Burt, Thurstone,
and others mentioned by Gould. Current research in testing is
much influenced by developments in cognitive psychology and in
the study of children's mental growth. It may be hoped that at
some time in the future, increased knowledge about the status
of g and other factors of cognitive ability will be available,
leading to positive ways in which testing can be of use in society.
|