Last year, Richard Herrnstein and Charles Murray published The Bell Curve:
Intelligence and Class Structure in American Life. Although it had more
graphs than a Ross Perot speech, The Bell Curve made its authors' names
household words, sometimes accompanied by four-letter words. Herrnstein
and Murray maintained that America is splitting into the intelligent, who
will move and shake society, and the less intelligent, who will be moved
and shaken. They thought that the split is inevitable, because our technological
society requires intelligence to run it. Finally, they said that intelligence
is largely hereditary, and that numerous government programs, especially
Affirmative Action, are undesirable because they amount to discrimination
against the capable.
Such thoughts are not entirely politically correct. The first reactions
to The Bell Curve were expressions of public outrage. In the second round
of reaction, some commentators suggested that Herrnstein and Murray were
merely bringing up facts that were well known to the scientific community,
but perhaps best not discussed in public. A Papua New Guinea language has
a term for this, Mokita. It means "truth that we all know but agree
not to talk about."
The uproar over The Bell Curve is remarkably similar to a debate
in the early 1970s. The earlier debate began when Arthur Jensen (1969) wrote
that the educational enrichment programs of the Great Society were inherently
limited by the immutability of intelligence and when Herrnstein (1973) claimed
that differences in intelligence are largely genetic. Counterattacks followed,
and by the early 1980s widely read books and articles maintained that there
is no such thing as general intelligence (Gardner 1983), or that if there
is it is largely a statistical artifact of the way that tests are constructed
(Gould 1983), and that even if IQ exists it has little to do with life outside
of a few narrow academic settings (Ceci and Liker 1986). Some of these authors
have recanted (Ceci and Bruck 1994, pg. 79).
A central question in the debate is whether or not mental competence is
a single ability, applicable in many settings, or whether competence is
produced by specialized abilities, which a person may or may not possess
independently. Almost equally important is the question of how cognitive
skill, as evaluated by IQ tests, translates into everyday performance. Popular
presentations on both sides of these questions leave the impression that
these questions have simple answers. They do not. My goal in this essay
is to discuss different theories of how intelligence is related to performance
in modern society. The plural was chosen intentionally, Although we know
a good deal about individual differences in human cognition, there is no
monolithic, agreed-upon, all-purpose theory to organize these facts, nor
is there likely to be one. There are a number of different theories that
are neither right nor wrong, but are useful for different purposes.
Psychometric Views of Intelligence
In popular discussions of intelligence, including The Bell Curve, the term
generally refers to scoring well on tests that have been developed to measure
mental ability as psychologists have come to see it. I shall refer to this
emphasis on test scores as the psychometric view of intelligence. Its core
belief is that individual differences in human cognition can be adequately
measured by performance on intelligence tests, and that intelligence itself
can therefore be defined by variations in test scores, across people. This
notion was expressed most pungently when the psychologist Edwin Boring (1923),
in a public debate with the columnist Walter Lippman, said that "intelligence
is what the intelligence test measures." It turns out that that statement
is not quite so arrogant or self-serving as it sounds. To see why we have
to look at what intelligence tests are and how intelligence measures are
inferred from test scores.
Although it is not always clear in our everyday use of language, scientists
distinguish carefully between a conceptual variable and its operational
definition--the way that it is measured. Physicists distinguish between
mass as a concept and scale readings as data to be analyzed. In the best
of situations there is a clearly understood link between the two. Physicists
can provide a theory of the relation between a scale's movement and the
mass of the object being weighed. The relation between the data for and
the concept of intelligence is not at all like the relation between scale
readings and mass, because in psychometrics the concept is inferred from
the measuring instrument, rather than having the measurement technique dictated
by the concept.
Most intelligence tests do not measure just one thing, in the sense that
a scale measures only the gravitational attraction between an object and
the earth. Instead, intelligence tests are made up of a number of component
subtests, in which people are asked to perform different cognitive tasks.
The test score is supposed to measure the common thread that runs through
performance on the subtests. For instance, the widely used Wechsler Adult
Intelligence Scale (WAIS) contains subtests that evaluate a person's vocabulary,
short-term memory, arithmetical ability, world knowledge and several other
specific skills. The Scholastic Achievement Test (SAT), which is a widely
used college-screening test, and the Armed Service Vocational Aptitude Battery
(ASVAB), which is used to screen military recruits, are organized in somewhat
the same way. Instead of thinking of these tests as cognitive yardsticks
measuring intelligence the way a real yardstick measures length, it is better
to think of an intelligence test as a sort of mental track meet, in which
cognitive ability is inferred by combining subtest scores, just as athletic
ability can be inferred by combining the scores in a decathlon.
This brings us to the question of how the subtest scores are to be combined.
Although there is some variation from test to test, the formal basis for
test combination is a statistical procedure called factor analysis. Suppose
that an intelligence test consists of K subtests. (To continue the
analogy to the decathlon, K is usually 10 or 12.) A person's scores
on the subtests can be represented by a K-dimensional vector. The
collective scores of all people in the group can be thought of as a swarm
of points in a K-dimensional space. Factor analysis attempts to reduce
the K-dimensional space to a smaller P-dimensional space,
where P \ K and the axes defining the dimensions are orthogonal,
or at right angles to one another. Unless the scores of two of the original
tests are perfectly correlated, this always entails some loss of accuracy.
The loss can be measured, so we can determine how much of the variation
in the original K-space lies along a particular dimension in the
reduced P-space.
To get an intuitive idea of factor analysis, imagine buying a hot dog with
pimientos embedded in it. The hot dog is a three-dimensional object, so
it takes three dimensions to specify the exact location of each pimiento.
However, you can locate a pimiento reasonably accurately by saying where
it is along the long axis of the dog. In factor-analytic terms the pimientos
are the data from each person, and the three dimensions of the hot dog represent
the individual tests. The long axis of the hot dog would be the first factor
to be extracted and would capture most of the variation between pimiento
locations. If we apply factor analysis to test scores, instead of hot dogs,
the first factor accounts for most of the variation between people just
as the length of the hot dog accounts for most of the positioning of the
pimientos. But instead of saying "length of hot dog," we say "general
intelligence."
There are two objections to this argument. One is that when the data are
reduced from the K-dimensional to the P-dimensional space,
the orientation of the orthogonal dimensions in the P-dimensional
space is arbitrary. To see this, consider the hot-dog example again. Although
locating pimientos can be reduced from a problem in three dimensions to
a problem in one dimension, the one dimension does not have to point exactly
along the long axis of the hot dog. It could be rotated to any angle at
all, excepting at a right angle to the long axis, and the pimientos could
still be located with equal accuracy.
This fact led one critic of the idea of general intelligence, Stephen Jay
Gould (1983) to argue that factor analysis is not an appropriate way of
defining the variables underlying test scores, because one solution is statistically
as a good as another. Gould was wrong. There are statistical methods (which
were well known to specialists at the time) that make it possible to compare
the goodness of fit of one factor-analytic solution to another. When these
methods are applied, investigators virtually always find a highly reliable
first factor. The case for general intelligence, the unitary IQ score, is
far from trivial. However, there are alternative explanations for the data,
based on the idea that there are different types of intelligence, even when
one restricts oneself to the notion that intelligence is what the tests
measure. To understand what they are, we need to delve into factor analysis
a bit more.
Suppose that the statistical variation in the data can be reduced from K
dimensions (the original test space) to P orthogonal dimensions.
This is only possible if the K original tests are positively correlated,
which they virtually always are. In this case there will also be a solution
in M dimensions, where P < M < K, in which
some of the M dimensions are not orthogonal to each other. (In psychological
terms, if two abilities are statistically unrelated to each other, the dimensions
representing them will be orthogonal.) Now, suppose that you had some theoretical
reason to believe that the data from the original K tests had been
generated by two or more underlying mental factors that were statistically
related to each other. Returning to the athletic example, you might want
to argue that decathlon scores were determined by the strength and speed
of the athletes, and that there is a statistical relationship between strength
and speed. Reasoning such as this is called specifying a factor structure
for the underlying abilities. Gould claimed that psychometricians could
not distinguish between alternative factor structures. Today they can.
During the 1970s the Swedish psychometrician Karl Jr¨eskog developed
a statistical technique for evaluating the fit of a multivariate data to
an arbitrary, a priori specified factor structure. This made it possible
to compare two proposals about the structure of intelligence to data, to
see which theory best fit the facts. The new methods have been applied to
a number of new data sets (notably Gustafsson 1984) and have become standard
in evaluating models of intelligence. In a related, highly technical but
very important volume, John Carroll (1993) used somewhat different methods
to reanalyze a great many important data sets that have been collected over
the past 60 years. The results of these independent analyses were quite
consistent. Skipping over some details, human intellectual competence appears
to divide along three dimensions. Following Raymond Cattell (1971) and John
Horn (1985), I shall refer to these dimensions as fluid intelligence (Gf),
crystallized intelligence (Gc), and visual-spatial reasoning (Gv).
Cattell and Horn describe them as follows:
Fluid intelligence is the ability to develop techniques for solving
problems that are new and unusual, from the perspective of the problem solver.
Crystallized intelligence is the ability to bring previously acquired,
often culturally defined, problem-solving methods to bear on the current
problem. Note that this implies both that the problem solver knows the methods
and recognizes that they are relevant in the current situation.
Visual-spatial reasoning is a somewhat specialized ability to use
visual images and visual relationships in problem solving--for instance,
to construct in your mind a picture of the sort of mental space that I described
above in discussing factor-analytic studies. Interestingly, visual-spatial
reasoning appears to be an important part of understanding mathematics.
Crystallized- and fluid-intelligence measures are substantially correlated.
For instance, Horn reported a study in which Gf and Gc measures
were extracted from an analysis of the WAIS. The correlation between factors
was 0.61. Such findings have led believers in just one intelligence to argue
that Gf and Gc are simply different flavors of a general intelligence
(IQ) factor. This argument cannot be answered one way or the other solely
by looking at correlations between tests. However, it can be attacked by
stepping outside of factor analysis and looking at how Gf and Gc
measures respond to manipulations that might change mental competence. It
turns out that they respond differently.
The most striking example is aging. Measures of Gf generally decrease
from early adulthood onward, whereas Gc measures remain constant
or even increase throughout most of the working years (Horn 1985; Horn and
Noll 1994). This is not surprising. Experience counts; most of the key leadership
positions in our society are held by people over 40. On the other hand,
middle-aged and older people do take longer than younger people to understand
new problem-solving methods and to deal with unfamiliar tasks. Age is not
the only variable that can be shown to have different influences on fluid
and crystallized intelligence. Alcoholism shows similar effects.
Since variables such as age, which is not itself a cognitive operation,
have different influences on different types of tests, it follows that there
cannot be just one ability underlying test performance. This argument moves
away from the psychometric tradition, which focuses only on test scores,
and towards the cognitive-psychology approach to intelligence. As the name
suggests, it is derived from a more general theory about what human thought
is, so a word about the general theory is in order.
The Cognitive-Psychology View
Cognitive psychologists think of thinking as the process of creating a mental
representation of the current problem, retrieving information that appears
relevant and manipulating the representation in order to obtain an answer.
The problem, its solution and some of the methods used to solve it are then
stored for later reference. The key point in this process is creating the
representation. This is assumed to require a temporary, working memory capability,
which requires attention and is often a bottleneck in thought. When familiar
problems are encountered the process of building an appropriate representation
becomes more efficient, because previously acquired information and problem
solving techniques can be used. This reduces the demand on working memory,
but does not entirely eliminate it.
The cognitive-psychology view is that cognition is a process, whereas the
psychometric view makes it a collection of abilities. Perhaps because it
is more dynamic, the cognitive-psychology view is often seen as more appealing
than the psychometric view, but it has the disadvantage of not lending itself
to easy summarization. When cognitive psychologists try to characterize
a person's thinking, they are not likely to use numbers to place the person
in a "mental space" defined by factors derived from IQ testing.
Instead they frequently use analogies to computing systems. To solve problems
a computing system must have sufficient "number crunching" power
to attack the problem at hand, programs that are appropriate for solving
the problems the system faces, and access to the data required to solve
these problems. Cognitive psychology draws an analogy between computing
power, programs and data access, and the cognitive functions of being able
to process ideas--any ideas--quickly and accurately, knowing how to solve
certain classes of problems, and having access to the knowledge needed to
solve particular problems. In psychological terms, human number-crunching
is a physiological capacity, whereas knowing how to solve problems and knowing
key facts are both products of learning. Each of these aspects of thought
are legitimate parts of intelligence. The physiological capacities are clearly
part of Gf, knowing key facts is part of Gc, and having acquired certain
problem-solving strategies is a bit of both Gc and Gf. A person's capabilities
are determined by the interaction between power, knowledge of how to use
that power and access to required data.
The cognitive-psychology account complements the psychometric distinction
between fluid and crystallized intelligence. Both accounts stress how a
novice's performance depends on the ability to develop new problem representations
(Cattell and Horn's fluid intelligence) and how with experience one shifts
from problem representation to pattern recognition, by applying past solutions
to present problems. Since developing a representation is more demanding
of working memory and attention than pattern recognition is, learning to
do an intellectual task will generally be harder than doing it. The theory
also implies that people who do well on tests of fluid intelligence should
have a large working-memory capacity, and indeed, they do (Carpenter, Just
and Shell 1990).
When cognition is viewed this way it is not surprising that IQ tests, and
especially fluid-intelligence tests, are associated with academic performance.
By definition students are novices. So are apprentices in workplace settings.
Data from the military (Wigdor and Green 1991) have shown that performance
on the Armed Forces Qualification Test (AFQT), which is used to screen military
recruits, has a strong relation with performance on the job in the first
few months. After two years the relation is reduced, but not negligible.
Similarly, the Department of Labor's General Aptitude Test Battery (GATB)
has been shown to be less valid for older than for younger workers. This
is consistent with laboratory studies and theoretical analyses in cognitive
psychology, all of which show that the experience reduces but does not eliminate
the relation between general intelligence and performance (Ackerman 1987).
Nonlinearities in Intelligence
Most of our everyday measurements are linear measurements. A linear measurement
is one in which a constant interval means the same thing at any point on
the scale. For instance, adding one inch to a six-foot board produces the
same change in length that adding one inch to a five-foot board does. We
are so familiar with linear measurements that we often assume that the properties
of linear measurements apply to any characteristic that is described by
numbers. That is not so, and the erroneous assumption can be particularly
confusing when we deal with intelligence.
In psychometric theories intelligence is calculated by determining a person's
standard score on an IQ test. The standard score is the deviation of a person's
absolute score of a test from the mean test score of a reference population,
divided by the standard deviation (a measurement of the variability of scores
in the reference population):
zi = ( xi - µ )
______
s
where xi is the ith person's score in absolute units (usually the number
of correct answers on a test) and µ and s are, respectively, the population
mean and standard deviation. If this equation were applied strictly, a person
of exactly average intelligence would have a score of zero, and people with
below-average intelligence would have negative scores. Since the ideas of
zero and negative intelligence do not seem reasonable, it is conventional
to report IQ scores by rescaling standard scores, using the equation
IQ = 15z + 100
This gives the person of average intelligence a score of 100. This equation
is simply a scaling convention; the real definition is contained in the
first equation, which makes the standard deviation the unit of scoring.
Herrnstein and Murray refer to the standard deviation as "like an inch,"
but it is not. The standard deviation is determined not by the absolute
values of the scores in a population, but rather by the extent to which
one score is likely to be different from another. In addition, the zero
point of the IQ scale (IQ = 100) is determined by the population mean, not
by a definition of "average intelligence" in terms of intellectual
performance. Therefore the IQ score of an individual is a relative score,
compared to the mean and variability in the reference population, rather
than an absolute measure of mental competence. If we measured height the
way that we measured IQ, a six-foot, six-inch man would have a standard
score of somewhat greater than 2, in the North American male population.
The same person would have a standard score of about 0 if the reference
population were professional basketball players.
The distinction between the relative and absolute definitions of intelligence
becomes important when we consider the relation between IQ, defined by standard
scores, and various dependent measures, such as school achievement and workplace
performance. Suppose a psychometrician records the job performance and intelligence-test
scores of a group of workers. The relationship would be expressed by this
equation, where B is the regression coefficient, or the rate at which
job performance changes as IQ changes:
job performance = average job performance +
B * IQ
B is calculated to make predictions as accurate as they can be. The actual
degree of accuracy is measured by the correlation coefficient , which varies
from 0 (no accuracy at all) to 1 (perfect prediction). Determining the regression
and correlation coefficients from a given set of data is straightforward.
The problem comes when an extrapolation is made to new situations, where
some data points lie outside the range of IQ units observed in the original
study. An example might be extrapolating the grade-IQ relationship observed
in high-school students to grade-IQ relations among college students. Such
extrapolations implicitly assume that IQ scores are linear measures of the
intellectual traits that they are supposed to measure. This is not true.
Suppose that a person in his 20s suffered a brain injury or infection that
reduced his IQ score by 20 points. (Such things are possible.) If he were
a medical or law school student with an original IQ of 140, he would probably
still complete his coursework, though perhaps with not quite so high a class
rank as before. If the person were a blue-collar worker with an original
IQ of 80 he would, at IQ 60, have a substantial risk of homelessness, poverty
and a number of other serious social problems.
The issue of nonlinearity applies to the very definition of intelligence,
and in particular to the question of whether there is one type of intelligence
or several. Suppose that general intelligence is equally important at all
levels of mental competence. In this case the results of a factor-analytic
study of test scores, based on data from people with high levels of intelligence,
should be similar to the results of a study based on data from people of
lower absolute levels of intelligence. Historically there have been suggestions
that this is not so. The general-intelligence model was first developed
by Charles Spearman (1904, 1927), based on analysis of test results from
English schoolchildren. In 1938 L. L. Thurstone challenged Spearman's conclusion
because he found very little evidence for general intelligence in a sample
of University of Chicago undergraduates. It was observed at the time that
the discrepancy might have arisen because Spearman and Thurstone had taken
data from people of widely different intellectual levels, which would be
evidence that intelligence changes qualitatively as the level of mental
competence changes. However, the results were not definitive because Spearman
and Thurstone had used different tests.
An important study by Douglas Detterman and Mark Daniel (1989) showed that
the relations between subtests do change as the level of scores changes.
Among other things, Detterman and Daniel examined correlations between subtests
of the WAIS and found higher correlations between subtest scores for people
with below-average IQ than for people with above-average IQ. David Waller
and Derek Chung and I found the same thing when we analyzed the ASVAB scores
that Herrnstein and Murray used in The Bell Curve to determine the
relation between IQ and various indicators of social adjustment. It appears
that general intelligence may not be an accurate statement, but general
lack of intelligence is!
The conclusion that the relation between different indices of mental competence
depends on the general level of competence is not consistent with psychometric
approaches, but it is consistent with the cognitive-psychology approach.
Recall that the cognitive-psychology approach assumes that mental competence
is produced by a cascade of progressively more refined abilities, moving
from information processing to problem-solving techniques to knowledge possession.
It follows that problems at the information-processing level will be general,
whereas potentials established at higher levels will be specific. In fact,
Detterman and Daniel did find that the relation between information-processing
measures and intelligence-test performance is higher at low levels of intelligence.
Similar observations have been made by scientists who have studied very
high-level performance, in fields ranging from physics to literature. A
certain amount of intelligence seems to be needed to gain entry to an intellectually
demanding field, but beyond that point success is determined by the effort
put into the job, social support, and just sheer experience. (See Ericsson,
Krampe and Tesch-Romer (1993) on expertise, Simonton (1984) on creativity,
and Gardner (1993) for some interesting biographical data.)
In economic terms it appears that the IQ score measures something with decreasing
marginal value. It is important to have enough of it, but having lots and
lots does not buy you that much. My regrets to Mensa, but that is the way
things are. Nonlinearity becomes important when we ask a key question raised
by Herrnstein and Murray: What is the relation between intelligence and
workplace performance?
How Important Is Intelligence?
No one would worry about who has intelligence, or why, if it did not matter.
Indeed, one of the claims made by the opponents of testing in the 1960s
and 1970s was that intelligence tests just measured academic performance,
and that even there they did not do a good job. One of Herrnstein and Murray's
major contributions has been to expose this bit of Mokita. Intelligence,
as measured by the tests, really does matter in both school and workplace,
although it may matter in somewhat different ways than The Bell Curve suggests.
To argue that IQ is a determinant of economic outcomes, Herrnstein and Murray
relied on two sources of evidence. One was the recent literature, and especially
John Hunter's (1986) summary of the relation between IQ scores and workplace
performance. The other was their own analysis of data from the National
Longitudinal Survey of the Labor Market Experience of Youth (NLSY). The
NLSY is a Department of Labor survey that has followed over 12,000 participants
since 1979. The respondents are now in their late 20s and early 30s. Early
in the survey many participants took the Department of Defense's ASVAB test.
Herrnstein and Murray used the AFQT score, which is derived from the ASVAB
subtest scores, as a measure of IQ. They then related IQ to subsequent life
events, such as being employed or being below the official poverty line.
Hunter reviewed studies of the relationship between job performance and
scores on the General Aptitude Test Battery (GATB), a Department of Labor
test which was widely used until the late 1980s, when the testing program
became embroiled in a controversy over its fairness to minorities. The GATB
was withdrawn as a political rather than a scientific decision. After a
detailed statistical analysis, Hunter concluded that the "true"
relation between intelligence and job performance in the population is about
0.5. This conclusion depended heavily upon extrapolating relationships beyond
the data, which assumes linearity. A National Science Committee reviewing
the GATB argued that Hunter should have used the observed correlations,
which were almost all in the 0.2 to 0.3 range. The truth probably lies between
these estimates, providing that the extrapolation is to comparable jobs
(Hunt 1995). And that is an important qualification.
The GATB was designed to screen applicants for entry-level jobs in blue-collar
and lower-level white-collar occupations. In terms of averages (something
that is well established), we are talking about occupations where the mean
IQ is in the 90-110 range, which covers about half of the population. But
recall that as intelligence goes up cognitive abilities become more differentiated.
Also, as experience goes up the IQ-performance connection gets weaker. These
factors would lead to a reduction in IQ-performance relations within higher-level
job classifications, and when dealing with experienced and older individuals.
(In fact, the GATB is known to be less accurate in predicting the performance
of older workers.)
The qualification within a job class is also important. There are quite
high correlations between the socioeconomic status of a job and the mean
IQ of the jobholders. Truck drivers average slightly under 100, while high-paid
professionals, such as doctors and lawyers, have averages of 125 or above.
It is sometimes asserted that this is because general intelligence is needed
to obtain the educational certification required to qualify for a job, but
is less important to on-the-job performance. There is evidence for this.
Military and civilian studies have found that IQ tests are better predictors
of performance when people are in training programs than when they are on
the job itself. After people are on the job, correlations are higher between
IQ and tests of job knowledge than between IQ and on-the-job observations
of performance. However, none of the correlations vanish.
IQ does not predict all aspects of job performance. In an extensive study
of enlisted personnel (Campbell, McHenry and Wise 1990), the Army found
that it was useful to distinguish between what might be called ability aspects
of performance, which includes such things as knowledge of one's job requirements
and the ability to operate machinery required in the job, and motivational
aspects, which include cooperating with colleagues, showing initiative and
leadership. The ASVAB did a good job of predicting the ability aspects but
had almost no relation to the motivational aspects. This is not surprising,
but it does make any focus on a unitary index of job competence seem simplistic.
In summary, it appears that IQ is an important factor in getting into a
job or profession, but is less important (although not negligible) once
you have learned to do the job. Further improvement is then achieved by
acquiring experience, rather than improving upon an abstract knowledge of
what the job requires.
Untangling Social Variables
If we can predict good things, however imperfectly, for someone with a high
IQ score, what can we predict for a person with a low score? People with
criminal records, people who are below the official poverty line, and people
who are receiving aid for dependent children tend to have low IQ scores.
Based on their analysis of the NLSY data base, Herrnstein and Murray argued
that IQ causes these problems, because AFQT scores are often the best single
predictor of a person's social troubles.
People who are below the poverty line are likely to simultaneously have
low IQs (on the average) and poorer than average health, and to come from
parental families of low socioeconomic status (SES). What is causing what?
The question is hard to answer, partly because of the difficulty of the
statistical analysis and partly because most social problems have multiple
causes. Young adults on welfare may be there because of a combination of
low intelligence, lack of education and limited familial support.
In preparing their book, Herrnstein and Murray used a technique called logistical
regression to attack the statistical problems. They first defined a binary
social variable, such as having an income under the official definition
of poverty, and then looked at the relation between the probability that
a person will be on the bad side of this variable as a combined function
of various predictor scores, such as IQ (defined by the AFQT), SES, and
education. Because of mathematical problems, it is not possible to look
at the probability of, say, poverty status, directly. Instead they calculated
a regression equation. In this equation p is the probability of being
in poverty. A logarithmic expression based on p is related to IQ,
SES, education (ED) and so forth by the regression coefficient for each
(the B terms).
ln (p/(1-p)) = A + BIQ(IQ) + BSES(SES) +
BED (ED) + ...
If all variables are expressed as standard-score units, you can determine
the relative importance of each variable as a predictor by comparing the
regression coefficients. For instance, in the case of poverty status the
regression coefficient for IQ is -0.84 and the regression coefficient for
SES is -0.33. This tells us that the risk of poverty goes up as IQ and parental
SES go down, and that, since the absolute value of the IQ regression coefficient
is greater than the absolute value of the SES regression coefficient, the
risk of poverty is more sensitive to changes in personal IQ than to changes
in parental SES.
Results like this are ubiquitous in the NLSY data. IQ is the best predictor
of being below the official poverty line, dropping out of high school and
receiving aid for dependent children. IQ and SES are about equal in predicting
risks of long-term unemployment and of divorce. Since the publication of
The Bell Curve, and possibly inspired by it, there have been a number
of privately circulated alternative analyses of the NLSY data. All the ones
that I have seen show that, although you might change the exact numbers
reported by Herrnstein and Murray a bit, intelligence is a substantial predictor
of indicators of social problems.
But just how substantial, and how should a prediction based on intelligence
be related to a prediction based on other factors? This is a hard question
to answer, because of the complicating factors of nonlinearity and collinearity.
Recall that nonlinearity means that a relation is not the same at all levels
of the predictor (IQ). Understanding nonlinearity is always difficult. The
problem is compounded because, in this case, the regression coefficients
are not for the risk of a social problem; they are for the logistic function
of that risk. This function is not intuitive to most people. Collinearity
refers to the fact that the predictor variables--IQ, SES, education and
a number of other possible predictors--are themselves highly correlated.
In the NLSY data, for instance, the correlation between IQ and SES is 0.55,
which is about as high as the correlation between adult height and weight.
The graph below shows how these effects combine in the NLSY. This figure
is a three-dimensional view of the relation between the probability of being
in poverty status, represented by color; IQ (the horizontal axis); and SES
(the vertical axis). The figure shows both the nonlinearities and the collinearity
of these data. For anyone of above-average intelligence or high parental
SES, the probability of being in poverty status is very low indeed. This
is indicated by the large black area in the figure. Furthermore, in this
distribution people with moderate or better SES and very low intelligence,
or moderate to better intelligence and low SES, are not likely to exist.
(Note that the figure is not a square.) The red "hot spot" might
be thought of as a danger zone in which relatively high probabilities of
poverty status are associated with the combination of the bottom 15 percent
of the intelligence and parental SES distributions. This does suggest a
troubling, cyclical relation between these variables. But once a person's
scores are in the moderate SES or moderate cognitive ability ranges the
relation between poverty, IQ and parental SES virtually vanishes.
Waller, Chung and I have developed a number of similar analyses for other
"at risk" variables in the NLSY data set, such as health problems
and prolonged unemployment. No single picture emerges. What is clear, though,
is the need to consider nonlinearity and collinearity in each case. Even
after this is done, intelligence test scores in the bottom 15 percent (roughly
an IQ of 85 or below) almost always indicate that a person has a substantial
risk of encountering problems in our society. It is important to remember
that this is a statistical statement, whereas at the individual level nonstatistical
interactions are involved. There are undoubtedly many cases in which a person
with low parental SES inherits genetic limitations in IQ, and IQ score is
indicative, on average, of the extent to which a person can benefit from
education. There are other cases in which limited family support or limited
educational opportunity may restrict a person's intellectual potential,
even when a person is highly motivated to succeed. Statistics cannot tell
us to what extent any of these variables is operating in an individual case.
All statistics can tell us is how many such cases to expect in the population.
We once again see that the data are more easily explained by the cognitive-psychology
view of intelligence as an interacting process than from the psychometric
emphasis on linear relationships. From a cognitive-psychology perspective,
low IQ might cause social problems, because of the failure of some general
component of cognition, but once beyond a given level of ability people
would be able to cope with the general society adequately. (Anthropologists
will hardly be surprised to find that most people are able to operate in
their own cultures! ) Social problems could arise, though, if the threshold
for doing well in society were set so high that a substantial number of
people could not meet it. This topic will appear again when we look at the
interaction between scientific facts and public policies.
Can Cognitive Abilities Be Improved?
Because expressed intelligence must be drawn out from innate ability, through
cultural experiences, it is natural to ask whether certain cultural experiences,
including education, can improve intelligence. Some social programs have
had this as an explicit goal. It is also natural to ask whether societies
can improve intelligence by altering the physical environment--for instance,
through programs to improve nutrition or the family environment. Finally,
whether or not intelligence, as measured by tests, is subject to improvement,
there remains the question of whether cognitive competence can be manipulated.
These questions have been looked at in three ways: in statistical and historical
comparisons of cultures, from within our own culture's experience and from
the viewpoint of statistical and theoretical biology. They are at the core
of the debate reignited by Herrnstein and Murray, who argue that competence
in today's workplace is determined by IQ, that IQ is determined by inheritance
and that since IQ is resistant to change, social programs that rely on changing
or disregarding IQ are misguided and even counterproductive.
If we take a cross-cultural perspective, there is evidence that broad characteristics
of a society can influence reasoning, probably by placing a value on the
practice of certain intellectual skills. Literacy is associated with an
appreciation for abstract reasoning, which is of considerable importance
in a technologically oriented society. Nonliterate, traditional cultures
seem to place more weight on reasoning based on memory and personal experience.
These observations, though, are of limited importance for the study of variations
in intelligence within our own society, where minimal literacy is virtually
universal.
There is some indication that intelligence levels have changed over time
within Western cultures. Flynn (1987) observed that the absolute scores
on widely used tests of abstract reasoning (Gf) have increased in
North America and Europe since World War II. Interestingly, scores on tests
that are designed to evaluate cultural knowledge and problem-solving techniques
(e.g., the SAT) declined over the same period. Although the reasons
for these changes are not known, the fact that they have moved in the opposite
direction is further evidence for distinguishing between intelligence as
an abstract problem-solving ability from intelligence as an ability to attack
culturally relevant problems.
When we move from comparisons across cultures and across time to our own
society, we find surprisingly little evidence for influences of cultural
experiences on intelligence--once again, as measured by intelligence-test
scores--in spite of many efforts to find such effects. Two well-documented
findings capture the gist of the results. Studies of adopted children have
repeatedly shown that the IQ of the biological parent is a better predictor
of the child's IQ than is the IQ of the adopting parent, even when adoption
is virtually at birth. Consistent with this observation, the quality of
home or school environments appears to have relatively little relation to
permanent changes in test scores, once one has taken account of the correlation
between genetic and social variables. Put a slightly different way, genetic
predictions based on parental or sibling IQ can account for IQ variability
in children, after social factors have been taken account of, but social
factors are not related to children's IQ after genetic variability has been
accounted for (Scarr, in press).
Within the framework of the psychometric definition, in fact, the evidence
is quite clear that intelligence is substantially inherited. Behavior-genetics
studies have shown repeatedly that IQ scores behave as if between 40 and
80 percent of the variation in intelligence, across individuals, can be
accounted for by genetic variation. The exact value does not matter. Identical
(monozygotic) twins who are adopted at birth and raised apart will resemble
each other in IQ more than fraternal (dizygotic) twins raised together.
Genetic heritability of IQ is a major determinant of whatever is behind
the IQ scores.
Genetic heritability has become entangled with racial and ethnic issues
each time the national intelligence debate has flared up. Gaps in intelligence-test
scores among groups exist; Herrnstein and Murray, like Jensen before them,
posit a genetic explanation. Many social activists have responded by denying
the tests' validity in minority groups. The facts in this debate are pretty
clear, but the explanation for the facts is not.
Numerous studies have found that in the United States the average IQ score
in samples of blacks and Latinos is about one standard-deviation unit below
the average score for whites and Asians. This means that the median black
score is exceeded by 87 percent of whites. There is, at best, marginal evidence
showing that the tests do not predict minority academic performance as well
as they predict majority performance. With a few exceptions (primarily involving
language tests in Latinos) test items that appear to have the least cultural
bias show some of the largest ethnic-group differences. Herrnstein and Murray
asserted that the tests are equally valid for minorities and majorities;
although too strong, this statement is closer to the truth than the claim
that the tests are totally invalid. This does not mean that the differences
in IQ scores between ethnic groups are genetic in origin. In our society
ethnic status and social variables that might correlate with intelligence
are highly confounded. Therefore the currently available data do not discriminate
between genetic and nongenetic explanations. We do not know whether ethnic-group
differences are innate or not. Given the complexities of the situation,
not the least of which is defining what ethnic group a person belongs to,
we should perhaps let the issue go at that.
IQ and Cognitive Skills
The view is different as soon as one steps outside psychometrics. The sociologist
Christopher Jencks (1992) has observed that genetic explanations that stop
with a heritability coefficient are unsatisfactory because they do not specify
how intelligent behavior is produced. No one inherits an intelligence-test
score in the sense that one inherits eye color. What must be inherited is
a physiological capacity for paying attention, learning and reasoning that
allows us to extract from our experiences the knowledge and problem-solving
techniques required to solve test problems. We have very little idea about
what these physiological mechanisms might be, especially insofar as they
are related to variation in abilities within the normal range of intelligence.
(There is a considerable knowledge of physiological problems associated
with specific types of mental retardation.)
Whichever model they adopt, psychologists have been frustrated in the search
for ways to enhance cognitive function. Research has shown how we might
lower a person's intelligence by physical intervention, but not how to improve
it. There are drugs that produce brief improvements in specific cognitive
functions, such as memory or attention, but the intelligence pill is nowhere
in sight. And although nutrition might be thought to be a significant effect,
there is at best marginal evidence for nutritional effects within the range
of nutrition encountered in the developed world.
Even if we do not know how to improve intelligence, as indicated by the
test scores, the economic issue is what skills people possess, not what
their IQ scores are. We may not be able to destroy the linkage between IQ
scores and the relative possession of cognitive skills (and it is not clear
why we would want to), but improved education and training can raise the
average achievement of all students.
A study by one of my colleagues (Levidow 1994) showed this in a controlled
way. High-school students were given a test of fluid intelligence. They
then took a year-long problem-solving-oriented course in elementary physics.
The IQ test did indeed predict how much physics the students learned. At
the end of the year they took an equivalent IQ test. Their IQ scores had
not changed a whit. Furthermore, the IQ test did predict the relative standings
of the students on the final examination. However, all students had learned
a great deal of physics, as evidenced by comparisons to national standards.
IQ may not have been changed, but cognitive competence, in the sense of
the problems the student could solve, was increased.
Levidow's study involved a carefully monitored educational program. Could
similar increases in skill be obtained just by putting more effort into
education? In 1994 the New York City school system, at the insistence of
their new chancellor, required that virtually all 10th-grade students take
science courses that previously had been taken by only half the students,
usually the more able ones. Enrollment jumped from 20,000 to 48,000 students.
Failure rates went up, from 13 percent to 25 percent. Pessimists can point
to this as a consequence of trying to teach hard topics to less-intelligent
students. There is probably some truth to this. But more than twice as many
students successfully completed science courses in 1994 than in 1993.
I have just cited examples of programs that achieved success by one measure,
which happens not to be IQ scores. Herrnstein and Murray cited different
examples to buttress their conclusion that programs intended to enrich children's
intellectual experiences, such as Head Start, have failed. This has serious
policy implications, because enrichment programs are generally targeted
toward children who, as a statistical group, have low IQ and are considered
at risk for school failure. Saying that the programs have failed is a bit
strong, because the programs certainly should not be judged solely by their
effect on children's IQ scores, and perhaps not even solely upon children's
school records. But by these measures it is clear that enrichment programs
have not been nearly as successful as it was hoped that they would be when
they were initiated in the 1960s and early 1970s.
What measures are appropriate to judging such programs? In our society the
labor market supplies the yardstick. Herrnstein and Murray maintained that
changes in our society are increasing the value of intellectually demanding
occupations, relative to the value placed on less intellectually demanding
ones. For example, they would argue that in modern times the values to society
of computer-system designers and bank-portfolio managers have increased
relative to the values of bookkeepers and tellers. They are not the only
ones to have made this observation. Secretary of Labor Robert Reich (1991)
has described the ascendancy of the "symbol analyst," the person
whose expertise is in dealing with abstract models of the world rather than
dealing with it directly. The evidence for this trend is overwhelming, and
all indications are that it will be accelerated by technological changes
that are clearly on the horizon (Hunt 1995).
The trend has implications for economic investment in education. During
the 1960s and 1970s, and to a considerable extent today, special funds were
made available to deal with the "at risk" student, where there
was a greater expectation of educational failure. Much less was spent on
funding for gifted students. Herrnstein and Murray argue that this is a
poor investment policy, on the grounds that education produces a greater
added value for society when applied to the top student than when applied
to the bottom one. They also argue that because IQ is the driving force
in workplace success, and because little can be done to change it, little
can be done to change the situation at the bottom.
Given the evidence for increasing economic value for highly educated, skilled
workers, this is not unreasonable. A good case can be made for investing
more in the development of high-level skill than we do now. The United States
charges tuition to university students who, in other industrial countries,
would receive stipends as part of an effort to improve national human resources.
Two qualifications have to be added. One is that because of the nonlinearities
between intelligence and performance, as documented above, it is not clear
that the gains from the cultivation of high-level skills would be as great
as The Bell Curve suggests. The other is that because SES is positively
correlated with intelligence emphasizing the development of upper-level
intellectual skills does tend to make the fortunate more fortunate. The
economic advantages of the investment have to be weighed against our society's
general disinclination to support the privileged.
When it comes to programs to improve cognition generally, there is little
room for argument.We need to increase competence at all levels because the
increasing technological nature of our society has both increased the opportunities
available to the capable and increased the penalties for not being able
to keep up. Consumer credit is a good example; new banking technologies
have provided the average citizen with an opportunity for leveraged investment
that were previously open only to the wealthy. (This is what a credit card
is!) Managing the opportunity requires a good bit of sophistication, so
consumer debt is a problem. The cognitive skills needed to be a fully functional
member of our society are clearly on the rise. Once again, intelligence
is more closely linked to acquiring these skills than to exercising them
once they are acquired. Therefore investments that improve the efficiency
of training and education will have larger and larger payoffs as the technological
sophistication required to function in society increases.
Intellectual Resources in the Workforce
Facts about intelligence are relevant to policy in another area: the question
of how society should use those resources that it already has. Affirmative-action
programs are now on the political chopping block, and the question raised
by Herrnstein and Murray--Do they discriminate against the capable, and
thereby squander the nation's intellectual resources?--is squarely in front
of us.
From a narrow perspective, if the payoff for performance is highest at the
top end of intellectual demands, we should be zealous about ensuring that
the most demanding, generally best paid, jobs do in fact go to the most
competent. To the extent that IQ scores indicate who these people are, we
should pay a premium for intelligence. This policy, which Herrnstein and
Murray (and others) advocate, has an unfortunate side effect. At the present
time assignment of jobs solely on the basis of performance predictors, such
as skills tests, would result in marked underrepresentation of minorities
in high-level job classes. This, in itself, would create a costly division
in society, because the ethnic groups involved would understandably refuse
to accept this outcome as just.
The only way out of this situation is to make major investments in training
and education in the affected communities, so that the distribution of workforce
skills becomes more equitable across ethnic groups. There is also a good
deal of evidence that successful investment must include participation and
support by the minority communities themselves. Simply admitting more minority-group
members to present programs does not work. In fact, there is evidence that
some such efforts have amounted to certification that minority group members
have passed through an educational program without a concomitant emphasis
on performance. A recent survey of workplace skills showed that blacks with
graduate-school experience have, on the average, writing and computational
skills equivalent to whites who have only a community-college education
(Kirsch et al. 1993). The issue is the changing of skill levels, not certification
levels!
The Bell Curve leaves the impression that nothing can be done because
of immutable IQ differences. This position goes beyond the evidence. In
fact, Herrnstein and Murray admit that some educational improvement programs
that they regard as far too expensive to be feasible nationwide have been
effective. The decision about whether a program is "too expensive"
or not is a matter of political rather than scientific judgment.
As this essay has shown, our knowledge of intelligence has been extracted
from complex statistical relationships. Queen Victoria's Prime Minister,
Benjamin Disraeli, said, "There are lies, damned lies, and statistics."
What social policies are dictated by selected facts about intelligence depends
on who is doing the selecting. Besides, while social policies are certainly
constrained by scientific findings, it is seldom the case that findings
in the social sciences will dictate just one policy.
Variations in intelligence have always been with us. How important they
are depends on the technological level and social organization of society.
The "village idiot" was a stock figure in medieval and early industrial
stories. In pre-industrial days, though, an able-bodied person, living in
a tightly knit society where economic, extended family and social roles
merged, may have been able to be a contributing member of society. In fact,
in such societies most of the brighter members of society may not have been
able to divorce themselves from the problems of dealing with such individuals,
so that it was to their advantage to see that everyone could cope. This
probably became less true as agrarian societies were replaced by industrial
ones. Today we live in a society where economic roles dominate other roles,
where the extended family is reduced to an exchange of Christmas cards with
cousins (and even ex-spouses) and where the movers and shakers of society
can, indeed, afford to remove themselves from the moved and shaken. There
are fascinating questions here for those interested in the intersections
between sociology, economics, anthropology and cognitive psychology. We
do not have the answers yet. We may need them soon, for policy makers who
rely on Mokita are flying blind.
Acknowledgment
The author and his colleagues, DavidWaller and Derek Chung, are indebted
to Charles Murray for his cooperation in advising them as they re-examined
his data.
Bibliography
Ackerman, P. 1987. Individual differences in skill learning: An integration
of psychometric and information processing perspectives. Psychological Bulletin
102(1):3-27.
Campbell, J. P. 1990. An overview of the Army Selection and Classification
Project (Project A). Personnel Psychology 43:231-239.
Campbell, J. P., J. J. McHenry and L. L. Wise. 1990. Modeling job performance
in a population of jobs. Personnel Psychology 43:313-333.
Carpenter, P. A., M. A. Just and P. Shell. 1990. What one intelligence test
measures. A theoretical account of processing in the Raven Progressive Matrix
Test. Psychological Review 97 (3):4-4-431.
Carroll, J. B. 1993. Human Cognitive Abilities. Cambridge: Cambridge
University Press.
Cattell, R. B. 1971. Abilities: Their Structure, Growth, and Action.
Boston: Houghton Mifflin.
Ceci, S. J., and M. Bruck. 1994. The bio-ecological theory of intelligence:
A developmental-contextual perspective. In Current Topics in Human Intelligence,
ed. D. K. Detterman. Volume 4: Theories of Intelligence. Norwood, N.J.:
Ablex.
Ceci, S. J., and J. Liker. 1986. Academic and nonacademic intelligence:
An experimental separation. In Practical Intelligence: Nature and Origins
of Competence in the Everyday World, ed. R. J. Sternberg and R. K. Wagner.
Cambridge: Cambridge University Press.
Detterman, D. K., and M. H. Daniel. 1989. Correlations of mental tests with
each other and with cognitive variables are highest in low IQ groups. Intelligence
13:349-360.
Ericsson, K. A., R. Th. Krampe and C. Tesch-Romer. 1993. The role of deliberate
practice in the acquisition of expert performance. Psychological Review
100(3):363-406.
Gardner, H. 1983. Frames of Mind: The Theory of Multiple Intelligences.
New York: Basic Books.
Gardner, H. 1993. Creating Minds. New York: Basic Books.
Gould, S. J. 1983. The Mismeasure of Man. New York: Basic Books.
Gustafsson, J. E. 1984. A unifying model for the structure of intellectual
abilities. Intelligence 3:179-203
Herrnstein, R. J. 1973. I.Q. in the Meritocracy. Boston: Little,
Brown.
Herrnstein, R. J., and C. Murray. 1994. The Bell Curve: Intelligence
and Class Structure in American Life. New York: The Free Press.
Horn, J. L. 1985. Remodeling old models of intelligence. In Handbook
of Intelligence. Theories, Measurements, and Applications, ed. B. B.
Wolman. New York: Wiley. Pg. 267-300.
Horn, J. L., and J. Noll. 1994. A system for understanding cognitive capabilities:
A thoery and the evidence on which it is based. In Current Topics in
Human Intelligence, ed. D. K. Detterman. Volume 4: Theories of Intelligence.
Norwood, N.J.: Ablex.
Hunt, E. 1995. Will We Be Smart Enough? A Cognitive Analysis of the Coming
Workforce. New York: Russell Sage Foundation.
Hunter, J. E. 1986. Cognitive ability, cognitive aptitudes, job knowledge,
and job performance. Journal of Vocational Behavior 29:340-362.
Jencks, C. 1992. Rethinking SocialPolicy:Race, Poverty, and the Underclass.
Cambridge, Mass.:Harvard University Press.
Jensen, A. R. 1969. How much can we boost IQ and scholastic achievement?
Harvard Educational Review 39(1):1-123.
Kirsch, I. S., J. Jungeblut, L. Jenkins and A. Kolstad. 1993. Adult Literacy
in America. Washington: National Center for Educational Statistics.
Levidow, B. B. 1994. The effect of high school physics instruction on measures
of general knowledge and reasoning ability. Unpublished Ph.D. Dissertation,
U. of Washington.
Reich, R. 1991. The Work of Nations: Preparing Ourselves for 21st Century
Capitalism. New York: Knopf.
Scarr, S. In press. Behavior genetic and socialization theories of intelligence:
Truce and reconciliation. In Intelligence, Heredity, and Environment,
ed. R. J. Sternberg and E. G Rigorenko. Cambridge: Cambridge University
Press.
Simonton, D. K. 1984. Genius, Creativity, and Leadership: Historiometric
Inquiries. Cambridge, Mass.: Harvard University Press.
Spearman, C. 1904. General intelligence, objectively determined and measured.
American Journal of Psychology 15:201-293.
Spearman, C. 1927. The Abilities of Man. London: MacMillan.
Thurstone, L. L. 1938. Primary Mental Abilities. Chicago: University
of Chicago Press.
Wigdor, A. K., and B. F. Green, Jr. 1991. Performance Assessment in the
Workplace. Washington: National Academy Press.
©American Scientist 1995