The Brain Project

Neural Networks and the Computational Brain

or
Matters relating to Artificial Intelligence
by Stephen Jones

Automata

The idea that we might be able to produce an artificial intelligence or perhaps even a conscious machine has had a long history. Harking back to the time of Descartes there was a great deal of activity in producing hydraulic automata for the pleasure gardens of the wealthy. These were hydraulic devices which, for example, might respond to a person stepping on a specially constructed flagstone in a garden pathway by triggering a cupid sculpture to spray water over that person.

Using hydraulic and clockwork models many automata were produced emulating in some way the activities of animals or humans. Many of the mechanical devices of the 17th and 18th centuries echoed aspects of human and animal motion and behaviour. The mechanistic view of the world developed greatly and natural philosophers felt that all human behaviour could be explained by mechanical models. In 1680 an Italian, and student of Galileo's, Giovanni Borelli, published De Motu Animalium (On the motion of animals) a study of the mechanical action of the muscles. In France in 1748 de la Mettrie's L'Homme Machine (Man a Machine) was published in which he claimed that all human behavior including the mind had mechanical explanation. This work was burned as atheistic and is still considered by historians of science as unecessarily extreme. [for example see C. Singer A Short History of Biology, 1931, p357] One should note that there was also a great deal of opposition in some academic quarters to this mechanistic view which was expressed under the framework of 'vitalism'.

In the same period Vaucanson produced a number of quite successful toys which emulated some activity or another of an animal or bird. Sir David Brewster in his book Letters on Natural Magic provides a description of Vaucanson's duck:

It "exactly resembled the living animal in size and appearance. It executed accurately all its movements and gestures, it ate and drank with avidity, performed all the quick motions of the head and throat which are peculiar to the living animal, and like it, it muddled the water which it drank with its bill. It produced also the sound of quacking in the most natural manner. In the anatomical structure of the duck, the artist exhibited the highest skill. Every bone in the real duck had its representative in the automaton, and its wings were anatomically exact. Every cavity, apophysis, and curvature was imitated, and each bone executed its proper movements. When corn was thrown down before it, the duck stretched out its neck to pick it up, it swallowed it, digested it, and discharged it, in a digested condition. The process of digestion was effected by chemical solution, and not by trituration, and the food digested in the stomach was conveyed away by tubes to the place of its discharge." [Brewster, 1868, p321]
The possibility of the automaton has enticed engineers in the western world for many centuries providing many an exhibit at fairs and expositions and as a feature of tales and novels from the Golem to Frankenstein. The robot workers of Karel Capek's R.U.R and Fritz Lang's Maria in Metropolis provide memorable 20th century examples.

Neural Network theory & non-reducibility of brain operation to the neuron

Research into potential systems of artificial intelligence now looks to the brain for models rather than looking to technology for ideas from which to model the brain. A number of scientists are looking at the development of artificial intelligence from the basis of a developing understanding of the architecture of the human brain. This work is now represented in two interlocking disciplines: Computational neurobiology: which involves understanding human/animal brains using computational models; and Neural Computing: or simulating and building a machine to emulate the real brain. The analysis is made on two levels: coarse grained, examining and elucidating networks of interacting subsystems which is largely a neurophysiological activity; and fine grained, building theories and models of actual artificial neural networks as subsystems.

By the 40's enough work had been done on describing the behaviour of the neuron for psychologists and mathematicians to make a serious attempt at a mathematical theory of the neuron, both natural and artificial.

The artificial neuron

The original neural network was based on work by Warren McCulloch and Walter Pitts published in 1943. They built up a logical calculus of sequences of nerve connections based on the point that a nerves' action potential only fires in an all-or-none manner if the treshold for that nerve has been exceeded.

They produced an artificial logical neuron network consisting of three kinds of neurons

1. Receptor, afferent or input neurons which receive the impulse to fire from a sensor.
2. Central or inner neurons which are synapsed onto from receptor and other neurons and synapse onto output and other neurons.
3. Effector neurons which receive impulses from both inner neurons and directly from receptors.

They described a set of rules for the operation of the neurons:

1. Propagation delay is assumed to be constant for all neurons,
2. Neurons fire at discrete moments, not continuously.
3. Each synapse output stage impinges onto only one synaptic input stage on a subsequent neuron.
4. Each neuron can have a number of input synaptic stages.
5. Synaptic input stages contribute to overcoming of a threshold below which the neuron will not fire.

An artificial neuron is set up to fire at any time t if and only if (e-i) exceeds h,
where e is the number of excitatory synapses onto it at time t, i is the number of inhibitory ones and h is the firing threshold for that neuron.

threshold formula
analog neuron

Given a clearly defined set of input and output conditions it is possible to create an arbitrarily complex neural network from the three types of neurons, with appropriate thresholds at the various synapses of the network.

Compared with biological neurons

McCulloch and Pitts suggested that this network may as well describe the functioning of a human nervous system as much as it might describe an automaton. Nevertheless, the whole system is deterministic. The network is a scanning device which reads the input to output transform specification as if it were a dictionary, the 'meaning' of every possible input 'word' is determined by the dictionary of associated inputs and outputs in its repertoire.

"Given any finite dictionary of input stimuli and their associated meanings or output responses, we can...always make (on paper) a scanning device or neural network capable of consulting the dictionary and producing the listed meaning or response for each input 'word' denoting its associated stimulus." [Singh, 1965, p158]

The associations of input to output are altered by altering the pattern of interconnections between neurons of each layer.

This is really a look-up-table device using neurons to carry out logic hardware functions, all its input and output are predetermined, for each set of possible inputs and interconnections there is a fixed result. Obviously human intelligence is not so fixed, and there will always be shortfalls in any strictly defined neural system. Active human neural systems learn and adapt to the culture in which they grow, so the McCulloch and Pitts neuron is inadequate to describe what is really going on, but networks starting at this level can be set up to learn and adapt.

Jagjit Singh in his textbook on information theory speaks of the potential behaviour repertoire of natural neural systems as being impossible to reduce adequately to unambiguous description:

"Whether any existing mode of behaviour such as that of the natural automata like the living brains of animals can really be put 'completely and unambiguously' into words is altogether a different matter...Consider, for instance, one specific function of the brain among the millions it performs during the course of its life, the visual identification of analogous geometrical patterns. Any attempt at an 'unambiguous and complete' verbal description of the general concept of analogy, the basis of our visual faculty, will inevitably be too long to be of much use for even drawing (on paper) neuron networks having the wide diversity of visual responses the natural automata normally exhibit as a matter of course. No one in our present state of knowledge dare hazard a guess whether such an enterprise would require thousands or millions or any larger number of volumes. Faced with such a descriptive avalanche, one is tempted to substitute the deed for the description, treating the connection pattern of the visual brain itself as the simplest definition or 'description' of the visual analogy principle." [Singh, 1965, pp171-2]

These neural networks are essentially digital, computer-like models having profound differences from real neural systems. For example, in real neural systems the pulse trains carrying quantitative sensory information seem to be coded in pulse frequency modulation form, rather than digital representations of number; also the depth of connectionism seems to be much more efficient in our neural operations. That is, the number of layers of neurons: sensory input, processing and output (efferent) layers; is much less than appears necessary with artificial neural nets.

McCulloch and Pitts also spoke of neuron nets having circular interconnections in which "activity may be set up in a circuit and continue reverberating around it for an indefinite period of time, so that any realisable (result) may involve reference to past events of an indefinite degree of remoteness." [McCulloch & Pitts, 1943] thus producing a regenerative process which might be akin to learning and to memory.

In considering the differences between biological systems and automata von Neumann examined the problem of self-reproducing machines. He discerned that in systems below a certain level of complexity the product of those systems would always be less complex than the system itself, but with a sufficient degree of complexity the system can reproduce itself or even construct more complex entities.

"Since the physical basis of mindlike qualities resides in the patterns of organisation of biological materials occurring naturally in animals, there is no reason why similar qualities may not emerge (in the future) from patterns of organisation of similar or other materials specially rigged to exhibit those qualities." [Singh, 1965, p202].

One should note here that it is this statement about 'mindlike qualities residing in the physical' which it is the task for computational physiologists to prove in this work exploring neural nets.

As Charles Sherrington has remarked,

"It is a far cry from an electrical reaction in the brain to suddenly seeing the world around one with all its distances, its colours and chiaroscuro." [Singh, p203]

and in Penfield's work of direct electrical stimulation of the exposed cortex, the patient

"is aware that he moved his hand when the electrode is applied to the proper motor area, but he is never deluded into the belief that he willed the action." [Singh, p204]

That is, there will be action co-ordinating or integrating centres 'above' the direct control networks. The stimulated versus the willed movement are distinguished as having different antecedents. The complex systems of neural nets are organised hierarchically with layers of processing nets projecting to higher "integrating" layers and so on up to the cortical planning and control layers. Also many layers use descending projections to control what they are being fed in the way of information. This prevents swamping and allows attention and concentration on particular processes.


W.Ross Ashby, Warren McCulloch, Grey Walter, and Norbert Wiener
at a meeting in Paris
(from Latil, P de: Thinking By Machine, 1956)

Coping with failures

John von Neumann, in attempting to produce a useful description of a reliable computing system added the idea of redundancy into the neural network in order to bring it more into line with the inherent unreliability of the physiological neuron net. Redundancy is a matter of using several copies of the same device with their outputs going to a majority decision device, so that if any one device fails the system still has enough functioning copies of the device to keep going. He showed that by using many redundant components in a circuit one could make an automoton with an arbitrarily high degree of reliability.

If any particular neuron net of moderately reliable organs is triplicated and the results of the three sent to a majority decision organ then the latter will give a highly reliable result equivalent to the result from a perfectly reliable single organ. But this leads to an uncontrollable proliferation of organs in something as complex as the brain. Von Neumann resolved this problem by increasing the number of input lines to each organ in the net such that the misfiring of a small number of components cannot cause a failure of the whole automoton or network. The redundancy in the system is increased considerably in order to control errors.

Coping with disturbances

The first thorough exploration of the behaviour of a mechanism in emulating living nervous systems was carried out by Ross Ashby in Great Britain and published in his book Design for a Brain in 1952. He was interested in the problem of how a dynamic system achieves a range of behaviours which may be said to show stability within the limits of survival for that dynamic system as well as adaptability to changes in the environment of that system.

When a system is perturbed by the occurence of an input the system's response to the perturbation will be determined by previous experience (training). If the response is not exactly appropriate to the input then an error will occur. This error could be catastrophic for the system especially if the system is not particularly adaptive. In order for the system to adapt to a changing range of inputs it must be able to accommodate the errors. The incorporation of the difference between the actual response and the required response is known as feedback self-regulation.

Ashby built an electro-mechanical system employing a set of four pivoted magnets and an arrangement of electrical connections and impedences. The effects of the position of each magnet were routed to the other three magnets via a number of parameter altering devices, viz. selection switches and motion constraints. With any change in the operating conditions the positions of the magnets would automatically shift until the original specified conditioin of stability was re-established. This machine was the Homeostat and demonstrated an operating procedure which he called ultrastability.

Ashby's Homeostat (from Ashby, W.R: Design for a Brain, 1952)

Detail of magnet and coil from Homeostat

Ashby used feedback in this self-regulating mechanical system he called the homeostat so that it would reach a stable state no matter how serious the perturbation of the inputs. Ultrastability is the capacity of a system to reach a stable state under a variety of environmental conditions. But the probability of stability being achieved decreases steadily as the system becomes more complicated. A large and complex system is very much more likely to be unstable.

Ashby showed that within a large complex system when any input disturbance is handled by only a small subset of the full array of subsystems in the system that disturbance will not affect the stability of the overall system but will, in fact, be easily accommodated by the system. If the subsets of input handling devices are different for differing ranges of input disturbances then the behaviours of the system will be spread over the system and no one input disturbance can take over or disturb or cause the failure of the full system. Ashby calls this the "dispersion of behaviour", ie. responses to a range of inputs may be said to be "dispersed" over the system. Within Ashby's framework each of these input handling subsystems will be ultrastable and the complex will be multistable.

Obviously this is what our brains do with our various modes of sensory faculties. Ashby's theory is applicable to living animals: the entire array of possible environmental disturbances is grouped or dispersed into separate sensory systems and these systems are specialised to filter all but a very specific subset of inputs. An animal is thus built up of a number of ultrastable subsystems in a dispersed organisation. The animal's behavioural adaptation to new stimuli will reach appropriate responses much more quickly than in a single ultrastable system which had to generate responses to the full array of possible input/perturbing conditions. This sort of behaviour occurs similarly between the system and its environment and between subsystems within the multistable system. Ashby is suggesting that adaptive behavior and goal-seeking behavior in animals is handled via this principle of multistability (a system of ultrastable devices in a dispersed behavior system).

The adjustable synapse

Returning to neural nets. The McCulloch & Pitts neuron had a fixed threshold, so McCulloch developed a model with a variable threshold which in a network provided a means of changing the internal organisation of the network, making it more able to self-adjust to changing environment.

D.O.Hebb suggested a synaptic threshold modification principle:

"when an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency as one of the cells firing B is increased." [Hebb, 1949]

The idea of an adjustable synapse allows an artificial neuron network to go beyond the process of simply making decisions based on a look-up-table or the execution of a set of logical rules as in an ordinary computer. The network can tailor its response to, or "interpret", its input by adjusting the weighting of each synapse in adding to a threshold so that new responses can be made to variations in the input conditions. If the actual output of the network is compared with the desired output then an error value can be determined which can be incorporated into the weighting of the synapse

There is an almost biological principle of adaptation to conditions in operation here, the internal organisation of the learning machine can be altered according to the 'feedback' of an error value from the output or result of the process. This is the principle embodied in F.Rosenblatt's Perceptron.

The Perceptron consists in a net of sensor units feeding to a set of association units which feed one or more response units. If the sensor units feed enough 'yes' votes to the association unit to which they are mapped to exceed the threshold of that association unit then it will be excited or 'fire'. When enough association units fire so as to exceed the threshold of the reponse unit to which they are mapped, then the response unit will fire. If the result is correct then the tresholds of the response units will be left as they are, but if the result is incorrect then the thresholds of the response units will be modified. This process is iterated enough times for the response unit to give a correct response to the input of the whole Perceptron system. Thus the Perceptron is said to be 'trainable'. The output of the network is affected by altering the weighting or the value contributed by each connection.

"In sum, the essence of the training scheme is to reduce selectively and iteratively the influence of active inner units which tend to give unwanted response, and enhance the influence of those that tend to give the desired response...after a certain number of trials a stage is reached when no further adjustment of weights of inputs into a response unit is required to secure correct identification by the machine in all subsequent presentations of picture patterns to it." [Singh, p228-9]

Mismatch

The next problem for neural nets, particularly for pattern recognition machines, is to allow for some level of mismatch, i.e. recognition on the basis of similarity. Humans handle the similarity problem with ease but a machine, especially an essentially classificatory machine, will have to go through a wide range of image transformation and generalisation. Image transformation employs shifting, rotation and scaling and in neural net systems is very neuron intensive, far to much so for it to be a viable model.

Pattern recognition by classification, and, with the inclusion of probability factors, by similarity is massively neuron intensive given the huge number of possibilities that any system might encounter. But if the system starts in a massively overconnected way, thus being provided with many more options than will be needed once trained, and abandons connections which are not used, then the neuron count can be kept down considerably once trained. But this arrangement would suffer from not being very flexible and being unable to account for new variations not encountered during training. The human brain is incredibly flexible and able to accomodate novelty, which none of the standard feedforward neural net models are able to do.

There is a model developed by Uttley for a kind of machine based on a conditional rarity model designed on classification and conditional probability principles with vast overconnections. This model does acheive the necessary economy of units - with chance connections. Because of the overconnections, there is initially ambiguity of discrimination, with units failing to recognise their associated input representations to a high degree. As information is accumulated those connections which carry little information become less effective until they are disconnected. Ambiguity is then eliminated and the system learns to discriminate, meeting some of the physiological and psychological facts rather well. This model has some considerable similarity to what occurs in the maturation of an infant's brain. The infant's brain is massively oversupplied with neurons and millions die off as the infant brain matures.

Simulation of biological neural nets

The strength of interconnections in a network of neurons determines how the network will respond as a whole to a particular input; "the pattern of connection strengths represents what the network knows." (Ferry, 1987, p55) The connections are bidirectional allowing for feedback circuits.

It now seems generally accepted that the brain's power arises spontaneously from huge numbers of neurons highly interconnected and processing information in parallel. These neuronal assemblies are defined as groups of 'neurons that are simultaneously active in response to a certain input'. But it is incredibly difficult to study these assemblies physiologically, getting enough electrodes into a small enough space to study enough neurons is currently out of the question, so finding an assembly and then showing its synchronised operation and its spread is incredibly difficult.

The neural net approach developed by McCulloch and Pitts was a hardware approach or was carried out on paper in mathematical and logical procedures. For a long time this work received very little attention and only in the last 15 years has a simulation approach emerged which models in the computer the interconnections and the interactions and simulates the activity of the nerve 'assembly' or the neural network. Given the massive numbers of synapses onto it that any one nerve might have, some of which are excitatory and some of which are inhibitory, and given the modulations of neurotransmitters across those synapses, the triggering of that nerve depends on the summation of all those inputs exceeding the threshold of that nerve. A cell's response might be a graded response according to the strength of the overall inputs or it might be an all-or-nothing response based on a threshold. All the outputs might be feedforward type processes or some might feedback into layers of cells preceeding the layer of the cell being considered, thus controlling its behaviour.

These systems of simulated neural nets can exhibit learning when based on D.O.Hebb's rule for learning developed in 1949, such that:

"the connections between cells that are active at the same time will be strengthened, increasing the probability that the first cell will excite the second cell in the future. Connections between cells whose activity is not synchronised will be weakened. Synchronised patterns of firing that occur repeatedly will eventually become stable representations (or memories) of the inputs that give rise to them, and can be reactivated by only partial inputs." [Ferry, 1987, p56].

Other aspects of brain function modelling being explored include how can one maximise the number of representations that a given network can hold? Obviously, maximising the number of connections, it may be that every neuron in the brain is only a very few neurons away from every other, (in the realm of 5 neurons distant). But then how do these overlapping assemblies operate without interfering with each other? Possibly some form of negative feedback is involved in preventing an active assembly from becoming to large.

In a neural network simulation system each neuron takes account of the weightings of the outputs of its upstream neighbours to determine its own state. Of course, it is not the neuron somehow deciding of itself that it should look at what its neighbours are doing, this connectivity is already in place and is, so to speak, woken up by use. Each neuron exists in a net of similar neurons and they are wired up, axon to dendrite, synaptically connected, forming self-programming processing subsystems. These networks are inherently robust - answering von Neumann's earlier call for reliability in artificial computing systems: if some neurons malfunction the overall function of the network is not affected.

Information is encoded in neural connections rather than separate memory elements, as unique patterns of interconnections. Also the system learns 'spontaneously' because it alters the strength of particular interconnections according to the repitition of use of those interconnections and the array of possible experiences provided and possible solutions arrived at, i.e. training. In computer terms it might be thought of as 'self-programming'.

This training process: the alteration of weightings on particular synapses in the network; can also be acheived by a recurrent or feedback network in which an error weighting is generated by comparing the actual output to the desired output and then fedback into the weightings of the synapses of the processing layer of neurons.

Summary

To rehash the neural net idea I want to quote from Tank and Hopfield's article in the December 1987 issue of Scientific American.

"A biological neuron receives information from other neurons through synaptic connections and passes on signals to as many as a thousand other neurons. The synapse, or connection between neurons, mediates the 'strength' with which a signal crosses from one neuron to another. Both the simplified biological model and the artificial network share a common mathematical formulation as a dynamical system - a system of several interacting parts whose state evolves continuously with time. Computational behaviour is a collective property that results from having many computing elements act on one another in a richly interconnected system. The overall progress of the computation is determined not by step-by-step instructions but by the rich structure of connections between computing devices. Instead of advancing and then restoring the computational path at discrete intervals, the circuit channels or focuses it in one continuous process." [Tank & Hopfield, 1987, pp62-63]

One might liken this activity to the human process of consensus decision making, where a problem is discussed until everyone involved knows enough about it for a decision to evolve from the range of opinions held by individual members of the group. The 'computational surface' of the nodes in a neural network shifts according to the weightings of each node. The weightings alter with training, i.e. through exposure to examples of the kinds of problems to be encountered by the particular network, and the solutions develop in the form of a kind of best-fit. "The network carries out the computation by following a trajectory down the computational surface. In the final configuration the circuit usually settles in the deepest valley" to find the best solution. [Tank & Hopfield, 1987, p67]. This approach is good for perceptual problems and for modelling associative memory. Using Hebbian synapses we can develop a model for learning: "synapses linking pairs of neurons that are simultaneously active become stronger, thereby reinforcing those pathways in the brain that are excited by specific experiences. As in our associative-memory model, this involves local instead of global changes in the connections.

Modelling real networks

Computational neuroscience is one of the major areas of investigation into what it is that brings about consciousness in what we know to be the extraordinarily complex but highly organised networks of neurons in the human brain. At the Tucson II conference Paul Churchland, of the University of California at San Diego, asked:
"Have we advanced our theoretical understanding of how cognition arises in the brain?
Yes: Through artificial neural networks that display

a) learning from experience,
b) perceptual discrimination of inarticulable features,
c) development of a hierarchy of categories or framework of concepts,
d) spontaneous inductive inference in accordance with past experience ("vector completion").
e) Sensorimotor coordination between sensory inputs and motor outputs
f) short term memory with information selective decay time
h) variable focus of attention

[from a slide in Churchland's presentation at Tucson II]

Churchland took us briefly through (a "cartoon version" of) the visual system.

A pattern of light: "... a representation on your retina is transformed by going through a trillion little synapses into a new pattern at the LGN that is projected forward to V1 (the primary visual cortex), (where it is) transformed by another population of synaptic connections in to a third pattern. It rather looks like the basic mode of representation in the brain is in patterns of activation across populations of neurons, and the basic mode of computation is transfomation from one pattern to another pattern, to another pattern. Transformations which co-opt relevant kinds of information. Information that is relevant to its day-to-day behavior." [from Churchland's presentation at Tucson II]

This is very similar to the structure of a neural net (of course !, given that they were designed from actual neurons).

One of the primary problems being used in neural network development is that of face recognition, i.e. attaching a name to the face. Churchland presented work done by Garrison Cotrell's group at the University of California at San Diego, using a feedforward neural network havng 80 cells in the inner layer, which did a pretty good job of the basic face recognition task. He then mentioned some of the ways in which it failed and how one might deal with these failures using recurrent or feedback weighting of the connections in the network and discussed how this relates to some aspects of consciousness such as short-term memory.
Churchland's description of the face-recognition network: The Input layer is made up as 64 x 64 neurons (4096 neurons) consisting in photocells having a photograph of a face projected onto it ("being stimulated to an appropriate level of brightness"). The midddle layer consisted in eighty cells. The output layer had eight cells which give 8 bits to identify: face/non-face; male; female; "name" (5 bits); The network was trained up on about 11 faces and a small number of non-faces, with a number of examples of each face, and it did very well on the three kinds of distinguishing it had to do. When shown a test set of novel faces it did about 10-15% less well than it did on the learned set. Still a remarkable performance and not far down on our own sort of performance.

But this network cannot discriminate ambiguous images (like the duck-rabbit illusion). To paraphrase Churchland: What a feedforward neural network does is embody an input/output function, with a unique output for every different input. To achieve something like the handling of ambiguity we need something more than feedforward networks. So he introduces "recurrent pathways" which bring contextual information from the rest of the system of the brain and feed it back into the network. This allows the network to "modulate its own responses to perceptual input" These recurrent pathways are the channels for the feedback information which we have discussed above. For example there are a very large number of descending pathways from the visual cortex back to the LGN, more than there are projecting from the LGN up to the visual cortex.

Recurrent pathways were originally introduced into neural nets as a form of short-term memory. They also provide a level of directability and handling of amibiguity as well as answering some of the other desiderata for a theory of Consciousness. In the brain, the best candidate for a neural correlate of consciousness is the thalamo-cortical system which is a massive recurrent network centering on the thalamus (see Newman and Baars on the thalamo-cortical system)

Artificial Intelligence

Of course, all this neural network work has another intention besides elucidating what it is in humans, or biological systems in general, that produces perception and attention and consciousness; and that is: Is it possible to build an artificial consciousness, a silicon system which might display some kind of consciousness? And even more importantly how do we test the machine we built to see if it really is conscious?

Two tasks had to be carried through before real computing machines, let alone intelligent machines or AI's, could be developed. First was the software problem: developing systems of algorithms for problem solving, and secondly, the hardware problem: a theoretical machine had to be produced which could deal with these algorithmic systems and treat them as instructions for its operation.

Dealing with the hardware problem first:
Alan Turing is the name most associated with the development of electronic computers. He showed that it would be possible to build universal, or reprogrammable, computing machines by developing an abstract version which became known as the Turing Machine. In this device a machine reads a paper tape on which symbols or spaces are written. These symbols tell the machine what to do next after reading the symbol: the symbols may then be erased or new symbols may be written, and then the tape is stepped on to the next symbol. This provided a conceptual base for the development of machines which could be supplied lists of instructions on how to solve a problem, which could then proceed as instructed and generate a solution to that problem at which point the machine would halt. He also showed that some mathematical problems were not 'computable', i.e.were unable to bring the universal machine to a halt, and were therfore not amenable to solution using an algorithmic process. [An algorithm is a fixed set of instructions which if carried out correctly will solve the problem they are designed to solve no matter how much time it might take.]

His other great contribution was the development of the Turing Test. Turing considered that a universal problem solving machine was something which could be constructed in any of a variety of materials. In the 19th century Charles Babbage had proposed an 'analytical engine' which would have been contstructed with mechanical components and driven by steam. Turing was building machines using electrical components, but he considered that in principle it wouldn't matter what the machine was built of so long as the operating principles were of the nature of his universal machine. So a flesh and blood machine would be possible and given that the existing flesh and blood machines were, more or less, intelligent, then a machine built in other materials might also be intelligent. But how would you know whether this possible machine, once built was truly intelligent? Thus the Turing Test: a machine and a human are placed in a room, with an interregator outside the room. They communicate to each other by some agency, e.g. a teletype machine, which doesn't reveal clues such as voice quality. The interregator asks questions of the two in the room and if the interregator is unable to tell which of the two is the source of the answers then the machine can be said to be intelligent. (Or, of course the human can be said to be dumb, which is a comment that Jaron Lanier made in his presentation at Tucson II.) So the idea is that if a machine can be made to behave in a manner indistinguishable from a human then that machine should be described as intelligent.

So this is the prime test that an artificial intelligence would have to pass.

The other problem was the software issue:
The software task for the putative designers of artificially intelligent machines was the production of a mathematical system which reduced reasoning to a mechanical process, first in arithmetic and then in more generalised systems of reasoning. So mathematicians of the second quarter of the 20th century spent a good deal of effort trying to develop formal logical systems which could provide general problem solving algorithms. These were to provide a basis for a consistent theory of arithmetic (known as number theory) and would be later employed in programming computers. But the Czech born mathematician Kurt Goedel caused a considerable upset when he showed that any formal logical system was necessarily incomplete. This is known as Goedel's theorem or Goedel's Incompleteness theorem and works like this:

In any formal system S, which is consistent, there can be a proposition which denies the provability of that proposition (of itself) within the system; i.e. the statement "this statement cannot be proven within S" can exist within S. Since this proposition can exist then it must be true, which denies its 'not provable' status, and therefore produces an inconsistency within what is supposed to be a consistent system. Thus no formal system of propositions can be complete.

This result has quite extraordinary consequences but these have been very differently interpreted by various people. The first interpretation has been that it provides an avenue for the existence of freewill in the world. Another interpretation, which is slightly more relevant to our discussion here, is that it has been suggested that no computing machine will be capable of becoming intelligent in the way that humans are because the formal algorithmic systems; i.e. the systems of programs, that a computer is constructed with can never provide the sort of mathematical 'understanding' or 'truth finding' capabilities that humans have. But this is to take a very narrow and simplistic view of the ways of designing machines as well as intelligence.

Now, I have another view divergent from this and that is that what the Godel result indicates is that to consider intelligent machines as being restricted to formal mathematical systems of propositions, that is algorithmic programming, is to severely misunderstand both human intelligence and the implications of incompleteness. The point is: Human intelligence is generative, i.e. it is capable of constantly producing new ideas, new sentences of language, new creations of art, new musical productions, etc, etc... and this is because an intelligent system is necessarily incomplete. It is in this idea that the possibilities for intelligent artificial constructions lie. A machine which is capable of passing the Turing Test must not only be able to pass a maths exam, but it must also be able to make up a new story about the neighbours or worse tell a lie about itself, that is it must be generative, always able to produce new sentences about itself and any other content of its system or its context.

There were two presentations at the Tucson conference which used an idea similar to this.

One was from Steve Thaler of Imagination Engines, Inc. in which he introduced a neural net construction in which the standard input stimuli are removed and replaced by various kinds of noise generation within the network cavity itself. The "internal noise...is shown to produce a temporal distribution of distinct network activations identical to that in human cognition" [Thaler, abstract of Tucson II presentation, and see Holmes, 1996]. The nets he uses generate new versions of existing structures by stochastically altering the weights of the internal connections of the net. Many of the results will be useless but some will be useful, and these can be easily selected out. For example, Thaler uses his neural nets to develop everything from new designs for motor-cars to composing all possible pop-songs.

The other was by Daniel Hillis of Thinking Machines Corporation, who are developers of massively parallel computers. Hillis spoke about his attempt to use evolutionary techniques to simulate the design of a very complexly organised machine.

In his presentation to Tucson II Hillis outlined a number of the current arguments against intelligent machines based on the failure of algorithmic computation to produce true intelligence. Then he presented a technique of evolving machines which, more or less randomly, may or may not come near to solving the problem being set. The most successful of these machines are then selected out and 'married' by a sexual combination process and new machines produced which are tested, and so on around the cycle for many thousands of generations. As he says the circuit diagram may be impossible to read but they are the most efficient machines for solving the particular problem ever produced.

Finally, I'll give John Searle the last word on "Can a computer be conscious?". He says that if you define a machine as a physical system that performs certain functions then our brains are machines: "I am a machine that thinks...The question: can a computer think? obviously has a trivial answer, Yes. My brain is a computer. Listen 2 + 2 = 4, that's a computation. So I am a computer that thinks." But conversely, computation as defined by Turing (as symbol manipulation) does not constitute thinking or consciousness. [Searle, presentation to Tucson II]

References

Ashby, W.R. (1952) Design for a Brain. Wiley

Brewster, Sir D. (1868) Letters on Natural Magic. W. Tegg.

Churchland, P.M. (1988) Matter and Consciousness. MIT Press

Churchland, P.M. (1989) A Neurocomputational Perspective: The Nature of Mind and the Structure of Science. MIT Press

Churchland, P.M. (1995) The Engine of Reason, the Seat of the Soul: A Philosophical Journey into the Brain. MIT Press

Ferry, G. (1987) "Networks on the Brain". New Scientist. 16 July 1987, pp54-58.

Goedel, K. (1931) "On Formally Undecidable Propositions of Principia Mathematica and Related Systems I" in Davis, M.(ed) (1965) The Undecidable. Raven Press

Hebb, D.O. (1949) The Organisation of Behavior Wiley

Hillis, W.D. (1985) The Connection Machine. MIT Press

Hillis, W.D. (1985) "The Connection Machine". in Scientific American, June, 1987, pp86-93

Holmes, B. (1996) "The Creativity Machine". New Scientist. 20 Jan.1996, pp22-26.

McCulloch, W.S. and Pitts, W.H. "A Logical Calculus of the Ideas Immanent in Nervous Activity" in McCulloch, W.S. (1965) Emodiments of Mind. MIT Press.

Searle, J. (1992) The Rediscovery of Mind. MIT Press

Singh, J. (1965) Great Ideas in Information Theory. Dover.

Tank, D.W. & Hopfield, J.J. "Collective Computation in Neuronlike Circuits", Scientific American, Dec 1987, pp62-70.

Turing, A. (1950) "Computing Machinery and Intelligence" in Mind, 59

von Neumann, J. (1958) The Computer and the Brain. Yale University Press.

von Neumann, J. (1966) Theory of Self-Reproducing Automata. University of Illinois Press.

To return to the discussion of organisation and complexity.


return to Chapters index

return to overall index

return to Introduction and textual index