AN OVERVIEW OF CONNECTIONISM & LANGUAGE ACQUISITION

Brain style computations began to be explored in the 1940's by McCulloch, Pitts and Hebb. Early work in artificial intelligence was done within two competing models: parallel, brain-like systems versus symbol processing on Von Neumann machines. Initially there were no learning procedures for multilayer networks and single layer networks were quite limited in what functions they could approximate. In part because of these the symbol processing approach became the dominant view. In the 1980's there was a resurgence of interest in neural networks because of the popularization of the back propagation learning algorithm for multi-layer networks. A multi layer network can theoretically approximate any function if given enough hidden units. The appeal of connectionism lies in the fact that basically it is a brain metaphor. The basic units of a connectionist model are simple computing devices (the neurons of the system) organised in a massively interconnected network corresponding to our current understanding of brain structure. An understanding of the workings of a connectionist model requires an understanding of how data is transmitted within the microcosmic plane of the brain. An adult human brain weighs about 1.5 kilos and is composed of perhaps one trillion cells, of which between ten and one hundred billion are neurons, organised into a massively interconnected network. Each neuron has:

I. a nucleus
II. a number of dendrites, which receive chemical input from other neurons
III. a number of axons, which emit chemical output to be taken up by the dendrites of connected neurons.

Neuronal input/output, in the form of chemical neurotransmitters, is produced in vesicles at the base of the axons, and released into the synaptic cleft(synapse) between neurons when a neuron has had sufficient input to cause it to fire. The neurotransmitters are collected from the synapse by receptors at the base of the dendrites. Neurotransmitters not taken up in this way by the dendrites of neighbouring neurons may be reabsorbed into the axon, or may remain in the synapse for some time.


Characteristics of Biological Networks
o Human brains contain a masive connection of neurons (1010 - 1110 )
o These neurons are densely interconnected
o Connections can be excitatory or inhibitory
o Learning involves modifying synapses
o Elimination and addition of connections can occur

Characteristicts of Connectionist Networks
o Neurally inspired: slow and parallel, highly interconnected, learning is
done by changing the strengths of connections, processing is
distributed and decentralized
o Neuron is the basic processing unit
o Configuration of connections is the analog of a program
o Local computation produces global behaviour
o Long term memory is in the strength of the connections (i.e. the weights)
o Short term memory is in the pattern of activity

SAMPLES OF WHAT NEURAL NETWORKS CAN DO

Connectionist believe that neural networks could be used to master cognitive task as well as explain certain areas of child language acquisition. The following are three well known experiments involving connectionist models and language output. The models are trained to reproduce certain grammatical aspects of speech and are touted to be able to approximate child language acquisition.

The first model created by Sejnowski and Rosenberg(1987) involves a net that can read English text called NETtalk. The training set for NETtalk was a large data base consisting of English text coupled with its corresponding phonetic output, written in a code suitable for use with a speech synthesizer. At first the output is random noise. Later the net sounds like it is babbling, and later still as though it is speaking English double talk(speech that is formed of sounds that resemble English words). At the end of training, NETtalk does a fairly good job of pronouncing the text given which includes text that was not presented in the training set. An interesting observation about NETtalk is the babbling produced during training which sounds like the vocalization of children attempting to produce words.

Rumelhart and McClelland(1986) formed a net that could predict the past tense of English verbs. The net was trained on data involving both regular and irregular verbs. During learning as the system was exposed to the training set containing more regular verbs, it had a tendency to overregularize, i.e. to combine both irregular and regular forms: (break/broked instead of break / broke) This was corrected with further training. It is interesting to note that children are known to make the same types of overgeneralization during language learning. This phenomenon is popularly known as the "U" curve(Bloom 1994). The left tail of the curve represents the earliest time of morphological production when children are able to produce both regular and irregulars correctly. This is followed by a long period of overgeneralization marked by the sagging middle of the U. Given time and exposure, children stop the anomaly and return to adult like competence, represented by the right tail of the U. Rumelhart and McClelland's model seems to have captured this phenomenon. However there have been criticisms against the model as to how humans learn and process verb endings. One criticism against the model is that it captures the overgeneralization and the U curve by some very unrealistic assumptions. In training their model Rumelhart and McClelland started by feeding the network with a higher proportion of irregular verbs than regular verbs. This was followed by a set of verbs where the proportion of irregular verbs was smaller. The reason for this was the assumption that irregular verbs were of a higher frequency and hence children may be expected to learn them first. However research done by Pinker and Prince(1988) has established that there is no change in the frequency of regular and irregular verbs when children start to overgeneralize. Therefore the Rumelhart and McClelland cannot accurately hypothesize on child language development. Pinker and Prince(1988) also point out that the model does a poor job generalizing to some novel regular verbs. They point out that nets may be good at making associations and matching patterns but they have fundamental limitations in mastering general rules such as the formation of the regular past tense. In their argument they state that the past tense rule applies to the verb stem which is located within the lexicon and not the sound of the verb. For example the word wring and ring have the same sound but the past tense forms differ; rang and wrung. A network model such as the Rumelhart and McClelland model would not be able to distinguish the difference. Furthermore unlike the limitations of the net children are quite capable of using words that they have never been exposed to before. Recent research has pointed out that inflectional overgeneralization which is known to last into the child's school years is less frequent than was previously thought to be. According to Cho and O'Grady(1996) "pre school children seem to overregularize irregular verbs less than 25 per cent of the time at any point in development." They further suggest that the overgeneralization errors observed in early speech reflect lapses in accessing the appropriate irregular form from the lexicon rather than failure to learn irregular forms per se. In light of this new findings it is rather improbable to generalize the Rumelhart and McClelland model as typifying child language acquisition.

One area of contention that the Rumelhart and McClelland model helps to alleviate is the acquisition of the irregular past tense in terms of the "blocking" or "elsewhere" condition(Aronoff, 1976; Kiparsky, 1982 as cited by Bloom 1994). This condition postulates a rule that negates the production of an item that already exist in the lexicon. For instance the past tense form went would block the production of 'goed' given that went as the past tense form of go exist in the lexicon of the speaker. The same condition would block mans as the plural form of man. According to this theory overregularization would come to an end when the child has enough lexicon of the irregular verbs which would block the application of the regular past tense verb. This is similar to the condition experienced by the Rumelhart and McClelland model during training sessions. Once the net is trained to recognize the irregular verb forms the overregularization of the net ceases. The blocking theory has gained some supporting evidence from this network.

Elman (1991) produced a net that had the capability to predict the next word in a sentence in English. The net allowed unlimited formation of relative clauses while keeping to the rules of subject-verb agreement. For example the net was able to produce the following sentence:

Any man that chases dogs that chases cats ... runs.

In the above example the net was able to identify the singular subject- man, with the verb -runs instead of run despite the intervening plurals -dogs and cats. Elman's net displayed an appreciation of the grammatical structure of sentences that were not in the training set. However the performance of this set is limited to the vocabulary within which it was trained . It will not be able to generalize this performance to sentences formed with a novel vocabulary.

Despite these interesting models and features, there are some general weaknesses in connectionist accounts of language development that bear mentioning. First, most neural network research are not able to closely approximate the many interesting and possibly important features of the brain. For example, connectionist usually do not attempt to explicitly model the variety of different kinds of brain neurons, nor the effects of neurotransmitters and hormones. Such biological features and their effect on language development seem to be an elusive topic for connectionist. Furthermore, it is far from clear that the brain contains the kind of reverse connections that would be needed if the brain were to learn by a process like backpropagation, and the immense number of repetitions for such training methods seem far from realistic. Children do not need repeated instructions and language experience in any normal condition to acquire the features of language as compared to neural networks that need repeated and multiple inputs and recalculations on the weights of the connections. Another area of contention that network researchers have not explained satisfactorily is the language development of children who live in bilingual environments. Children in such situations are known to develop more than one linguistic form for the same object, action or event. Attention to these matters will probably be necessary if convincing connectionist models of human cognitive processes are to be constructed.


Reference

Berko-Gleason, J. and Bernstein Rather, N (eds.) (1998). Psycholinguistics (2nd ed.). Orlando: Harcourt Brace.

Bloom, Paul. "Recent Controversies in the Study of Language Acquisition." Gernsbacher, M.A. Ed. (1994). Handbook of Psycholinguistics. San Diego: Academic Press.

Carroll, David W. 1999. Psychology of Language. California: Brooks/Cole Publishing Company.

Cho, Sook Wan & William O' Grady. Language Acquisition: The Emergence of a Grammar. William O'Grady et.al. 1996 Contemporary Linguistics: An Introduction. London: Longman Limited.

Damon, William 1998 Ed. Handbook of Child Psychology. Volume 2. New York: John Wiley & Sons.

Eggen, Paul & Don Kauchak. 1999 Educational Psychology. New Jersey: Prentice Hall.

Gernsbacher, M.A. (1994). Handbook of Psycholinguistics. San Diego: Academic Press.

Harley, T. (1994). The Psychology of Language. Hove: Psychology Press.

McClelland, J. & Rumelhart, D. et. al. 1986 Parallel Distributed Processing. Vol. II, Cambridge: MIT Press

Morrow, Lesley Mandel. 1993. Literacy Development in the Early Years. Boston: Allyn & Bacon.

Slavin, Robert E. 1994. Educational Psychology: Theory & Practice. Massachusetts: Allyn & Bacon.

Sternberg, Robert J. "Cognitive Approaches to Intelligence" Handbook of Intelligence. 1985. Ed. Benjamin B. Wolman. Wiley Interscience Publication: New York

Sternberg, Robert J. "A Contextualist View of The Nature of Intelligence" Changing Conceptions of Intelligence and Intellectual Functioning: Current Theory and Research. 1991 Ed. Prem S. Fry. Amsterdam: Elsevier Pub.

Wolman, Benjamin B. 1982. Ed. Handbook of Developmental Psychology. New Jersey: Prentice Hall.

The Issue of Language Acquisition.
http://snow.uscd.edu/~ezra/msc/122.html
Introduction to Connectionism.
http.//yoda.cis.temple.edu:8080/UGAIWWW/lectutes95/meeden/nnet/311.html

Child Development
http.//www.abacon.com/berk/cd/sum9.html

Child Language Data.
http.//www.arts.uwa.edu.au/lingWWW/LIN102-99/Notes/dataAcquis.html

Neural Computing Surveys
http //www.icsi.berkeley.edu/ jagota/NCS,

 

Hosted by www.Geocities.ws

1