Cascaded vs. Discrete Models of Speech Production in Bilinguals

  • Written by: Valentina Temina-Kingsolver
  • Country: Russian Federation
  • Times read: 1090
  • Cascaded vs. Discrete Models of Speech Production in Bilinguals

Like this article

5.0/5 rating (6 votes)

This article discusses the controversial issue of bilingual speech production, and the differing models of how the activation of lexical nodes is spread to the corresponding phonological representations.

This issue is of great importance and interest for those who are concerned with the psychology of bilingualism and multilingualism. The understanding of the processes of language production that are related to bilingual speakers of different age groups and levels of proficiency may shed light on general issues of L1 and L2 acquisition. In addition, the knowledge of L1 and L2 development can have practical application for language instruction.

In the present article, first the background information on the processes/models of language production is presented; then two highly debated hypotheses of discrete and cascaded models are discussed. A big portion of this article is devoted to the experimental evidence that supports either of the models. Finally, the experimental paradigms are analyzed which provides new ideas for further research in this area.


There are different models of language production (Caramazza, 1997; Dell, 1986; Dell et al., 1997; Levelt, 1989; Levelt et al., 1999). Most models recognize at least three different levels of representations. At each level, the speaker is involved in the processes of activation and selection of corresponding representations (concepts, lexical nodes/words, and phonemes/phonological segments).

First, at the conceptual/semantic level, the speaker has to make a decision which conceptual information to convey (concept selection). During this step, not only the semantic representations (lemmas) of the target concept but also the semantic representations of related concepts get activated to a certain degree. The activation of the semantic representations spreads to the lexical level, to the corresponding lexical representations. There is a dominant view in the models of bilingual speech production that the activated semantic representations automatically spread activation from the conceptual to lexical level of both languages of the bilingual (Costa et al., 1999; Gollan & Acenas, 2000; Green, 1998; Levelt et al., 1999; Poulisse, 1999; Starreveld & La Heij, 1996).

Second, at the lexical level, lexical representations (lexemes/word forms) are accessed. Since not only the selected concepts but also all the activated concepts activate their lexical representations that act as competitors, the speaker has to make a decision which lexical item out of a set of possible candidates to choose for further processing. In bilinguals the question is also of whether the lexical items of the nonresponse language get somewhat activated during the lexical access. There is some empirical evidence of semantic errors, blends of semantically related words, the semantic interference effect, and L1 lexical intrusions during L2 production that supports this view.

Finally, at the sublexical/phonological level, the lexical selection of the target lexical item results in the retrieval of the corresponding phonological segments/word form that will be articulated.

The spreading activation principle that describes the dynamics of the proceedings between the conceptual/semantic and lexical levels has been widely accepted by different models; however theories of word production disagree in their view of the relationship between the level of lexical selection and the level of phonological encoding.

Discrete vs. Cascaded Processes/Models

There are two questions that are commonly discussed in the research of bilingual speech production regarding this issue: (1) whether the activated lexical nodes of the language not in use activate the corresponding phonological representations, and (2) whether such activation affects the phonological encoding of the intended lexical item.

These questions are very important for understanding the processes of the language production of bilingual speakers. Costa (2005) claims that this issue is not clear. He also notes that the first question is independent of the second one, since it is possible for activation flow to spread to the two languages of a bilingual, and for the selection mechanism to be insensitive to the activation of the representations that do not belong to the intended language.

There are two different models that concern these questions of lexical access in speech production: the discrete and the cascaded models. According to discrete/serial models (Levelt, 1989; Levelt et al., 1991; Levelt et al., 1999; Schriefers et al., 1990), only the selected lexical representations activate their phonological segments, and the lexical representations that were not selected for production do not spread activation to their corresponding phonological representations. The activation of the phonological form begins only after the target lexical node has been selected (discrete processing) and there is no feedback to higher level processes from lower levels (feed-forward processing).

On the contrary, in the cascaded models of lexical access (Caramazza, 1997; Costa et al., 2000; Cutting & Ferreira, 1999; Dell, 1986; Dell et al., 1997; Dell & O'Seaghdha, 1991; Morsella & Miozzo, 2002; Peterson & Savoy, 1998), all activated lexical representations affect the next level by spreading a proportional amount of activation to their corresponding phonological segments. The activation of the phonological word form occurs before the lexical selection takes place (cascaded processes), and the information from the sublexical level affects the higher levels of processing (feedback processing).

These views can be logically extended to the models of bilingual language production. If semantic representations activate both lexicons of a bilingual concurrently then there is a possibility that lexical nodes belonging to the nonresponse language activate their phonological representations. For example, when a Spanish-English bilingual wants to name a picture in English bed, the phonological properties of bed along with those of its translation in Spanish cama would become activated. There is some empirical evidence that is consistent with this view (Colomé, 2001; Costa et al., 2000; Costa & Caramazza, 2002; Meyer & Schriefers, 1991; Roelofs, 2000; Starreveld, 2000).

Experimental Evidence

There is little agreement among the researchers on the issue of whether the selection of the lexical and sublexical representations in the response language of a bilingual is affected by the activation of corresponding linguistic representations in the other language.

There are several studies that have addressed the question of whether phonological activation of non-selected lexical nodes is found during lexical access in monolingual speakers (Cutting & Ferreira, 1999; Levelt et al., 1991; Peterson & Savoy, 1998). However, there are fewer studies that deal with bilingual language production. The results of several reaction time studies favor the notion of concurrent activation of the lexical nodes of the nonresponse language spreading to the corresponding phonological segments (cascaded processes in language production) (Colomé, 2001; Gollan & Acenas, 2000; Hermans et al., 1998; Jescheniak et al., 2006; Morsella & Miozo, 2002). However, some other studies have been interpreted as supporting the notion that lexical nodes belonging to the nonresponse language do not compete during lexical selection (Costa et al., 1999; Costa & Caramazza, 1999).

Costa (2005) notes that with the present state of research in this area it is not possible to conclude which of the models better explain the processes in the bilingual language production. He raises a question of dependence of the results on the performance of different categories of bilingual speakers.

One of the important pieces of evidence of the cascaded processes comes from a reaction time study conducted by Peterson and Savoy (1998). In some trials the participants in this study had to name a picture presented on the screen and in others, instead of naming the picture, they had to read the word that appeared on the screen after the presentation of the picture. In the critical condition the word was phonologically related to a synonym of the picture’s name. For example, when the target picture was couch, the word was phonologically related to the couch’s synonym sofa (e.g. soda). The authors hypothesized that if the non-target word sofa sends activation to its phonological segments, reading latencies for the word soda should be faster when preceded by the picture of a couch than when preceded by an unrelated picture (e.g. lemon). The obtained data supported this hypothesis. The authors were able to conclude that when a participant is asked to name a picture, the activation of the target’s synonym word was also highly activated, because the two words overlapped semantically. A question, however, can be raised by this study as to whether the semantic overlapping of the target word and its synonym created a semantic facilitation effect rather than a phonological facilitation effect.

Another reaction time experiment that can be interpreted as revealing the presence of cascaded processes in bilingual language production is phonological influence of L1 lexical items in a phoneme monitoring task observed by Colomé (2001). In this study, Catalan-Spanish bilinguals were asked to decide whether a target phoneme was included or not in the Catalan name of the picture. Colomé hypothesized that in naming pictures in Catalan the reaction time would be affected by the activation of the phonological segments of the corresponding Spanish names of the picture. She also argued that in such a scenario it would be harder for participants to reject a phoneme as not being part of the target Catalan word if it was part of its Spanish name because the level of activation of the related phoneme would be higher than the unrelated one. The results of the study confirmed the predictions and were interpreted as evidence for activation of the lexical items of the nonresponse language and their corresponding phonological segments. Colomé concluded that making a phonological decision about the name of a target picture in L2 is affected by the phonological content of the name of the target picture in L1. In contrast to the previously described study, Colomé’s findings display a phonological interference effect rather than a phonological facilitation effect, which equally supports the cascaded model, because any kind of phonological effect would account for activation of phonological segments of the nonresponse language.

Hermans et al. (1998) conducted several picture-word interference experiments in which Dutch-English bilinguals were required to name pictures in their L2 and to ignore distractor words in either L1 or L2. In the critical condition, the distractor word was phonologically related to the target's translation. The authors hypothesized that the target’s translation word would interfere during the selection of the target’s lexical node, and consequently the naming latencies would be slower when the translation word received extra activation from the phonologically related distractor word then when it was unrelated. The results supported this hypothesis in that naming latencies were slower in the phonologically related condition, which revealed the activation of the target’s translation in the nonresponse language. The obtained evidence was interpreted as favoring the ideas of activation of lexical representations of the nonresponse language.

Costa et al. (2000) however, proposed an alternative explanation for these results. They claim the interference effect in this study can be explained by the retrieval of the phonological elements of the target word. Thus, in the phonologically related condition the phonemes of the target lexical item receive extra activation from the distractor word, while they do not receive any extra activation in the unrelated condition. Therefore, they assert that “the selection of the target’s phonological information might be delayed by the competition of other activated phonological information” (Costa et al., 2000, pp. 428-429). They base this argument on the assumptions that (a) there is cascaded processing, and therefore even if the lexical nodes are not selected for production, they still activate their phonological segments, and (b) the activation of phonological segments of the non-target item may still interfere with the retrieval of the target item.

Another set of results indicating that lexical nodes from the nonresponse language activate their phonological properties was reported by Gollan and Acenas (2000). They explored occurrence of the tip-of-the-tongue states in bilingual speech production. In this study, the authors were able to observe fewer TOT states in bilingual language production when cognate words were involved. They argued that the cognate effect arose because the target's translation was sending extra activation to the phonological elements of the target word thus making the phonological segments of a cognate word more available than those of noncognate. One can declare that since cognate words in two languages share conceptual representations, it is not obvious what the cause of the cognate facilitation effect is: the semantic overlapping or the phonological similarity.

In a study by Morsella and Miozzo (2002), the authors used a picture-picture interference paradigm, where distractor pictures were used rather than distractor words. The participants were to name superimposed pictures in green while ignoring pictures in red. In one part of the experiment, both phonologically related and unrelated target and distractor pictures were used. The pictures of phonologically related composites shared some phonemes but were not homophones. The authors hypothesized that phonologically related distractor words would facilitate naming of the target pictures. As predicted, naming latencies were faster when the distractor picture’s name was phonologically related to the name of the target picture than when it was unrelated. The authors interpreted this phonological facilitation effect as the evidence for distractor pictures activating their phonological segments. Thus, they were able to conclude that the study supports the cascaded model of lexical access.

In a study recently published by Jescheniak et al. (2006) the results obtained by the authors provide further evidence for the cascaded model of lexical retrieval. The series of experiments demonstrated that there were mediated effects present in children of about 7 years of age that decreased with age. Also, the mediated distractors led to interference in the youngest age group but not in adults. This study provided a different perspective on the language production processes in bilingual, as it is unquestionable that age of the bilingual person plays an important role in the way the language is processed.

There is also evidence of discrete processes in language production that has been obtained by the proponents of discrete models. Thus, in the study conducted by Levelt et al. (1991), the search was for phonological activation of semantic alternatives to the target word. They were not able to obtain evidence of any effect on the lexical decision latencies for spoken probes that were phonologically related to semantic competitors of the target lexical items (e.g., dog as a competitor of cat). Moreover, the semantic competitors (DOG) were showing semantic interference themselves. This result supports the discrete view.

More evidence for the discrete models comes from the experiments conducted by Levelt (1989), Levelt et al. (1999), Schriefers et al. (1990) and others.

Experimental Paradigms

The processes of bilingual speech production have not been fully explored and there seem to be a number of unanswered questions. The major reason for this state of affairs is, according to Costa (2005), problems in developing experimental paradigms in language production of bilingual speakers.

One of the most popular and frequently used paradigms for studying processes involved in lexical access is the picture-word interference task, which is a variant of the color-word Stroop task. In this task, the participant is presented with a target picture and a distractor word, and is required to name the picture while ignoring the distractor word. Under different conditions, the distractor word can contain different critical features (for example, be phonologically and/or semantically related). Major effects, such as semantic and phonological facilitation effects (SFE and PFE), have been observed in this paradigm. PFE refers to the faster naming latencies observed when the distractor word and the picture’s name are phonologically related (picture: bed, distractor: bell) than when they are not (picture: bed, distractor: dog) (Costa et al., 2000).

However, La Heij (2005) sees one major problem with this paradigm. He claims that words activate their lexical representations faster and stronger than nonverbal stimuli do. Thus, words access their lexical and phonological representations, bypassing the conceptual level, while visual stimuli have to follow the top-down architecture of language production. Consequently, after the presentation of, for example, the picture of a bed with the word dog superimposed, the incorrect word dog will reach the threshold of activation before the target word bed will.

Another problem can be detected with the picture-word interference paradigm. Costa et al. (2003) claim there is an assumption that a distractor word exerts some effects at one single level of representation. For instance, semantically related distractors would exert their effects at the lexical level, whereas phonologically related distractors would do so at the sublexical/phonological level. However, they noted that some distractor words might exert their effects at different levels of representation. Accordingly, phono-translation, cognate distractors and synonym distractors may all create an effect at both lexical and sublexical levels.

Costa et al. (2006) point out some very important potential variables that play an important role on the type of processes involved in speech production of bilinguals. Among those variables are “the similarity of the two languages of a bilingual, the age (and manner) at which L2 has been acquired, the proficiency achieved in L2, the recency and frequency of use of the two languages, and the discourse topic [that] may affect whether or not the two languages become activated in parallel even in monolingual contexts” (p. 148)


Taking into account the issues discussed above, it seems reasonable to use a picture-picture interference paradigm – a task in which a number of objects are presented (e.g., two pictures), one of which has to be named aloud. This paradigm allows avoiding known problems of the picture-word paradigm, such as the possibility of a written-word distractor to bypass the lexical nodes and activate phonology directly from orthography.

Also, it seems to be important to control for various alternative explanations of experimental results. As such, including participants of different age groups and levels of language fluency is crucial for obtaining convincing results about the cascaded and discrete processes of language production.


Caramazza, A. (1997). How many levels of processing are there in lexical access? Cognitive Neuropsychology, 14(1), 177-208.

Colomé, A. (2001). Lexical activation in bilinguals' speech production: Language-specific, or language-independent. Journal of Language and Memory, 45(4), 721-736.

Costa, A. (2005). Lexical Access in Bilingual Production. In J. F. Kroll & A. M. B. De Groot (Eds.), Handbook of Bilingualism: Psycholinguistic Approaches (pp. 308-325). New York: Oxford University Press.

Costa, A., & Caramazza, A. (1999). Is lexical selection in bilingual speech production language-specific? Further evidence from Spanish-English and English-Spanish bilinguals. Bilingualism: Language and Cognition, 2(3), 231-244.

Costa, A., & Caramazza, A. (2002). The production of noun phrases in English and in Spanish: Implications for the scope of phonological encoding in speech production. Journal of Memory and Language, 46(178-198).

Costa, A., Colomé, A., & Caramazza, A. (2000). Lexical access in speech production: The bilingual case. Psychological Review, 21, 403-437.

Costa, A., La Heij, W., & Navarrete, E. (2006). The dynamics of bilingual lexical access. Bilingualism: Language and Cognition, 9(2), 137-151.

Costa, A., Miozzo, M., & Caramazza, A. (1999). Lexical selection in bilinguals: do words in the bilingual's two lexicons compete for selection? Journal of Memory and Language, 41, 365-397.

Cutting, J. C., & Ferreira, V. S. (1999). Semantic and phonological information flow in the production lexicon. Journal of Experimental Psychology: Learning, Memory and Cognition, 25, 318-344.

Dell G. S., & O'Seaghdha P. G. (1991). Mediated and convergent lexical priming in language production: A comment on Levelt et al. (1991). Psychological Review, 98, 604-614.

Dell, G. S. (1986). A spreading-activation theory of retrieval in sentence production. Psychological Review, 93(3), 283-321.

Dell, G. S., Burger, L. K., & Svec, W. R. (1997). Language production and serial order: A functional analysis and a model. Psychological Review, 104(1), 123-147.

Gollan, T. H., & Acenas, L. A. (2000, April). Tip-of-the-tongue incidence in Spanish-English and Tagalog-English bilinguals. Presented at the Third International Symposium on Bilingualism, Bristol, U.K.

Green, D. W. (1998). Mental Control of the Bilingual Lexico-Semantic System. Bilingualism: Language and Cognition, 1(2), 67-104.
Hermans, D., Bongaerts, T., De Bot, K., & Schreuder, R. (1998). Producing words in a foreign language: Can speakers prevent interference from their first language? Bilingualism: Language and Cognition, 1, 213-230.

Jescheniak, J. D., Hahne, A., Hoffmann, S., & Wagner, V. (2006). Phonological activation of category coordinates during speech planning is observable in children but not in adults: Evidence for cascaded processing. Journal of Experimental Psychology: Learning, Memory and Cognition, 32(3), 373-386.

La Heij, W. (2005). Selection processes in monolingual and bilingual lexical access. In J. F. Kroll & A. M. B. De Groot (Eds.), Handbook of Bilingualism: Psycholinguistic Approaches (pp. 289-307). New York: Oxford University Press.

Levelt, W. J. M. (1989). Speaking: From Intention to Articulation. Cambridge, Massachusetts: MIT Press.

Levelt, W. J. M., Roelofs, A., & Meyer, A. S. (1999). A theory of lexical access in speech production. Behavioral and Brain Sciencies, 22(1), 1-38.

Levelt, W. J. M., Schriefers, H., Meyer, Antje S., Pechmann, T., Vorberg, D., & Havinga, J. (1991). The Time Course of Lexical Access in Speech Production: A Study of Picture Naming. Psychological Review, 98(1), 122-142.

Meyer, A., & Schriefers, H. (1991). Phonological facilitation in picture-word interference experiments: Effects of stimulus onset asynchrony and types of interfering stimuli. Journal of Experimental Psychology, 17(6), 1146-1160.

Morsella, E., & Miozzo, M. (2002). Evidence for a cascade model of lexical access in speech production. Journal of Experimental Psychology: Learning, Memory and Cognition, 28(3), 555-563

Peterson, R. R., & Savoy, P. (1998). Lexical selection and phonological encoding during language production: Evidence for cascaded processing. Journal of Experimental Psychology: Learning, Memory and Cognition, 24, 539-557.

Poulisse, N. (1999). Slip of the Tongue: Speech Errors in First and Second Language Production. Amsterdam, Philadelphia: John Benjamins.

Roelofs, A. (2000). WEAVER++ and other computational models of lemma retrieval and word-form encoding. In L. Wheeldon (Ed.), Aspects of language production (pp. 71-114). Sussex, U.K.: Psychology Press.

Schriefers, H., Meyer, A. S., & Levelt, W. J. M. (1990). Exploring the time-course of lexical access in production: Picture-word interference studies. Journal of Memory and Language, 29, 86-102.

Starreveld, P. A. (2000). On the interpretation of onsets of auditory context effects in word production. Journal of Memory and Language, 42(4), 495-525.

Starreveld, P. A., & La Heij, W. (1996). The locus of orthographic-phonological facilitation: A reply to Roelofs, Meyer, and Levelt. Journal of Experimental Psychology: Learning, Memory and Cognition, 22(1), 252-255.

Leave a comment

You are commenting as guest. Optional login below.

Who's Online

We have 151 guests and no members online

Login & Registration