Formulaic Sequences: Definition Problems

This article discusses the problem of defining formulaic sequences in language use as it is presented in the 2004 volume “Formulaic Sequences”, edited by Norbert Schmitt.

In the past three decades there has been much research done in the area of language patterning. However, as Schmitt & Carter (2004) noticed, the problem is that the diversity of formulaic sequences made it difficult to develop a comprehensive definition of the phenomenon.

Kuiper (2004) states that as early as 1976, Austin observed some utterances that acted as speech acts (greetings, apologies, etc.) were used within particular contexts, and served both as cultural and linguistic artifacts, but he failed to notice that those utterances were relatively fixed formulae.

Schmitt & Carter (2004) and Schmitt et al. (2004a) claim that Pawley & Syder (1983) were the first English-based researchers to notice the magnitude of ‘sentence-length expressions’ (conventionalized language). Pawley and Syder gave an explanation about why word clusters appear to hold such a prominent place in language use: holistic ‘prepackaged’ storing of ‘formulaic sequences’ allows the language user to easily retrieve them and free up cognitive resources for other language processes.

In 1991, as Schmitt & Carter (2004), Jones & Haywood (2004), and Adolphs & Durow (2004) point out, Sinclair (1991) put forward two main structuring principles of organizing language as a whole: ‘an open choice principle’ that assumes free choice of individual lexical items and ‘an idiom principle’ which involves extensive use of formulaic stretches of words, or ‘semi-preconstructed phrases’, and presupposes that words are often selected as part of a ‘co-selection process’ which ‘leads to a strong syntagmatic relationship between individual lexical and grammatical items’.

In 1992, according to Schmitt et al. (2004a), Nattinger & DeCarrico (1992) conducted an investigation of the relationship between ‘lexical phrases’ and functional language use.

In 1995, Jackendoff (1995), and Melcuk (1995) separately echoed the same conclusion but used the terms ‘formulaic sequences’ and ‘phraseology’ respectively (Schmitt & Carter, 2004).

Jones and Haywood (2004) assert that studies of computerized corpus revealed additional patterning in language use. Thus, in 1999, Biber et al. (1999) researched ‘lexical bundles’ that they defined as ‘bundles of words that show a statistical tendency to co-occur’ and also as ‘recurring sequences of three or more words’ (pp. 989-990). They reported that sequences of this length were at least ten times more common than longer sequences.

In 2000, Wray proposed a definition of the phenomenon, which has been quite dominant; though, as said by Read and Nation (2004), some researchers admit that it has to be taken cautiously.

Wray’s definition of the formulaic sequence is:

A sequence, continuous or discontinuous, of words or other meaning elements, which is, or appears to be, prefabricated: that is stored and retrieved whole from the memory at the time of use, rather than being subject to generation or analysis by the language grammar. (Wray 2000:465)

Wray sees wholeness as the main characteristic of a formulaic sequence. The sequence has to be identified as ‘stored and retrieved whole from memory at the time of use’. Holistic processing does not allow ‘generation or analysis by the language grammar’.

Wray’s definition includes both continuous and discontinuous sequences. That is, there may be insertions in a sequence (the point of … is that…). The definition, however, excludes substitution of items within a sequence (pull his leg/pulling his sister’s leg/yank her leg, etc.), and transformations of a sequence (chew the fat/fat chewing/fat chewer), because they would involve ‘generation or analysis by the language grammar’.

Read and Nation (2004) assert that Wray’s definition does not specify the form of the items in storage. They suggest that if the storage is verbatim, where the sequences are stored without possibility of substitution or transformation, then according to Grant’s (2003) research, the number of formulaic sequences is rather infrequent in the language use.

Read and Nation (2004) claim that the interest in formulaic sequences arose as reaction to the absence of description of semantic patterning in the language, which they assert is different from formulaic sequences. They call for the distinction between those two phenomena. Taking into consideration the variability of the formulaic language and of the way different researchers see the construct, they consider it to be important to modify the definition of formulaic sequences depending on the purpose of a research study.

Wray (2004) argues that if any word string can be treated as formulaic it will be difficult to identify formulaic strings by just looking at their form, meaning or usage. According to her, it is difficult to make a complete list of formulaic sequences, but it is still possible to do this by using such techniques as tracking of pauses, eye-gaze, intonation, or fluency in typing.

Wray (2002:9) found over fifty terms that different researchers used for word sequences. Among them are: chunks, formulaic speech, multiword units, collocations, formulas, prefabricated routines, conventionalized forms, holophrases, ready-made utterances.

The editors of this volume considered using such terms as phrasal lexical items and phrasal lexemes in this book, but in the end, based on Wray’s definition, they decided to employ the term formulaic sequences as it covers different kinds of patterned language.

Other contributors to the volume use other terms for various reasons. As a consequence, Kuiper (2004) has his own definition of formulaic sequences. He suggests that it is easy to identify such sequences because they have fixed forms. He also notes that there are other sequences on the scale of formulaicity that are examples of collocational prosody (e.g., bordering on). These sequences allow insertion, inflection, substitution, deletion, and transformation which all goes in contrast with Wray’s definition as it engages “generation or analysis by the language grammar”. Therefore, he admits that these sequences/patterns cannot be easily called formulaic sequences.

Adolphs and Durow (2004) also point out different approaches used to define the form and function of different types of relatively fixed sequences in language use. They refer to Aijmer (1996), Manes & Wolfson (1981) and Moon (1998) for examples of different word sequences ranging from chunks to multi-word units to formulas.

As we see from the discussion above, defining formulaic sequences has been a real challenge for linguists who work in this area.


