Speech production is a process running through a set of levels of a very heterogeneous nature as it starts from an idea and ends with a sound wave. Focussing on the production of a word, the steps may be schematized as follows:1

From idea to articulation

leveloperations/processesbrain areaexample
Pragmaticsa cognitive and communicative idea is formedtertiary association corticesinvolves a tiger
Conceptualizationidea is analyzed in terms of constitutive notionsnotion of ‘tiger’ is activated
Lexical selectionnotions are mapped onto items of the mental lexicon[various regions of left hemisphere]the word tiger is selected
Morphological adaptationlexeme is specified as a word form according to its function in the sentenceBroca's area(nothing in this example)
Symbolizationcombination of morphs is mapped onto a phonological representationWernicke's area/tajgər/
Phoneticsphonological representation is converted into a plan for the execution of phonatory and articulatory movements2Broca's area + premotor cortex + basal ganglia3[tʰɑɪ•gɚ]
Articulatory representationphonetic plan is temporarily kept in working memoryprefrontal cortex
Motor commandssequentially, each component of the plan is sent to speech apparatusprimary motor cortex
Phonation and articulationspeech apparatus executes motor commands (with proprioceptive and auditory feedback)phonatory and articulatory organs[Röntgen video here]
Acousticssound wave is producedair[s. next section]

Although this model is schematic and simplified in some respects, its steps can be tentatively associated with the passage of a neural impulse through the brain regions involved as indicated in the table.

The model incorporates self monitoring (Levelt et al. 1999). This implies that the speaker controls the result of each of the steps executed, compares it with his intentions and, if an error is found, may take appropriate measures. The last monitoring step consists in the auditory feedback that the speaker receives of his own utterance.


1 Cf. the “Standard Model of Word-Form Encoding” (Meyer 2000) and Stille et al. 2020.

2 The step-wise conversion of a relatively abstract phonological into an individual phonetic representation is flatly contradicted in Port 2007.

3 The three components mentioned are linked by a closed loop in which motor programs are selected and refined.