Elizabeth Stokoe

The science of analyzing conversations, second by second
TEDxBermuda, Dec 4, 2014
Elizabeth Stokoe

What is it that you actually do?
first mover

Prof. Elizabeth Stokoe takes a run on what she terms the “conversational racetrack”—the daily race to understand each other when we speak—and explains how to avoid hurdles that trip us up and cause conflict.

Elizabeth Stokoe is a British scientist. She studies conversation analysis. She is a professor at Loughborough University. She graduated from the University of Central Lancashire (Preston Poly) in 1993 with a traditional psychology degree. Then Stokoe completed three years PhD research at Nene College (Leicester University) with Dr. Eunice Fisher.

Her research included videotaping interaction in university tutorials, and conducting conversation analyses of topic production, topic management, academic identity, and the relevance of gender. She developed these and other interests while working at the Institute of Behavioural Sciences (University of Derby, 1997-2000) and University College Worcester (2000-2002).

Stokoe joined the Department of Social Sciences at Loughborough in October 2002 and was promoted to Reader (2007) and Chair (2009). She teaches on the BSc Social Psychology programme, covering modules in relationships, qualitative methods and forensic psychology.

Stokoe developed the Conversation Analytic Role-play Method (CARM), an approach based on evidence about what sorts of problems and roadblocks can occur in conversation, as well as the techniques and strategies that best resolve these problems.[2] CARM won Loughborough University’s Social Enterprise award (2013).

The Mehrabian Study

The Infamous Mehrabian Study And Why You Should Care
Forbes. 2/28/2011

Forty years ago communications expert Albert Mehrabian did a little study that got an outsized reputation — and is often misunderstood.
Here’s what he actually found, and what it means, in a short video from a speech last year.


the 7%-38%-55% Rule, for the relative impact of words, tone of voice, and body language when speaking.



Brain responses to angry prosody

prosodyThe voices of wrath: brain responses to angry prosody in meaningless speech
Nature Neuroscience 8, 145 – 146 (2005)
Didier Grandjean, et al.

emotional enhancement was voice specific, unrelated to isolated acoustic amplitude or frequency cues in angry prosody, and distinct from any concomitant task-related attentional modulation.
Attention and emotion seem to have separate effects on stimulus processing, reflecting a fundamental principle of human brain organization shared by voice and face perception.

voice perception

Prosody cues word order

Prosody cues word order in 7-month-old bilingual infants
Nature Communications, Feb. 14, 2013, 4: 1490
Judit Gervain

A central problem in language acquisition is how children effortlessly acquire the grammar of their native language even though speech provides no direct information about underlying structure.
This learning problem is even more challenging for dual language learners, yet bilingual infants master their mother tongues as efficiently as monolinguals do.

Here we ask how bilingual infants succeed, investigating the particularly challenging task of learning two languages with conflicting word orders (English: eat an apple versus Japanese: ringo-wo taberu ‘apple.acc eat’).

We show that 7-month-old bilinguals use the characteristic prosodic cues (pitch and duration) associated with different word orders to solve this problem.
Thus, the complexity of bilingual acquisition is countered by bilinguals’ ability to exploit relevant cues.
Moreover, the finding that perceptually available cues like prosody can bootstrap grammatical structure adds to our understanding of how and why infants acquire grammar so early and effortlessly.

Foreign accent syndrome (Jamaican + Italian accent)

A fall on a stairway led to Robin Jenks Vanderlip’s foreign accent syndrome.

A case of foreign accent syndrome.
J Emerg Med. 2013 Jul;45(1):26-9.
Tran AX, Mills LD.

Foreign accent syndrome is a rare but potentially devastating clinical condition associated with altered speech rhythm and prosody, often occurring after a cerebral vascular accident. Missing this diagnosis can lead to delayed stroke work-up and treatment.

We report a case of foreign accent syndrome in a 60-year-old woman who presented to the Emergency Department (ED) with 3 weeks of altered speech pattern, widened gait, bilateral leg heaviness, and mild headache.

The patient had a history of Type 2 diabetes, malignant hypertension, toxic nodular goiter, and hyperlipidemia. She initially  was thought to have had speech change secondary to a goiter impinging on the recurrent laryngeal nerve, and was discharged.
She returned to the ED 3 weeks later when outpatient imaging revealed subacute infarction of the left hemi-pons and absent flow within the left vertebral artery.
On examination, the patient was alert and conversational. She spoke fluently with an accent that had components of Jamaican and the accent of an Italian speaking English.
She developed a second, more significant stroke 1 month later, with unilateral weakness and slurred speech in the middle cerebral artery distribution.

Clinicians should be aware that some stroke patients present with various atypical symptoms, and should suspect stroke in any patient with acute-onset neurological symptoms, including speech change


Rare and unusual psychiatric syndromes


Speaking In Tones

Speaking In Tones
By Diana Deutsch
Scientific American Mind July / August 2010


perceptual transformation

the brain areas governing music and language/speech overlap

a person’s native tongue influences the way he or she perceives music

speakers of tonal languages such as Mandarin are much more likely than Westerners to have perfect pitch

nonverbal sounds such as music

some aspects of music engage the left hemisphere more than the right

the neural networks dedicated to speech and song significantly overlap.

This overlap makes sense, because language and music have a lot in common.
They are both governed by a grammar, in which basic elements are organized hierarchically into sequences according to established rules.
In language, words combine to form phrases, which join to form larger phrases, which in turn combine to make sentences.
Similarly, in music, notes combine to form phrases, which connect to form larger phrases, and so on.
Thus, to understand either language or music, listeners must infer the structure of the passages that they hear, using rules they have assimilated through experience.

In addition, speech has a natural melody called prosody.
Prosody encompasses overall pitch level and pitch range, pitch contour (the pattern of rises and falls in pitch), loudness variation, rhythm and tempo. Prosodic characteristics often reflect the speaker’s emotional state. When people are happy or excited, they frequently speak more rapidly, at higher pitches and in wider pitch ranges; when people are sad, they tend to talk more slowly, in a lower voice and with less pitch variation.
Prosody also helps us to understand the flow and meaning of speech.
Boundaries between phrases are generally marked by pauses, and the endings of phrases tend to be distinguished by lower pitches and slower speech. Moreover, important words are often spoken at higher pitches. Interestingly, some pitch and timing characteristics of spoken language also occur in music, which indicates that overlapping neural circuitries may be involved.

in 2009 medical anthropologist Kathleen Wermke of the University of Würzburg in Germany and her colleagues recorded the wails of newborn babies—which first rise and then fall in pitch—who had been born into either French- or German- speaking families.
The researchers found that the cries of the French babies consisted mostly of the rising portion, whereas the descending segment predominated in the German babies’ cries. Rising pitches are particularly common in French speech, whereas falling pitches predominate in German. So the newborns in this study were incorporating into their cries some of the musical elements of the speech to which they had been exposed in the womb, showing that they had already learned to use some of the characteristics of their first language.

When parents speak to their babies, they use exaggerated speech patterns termed motherese that are characterized by high pitches, large pitch ranges, slow tempi, long pauses and short phrases.
These melodious exaggerations help babies who cannot yet comprehend word meanings grasp their mothers’ intentions. For example, mothers use falling pitch contours to soothe a distressed baby and rising pitch contours to attract the baby’s attention. To express approval or praise, they utter steep rising and falling pitch contours, as in “Go-o-o-d girl!” When they express disapproval, as in “Don’t do that!” they speak in a low, staccato voice.

the melody of the speech alone, apart from any content, conveys the message.

… but after the six months of instruction, the children who had taken music lessons outperformed the others. Musically trained children may thus be at an advantage in grasping the emotional content—and meaning—of speech.

music lessons can improve the ability to detect emotions conveyed in speech (presumably through a heightened awareness of prosody).

that the language we learn early in life (e.g.: English, Vietnamese) provides a musical template that influences our perception of pitch.

pitch range of speech

Vietnamese and Mandarin are tone languages (words take on entirely different meanings depending on the tones with which they are spoken)


September 10, 2014


Beat boxing

‘Er’ cautions listeners to stay on side

‘Er’ cautions listeners to stay on side
28 May 2002 | Nature

‘Uh’ and ‘um’ send information to listeners just like proper words, say Herbert Clark of Stanford University, California, and his colleague Jean Fox Tree at the University of California, Santa Cruz.

English speakers lob in ‘um’ before a long pause and ‘uh’ in front of a brief hiatus, the analysis revealed. People even create compounds such as ‘the-um’ or ‘and-uh’, says Clark, showing that speakers know that there is going to be a problem after the word even before they begin it.

‘Uh’ and ‘um’ are commonly thought just to fill a pause or prevent interruptions.

gap-signaling words

Public speakers learn to suppress umming and erring, hiding moments of uncertainty.
For example, there’s not a single ‘um’ or ‘uh’ in any of the recorded inaugural addresses made by US presidents between 1940 and 1996.

see also:


  • linguistics
  • conversation
  • speech
  • um
  • uh