Speech prosody

a methodological review


  • Yi Xu University College London




Analysis-by-introspection, Analysis-by-transcription, Analysis-by-hypothesis-testing, Analysisby-modeling, Predictive knowledge, Degrees of freedom


This critical review is mainly concerned with methodological issues in prosody research, with the aim to highlight progress toward developing predictive knowledge about prosody. The review shows that there has been a steady progression in terms of methodological rigor as the field goes through major methodological trends that can be described as analysis-by-introspection, analysis-by-transcription, analysis-by-hypothesis-testing and analysis-by-modeling. All the major methodologies currently still co-exist and each still has it own merit. But all of them are evaluated in terms of their effectiveness in establishing knowledge that is generalizable. Finally, an emphasis will be made on the need to have much more linking and integration between different subareas of prosody research.


Download data is not yet available.

Author Biography

Yi Xu, University College London

Professor of Speech Sciences of the University College London.


Proceedings of Proceedings of ICASSP, San Diego, CA, 1984. 77-80.

Andruski J and Costello J. Using polynomial equations to model pitch contour shape in lexical tones: An

example from Green Mong. Journal of the International Phonetic Association 2004;34:125-140.

Anttila A, Adams M and Speriosu M. The role of prosody in the English dative alternation. Language and

Cognitive Processes 2010;25:946-981.

Arvaniti A and Garding G. Dialectal variation in the rising accents of American English. In: J Cole and J

Hualde, editors. Laboratory Phonology 9, The Hague: Mouton de Gruyter, 2007. 547-576.

Atterer M and Ladd DR. On the phonetics and phonology of "segmental anchoring" of F0: Evidence from

German. Journal of Phonetics 2004;32:177-197.

Auberge V and Cathiard M. Can we hear the prosody of smile. Speech Communication 2003;40:87-97.

Avesani C and Vayra M. Broad, narrow and contrastive focus in Florentine Italian In: Proceedings of The

th International Congress of Phonetic Sciences, Barcelona, 2003. 1803-1806.

Bagshaw PC. An investigation of acoustic events related to sentential stress and pitch accents, in English.

Speech Communication 1993;13:333-342.

Bailly G and Holm B. SFC: a trainable prosodic model. Speech Communication 2005;46:348-364.

Bänziger T and Scherer KR. The role of intonation in emotional expressions. Speech Communication


Barbosa PA. From syntax to acoustic duration: A dynamical model of speech rhythm production. Speech

Communication 2007;49:725-742.

Barbosa PA. Detecting changes in speech expressiveness in participants of a radio program In:

Proceedings of Interspeech 2009, Brighton, UK, 2009. 2155-2158.

Barbosa PA. and Bailly G. Characterisation of rhythmic patterns for text-to-speech synthesis. Speech

Communication 1994;15 (1-2):127-137.

Baum KM and Nowicki S. Perception of emotion: Measuring decoding accuracy of adult prosodic cues

varying in intensity. Journal of Nonverbal Behavior 1998;22:89-107.

Baumann S, Becker J, Grice M and Mücke D. Tonal and articulatory marking of focus in German In:

Proceedings of The 16th International Congress of Phonetic Sciences, Saarbrucken, 2007. 1029-1032.

Beckman ME and Edwards J. Lengthenings and shortenings and the nature of prosodic constituency. In: J

Kingston and ME Beckman, editors. Papers in Laboratory Phonology 1 — Between the Grammar and

Physics of Speech, Cambridge: Cambridge University Press, 1990. 152-178.

Beckman ME. A typology of spontaneous speech. In: Y Sagisaka, N Campbell and N Higuchi, editors.

Computing Prosody: Computational Models for Processing Spontaneous Speech, New York: Springer

Verlag, 1997. 7–26.

Bellegarda J, Silverman K, Lenzo K and Anderson V. Statistical prosodic modeling: from corpus design

to parameter estimation. IEEE Trans actions on Speech Audio Process 2001;9:52–66.

Beller G, Obin N and rodet X. Articulation Degree as a Prosodic Dimension of Expressive Speech In:

Proceedings of Speech Prosody 2008, 2008.

Bentin S, Ram F and Leonard K. Chapter 11 Phonological Awareness, Reading, and Reading

Acquisition: A Survey and Appraisal of Current Knowledge. In: editors. Advances in Psychology.

Volume 94: North-Holland, 1992. 193-210.

Berkovits R. Utterance-final lengthening and the duration of final-stop closures. Journal of Phonetics


Berkovits R. Durational Effects in Final Lengthening, Gapping, and Contrastive Stress. language and

speech 1994;37:237-250.

Beyssade C, Hemforth B, Marandin J-M and Portes C. Prosodic markings of information focus in French

In: Proceedings of Interface Discours & Prosodie 2009, Paris, 2009. 109-122.

Birch S and Clifton C. Focus, accent, and argument structure: effects on language comprehension.

Language and Speech 1995;38:365-391.

Bock J and Mazzella J. Intonational marking of given and new information: Some consequences for

comprehension. Memory & Cognition 1983;11:64-76.

Bolinger DL. Forms of English: Accent, Morpheme, Order, Cambridge, Massachusetts: Harvard

University Press; 1965.

Bolinger D. Intonation and its parts: melody in spoken English, Palo Alto: Stanford University Press;

Bolinger D. Intonation and Its Uses -- Melody in Grammar and Discourse, Stanford, California: Stanford

University Press; 1989.

Botinis A, Bannert R and Tatham M. Contrastive tonal analysis of focus perception in Greek and

Swedish. In: A Botinis, editors. Intonation : analysis, modelling and technology, Boston: Kluwer

Academic Publishers, 2000. 97-116.

Botinis A, Fourakis M and Gawronska B. Focus identification in English, Greek and Swedish In:

Proceedings of The 14th International Congress of Phonetic Sciences, San Francisco, 1999. 1557-1560.

Botinis A, Granström B and Möbius B. Developments and paradigms in intonation research. Speech

Communication 2001;33:263-296.

Braun B and Tagliapietra L. The role of contrastive intonation contours in the retrieval of contextual

alternatives. Language and Cognitive Processes 2010;25:1024-1043.

Brazil DM, Coulthard M and Johns C. Discourse Intonation and Language Teaching, London: Longman;

Breitenstein C, Van Lancker D and Daum I. The contribution of speech rate and pitch variation to the

perception of vocal emotions in a German and an American sample. Cognition & Emotion 2001;15:57-

Bulut M and Narayanan S. On the robustness of overall F0-only modifications to the perception of

emotions in speech. Journal of the Acoustical Society of America 2008;123:4547-4558.

Bruce G and Touati P. On the analysis of prosody in spontaneous speech with exemplification from

Swedish and French. Speech Communication 1992;11:453-458.

Bruce G. Developing the Swedish intonation model. Lund University, Dept. of Linguistics Working

Papers 1982a;22:51-116.

Bruce G. Textual aspects of prosody in Swedish. Phonetica 1982b;39:274-287.

Büring D. On D-Trees, Beans, and B-Accents. Linguistics and Philosophy 2003;26:511-545.

Büring D. Focus Projection and Default Prominence. In: V Molnar and S Winkler, editors. The

Architecture of Focus, 2006.

Büring D. Semantics, Intonation and Information Structure. In: G Ramchad and C Reiss, editors. The

Oxford Handbook of Linguistic Interfaces: Oxford University Press, 2007.

Byrd D and Saltzman E. Intragestural dynamics of multiple prosodic boundaries. Journal of Phonetics


Byrd D and Saltzman E. The elastic phrase: Modeling the dynamics of boundary-adjacent lengthening.

Journal of Phonetics 2003;31:149–180.

Calhoun S. How does informativeness affect prosodic prominence? Language and Cognitive


Campbell N. Automatic detection of prosodic boundaries in speech. Speech Communication


Campbell WN and Marumoto T. Automatic labelling of voice quality in speech databases for synthesis

In: Proceedings of ICSLP-2000, Beijing, 2000. 468-471.

Campbell N. Building a Corpus of Natural Speech - and Tools for the Processing of Expressive Speech -

the JST CREST ESP Project In: Proceedings of Eurospeech 2001, 2001. 1525-1528.

Campbell N. Databases of expressive speech. Journal of Chinese Language and Computing 2004;14:295-

Carlson, K., Clifton, C. and Frazier, L. (2001). Prosodic Boundaries in Adjunct Attachment. Journal of

Memory and Language 45(1): 58-81.

Caspers J. Local speech melody as a limiting factor in the turn-taking system in Dutch. Journal of

Phonetics 2003;31:251–276.

Chaffe W. Language and consciousness. Language 1974;50:111-133.

Chaffe W. Givenness, contrastiveness, definiteness, subjects, topics and point of view. In: C Li, editors.

Subject and Topic, New York: Academic Press, 1976. 25-55.

Chahal D. Phonetic Cues to Prominence in Lebanese Arabic In: Proceedings of The 15th International

Congress of Phonetic Sciences, Barcelona, 2003. 2067-2070.

Chen S-H and Chang S. A statistical model based fundamental frequency synthesizer for Mandarin

speech. Journal of the Acoustical Society of America 1992;92:114-120.

Chen A and Destruel E. Intonational encoding of focus in Toulousian French In: Proceedings of Speech

Prosody 2010, Chicago, 2010.

Chen A, Gussenhoven C and Rietveld T. Language-specificity in the perception of paralingistic

intonational meaning. Language and Speech 2004;47:311-349.

Chen S-w, Wang B and Xu Y. Closely related languages, different ways of realizing focus In:

Proceedings of Interspeech 2009, Brighton, UK, 2009. 1007-1010




How to Cite

Xu Y. Speech prosody: a methodological review. J. of Speech Sci. [Internet]. 2011 Jul. 1 [cited 2022 Jun. 29];1(1):85-115. Available from: https://econtents.bc.unicamp.br/inpec/index.php/joss/article/view/15014