Dynamic model of speech: intonation, rhythm and discourse in brazilian portuguese

Luciana Lucente

doi:10.20396/joss.v3i2.15045

Vol. 3 No. 2 (2013), Articles

Vol. 3 No. 2 (2013)

Dynamic model of speech: intonation, rhythm and discourse in brazilian portuguese

Articles

https://doi.org/10.20396/joss.v3i2.15045

Published 2021-02-05

Luciana Lucente⁺⁻

Luciana Lucente

Universidade Federal de Minas Gerais

PDF

Keywords

Phonetics
Speech prosody
Discourse and dialogue
Intonation

How to Cite

1.

Lucente L. Dynamic model of speech: intonation, rhythm and discourse in brazilian portuguese. J. of Speech Sci. [Internet]. 2021 Feb. 5 [cited 2024 Jul. 26];3(2):21-62. Available from: https://econtents.bc.unicamp.br/inpec/index.php/joss/article/view/15045

Abstract

This article explores the relationship between intonational patterns and its relationship with speech rhythm and discourse, according to the dynamic systems research program. The study of these relationships were based on Barbosa’s (2006) Dynamic Model of Speech Rhythm; on Dato intonational annotation system proposed by Lucente (2008); and on the Computational Model of the Structure of Discourse, proposed by Grosz & Sidner (1986). The Dynamic Model of Rhythm suggests that the speech rhythm is the result of the action of two oscillators – accentual and syllabic - which receive as input linguistic and gestural information, and give the output as gestural duration. This article hypothesis is that in addition to these oscillators, a glottal oscillator can act controlling the intonation patterns of speech. These patterns, or intonational cycles, which organize the BP intonation, emerge when related to the spontaneous discourse segmentation. For each discourse segment classified as spontaneous according to a criteria proposed in this article, the speech is segmented into the DaTo system in linguistically structured units, which contains the purposes of communication and attention. Each of these segments is aligned to the speech intonation pattern delimitated by a rising contour (LH or> HL) at the beginning and by a falling contour (LHL), or a boundary level (L), at the end. The speech rhythm is also aligned to the pattern formed between intonation and discourse. By the inclusion of a new layer for the stress groups segmentation into DaTo system was possible to observe that the alignment between the stress groups segmentation and the intonational annotation coincide with discourse segments boundaries. The alignment between intonation, rhythm and discourse, having the stress groups as attractors, allowed us to propose the insertion of a glottal oscillator into the Dynamic Model of Rhythm.

https://doi.org/10.20396/joss.v3i2.15045

PDF

References

Abner, N. 2009. Phrasing and Prominence: Extracting Prosodic Information from the BU Radio Corpus. Ms. UCLA.

Atterer, M. and Ladd, D. R. (2004) On the phonetics and phonology of "segmental anchoring" of f0: evidence from German. Journal of Phonetics 32, 177-197.

Barbosa, P. A. (2006). Incursões em torno do ritmo da fala. Campinas: Pontes.

__________(2007) From syntax to acoustic duration: a dynamical model of speech rhythm production. Speech Communication. 49 (1-2), 725 - 742.

__________(2008) Prominence- and boundary-related acoustic correlations in Brazilian Portuguese read and spontaneous speech. Proceedings of the Speech Prosody 2008 Conference. Campinas.

__________(2010) Automatic duration-related salience detection in Brazilian Portuguese read and spontaneous speech In Proceedings of the Speech Prosody 2010. Chicago.

Barbosa, P. A., Arantes, P., Meireles, A. R., Vieira, J. M. (2005). Abstractness in Speech-Metronome Synchronisation: P-Centres as Cyclic Attractors. Proceedings of the Ninth European Conference on Speech Communication and Technolog (Interspeech 2005) Lisbon, Portugal (1441-1444).

Beckman, M. E. (1996). A typology of spontaneous speech. In Y. Sagisaka, W. N. Campbell & N. Higuchi, eds., Computing Prosody, pp. 7-26 . New York: Springer-Verlag.

Beckman, M. E., Hirschberg, J., Pitrelli, John F., (1994). Evaluation of Prosodic Transcription Labeling Reliability in the ToBI Framework. (Disponível em http://www.ling.ohiostate.edu/~tobi/ame_tobi).

Biber, D., Conrad, S. (2009) Register, genre and style. Cambridge University Press. Boersma, P., Weenink, D. (2009): Praat: doing phonetics by computer (Version 5.1.05) [Computer program]. Retrieved May 1, 2009, from http://www.praat.org/

Botinis, A., Granström,B.,Möbius, G., (2001) Developments and paradigms in intonation research. (4): 263-296.

Brown, G. (1983) “Prosodic structure and the given_new distinction”. In Ladd, D. R. & Cutler, A. (Eds.) Prosody: Models and Measurements, Springer Verlag, Berlin, p.67-78.

Browman, C.P. & Goldstein, L. (1986) Towards an articulatory phonology. In C. Ewan and J. Anderson (Eds.) Phonology Yearbook 3. Cambridge: Cambridge University Press, p. 219-252.

Butterworth, B. (1975) Hesitation and semantic planning in speech. In Journal of Psycholinguistic Research (4); p.143-178.

Campbell, W.N. (1996) Synthesizing Spontaneous Speech. In Sagisaka. Y., Campbell, N., Higuchi, N. (Eds.) Computing Prosody. Computational Models for Processing Spontaneous Speech. New York: Springer-Verlag.

Chafe, W. (1976) “Givenness contrastiveness de_niteness subjects topics and point of view”. In Li, C., (ed.) Subject and Topic, Academic Press New York, p. 25-55.

Chafe, Wallace L. (1979) The Flow of Thought and the Flow of Language. In Givon. T., Ed., Syntax and Semantics, Vol. 12, Discourse and Syntax. Academic Press, New York, New York: 159- 182.

Chafe, W.L. (1980) The Deployment of Consciousness in the Production of a Narrative. In Chafe, W.L., Ed., The Pear Stories: Cognitive, Cultural and Linguistic Aspects of Narrative Production. Vol. 3. Advances in Discourse Processes. Ablex Publishing Corp, Norwood, New Jersey: 9-50.

Clark, H. H., & Haviland, S. E. (1977). Comprehension and the given-new contract. In R. O. Freedle (Ed.), Discourse production and comprehension (pp. 1-40). Hillsdale, NJ: Erlbaum.

Cohen, P. R., and Levesque, H. J. (1980). Speech Acts and the Recognition of Shared Plans. In Proceedings of the 3rd Conference of the Canadian Society for Computational Studies of Intelligence, Victoria, B.C., p. 263-271.

Côrtes, P. O., Mittmann, M. M., Caetano, R. V. O., Mello, H. R.; Raso, T. (2011) A convergência entre anotadores na segmentação prosódica do corpus C-ORAL-BRASIL. In: Anais do III Colóquio Brasileiro de Prosódia da Fala. Belo Horizonte, p. 1-7.

Dogil, G., Braun, G. (1988) The PIVOT model of speech parsing.. Vienna, Áustria: Verlag.

Fernandes, F. (2007) Ordem, focalização e preenchimento em português: sintaxe e prosódia. . Ph.D. Thesis. Unicamp, Campinas.

Fujimura, O. (2000) The C/D model and prosodic control of articulatory behavior. Phonetica 57, p.128-138.

Gravano, A. and Hirschberg, J. (2006) "Effect of Genre, Speaker, and Word Class on the Realization of Given and New Information," Proceedings of Interspeech 2006, Pittsburgh.

Grice, M., Ladd, D. R., Arvaniti, A. (2000) On the place of phrase accents in intonational phonology. Phonology 17: p. 143-185.

Grosz, B.J., and Sidner, C.L., (1986) "Attention, Intentions, and the Structure of Discourse", Computational Linguistics, p. 12:3.

Hirschberg, J., and Litman, D. (1993) Empirical Studies on the Disambiguation of Cue Phrases, Computation Linguistics, 19-3, p. 501–530.

Hasegawa-Johnson, Mark, Chen, Ken, Cole, Jennifer, Borys, Sarah, Kim, Sung-Suk, Cohen, Aaron, Zhang, Tong, Choi, Jeung-Yoon, Kim, Heejin, Yoon, Tae-Jin, & Chavarría, Sandra. (2005) Simultaneous Recognition of Words and Prosody in the Boston University Radio Speech Corpus. Speech Communication 46, p. 418-439.

Jubran, C. C. A. S. & Koch, I. G. V. (Org.). (2006). Gramática do português culto falado no Brasil. Volume 1: Construção do texto falado. Campinas: Editora da Unicamp.

Kelso, S. (1984). Phase transitions and critical behavior in human bimanual coordination. American Journal of Physiology: Regulatory, Integrative and Comparative 15.

_______ (1995). Dynamic Patterns.Cambridge: MIT Press.

Kohler, K. J. (1996) Modelling Prosody in Spontaneous Speech. In Sagisaka. Y., Campbell, N., Higuchi, N. (Eds.) Computing Prosody. Computational Models for Processing Spontaneous Speech. New York: Springer-Verlag.

Kohler, K. J. (2005) Timing and Communicative Functions of Pitch Contours. Phonetica 62, p. 88- 105.

Kugler, P. N., Turvey, M. T. (1987). Information, natural law, and the self-assembly of rhythmic movement. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.

Ladd, D. R. (1996) Intonational Phonology. Cambridge: Cambridge University Press. Lakatos, I. (1978) Falsificação e Metodologia dos Programas de Investigação Científica. Lisboa: Edições 70.

Lehiste, I., (1970) Suprasegmentals. Cambridge: MIT Press. Liberman, M. (1975) The intonational system of English. Ph.D. Thesis; MIT.

Lucente, L., Hirschberg, J., Barbosa, P. A. (2012) “The role of discourse structures and intonational features on determining the information status prominence” (to appear).

Lucente, L. (2008) DaTo: Um sistema de notação entoacional do português brasileiro baseado em princípios dinâmicos. Ênfase no foco e na fala espontânea. Dissertação de Mestrado. Unicamp.

Lucente, L., Barbosa, P. A. (2010) The role of alignment and height in the perception of LH contours. Proceedings of Fifth Conference on Speech Prosody. Chicago.

_________ (2008) Narrow focus in Brazilian Portuguese: spatial and temporal constraints. Proceedings of Fourth Conference on Speech Prosody. Campinas.

_________(2007) Notação Entoacional do Português Brasileiro em Corpora de fala Semi-Espontânea e Espontânea. Revista Intercâmbio 16.

Lucente, L. (2007) Abelhas, pessoas, zebras e caracóis: auto-organização na fala. Não publicado.

Marcus, S.M. (1981) Acoustic determinants of Perceptual-center (p-center) location. Perception and Psychophysics 30 (3), 247-256.

Moraes, J. A. (1998) “Intonation in Brazilian Portuguese”. In Hirst, D., Di Cristo, A. (eds.) Intonational Systems: a Survey of Twenty Languages. Cambridge. MIT Press.

Nakatani, C. H. (1996). Integrating prosodic and discourse modelling. In Y. Sagisaka & al. (Eds). Computing Prosody, New-York: Springer: p. 67-80.

Nakatani, C. H., Hirschberg, J. (1994) A Corpus-based study of repair cues in spontaneous speech, Journal of the Acoustical Society of America, 95-3: p. 1603–1616.

Ostendorf, M., Price, P., and Shattuck-Hufnagel, S. (1996) “Boston University Radio Speech Corpus”, Linguistic Data Consortium, Philadelphia.

Pierrehumbert, J. (1980) The Phonology and Phonetics of English Intonation. Ph.D thesis, MIT.

Polanyi, L. and Scha, R.J.H. (1986) “Discourse Syntax and Semantics”. In Polanyi, L., Ed., The Structure of Discourse. Ablex Publishing Co., Norwood, New Jersey.

Port, R. & van Gelder. T. (1995). It’s About Time: An Overview of the Dynamical Approach to Cognition. In R. Port, & T. van Gelder, (Eds.). Mind as motion: Dynamics, behavior, and cognition. Cambridge, MA: MIT Press.

Raso, T., Mello, H., (2012) C-ORAL-BRASIL I: Corpus de referência do português brasileiro falado informal. Belo Horizonte, UFMG-CNPq.

Rietveld, A. C. M., Gussenhoven, C. (1988) “On the relation between pitch excursion size and prominence”. Journal of Phonetics, Vol 13(3), 299-308.

Rosenberg, A., Hirschberg, J. (2009) Detecting Pitch Accents at the Word, Syllable, and Vowel Level, NAACL/HLT, Boulder, CO.

Saussure, F. et al. (2002) Curso de linguistica geral. 24. ed. São Paulo, SP: Cultrix. Sagisaka,Y., Campbell, N., Higuchi, N. (1996) COMPUTING prosody: computational models for processing spontaneous speech. Coautoria de. New York, N.Y.: Springer, 1996.

Saltzman, E. L. (1995). Dynamics and coordinate systems in skilled sensorimotor activity. In Port, R. & van Gelder, T., (Eds.). Mind as motion: Dynamics, behavior, and cognition. Cambridge, MA: MIT Press.

Scherer, K. R. (1984) On the nature and function of emotion: a component process approach. In Scherer, K. R. Ekman, P. (Eds.) Approaches to emotion. Hillsdale, NJ: Lawrence Erlbaum. 1984. p. 293- 318.

Silverman, K., M. Beckman, J. Pitrelli, M. Ostendorf, J. Pierrehumbert, J. Hirschberg, and P. Price (1992). TOBI: A Standard Scheme for Labeling Prosody. Proceedings of the International Conference on Spoken Language, Banff.

Sonntag, G., P. & Portele, T. (1998) Comparative evaluation of synthetic prosody with the PURR method. International Conference on Spoken Language Processing.

Tenani, L. E. (1996) Análise prosódica das inserções parentéticas no corpus do projeto da gramática do português falado. Dissertação de Mestrado. Universidade Estadual de Campinas, Instituto de Estudos da Linguagem, Campinas.

Thelen, E. and Smith, L.B. (1994) A Dynamic Systems Approach to the Development of Cognition and Action, MIT Press.

Wong, S.W., Schreiner, C.E., 2003. Representation of CV-sounds in cat primary auditory cortex: intensity dependence. Speech Communication. 41, p. 93-106.

Xu, Y. (1999). Effects of tone and focus on the formation and alignment of F0 contours. Journal of Phonetics 27, p.55-105.

______(2005). Speech melody as articulatorily implemented communicative functions. Speech Communication 46, p. 220-251.

______(2006). Speech prosody as articulated communicative functions. In Proceedings of Speech Prosody 2006, Dresden, Germany.

______(2010) In defense of lab speech. Journal of Phonetics 38: p. 329-336.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Downloads

Download data is not yet available.

Dynamic model of speech: intonation, rhythm and discourse in brazilian portuguese

Keywords

How to Cite

Download Citation

Abstract

References

Downloads