QUANTITATIVE ANALYSIS OF FUNDAMENTAL FREQUENCY IN SPANISH (L2) AND BRAZILIAN PORTUGUESE (L1): EVIDENCE OF LEARNING AND LANGUAGE ATTRITION

This paper analyzes the intonation of Spanish and Brazilian Portuguese (BP) produced by monolingual speakers of both languages and bilingual BP speakers that lived in Spain on average for 6 years. Bilinguals produced data in both Spanish L2 (BL2) and BP L1 (BL1). Speech materials are sentences in different modalities (declaratives, yes-no and wh-questions) and reading styles (isolated sentences and storytelling). Fundamental frequency (f0) contours were analyzed to assess learning in Spanish L2 and language attrition in the L1 production of bilinguals. Variability in the f0 contours of the four language conditions was gauged by means of three indices (peak rate, peak range and global standard deviation). Dynamic time warping (DTW) distances between pairs of f0 contours were also measured as a way to measure withinand between-language differences in intonation patterns. The main results are: 1) BL2 and BL1 contours are significantly more variable than the monolingual ones both quantitatively and qualitatively; 2) BL2 contours partially converge towards the patterns of Spanish monolinguals, showing that there is learning; 3) there is evidence for language attrition in the form of transfer of Spanish patterns to BP contours produced by bilinguals; 4) Learning and attrition levels are different depending on sentence modality, such that learning is greater in modalities that differ less between BP and Spanish and attrition is greater in modalities that differ the most.


Introduction
Intonation conveys many different communicative functions (1)(2)(3)(4)(5). In this paper we focus on the modality function and analyze intonational differences between fundamental frequency (henceforth f0) contours in Spanish as a second language (L2) compared to Spanish as a mother tongue (L1) and Brazilian Portuguese (henceforth BP) as a mother tongue spoken by monolingual (from now on MP) and bilingual Brazilian speakers (from now on BL1) in declarative sentences, yes-no and wh-questions. We will look for evidence of learning in the production of L2 speakers by comparing it to the monolingual Spanish speakers and also for evidence of language attrition by comparing bilingual's L1 production to the control monolingual BP speakers. In the following subsections of this introduction, we discuss in detail the subjects that we bring together in this study. Firstly, we present the most relevant details of how both monolingual Spanish and BP speakers implement sentence modality in terms of intonation (sections 1.1 and 1.2). We also present results of studies that deal with Brazilian bilinguals learning Spanish L2 intonation (section 1.3). Then, we review references related to language attrition that the interpretation of the results we present here is based on (section 1.4). Lastly, we lay out the goals of the present study and the research questions we investigate (section1.5).

Intonation of Spanish L1
There are many studies on the intonation of declarative and interrogative sentences in Castilian Spanish (6)(7)(8)(9)(10). Based on the Autosegmental metrical theory (AM), Estebas-Vilaplana and Prieto (10) described the basic intonation patterns of Castilian Spanish. The authors analyzed many different sentence modalities. However, we focus on their analysis of the broad-focus declaratives and yes-no and wh-questions intonation. The authors transcribe declaratives with a L+>H* pitch accent in pre-nuclear position, indicating that the rising movement develops within the stressed syllable and culminates in the post-stressed syllable. The contour, then, progressively descends throughout the rest of the sentence up to its end at a low value. The final pitch accent does not show any relevant pitch movement, it is just part of the progressive descent in f0. Since the low f0 is reached during the last stressed syllable, this final pattern is described by a L* pitch accent followed by a boundary tone L%. Yes-no questions are described as having a pre-nuclear L*+H accent. The nuclear accent shows a f0 drop which, according to the authors, can be interpreted as a L* pitch accent. The final rising movement is described as HH% given the rapid rising movement at the end of the sentence. Finally, the authors observed in wh-question two different intonational patterns in nuclear position: a falling contour (L*L%) or a rising contour (L*HH%), the latter expressing "a nuance of interest and greater speaker involvement in the speech act" (10, p. 35). In pre-nuclear position, the pitch accent is the same (H*, aligned to the interrogative pronoun) for both types of final contour.
Henriksen (11) carried out an experimental study to analyze only wh-questions. His study aimed at identifying how many different possible f0 contours would be produced in peninsular Spanish. Six Spanish speakers from León took part in the production experiment. The experiment consisted of producing wh-question sentences under two different conditions: (1) participation in what the author describes as a short task-based dialogue, which the author called a person identification task and (2) a sentence reading task.
The author found four different f0 contours: (1) final rise (f0 peak at the offset of the first stressed syllable of the sentence, gradual fall throughout the last stressed syllable, a nuclear valley and a boundary rise); (2) nuclear circumflex (tonal plateau that starts at the onset of the sentence and lasts until the beginning of the nuclear stressed syllable or the initial tonal level is slightly higher and followed by a falling movement until the final position. In both cases, there is a rise-fall movement aligned to the stressed syllable of the last content word); (3) global falling (the highest f0 peak is aligned to the initial wh-word, the initial rise is followed by a gradual descent throughout the sentence until a valley at the nuclear stressed syllable, in some cases the f0 continues its descent throughout the final syllable and did not fall to a valley; (4) nuclear falling (rise to a f0 peak on the wh-words, a plateau that lasts until the nuclear stressed syllable with a falling movement and a tonal valley at the end of the sentence. The author showed that there was a significant effect of speech style on the use of different contour types. In the reading task, 70% of occurrences were of final rise, 13% of global falling, 10% of nuclear circumflex and 7% of nuclear falling contours. In the person identification task, there was much less variability, 57% produced a final rise and 43% a nuclear circumflex.

Intonation of Brazilian Portuguese L1
As is the case for Spanish L1, intonation of declarative and interrogative sentences in BP has also been the object of extensive research (12)(13)(14)(15)(16)(17)(18). According to Frota and Moraes (18), declarative sentences have one pitch accent per prosodic word in pre-nuclear position and non-final stressed syllables typically exhibit a rising melody (L+H). The nuclear contour, represented by a H+L* pitch accent, surfaces as a movement that starts in a peak and develops into a fall that extends through the stressed syllable towards a low target. The nuclear accent in yes-no questions presents a circumflex or rise-fall contour (L+H*L%) that surfaces a rising movement through the final stressed syllable with a late alignment of the f0 peak followed by a falling movement in the poststressed syllables (13). According to Frota and Moraes (18), this complex rise-fall final contour truncates in sentences that end in words with final stress. Lastly, in wh-questions the pre-nuclear contour has an extra high initial f0 peak aligned to the wh-word (H+H*). Then, there is a gradual fall movement over the following syllables up the last stressed syllable in the sentence. The nuclear contour is the same as the declarative sentences, followed by a low boundary tone (H+L*L%) (13).

Intonation of Spanish L2
The first studies about the intonation of declarative and interrogative sentences in Spanish as an L2 spoken by Brazilians analyzed mainly the productions of Brazilians who learned Spanish in Brazil. Most of these studies used the AM theory to analyze the intonation in Spanish L2, typically describing and comparing the pre-nuclear and nuclear pitch accents in Spanish L2 compared to Spanish L1, especially the peninsular variety (19)(20)(21)(22). In more recent research, the L2 intonation of Brazilians has been analyzed from different theoretical perspectives -see, for instance, (23)(24)(25)(26).
Sá (19) analyzed the intonation in Spanish Foreign Language (SFL) 2 and Spanish L1 in reading sentences and in jokes. Two Brazilians and one Spanish participated in the study. She observed that the pitch accents in SFL and Spanish L1 were more similar only in declaratives. Regarding yes-no questions the native produced the initial L*+H pitch accent and, for the most part, the L*H% final pitch accent and learners used H*+H or H*+L tones in initial position and L+H*L% in final position. In wh-questions, native and learners produced similar patterns in prenuclear position, although differed in the final position, the native using a high tone and learners a low tone.
Oliveira (23) studied L2 Spanish intonation using a model proposed by Cantero Serena (28). 3 She analyzed data from twelve Brazilians who lived in Barcelona and Valladolid. The data were recorded during an interview in which participants talked about their experiences with the Spanish language and culture. She analyzed 152 declaratives, 28 yes-no and 15 wh-questions. Regarding the final contour of the declarative sentences, the author observed that 45% had a suspended melody (slight fall), 38% neutral melody (steeper fall) and 17% emphatic (circumflextype contour). In yes-no questions, 57% of the cases had the final Spanish L1 contour (ascending), 33% BP contour (circumflex) and 8% suspended. For wh-questions, the contour type distribution observed was: 52% emphatic contour, 33% suspended and 20% neutral. The fact that the author investigated spontaneous speech might explain the presence of more than one contour form for each sentence modality, contrary to many of the previously cited studies, which mostly analyzed read speech.
Unlike studies that analyze transfer of contour features from BP to SFL as an all-or-nothing phenomenon (19,20), Oliveira's study points to a gradient variability in the production of the f0 contours in Spanish L2. In each of the modalities analyzed, the author observed the occurrence of three types of final contours, which highlights the variability present in L2 production. This variability was also observed by Silva (26), who analyzed the production of 15 Brazilians who lived in Madrid. She analyzed the data using the PENTA framework (1). Her goal was to determine if the values for the PENTA model parameters inferred from the Spanish L2 f0 contours in her corpus were more similar to those of Spanish L1, to those of BP L1, or if they have their own characteristics. In order to do that, the author analyzed three communicative functions: prominence, boundary and modality (declarative, yes-no and wh-questions). Figure 1 shows f0 contours over the prominent phonological word in final position in a declarative sentence in the three language conditions: Spanish L1, BP L1 and Spanish L2 (26). We can see that Spanish L1 speakers produce a smoother fall towards the low boundary. BP L1 contours, in comparison, tend to show sharper final falls. Spanish L2 productions show variability: it is possible to see sharper falls similar to those typical in BP L1 and also a good number of smoother falls similar to what is seen for Spanish L1.
Further, her results showed that in prominent phonological words in continuative contours the PENTA model parameter height was higher in Spanish L2, indicating that Spanish L2 contours tend, ceteris paribus, to be globally shifted upwards in comparison to their Spanish L1 counterparts. This effect can be seen in Figure 2 adapted from Silva (25, p. 167) that shows contours of the same Spanish wh-question produced by Spanish L1 and Spanish L2 speakers. Visual inspection shows that contours by both male and female Spanish L2 speakers tend to be produced at a slightly, although noticeable, higher register and to show greater f0 modulation than the corresponding Spanish L1 counterparts. This contour height difference may result in L2 intonation being perceived by L1 listeners as having greater prominence than the monolingual intonation, although this hypothesis was not instrumentally tested in her work. Silva's data present evidence of L2 learning, since some of the participants are able to reproduce aspects of Spanish L1 f0 contours, although learning was not complete, since there is evidence of L1 transfer in the data as well. One result in Silva's data that the author did not anticipate in her initial hypotheses was the fact that the L1 production of bilingual speakers showed more variability in the implementation of tonal patterns than what was expected by the traditional description of BP L1 intonation. It was observed that some BP L1 contours produced by BP bilinguals presented patterns similar to the ones observed in Spanish L1, hinting at the occurrence of L2 transfer to L1. Based on this result, a suggestion was made that this could be the result of language attrition. As Silva's study did not include BP monolingual speakers that could be used as a control condition, this hypothesis could not be properly tested. One of the aims of the present study is to add this control group to the speech material studied by Silva (26) in order to further explore the possible occurrence of language attrition.

Language attrition
The phenomenon of language attrition can be defined as a "non-pathological, non-age related, structural change of an L1 within a late bilingual, assuming that the acquisition of the L1 precedes this change" (28, p. 3). Regarding the manifestation of attrition, there has been a debate in the history of the field about which particular changes in L1 could actually be considered manifestations of language attrition, as opposed to short-lived, performance-related changes 4 . 4 Chang (30), for instance, advances a proposal to distinguish language attrition from what he calls phonetic drift. In his view, what distinguishes drift from attrition is the fact that the first happens in the short-term and arises from recent exposure to the L2 and attrition persists past the decline of exposure to the L2. A criticism that could be levied against Chang's distinction is that the author provides no objective definitions of "recent" and "long-term" and the author himself recognizes that "few studies [...] have actually tracked L1 learners through alternations in language environment, with the result that it is often unclear whether observed L1 changes are short-term, long-term, or medium- Since the emergence of language attrition as a research field, the theoretical understanding of the phenomenon has evolved significantly. In current accounts, this division appears to have been replaced by a consensus view that attrition "is not [...] an 'extreme' form of development, experienced by a small minority of bilinguals with decades of no or very little exposure, but a form of language development experienced from the early stages of L2 development and thus, in all likelihood, common to all bilinguals" (30, p. 3), supporting the conclusion, stated elsewhere, that "all instances of L2-to-L1 transfer should therefore be considered attrition phenomena" (31, p. 5) regardless of time frame or considerations on how "deep" or "shallow" the changes in L1 are. In a recent overview presentation of the field, Kupske (33) points out that one common scenario in which changes in L1 are triggered due to the development and dominance of other languages is when there is "a change in the linguistic environment, such as migratory processes" (32, p. 101, translation is ours). This is the situation of all bilingual participants in the data sample we analyze here (see section 2.1 for sociodemographic information on the participants). Migration, however, is not the only possible scenario conducive to language attrition. It has been shown that attrition can occur in L1-dominant contexts as well (34).
In studies on attrition at the phonetic and phonological levels, there is currently a concentration on the acoustic analysis of consonants and vowels and still few studies on such influences in prosody. The studies by Flege (35); Major (36); Chang (37); Chang (38); Kupske (39), for example, analyzed the production of consonants and/or vowels in an L2-dominant context. There are also works on the production of consonants and vowels in L1-dominant contexts (34,(40)(41)(42)(43)(44)(45).
Regarding prosody, for example, there are few studies (29,46). Mennen (46) studied intonation in native Dutch speakers that learned Greek as L2 and looked for evidence of attrition caused by Greek in L1 Dutch. Dutch has long and short vowels, and f0 peaks in pre-nuclear position align differently depending on vowel length. Greek, on the other hand, does not contrast vowel length. Results show that the difference between long and short vowels was reduced in the production of Dutch bilinguals. As a consequence of this quantity merger, the f0 alignment difference is lost. Leeuw et al. (29) looked for evidence of attrition in L1 German caused by English in bilinguals. The authors looked for differences in the alignment of f0 peaks in prenuclear position in both languages. Peaks align later in German in this context when compared to British English. The authors hypothesized that attrition would cause German bilinguals to align peaks earlier in pre-nuclear position influenced by the typical English pattern. Results show partial evidence for language attrition: although there is no statistically significant difference in the alignment of the peak in German bilinguals, the valley preceding the peak has a significantly earlier alignment than the German monolingual control group.

Study goals and research questions
Most of the previous literature on L2 Spanish intonation learning and cross-language influences bases its analysis on the Autosegmental-Metrical theory (47,48) and the ToBI intonational phonological notation derived from the aforementioned theory (19)(20)(21)(22)46). However, we consider that a discrete prosodic annotation as ToBI may not be fully capable of capturing the variability in the L2 intonation (23,25,26). The studies mentioned in section 1.3 have shown that although there is transfer of f0 contour features from L1 to L2, they do not happen systematically among all learners and do not affect all types of sentence modality in the same way (declarative sentences term (i.e., reversible, but not quickly or easily) and, more problematically, impossible to tease apart the effect of L2 exposure from the role of L2 acquisition, L2 knowledge, or L2 use" (29, p. 202). seem to be more easily acquired than interrogatives, for instance). In addition, a notation based on discrete tones would have problems to explain gradient variability in f0 contours, that is, the presence of contour features that do not occur in both L1 Spanish and L1 BP but that are present in L2 Spanish (26). Also, choosing a ToBI-based system as a notation device to describe f0 contours poses an important problem: what set of tones should be used to annotate Spanish L2 contours, those recognized by Sp-ToBI or the ones employed in ToBI-like systems developed for BP, such as Lucente (15)? Although at first it would seem obvious to answer with the first alternative, i.e. Sp-ToBI, what should be done when a contour feature from BP gets transferred to a L2 contour? Should a "BP tone" be used? That seems to be contradictory with the phonological nature of a ToBI-inspired analysis. The same conundrum arises in possible cases of cross-language influences: if a ToBI set of tones especially chosen to represent BP intonation is used to describe the L1 production of bilinguals, what to do when a Spanish contour feature intrudes in the BP L1 production? Should a Sp-ToBI tone be "borrowed"?
Besides the problem of how to best describe contour features in bilingual's productions, another issue with the previous literature has to do with what parts of the contours are analyzed. AM-based analyses in most cases only look at pre-nuclear and nuclear pitch accents. Such a practice will fail to capture differences that modify the contour as a whole. For all these reasons, in this work we will follow a different approach. Instead of analyzing discrete tonal configurations, we apply the Dynamic Time Warping (DTW) technique as a way of finding an objective measure that will help us to quantify overall differences among f0 contours produced under the different conditions. DTW is a technique used to find the best possible alignment between two temporal sequences, indicating how much one of them would have to be warped in order to become identical to the other. It thus offers a measure of similarity (or distance) between these two sequences, such that the distance will have a low value when the two sequences are equal or very similar to each other and the value will grow larger when they are more dissimilar, i.e., when lots of changes would have to be applied to one of them so as to make it identical to the other (49)(50)(51)(52)(53)(54). The advantage of this technique in the context of comparing pairs of f0 contours is that it provides a measure of similarity between two time series regardless of the alignment between the sequences, that is, the analyst does not need to select in advance landmark points in the contours being compared. This is a useful feature, both because it makes the analysis less time-consuming and makes few theoretical assumptions about the contents of the contours. DTW seems to be a good alternative to labeling f0 contours with discrete phonological tones because it can reduce to a single number the differences that spread over the whole of the two contours being compared.
This study aims at comparing the f0 contours of declarative sentences, yes-no questions and wh-questions produced by Brazilian speakers of Spanish L2 with those produced by L1 Spanish speakers and also BP L1 contours produced by monolingual and bilingual Brazilian speakers. For this we used the DTW technique as a tool to calculate a measure of similarity between pairs of contours. For the purpose of best understanding the source of DTW distances, we performed an acoustic analysis of the f0 contours in order to calculate three f0 estimators: peak rate, peak range and global standard deviation. Regarding the DTW analysis of f0 contours in this study, we have three main research questions: (1) the biggest DTW distances observed in the study will be those involving pairs of contours spoken by monolingual Spanish (MS) and monolingual BP (MP) participants, since both languages have different intonational patterns and monolingual speakers of one language have no knowledge of the intonational patterns of the other; we also predict that this result will be modulated by a modality effect, such that the DTW distance will be bigger for modalities where the intonational patterns of Spanish and BP differ more, especially interrogatives, as described earlier in sections 1.1 and 1.2; (2) if bilingual speakers have learned the L2 patterns to a high degree, DTW distances between their contours and those produced by Spanish monolingual speakers will tend to approach the between-speaker distance variability seen in the Spanish monolingual sample; on the other hand, if bilingual speakers' L2 contours show signs of L1 transfer, i.e., contours have a mix of L1 and L2 features, DTW distances between pairs of monolingual Spanish contours and bilingual L2 contours will tend to have distance values that are lower than the ones observed for monolingual Spanish and monolingual BP pairs, but higher than the internal (between-speaker) variability of both monolingual Spanish and BP speakers; (3) if successful learning of L2 f0 patterns by bilingual speakers do not have an impact on bilinguals' L1 performance, then DTW distances between pairs of monolingual BP contours and bilinguals' L1 contours will not differ significantly from the between-speaker distance variability seen in the BP monolingual sample; if learning L2 gives rise to systematic changes in the production of the L1, then bilinguals' L1 contours will have features of both L1 and L2 and DTW distances between pairs of monolingual BP contours and bilingual L1 contours will be higher than internal (between-speaker) variability of monolingual BP speakers, but lower than distances between pairs of monolingual Spanish and monolingual BP contours.

Speakers and experimental procedure
Fifteen native speakers of Brazilian Portuguese (10 women and 5 men) that speak Spanish as an L2, five native speakers of Spanish (3 women and 2 men) and five native speakers of Brazilian Portuguese (3 women and 2 men) who have never studied Spanish took part of the experiment. None of them were paid for their participation. Both Brazilian and Spanish speakers (25 subjects) declared they do not have any hearing and/or phono-articulatory problems.
The fifteen Brazilian speakers are from São Paulo State, college-educated, aged between 27, 48 years (mean age 35 years), and they learned Spanish after the age of 18. All lived in Madrid at the time of the recordings. The length of residence in Madrid ranged from six months to 16 years (mean residence time is 6 years). Ten Brazilian speakers studied Spanish in Brazil before moving to Spain. Even though all participants declared themselves proficient in Spanish, linguistic factors such as age of L2 learning, age of arrival in Spain, length of residence in Spanish-speaking countries, frequency of BP use in Madrid as well as social factors such as motivation to emigrate and personal experience with the country's culture may affect participant's performance in the L2. It was not possible to match participants on all linguistic and social factors mentioned earlier, but a sociolinguistic interview was conducted with the participants to better understand their linguistic and social experience with Spanish as L2. The interview data is summarized in Silva (25, p. 206-208). In this study, though, we do not correlate social factors with the production patterns. Because of the relative heterogeneity of participant's language experience, it is possible that learning level varies in the group. This aspect will be discussed in more detail in section 4. We consider this group to be bilingual and they provided data for conditions BL1 (Brazilian Portuguese L1 production) and BL2 (Spanish L2 production). The bilingual speaker definition can vary according to the authors or different theoretical perspectives. The studies of Conxita Lleó for example analyze L1 acquisition in bilingual contexts (55). Her studies focus especially simultaneous acquisition of two languages (the simultaneous acquisition of Spanish with another language). On the other hand, models such as the Revised Speech Learn Model proposed by Flege and Bohn (56) consider bilingual speaker a person that already has an established L1 before learning an L2 in a naturalistic context. In this case the second language learning process is not simultaneous but sequential. We consider bilingual speakers in the second perspective, so that "bilingual" is a shorthand term for speakers that learned an L2 mostly as adults living in a foreign country where that language is spoken.
The five monolingual Spanish speakers contributed data for the MS condition (Spanish L1 production). Three speakers are from Madrid, one from Segovia and another from Ciudad Real (both cities are close to Madrid). All of them are college-educated, aged between 22 and 33 years (mean 28 years). They have never studied Portuguese as a foreign language.
Both bilingual Brazilian and monolingual Spanish speakers were recorded in three different moments. The first recording took place in a soundproof room at the phonetics laboratory of the CCHS-CSIC (Centro de Ciencias Humanas y Sociales -Consejo Superior de Investigaciones Científicas) in Madrid. We recorded directly on a desktop computer using the Adobe Audition 1.0 software. The audio files were sampled at a 44,100 Hz rate and saved in WAV format in one channel. For that, a headset microphone AKG C444 was used. In this first moment, we collected the data from 13 subjects. The second recording took place in each subject's house. We collected the data from seven subjects in that way. We recorded directly on the computer using Praat software. The audio files were sampled at a 44,100 Hz rate and saved in WAV format in one channel. For that, a microphone Behringer B-2 PRO and sound card Scarlett 2i2 Focusrite were used.
The five monolingual native BP speakers are from São Paulo State in Brazil and they provied data for the MP condition (Brazilian Portuguese L1 production). All of them are collegeeducated, aged between 19 and 42 years (mean 25 years). They have never studied Spanish as a foreign language. The recording sessions took place in a room at the phonetics laboratory of the UFSCAR (Universidade Federal de São Carlos) in São Carlos, Brazil. We recorded directly on the computer using Praat software (57). The audio files were sampled at a 44,100 Hz rate and saved in WAV format in one channel. For that, a microphone Samson model C01 and a Behringer UMC202HD sound card was used. We collected the data from five subjects in that way.
Subjects read an excerpt from the story of Don Quijote adapted for children (58) and further modified by Silva (26). The excerpt chosen, which has 720 words, was the beginning of the Chapter entitled Gigantes con aspas, which tells the classic episode of Don Quijote fighting against the windmills. After reading the whole excerpt 39 isolated sentences (15 declaratives, 12 yes-no and 12 wh-questions) from the story were chosen to be read in isolation in random order.
The Portuguese translation of the Spanish excerpt has 685 words and the translation obeyed the position of the lexical stress of the final word of the sentence and, whenever possible, the same number of words per sentence, the same syntactic and prosodic structure as well as segmental content of the Spanish excerpt. Despite the structural constraints, Brazilian and Spanish participants considered the texts easy to read.
Monolingual speakers performed the experiment in one language only (Spanish for monolingual Spanish speakers and Portuguese for monolingual Brazilian speakers). Bilingual speakers performed the experiment in two languages (Spanish and Portuguese). First, they performed it in Spanish and then in Portuguese. Speakers were instructed in the language in which the task would be performed. The recording session consisted of reading aloud the text two times and reading each isolated sentence three times. First, they were instructed to read silently and comment on what they thought about the text and if it was difficult to read.
Subjects were instructed to start reading aloud the text only when they felt comfortable to do it. In addition, they were instructed to repeat any sentence that, in their opinion, had not been properly pronounced. After reading the text, they read the isolated sentences. These sentences were displayed through a Microsoft Office PowerPoint presentation. The set of 39 test sentences were presented one at a time in random order and the participant had to press a button in order to the next sentence to be shown. Finally, they were asked to tell the story read earlier in their own words. The recording sessions took about 20 minutes per speaker in each language. A complete list of the sentences in the corpus and their translation to Brazilian Portuguese can be seen in Silva (26).

Corpus
The corpus consists of parallel speech productions by three groups: native monolingual Spanish speakers (MS), native monolingual Brazilian Portuguese (MP) speakers and native Brazilian Portuguese speakers who live in Madrid and have Spanish as L2; this former group contribute with two sets of recordings: Spanish as L2 (BL2) and Brazilian Portuguese (BL1). The four types of productions are considered levels of a variable henceforth referred to as LANGUAGE condition. For all the analysis reported here, both read text and sentences read in isolation were included but not analyzed separately due to the already complex design of the study. The three types of sentence modalities are levels of a variable we will refer to as MODALITY. As mentioned in section 2.1, there are a total of 39 sentences, 15 declarative sentences and 12 of each type of interrogative (yes-no and wh-question). Thus, in MS and MP there are 975 sentences in total (39 sentences × 5 repetitions × 5 speakers) per language. In BL2 and BL1 there are 2,925 sentences (39 sentences × 5 repetitions × 15 speakers) per language. Thus, a sum total of 7,651 were analyzed, 963 (MS) + 960 (MP) + 2876 (BL1) + 2852 (BL2).

F0 extraction procedure
The procedure to extract the f0 contour for each audio sample in the dataset consisted in the use of two Praat scripts. The first script (59) implements a two-pass algorithm suggested by de Looze and Hirst (60) that optimizes the selection of values for the floor and ceiling parameters used by the autocorrelation algorithm underpinning Praat's "To Pitch" function. This optimization reduces extraction errors such as octave doubling and halving. A second script (61) was run on each contour in order to look for errors not eliminated by the first script. This script goes through all voiced samples in a contour and flags as potential extraction errors two consecutive voiced samples that differ by more than a frequency threshold and are less than a duration threshold apart. Both frequency and duration thresholds are defined by the user. The values we used were 0.5 octaves for the frequency threshold and 80 ms for the duration threshold and were selected in a trial and error basis so as to maximize true positives (true errors) and minimize false positives (falsely identified errors). All f0 samples flagged as errors by the script were later manually checked and corrected if necessary.
After the extraction and correction of the contours, a third script was used to perform the following procedures: to smooth contours using a 8 Hz bandwidth to minimize the micromelodic effects on the contours, then apply a linear interpolation over unvoiced sections in the contour and, lastly, convert the contour values from Hertz to semitones relative to the mean f0 value in Hz of each contour processed. After these steps, the script saved each contour as time-value pairs in tab-separated raw text files for further processing.

F0 statistical estimators
According to the description presented in Section 1, intonational differences between Brazilian Portuguese and Spanish in the three modalities studied here stem in part from the types of tones typically used to convey each modality and even peak height. Intonational differences between languages result in contours that can differ in peak height, peak density and overall contour variability. We decided to collect three parameters for each f0 contour in our corpus: peak rate, peak range and contour standard deviation with the help of a Praat script (62). The script measures peak rate as the number of f0 peaks per second. Prior to peak counting, the raw f0 contour undergoes heavy smoothing using a 2 Hz bandwidth and quadratic interpolation over unvoiced periods. All local maxima in the smoothed, interpolated f0 contours are included when counting peaks. Calculating peak rate instead of just giving a peak count is a way to normalize the measure and compensate for the fact that the sentences to be analyzed may have different lengths. Peak range is defined as the median excursion (measured as the valley-to-peak distance) of all peaks present in the contour spanning at least 0.5 semitones. Standard deviation (SD) takes into account all the voiced samples in the contour. For the SD calculation, the contour was smoothed using a bandwidth value of 4.5 Hz and values in Hertz were converted to semitones relative to 1 Hz.
We consider that the three selected parameters can be seen as a coarse-grained way to characterize the contour features that may drive the DTW distances between pairs of f0 contours, especially when comparing conditions where we expect to find the greatest differences (see the research questions outlined at the end of section 1).

DTW analysis
The R package DTW (53) was used in order to obtain DTW distances for pairs of different f0 contours. In the present study, both can be repetitions of the same sentence uttered by one speaker or each can be renditions of the same sentence uttered by two different speakers.
We interpret DTW distance results as follows: if the normalized distance obtained is equal to zero this means that the two contours being compared are identical. Thus, the more the normalized distance obtained is close to zero, the more the two contours are like each other, and the more distant from zero is the value of the normalized distance, the more different the two contours are. In Figures 3 and 4, we present two examples of sentences analyzed using the algorithm and their respective normalized distance values. In Figure 3 we have the f0 contours of the yes-no question ¿Aquellos?, produced by two Spanish speakers (MS) and in Figure 4 by a Spanish and a Brazilian speaker (BL2). In Figure 3, we observe that the two f0 contours are very similar. This similarity is reflected in a low normalized distance 0.21. In Figure 4 we have a contour produced by a monolingual Spanish speaker (the same contour represented by the black line in Figure 3) and one produced by a bilingual Brazilian speaker in Spanish L2. There is little correspondence between the two contours. The native Spanish contour presents a final rise and the bilingual speaker produces a circumflex pattern that ends in a descending movement. The cumulative alignment differences between the two contours cause a fourfold increase in the normalized distance (1.25).
Pairs of f0 contours were analyzed in two groups:  Within-language: both contours in the pair come from the same language condition. This group is further divided into: o Within-speaker: both contours in the pair come from the same speaker; o Between-speaker: each contour in the pair come from a different speaker.  Between-language: contours in the pair come from different language conditions. The within-language group allows us to estimate how variable contours are in a given language condition due to two factors: same speakers implementing contours of the same modality type in different ways (within-speaker group) and variability coming from different speakers doing different contour implementations (between-speaker group). The betweenlanguage group will allow us to compare how similar or different contours from different language conditions are.
Specifically, comparing MS to BL2 can estimate learning and comparing MP to BL1 can be a proxy for attrition. Lastly, comparing BL1 to BL2 can estimate how the bilingual speakers separate their L1 and L2 productions. The total combinations of all pairs of utterances of all speakers in the three modalities (declaratives, yes-no and wh-questions) were 567,800 according to the Table 1 and Table 2.   • Statistical estimators extracted for each f0 contour in the corpus: peak rate, peak range and contour standard deviation (see section 2.3.2 for a definition of each).
• DTW normalized distance, calculated for pairs of f0 contours (see section 2.3.3 for details on the procedures).
For the statistical estimators analysis, we performed two-way ANOVA tests with language and modality as independent variables, including their interaction. Separate tests were carried for each statistical estimator (peak rate, peak range and contour standard deviation). Pairwise independent t-tests were used to conduct post hoc comparisons among modality levels within each level of language or to compare the same modality level across two language levels. Bonferroni correction was applied to p-values in the pairwise comparisons in order to control familywise error rate. Cohen's d was used to determine effect size.
DTW data were also analyzed by means of two-way ANOVA tests with language and modality as independent variables plus their interaction followed by selected post hoc comparisons done through Bonferroni-corrected pairwise t-tests. Analysis was conducted in two groups as explained in section 2.3.3: • Within-language: contours from the same language condition. Further divided into: • Within-speaker: contours from the same speaker; • Between-speaker: contours from different speakers. • Between-language: contours from different language conditions. All descriptive statistics and statistical tests were carried out within the R statistical computing environment (63). From the rstatix (64) R package we used cohens_d function, which provides both a d value and a verbal label to describe the effect size, ranging from "negligible" to "large". We used both in the analysis.

Results
In section 3.1 we describe the effect of both LANGUAGE and SENTENCE MODALITY variables on the three f0 dispersion estimators. Then, in section 3.2 we present the DTW analysis results for each language and modality (both within-and between-subjects) and compare the DTW results between languages pairs. We start by presenting the results of the main ANOVA analysis run to test for main effects and interactions in the overall analysis that includes the four levels of LANGUAGE (MP, BL1, MS and BL2) and the three levels of MODALITY (declarative sentences, yes-no and wh-questions). For all three dependent variables there are significant effects of both LANGUAGE and MODALITY as well as a significant interaction between the two. Since the two independent variables yield significant effects, next we explore the results in detail. First, we look at possible effects of LANGUAGE between selected pairs of languages and then the effects of MODALITY. After that, we explore interactions between the two. Table 3 presents comparisons between three language pairs: MS and MP, MS and BL2 and MP and BL1. In the first pair we expect to find the largest differences, since we have two different languages with different intonational patterns spoken by monolingual speakers, i.e., that have no knowledge of the other; the second pair is where we can expect to find some degree of learning, i.e., BL2 to converge towards MS in terms of the dependent variables measured in the experiment; the third pair is the one where deviations in the values of the dependent variables can constitute evidence for attrition caused by the process of learning Spanish, giving rise to deviations in the prosodic parameters from BP towards Spanish.
Looking at MS-MP comparison in Table 3, we have evidence that Spanish and BP spoken by monolinguals differ significantly in terms of the contour features captured by the three f0 variability measures studied. Results show that f0 contours of Spanish as L1 have higher peak rate, peaks that span a broader range and greater overall variability when compared to their BP counterparts, regardless of sentence modality. Results presented in MS-BL2 comparison show that Brazilian speakers of Spanish as L2 converge towards the values of Spanish spoken by natives; although there are statistically significant differences between mean values for both language conditions in the three acoustic parameters, the effect sizes are small or negligible as can be seen by examining Cohen's d values. As for MP-BL1 comparison, results point to the fact that speaking Spanish as L2 has a significant effect on Brazilian Portuguese speakers' performance on their L1, since there are statistically significant effects on the three acoustic parameters; notably, the drift in the three acoustic parameters is directed towards the values typical of Spanish; these results can be interpreted as language attrition, since the performance of Brazilian speakers in their L1 seems to be affected by L2.
Overall, the analysis confirms the expectation that the greatest effect sizes are found when comparing f0 contours in BP and Spanish as L1; also, that BP speakers do change their prosodic patterns towards L2, observation that can be confirmed by the fact that the effect sizes when comparing BL2 and MS are small or negligible on the three acoustic parameters; finally, we observe that Brazilian speakers that speak Spanish as L2 do present performance differences in their L1 when compared their monolingual counterparts: their contours drift away from the patterns of BP speakers that do not speak Spanish towards the patterns of the L2. In terms of effect sizes, the attrition effects are intermediary between what is seen in the MS-MP and MS-BL2 pairs.  Table 4 shows mean value of peak rate, peak range and SD as a function of sentence modality (all languages polled). Mean values for the three modalities are significantly different for the three acoustic parameters: peak rate (all p < 0.001), peak range (p < 0.02 or less) and SD (all p < 0.001). Pairwise values of effect size are shown in Table 5. Peak features (rate and range) have a pattern: declarative and wh-question are similar (negligible to small effect size) and yesno question has higher mean peak rate and peak range (effects are moderate to large). When it comes to SD, we see a ladder pattern: declarative in the bottom rung, closely followed by yes-no question and wh-question at the top. Summing up the results concerning modality comparisons, declarative sentence contours have an intermediary value of peak rate and the lowest values of peak range and SD; yes-no questions have the highest peak rate and range and intermediary SD value; wh-questions have the lowest peak rate, intermediary value of peak range and the highest SD value. As it is possible to see in Figure 5, the fact that wh-question contours have high SD in the overall mean is driven by BP, BL1 and BL2, not MS. This interaction will be further explored later in Table 6.   Interactions between languages and modalities were explored by examining for different language pairs (lines in Table 6) how the three sentence modalities differ in terms of the three acoustical correlates. Since most pairwise comparisons yielded statistical significance (22 out of 27), we base our conclusions mostly on effect size as a proxy measure for distance between the language pair for a given acoustic correlate and sentence modality.
 MP-MS: declarative sentences have the least differences -small and moderate effect sizes for peak rate and peak range and a large effect for SD (MS values are greater than MP); yes-no questions have similar peak rate (small effect size) and large differences for the other two variables; wh-questions present large differences in peak rate and peak range (MS > MP) and a small difference in SD (MP > MS; wh-question is the modality for which MP has the greatest mean SD value).  MS-BL2: three modalities are rather similar; effect sizes are small with the exception of contour SD in wh-questions, where there is a moderate difference (BL2 > MS).
 MP-BL1: negligible or small effect size for peak rate in all three modalities; large (declarative) and moderate effect sizes (yes-no and wh-question) for peak range; large size effect for SD in all three modalities. Effects are all in favor of BL1, showing that BL1 deviates from MP towards targets more compatible with MS.
Summing up, the interaction analysis shows that the largest differences are seen in the MP-MS pair, as could be expected, since both are languages spoken by monolingual speakers. Also as would be expected by the contrastive description of modality intonation presented in Sections 1.1 and 1.2, the largest differences are seen in the two interrogative modalities. The results also show that there is a great deal of learning by BP speakers, since BL2 and MS comparisons yield small effect sizes regardless of sentence modality, indicating that BL2 speakers are able to modify their contour to match to a good degree those of the target language (MS). Lastly, results also point to the fact that the performance of BP bilingual speakers in their native language (BL1) show attrition. The effect is rather uniform among the three modalities, and contour SD is the most affected acoustic parameter, followed by peak range (two moderate and one large effect size); peak rate is unaffected or shows a significant but small-sized effect (wh-question).
In order to show in visual terms the contour patterns described by the statistical analysis presented before, Figure 6 shows time-normalized f0 contours of BP and Spanish sentences as a function of modality and language. Starting with MP and MS, we can see the intonational features described in sections 1.1 and 1.2 for both languages: i) declarative contours have a similar pattern for both languages, being comprised by a number of peaks, roughly one for each phonological word, but differing in how the final low tone aligns at the boundary -later for BP and earlier for Spanish; ii) yes-no questions in BP start with a high or extra-high peak, followed by a low section, ending in a circumflex tone; in Spanish, the recurrent features are a pre-final circumflex tone, immediately followed by a high or extra-high rising contour; iii) wh-questions in BP have a descending pattern that start at a high or extra-high point and drifts down towards the final boundary; in some cases the contour does not present peaks, just a smooth gliding towards the low final target; in Spanish, as noted in Section 1.1, there are at least two distinct patterns: the contour can start with an optional peak and it can end either with a circumflex or a rising tone; in the contours shown, there is no connection between the presence of a peak at the start and the two possible final tones.
The visual examination of the contours ( Figure 6) agrees with the statistical analysis: Spanish contours tend to have more peaks, peaks that span a greater range and overall greater variability. Also, the differences seem more visible in interrogative sentences, both yes-no and wh-questions. Comparing BL2 and MS, it is immediately visible that BL2 contours show a great deal of variability, irrespective of sentence modality. This is in agreement with the statistical results that show learning, since MS contours are overall more variable than MP. Given this systematic difference, it can be said that successful learning of MS intonation by BP natives must include the magnification of contour parameters such as peak rate, peak range and overall variability (of which SD can be seen as a proxy).
Turning to the interrogative modalities, we can see that MS contours show both evidence of learning and L1 transfer. Looking at yes-no questions, we see that BL2 contours incorporate features that are crucial to MS intonational grammar: the pre-final circumflex tone followed by a final rising tone ending in extra-high values, although some of the speakers skip the final rising tone and end the contour with a circumflex tone. L1 transfer can be seen in the initial high peak that is absent in MS contours, but present in MP. Similar remarks can be made about wh-question contours: BL2 incorporates a key feature of MS contour, namely a final tone that can be either circumflex or rising (again, those are not seen in MP contours); L1 transfer is also seen, in the form of initial peaks that tend to be high or extra-high as is typical in MP. Especially for whquestions, there seems to be some overshooting in terms of SD in BL2 (note the moderate-sized effect size in favor of BL2 reported in Table 6).
Comparing now BL1 and MP (and keeping BL2 contours in the background), the results suggest that our data confirms the prediction that there is an interaction between learning and the occurrence of attrition. The overall (modality-independent) greater contour variability that bilingual BP speakers had to learn in order to proper speak Spanish bleeds back into their L1 production. As reported when discussing the data in Table 6, SD and peak range are the most affected parameters. Focusing on yes-no questions, we see that BL1 contours preserve MP's high or extra-high initial peak, but we also see circumflex tones that are higher and are aligned earlier than what is typical in MP and final extra-high rising tones that are totally absent in MP; for this modality in particular, BL1 and BL2 contours seem very similar. Lastly, looking at wh-questions, we also see a kind of hybrid pattern: a falling contour that starts at a high or extra-high point as in MP, but instead of smoothly going down and ending on a low final tone, we see a number of cases of final circumflex and even final rising tones that are completely absent in MP contours but are typical of MS; for this modality, BL1 and BL2 have different patterns, showing less L2 transfer to the L1. (wh-question).

DTW analysis
Sections 3.2.1 and 3.2.2 present the main statistical results separately by language group. In Section 3.2.1 we present the within-language analysis and in 3.2.2 the between-language analysis.

Within-language analysis
In this section we present the statistical analysis of effects of the independent variables LANGUAGE and MODALITY on DTW distance between contour pairs taken from the same language. Mean DTW values are presented as a function of language condition and sentence modality in Tables 7  and 8, respectively. In both cases, the results are also presented as a function of group (withinand between-speaker). Step-like pattern: declarative > yes-no questions > wh-questions, small-size effects.   Figure 7 shows mean DTW distance as a function of both sentence modality and language condition and also group.
Looking at the interaction of language and modality, the first and more obvious result is that DTW distances are smaller in within-speaker than in between-speaker group. Two other observations can be made by looking at Figure 7: (1) languages spoken by monolinguals (MP and MS) show smaller mean distances than the others (BL1 and BL2), regardless of group and (2) wh-questions have bigger distances compared with the other modalities. The latter observation does not hold for monolingual BP speakers -no significant difference in both comparison groups. For the other three languages, wh-questions have bigger mean distances than the other two modalities, regardless of group. MS is the language in which the difference between wh-questions and the other modalities is bigger -median effect size is moderate (d = 0.58) for within-speaker and large (d = 0.83) for between-speaker.
MP and MS have similar levels of mean distance in both within-and between-speaker groups, except for wh-questions. The languages spoken by monolinguals have a low level of variability at within-and between-speaker levels for declarative sentences and yes-no questions. The heightened level of DTW distance seen for MS in wh-questions is due to the variability shown in Figure 6, where it is possible to see that contours start with either a high or a level tone and end either with a final rising or a circumflex pattern. Since within-speaker distance is also higher for wh-questions than other modalities, this is evidence that the same speaker will vary in the contour implementation.
When it comes to BL1 and BL2, all three modalities hover at similar levels of mean distance in both within-and between-speaker groups. Declarative sentences and yes-no questions have lower mean distance values than wh-questions, similar to what happens with MS. The overall higher levels seen for BL1 and BL2 in comparison to MP and MS can be explained by the fact that the contours in those languages have a hybrid character, as pointed in Section 3.1: because of attrition, some BL1 contours can present tones that are typical of MS while others remain very much like the ones in MP; because there is (incomplete) learning, some BL2 contours emulate the tones of MS but some still transfer MP tonal structure to the second language. This variability contributes to inflate DTW distances for BL1 and BL2 in general and wh-questions even more.

Between-language analysis
In this section we present the statistical analysis of effects of the independent variables LANGUAGE PAIR and MODALITY on DTW scores. Figure 8 shows mean values of pairwise DTW scores as a function of both variables. We start by presenting the results of the main ANOVA analysis run to test for main effects and interactions in the overall analysis that includes the four levels of the LANGUAGE  Since all independent variables yield significant effects, next we explore the results in detail. First, we look at possible effects of LANGUAGE between selected pairs of languages and then the effects of MODALITY. After that, we explore interactions between the two.
Mean DTW distance (SD in parentheses) for each language pair are listed below. The six pairwise comparisons yield significant differences (all p < 0.001) except for MP-BL1 compared to MS-BL2. The pair of languages spoken by monolingual speakers (MP-MS) has the lowest mean distance and the pair involving the BP bilinguals speaking their native language and Spanish as L2 (BL1-BL2) has the highest mean distance. The other two pairs are in-between. Effect sizes range from negligible to small (Cohen's d ranging from 0.0006 to 0.32).
 Two factors may be the cause of higher DTW distances involving pairs that have BL1 or BL2 as a member: (1) higher overall variability in f0 contours for BL1 and BL2 compared to MP and MS as can be seen in Figure 5 and (2) the hybrid intonation patterns seen in the time-normalized contours in Figure 6 for BL1 and BL2. Mean DTW distance for each modality is listed below. Results of pairwise comparisons are: declarative/yes-no questions (p < 0.001, d = 0.17 negligible), declarative/wh-questions (p < 0.001, d = 0.5 moderate) and yes-no/wh-questions (p < 0.001, d = 0.36 small). Declarative sentence is the modality with less DTW distance, then yes-no questions and finally wh-questions. This step-like pattern can be seen in Figure 8. We can explain why BL1-BL2 and MP-BL1 are the pairs that have the highest mean DTW in declarative and yes-no question bearing in mind the results described in Section 3.2.1 about the variability in contours. Starting with BL1-BL2, this is a pair where both languages are highly variable -BL1 contours show both L2 influence and faithful L1 performance; similarly, BL2 contours show both evidence for learning and L1 transfer. Since both languages in the pair present high between-speaker variability, this favors higher DTW distances. Let's consider now the MP-BL1 and MS-BL2 pairs; in both cases, one of the elements in the pair is a language with high between-speaker variability (BL1 or BL2). The fact that MP-BL1 has statistically significant higher mean distances than MS-BL2 may be considered evidence that cases of attrition in BL1 contours are more common than retention of native performance and that BL2 contours where features learned from L2 are present are more numerous than those showing transfer from L1. Lastly, the low mean values for the MP-MS pair may be explained by the fact that MP and MS both have low between-speaker variability as shown by within-language DTW distances for declarative and yes-no questions. Since monolingual speakers tend to be very consistent, f0 contour pairs for which one is taken from MP and the other from MS will tend to generate lower DTW distances; even though there are systematic differences between contours in both languages, they tend to be constant for all speakers.
Wh-questions present an altogether different pattern. MS-BL2 is the pair with the highest mean DTW distance, followed by BL1-BL2, in turn followed by MP-MS and MP-BL1 at the same level. The same principle behind the results for the other modalities seem to explain what is seen here. As pointed earlier, BL2 is characterized by a high level of both within and betweenspeaker variability; wh-questions in MS are special in the sense that this modality differs from others in presenting statistically significant higher within-and between-speaker variability in within-language DTW distance; the combination of these two factors adds up to make a good number of contours in the MS-BL2 comparison differs substantially, yielding higher DTW distances. The second highest value in the wh-question modality for the BL1-BL2 pair can be explained by the same reasons presented in the previous paragraph for this pair. The same holds for the lowest mean value associated with MP-MS; MP speakers have a low within-and betweenspeaker variability in wh-questions, but not MS; this simultaneously explains that MP-MS has one the lowest means in the wh-question modality but at the same time has the highest mean for the MP-MS pair considering the three modalities (all pairs p < 0.001). Again, the low variability of MP explains that MP-BL1 also has a low mean value within wh-questions; we can also hypothesize that BL1 has fewer cases of contours affected by attrition in this modality than the others.

Discussion and conclusion
In this section we review what we consider to be the main contributions brought by the study. Firstly, the use of a set of tools to describe f0 contours that do not rely on theory-dependent transcription systems that are language-specific; we consider this point to be important, given the inherently variable nature of L2 production. Secondly, the results present an objective and quantitative account of the highly variable nature of both Spanish L2 and BP L1 production of bilingual speakers. Lastly, what we see as the most original result is the evidence of L2 influence on the L1 prosodic production of Brazilian bilinguals, a phenomenon consistent with the definition of attrition we gave in section 1.4. There is also evidence of learning in the data, although this is not entirely new in the previous literature on Spanish L2 spoken by Brazilian learners. The novelty in this respect is that our participants are bilinguals that were living in a L2dominant context, while most previous literature study late bilinguals learning Spanish in formal settings while living in Brazil.
Three acoustic parameters -rate of f0 peaks, mean peak range and standard deviation of the whole contour f0 -were used as proxies for the overall variability of f0 contours in the four language conditions studied here. A detailed statistical analysis of these parameters is reported in section 3.1 and shows that there are significant effects of both language and sentence modality. A comparison of language pairs indicates that Spanish and BP spoken by monolinguals (MS-MP) is the pair for which the differences in parameters' values are greater (using effect size as a measure). This result corroborates hypothesis 1, stated at the end of section 1, as it seems reasonable to attribute differences in the acoustic parameters to differences in intonational phonology between the two languages. The MS-BL2 pair shows the least differences, suggesting a good deal of learning, as f0 parameters of bilinguals (BL2) come close to the values of the L2 target (MS). Results for the MP-BL1 pair show that the level of divergence in the f0 patterns is intermediary between those of the two other pairs, MS-MP and MS-BL2. We interpret this result as evidence that the successful learning process revealed by the close approximation between BL2 and MS may have interfered with the bilingual speakers' performance of their L1, evidenced by a greater degree of divergence between the bilinguals' L1 performance and the monolingual BP performance, a situation that we interpret as an instance of language attrition. Regarding the other two language pairs, MS-BL2 presents smaller distances between the three acoustic parameters; distances are not that different between modalities, which we interpret as evidence of similar levels of learning throughout modalities. In MP-BL1 we see bigger differences compared to the ones in MS-BL2. Considering the modalities, wh-questions yield bigger differences compared to the other two modalities.
Overall, results point to the fact that, at least in our data, successful learning in bilinguals is associated with increase in attrition, especially for the wh-question modality, the one among the three studied here where Spanish and BP differ the most. As Brazilian bilinguals strive to change their f0 contours to match the patterns in L2, their native L1 production is warped towards the L2 patterns. We interpret this bidirectional influence as evidence in favour of Flege and Bohn's revised Speech Learning Model (SLM-r) proposition that the L1 and L2 share a common phonetic space in bilinguals and, as such, both L1 and L2 compete for the shared cognitive resources. Furthermore, the L2 Intonation Learning theory (LILt) model, proposed by Mennen (65), also seems to be relevant to explain the results presented here as we pointed out earlier in this section.
Results of contour distance as measured by the DTW technique, reported in section 3.2, seem to partially confirm hypothesis (1) stated at the end of section 1. The mean DTW distance in the within-language group for monolingual speakers (MP and MS) is around 0.45 for all modalities in MP and declarative and yes-no question in MS, the exception being wh-question in MS (mean value of about 0.8); this exception is compatible with the observation made in section 1 about the well-known variability in wh-question patterns documented in Spanish. When these figures are compared to mean DTW distance for the MP-MS pair in the between-language condition (see Figure 8), we see that the values are always higher than 0.45: mean values go up from 0.5 (declaratives) to around 0.65 (yes-no questions) and then almost 0.8 (wh-questions). Comparing the two sets of results, we can say that variability in DTW distance due to betweenspeaker differences within a monolingual sample (either in MP or MS) is more or less constant (with the exception of wh-question in MS for which there is a good explanation presented in section 1.1) and that variability due to between-language differences when each f0 contour in the pair comes from a monolingual sample (MP or MS) is always at a higher mean level and varies as a function of sentence modality as well. These results suggest that DTW distances are sensitive to differences in language intonation patterns and also to modalities within each language. We said that the hypothesis was partially confirmed because the MP-MS pair is not the one yielding the highest mean DTW distances. What seems to be behind this result is not the fact that BP and Spanish intonational patterns are not so different as we thought at first, but that f0 contours produced by speakers in our monolingual samples (MP and MS) tend to be much more homogeneous, that is, there is less between-speaker variability, than contours produced by bilinguals, either in BL1 or BL2 conditions. Hypotheses (2a) and (2b) predicted that bilinguals' contours would be in between BP and MS: in BL2 condition, they would be closer to MS if L2 learning was predominant and closer to MP if L1 transfer was typical; in BL1 condition, contours would be close to MP in most cases, but L2 traits would appear if cases of attrition were the norm. But we tacitly assumed that speakers' behavior would be homogeneous. What the results show is that the behavior of bilinguals is significantly more variable than their monolingual counterparts, as can be seen in Figure 6. Inspection of time-normalized f0 contours of individual speakers indicate that in BL2 condition the same speaker will produce contours that are very similar to MS and also some that are identical to the L1 pattern, constituting instances of learning; conversely, in BL1, some speakers will produce contours that are faithful to the MP pattern and others that have the MS patterns, a situation we identify as evidence of attrition. Cases of contour "hybridization", where traits of both languages are present in the same contour do happen, as can be seen in Figure 6. Inspection of the Figure 6 suggests they are not the norm, but we have not quantified how prevalent they are. Contours produced by bilinguals are more variable not only qualitatively but also quantitatively. The most striking difference in this respect is contour standard deviation. As can be seen in Figure 5, BL1 has significantly higher SD values than MP in all modalities; BL2 has higher SD values compared to MS, especially in wh-questions.
The results generated by the DTW analysis allow us to conclude that the technique was useful as a descriptive device that helped us make sense of the f0 data we had, although some of our initial hypotheses were not confirmed. Despite its usefulness, DTW distances alone are not enough as an analytical tool. To better make sense of our data, we also had to make use of timenormalization of the f0 contours of specific sentences in order to visualize information about the alignment of f0 movements to specific words which is something that the holistic nature of DTW does not allow.
One important trait of the f0 contours produced by bilinguals is their great overall variability. This result could be explained by a bidirectional L1-L2 linkage, as proposed by the SLM-r model (56) to explain the learning of segmental features. According to the latest version of the model, there are interactions between the L1 and L2 phonetic subsystems and such interactions occur because L1 and L2 sounds exist in a common phonetic space. As a result, the theory predicts that, as a bilingual develops an L2 category alongside a previously existing L1 category, the distance between the two in the shared phonetic space tends to be magnified in order to increase the likelihood of acquisition, especially if both categories are phonetically close. The theory predicts that the contrast-enhancing strategy leads to bilinguals producing both L1 and L2 in a non-precise way. A similar prediction is made by the LILt model, that deals specifically with the learning of intonation. Our data seems to corroborate this theoretical prediction: in order to try to maintain the contrasts between the intonation in BP and Spanish, that have some contour features in common, the cost would be to produce both languages in a non-precise way. In the context of our data, this less-precise production led to increased variability in the f0 contours. Interpreting the results within the context of the LILt model, this could be interpreted as a deviance in the realizational dimension.
Considering that the influence between L1-L2 is bidirectional, the SLM-r model also allows for the possibility of language attrition. When L2 sounds are very similar to those of L1, the substitution would be unnoticed by monolingual speakers of the target L2. This could be what happened in part with the intonation of declarative sentences. Our results showed smaller DTW distances between the intonation of declarative sentences compared to both types of interrogatives. On the other hand, when there is an L2 sound pattern for which a new category was not formed, the model predicts that a compromise L1-L2 category will develop based on the combined distribution of sound patterns defining the L1 and L2 categories. This is what we observed in bilingual interrogatives. In yes-no question, the bilinguals produce both final f0 contours: rising (from Spanish) and circumflex (from BP) in both in Spanish L2 and BP L1. Similarly, in wh-question the bilinguals produce extra high pitch accent in the interrogative pronoun (from BP) and the final f0 contour with a falling (from BP), rising (from Spanish) and circumflex (from Spanish) shape in both languages in Spanish L2 and BP L1. The fact that declaratives yield less deviance in L2 production can be explained in terms of the LILt model as arising from the fact that declaratives are phonologically more similar (called the systematic dimension in the model) in BP and Spanish than interrogatives, as explained in sections 1.1 and 1.2.
As mentioned in Section 1.4 there are only a couple of studies about the intonation of bilingual speakers that also discuss language attrition in their L1 intonation (29,46). As far as we know, none has published about language attrition in intonation produced by Brazilian bilingual speakers of Spanish L2 other than a mention by Silva (26) about the possibility 5 . For this reason, this study could bring an important theoretical contribution to the studies of language attrition analyzing prosody. We consider that our results make a better case for attrition involving intonation than the ones previously reported in the literature. Compared to results in Mennen (46), the attrition we observe is not caused by L2 influence in a confounding variable (vowel length, in her case). Compared to results presented by Leeuw and colleagues (29), the influence of L2 in L1 we see in our data is more robust, affecting the choice of contour, not only the alignment of a tone.
As noted in Section 2.1, the speaker sample analyzed in this study is relatively heterogeneous in terms of time of residence in Spain, frequency of L1 usage and social setting in which their use of L2 takes place. These factors may have an impact in the observed results and explain at least part of the variability observed in the bilingual production, both in L1 and L2. Future work should look the data of individual speakers separately and explore the possible influence of speaker experience on the patterns observed here. In particular, it should try to establish correlations between individual speaker experience (length of residence, L1 and L2 frequency of use) and outcomes of learning and attrition.