PROSODY AS A RECURSIVE EMBEDDING TOOL IN PRODUCTION AND PERCEPTION OF KARAJÁ: AN ACOUSTIC AND NEURO- PSYCHOLINGUISTIC INVESTIGATION

Hierarchical or indirect recursion can be found in different domains of human language and thus, it has been claimed to be the only part of language that is specific to humans (Hauser, Chomsky, and Fitch 2002). However, in the past decade, both the claims that, recursion is the central component of the “narrow faculty of language” and that, it should be present in all languages have been the object of intense debate (cf. Pinker & Jackendoff, 2005; Everett, 2005). This debate triggered the exploration of new frontiers in the examination of embedded structures, which have been examined in acquisition and in processing and have been shown to be implemented through a wide array of linguistic resources in different languages. This paper presents an acoustic description and a neuro-psycholinguistic analysis (ERP/EEG) of an uncommon cognitive device to embed relative clauses. It is implemented in Karajá, a Macro-Je language spoken in Central Brazil, which uses pitch accent to signal relativization: (i) [tori do‟rode] „the white man arrived‟ versus (ii) [tori doro‟de] „the white man who arrived‟, first described in Ribeiro as stress shift (2006). The major interest in studying this phenomenon is because in Karajá, more than structuring envelopes for acts of speech, prosody codes directly onto the central syntactic algorithm of recursion. We found evidence in favor of a stronger facilitation to process a coordinated structure than a recursive structure. We found smaller RTs and amplitudes in the EEG related to the coordinated conditions versus the embedding conditions. Also, it seems that even though embedding is harder to launch, hierarchical structuring makes it easier to process in the third embedding, when comprehenders learn they are in an embedding mode. Coordination, on the other hand, being a default, is easier to launch, but it seems to become progressively harder as it does not benefit from hierarchical structuring.

structure relative sentences that contrast with their juxtaposed version that are exactly the same, except for this marked prosodic structure: (i) [tori do'rode] ‗the white man arrived' versus (ii) [tori doro'de] ‗the white man who arrived', first described in Ribeiro (2006) as stress shift. In order to analyze this phenomenon we used a neurophysiological protocol to compare the ERPs (event related brain potentials) extracted from Karajá speakers exposed to juxtaposed and embedded sentences.
Since we were dealing with an under-studied language, in order to confirm our impressionist observation that there was such interesting prosodic phenomenon contrasting juxtaposed and embedded sentences in Karajá, we first implemented a detailed acoustic analysis of the audio sentences that we would use as stimuli in the Electroencephalography (EEG) test, which will also be presented here.
Maia, França, Gesualdi, Lage, Oliveira, Soto & Gomes (to appear) and Maia (2016) have investigated Prepositional/Postpositional Phrase (PP) embedding vis à vis coordination both in Brazilian Portuguese and in Karajá, respectively using EEG and eye-tracking. The Event Related Potentials (ERPs) methodology used in this work consists in using EEG to extract brain potentials related to sentences processing, and averaging them until an ERP can be distinguished from noise and strongly relates with the presentation of linguistic stimuli. Until the year 2000, two major ERPs were robustly elicited, the N400 and the P600. The N400 is a negative component (by convention plotted upward) whose peak rises at around 400 ms after stimulation. The P600 is a positive wave (by convention plotted downward) that peaks at around 600 ms. The N400 has been mostly related to semantic processes and the P600 with late syntactic processes.
Maia, França, Gesualdi, Lage, Oliveira, Soto & Gomes (forthcoming) argue that the results of the neurophysiological tests show that the ERP latencies of both Karajá and Brazilian Portuguese are compatible with the behavioral Response Times (RTs) in two aspects: (i) the three types of stimuli listing one, two and three coordinated words had the same latency at 380 ms; (ii) coordination yielded earlier N400s than those of recursion. Strikingly, as to the recursive stimuli, there was a progressive facilitation going from the constructions with one embedded PP, to two PPs and three PPs. Thus, since marked N400s are connected with difficulties in word integration with the working context, according to this view, the more salient the N400, the harder the combinatorial process (Kutas & Hillyard, 1980, 1984Brown, & Hagoort, 1993Lau et al 2008;Gomes & França, 2008. The authors' interpretation was, therefore, that recursion should be analyzed as the result of a syntactic algorithm that -is costly to be launched, but once it is established, it does not pose any extra significant effort to the system.‖ Further, Maia (2016) presented a reading study which discussed two eye-tracking experiments comparing the processing of coordination and embedding of Prepositional Phrases in Brazilian Portuguese (BP) and of Postpositional Phrases in Karajá. Experiment 1 compared the processing of sentences containing Prepositional Phrases which could either be self-embedded or conjoined in Brazilian Portuguese. Experiment 2 compared the processing of sentences containing Postpositional Phrases which likewise could be either self-embedded or conjoined in Karajá. Twenty Brazilian Portuguese (BP) and 20 Karajá subjects had their eye movements monitored as they performed a sentence/picture-matching task including sentences in BP and in Karajá, respectively. Based on Maia, França, Gesualdi, Lage, Oliveira, Soto & Gomes (forthcoming), two hypotheses were formulated, both for BP and Karajá, namely, (i) launching the self-embedding of PPs would be more costly to process than launching the conjoining of PPs; (ii) after launching, the subsequent self-embedding of a third PP would be less costly than the previous PP. Results confirmed these predictions and were analyzed in terms of a third factor computational effect learning algorithm. Chomsky (2005) proposes that the design of the human language faculty is determined by three factors, namely, Universal Grammar (UG), experience and principles of computational efficiency. UG is the internal innate factor which is part of the genetic endowment of humans; experience provides the triggering effects of linguistic input which interact with the innate principles of UG, and computational efficiency factors are taken to be general principles, which may apply -well beyond language‖ and are taken to be part of natural laws affecting the development of biological systems. (cf. Chomsky, 2005, p. 9).
The present paper intends to present Karajá declarative and relative clause sentences, which do not resort to the use of conjunctions. In Karajá, an alternative prosodic mechanism is instantiated in relative clause embedding. We described this mechanism through a detailed acoustic analysis. Additionally, an EEG experiment will keep track of the embedding operation which is here analyzed as a third factor principle of computational efficiency in the processing of embedding, as described for embedded PPs in Maia, França, Gesualdi, Lage, Oliveira, Soto & Gomes (to appear) and Maia (2016).
The paper is organized as follows. In section 2 relevant syntactic aspects of juxtaposed and embedded constructions in Karajá used in the study are analyzed and discussed. In section 3, we provide full acoustic description of the pitch-accent phenomenon in the sentences that are later used as stimuli in the EEG-ERP test. In section 4, the EEG-ERP study is presented. Finally, in section 5, we conclude with findings that will integrate the wider scope panorama of independent studies on the recursive strategies and their computation cost in different languages.

The Karajá language
The Karajá language is spoken by a population of about 3,000 people in the Bananal Island and surroundings, in Central Brazil. The data collection and the EEG experiment were carried out in the Karajá Hawalo village, Tocantins state. Karajá is an agglutinative, pro-drop, verb final language, allowing Wh-movement. As first noticed in Ribeiro, 2012, we also have the impressionist analysis that there is a specific prosodic structure for relativization: [tori do'rode] ‗the White man arrived' versus [tori doro'de] ‗the White man who arrived'.
Sentences (1) to (4) below show samples of juxtaposed and embedded clause constructions in Karajá with three and four recursive layers.
(1) Sentence with 3 clauses in juxtaposition hawɨˈɨ dʒuaɗa riˈmãra, dʒuaɗa hãˈbu riˈrɔra, hawɨˈɨ kaˈu ruˈrura woman piranha caught, piranha man bit, woman yesterday died -The woman caught the piranha, the piranha bit the man, the woman died yesterday‖ (2) Sentence with 3 clauses: 2 relative clauses and a main one hawɨˈɨ dʒuaɗa rimãˈra hãˈbu rirɔˈra kaˈu ruˈrura woman piranha caught man bit yesterday died "The woman who caught the piranha who bit the man died yesterday" (3) Sentence with 4 clauses in juxtaposition hawɨˈɨ dʒuaɗa riˈmãra, dʒuaɗa hãˈbu riˈrɔra, bɛˈra-ki robuˈnãra, hawɨˈɨ kaˈu ruˈrura woman piranha caught , piranha man bit, river -in swam woman yesterday died -The woman caught the piranha, the piranha bit the man, the man swam in the river, the woman died yesterday‖ (4) Sentence with 4 clauses: 3 relative clauses and a main one hawɨˈɨ dʒuaɗa rimãˈra hãˈbu rirɔˈra bɛˈra-ki robunãˈra kaˈu ruˈrura woman piranha caught man bit river -in swam yesterday died "The woman who caught the piranha who bit the man who swam in the river died yesterday" In (1), three juxtaposed clauses are presented. Notice that in either cases there is no conjunction or operator connecting the clauses, which are only prosodically separated, as indicated by the commas in the transcriptions. Notice also that all three verbs have the main stress in the root, as indicated in the transcriptions by the stress diacritic ('), which signals the tonic syllable. Since the basic word order in Karajá is SOV in declarative sentences, verbs are consistently in final position in the three coordinated clauses.
In (2), on the other hand, a sentence with a main clause and two embedded relative clauses is presented. Notice that the first relative clause refers to the hawɨˈɨ -the woman‖ noun phrase, which is both the subject of the first relative clause and the subject of the main clause verb in the last clause of the sentence. The second relative clause, however, applies to the object of the first clause, the dʒuaɗa -piranha‖ NP. In both cases, RCs are signaled by a displacement of the pitch accent in the relativized verbs from the stressed syllable onto the final syllable: the declarative riˈmãra -caught‖ in (1) becomes the relativized rimãˈra in (2) as well as riˈrɔra in (1) becomes rirɔˈra in (2). Only the verb in the main clause at the end of the sentence does not undergo such a pitch shift, as it is not relativized: ruˈrura -died‖ keep the same pitch movement on the root both in (1) and in (2).
In examples (3) and (4), an additional clause is added, namely, a juxtaposed clause in (3) and a third relativized clause in (4). Again there is no conjunction or operator to signal coordination in (3). Notice that this additional RC in (4) is also an object RC, which applies to the same object NP as the previous RC, the NP hãˈbu -the man‖. And again, the pitch accent moves to signal relativization. The verb in the main clause is at the very end of the sentence, after the three RCs.
The four sentence types described above will become the four conditions that will be empirically compared in a neurophysiological experiment. Nevertheless, since the language had not been described acoustically before, in chapter 3 we provide a full description of juxtaposed and relative clauses, the latter involving the pitch-accent phenomenon.

The Acoustic Analysis
The audio stimuli we built consisted of 15 sentences for each one of the four syntactic conditions:  Condition 1: Sentences with 3 clauses in juxtaposition  Condition 2: Sentences with 2 embedded clauses and a main clause  Condition 3: Sentences with 4 clauses in juxtaposition;  Condition 4: Sentences with 3 embedded clauses and a main clause. Sentences are thus varied upon the number of their relative clauses (2 or 3), and the nature of these clauses (juxtaposed or embedded). The words bearing the distinctive information on the nature of the syntactic relation are the clause-final verbs, which are always 3-syllable long in this experiment. As the phenomenon under investigation focuses on the pitch shift between the penultimate and final syllables of verbs ending each clause were segmented, so as to isolate their vowels and allow their acoustic analysis.
The total of 60 sentences were produced by the same male speaker 3 of Karajá, who is a language consultant. In order to assert that there are changes in speech signal related to the phenomenon of recursive relative clause attachment, first, for the sake of a visual understanding of the acoustic differences, a mapping of the fundamental frequency changes relative to the two most complex conditions follows, clause by clause, followed in section 3.1. by a subsequent acoustic analysis made using the PRAAT software (  clause-but the final syllable of the main clause, that always shows a fall. The two sentences shown here are the following: Sentence with 4 clauses in juxtaposition waʃiˈdu ɔdɛmaˈhi ɾiˈmɨɾa wɛɾ ɨˈɾ ɨ ɾiɔˈhɛɾa bɛˈɾa-ɔ ɾaˈlɔɾa ɗiˈi kaˈki hãˈwɔ ɾiɗɔˈɾɔɾa fisherman crab caught boy cut river -to entered he here canoe rested -The fisherman caught a crab, the boy cut (it), entered the river, he rested the canoe here‖ Sentence with 3 relativized clauses and a main clause waʃiˈdu ɔdɛmaˈhi ɾimɨˈɾa wɛɾ ɨˈɾ ɨ ɾiɔhɛˈɾa bɛˈɾa-ɔ ɾalɔˈɾa kaˈki hãˈwɔ ɾiɗɔˈɾɔɾa fisherman crab caught boy cut river -to entered here canoe rested -The fisherman who caught a crab, which cut the boy who entered the river, rested the canoe here‖

Acoustic Measures
From the 60 stimuli, the fundamental frequency (F 0 , expressed in semitones), A-weighted intensity (expressed in decibel, cf. Liénard and Barras 2013), and the first three formants (F 1 , F 2 , F 3 , expressed in Hertz) were measured at each 5 milliseconds along the vowels, and then their median values were considered for each vowel. A 50 to 350 Hz range was used to estimate F 0 , and the estimated values were manually checked; a 5000Hz ceiling was used for formants, as suggested for male voices by (Gendrot and Adda-Decker, 2005). Vocalic durations are expressed in milliseconds. Formants were measured in order to assert that the difference is indeed based on prosody-related parameters, and not due to change at the segmental level.
Because of the limited size of the items in the experiment (punctual measures on three or four points in 60 sentences of heterogeneous phonemic construction), normalization of phoneme-intrinsic differences (notably for duration: cf. Campbell 1993; Barbosa 2007, but also formants) was impossible. Thus, in order to observe differences between two types of clauses (juxtaposed vs. embedded), comparison between sentences in conditions 1 and 2, and sentences in conditions 3 and 4, were based on the difference of F 0 (or intensity, duration, formants' frequency) between pairs of sentences. For conditions 1 and 2 (resp. 3 and 4), the final two vowels of each three verbs (resp. four verbs) were analyzed, and the difference between the parameters estimated on both sentences was used for subsequent analyses. There are thus differences in F 0 and intensity levels, in vocalic duration, and in formant values, for the 2-1 (resp. 4-3) pairs of conditions, at two locations (penultimate and final vowels) on each 3 (resp. 4) verbs. The differences between conditions 2 and 1 (resp. 4 and 3) are expressed as the ratio between a measure taken on sentence 2 (resp. 4), over the same measure on sentence 1 (resp. 3); to allow a symmetric comparison between the two sentences, the logarithms of these ratios were considered-as ( ) ( ). Thus, comparisons are based on equation 1: Where ( ) denotes the logarithm of the ratio between sentences B and A for the acoustic measure M (e.g. F0, intensity, etc.); and MA and MB denote the estimation of parameter M in sentences A and B. Note that for measures based on a logarithmic scales (semitones and decibel), this ratio is simply obtained by the difference between the two logarithmic measures. The variations of these ratios across the stimuli are described in the next section.

Acoustic differences between types of clauses
The three (resp. four) verbs of the pairs of sentences in conditions 2-1 (resp. 4-3) correspond to: the end of the subject relative clause, the end of the (or the two) object relative clause, and the end of the final main clause. Boxplots in figures 2 and 3 illustrate the distributions of the differences for four acoustic measures, according to the position of verbs across the syntactic structure, and depending on the considered syllable (penultimate or final).
Three measured F 0 differences exceed 10 semitones because of the presence of creaky voice in the final syllable of sentences in conditions 1 and 3: they were discarded from the dataset, granted that if we kept them they would not alter the conclusions, but introduce a statistical bias.  Acoustic differences between pairs of sentences in conditions 2 vs. 1 (resp. 4 vs. 3) appears to be greater on the final syllables than on the penultimate ones-and for verbs ending a relative clause. Also the acoustic correlates of prosody (F 0 , intensity, duration) show wider variations than segmental-related measures (formants). Statistical models of analysis of variance (ANOVA: based on R Software's (R Core Team, 2016) lm ( ) Function) were fitted on each acoustic measure, to assert the relative importance of each factor on the acoustic variations, and the relative role one may attribute to each acoustic difference in the performance of syntactical embedding. The following factors were taken in consideration: the Pair of sentences across conditions (two levels: 2-1 and 4-3), the type of Clause (3 or 4 levels: subject relative, object relative-1 or 2 depending on the condition-, and the main clause), the Position of the syllable in consideration (2 levels: penultimate and final), and all double and triple interactions between these factors. From these maximal models, a stepwise process of simplification was pursued to remove non-significant interaction terms, levels differences and main effects (cf. details on this procedure in Crawley 2013), so to obtain a minimal adequate model of the factors (and their levels) having a significant influence (with a risk alpha set at 5%) on the response variable, and to assert the size of these effects. This process was done separately for each acoustic parameter.

Model for F0
All interactions with the factor Pair were removed, as non significant; the different levels of the relative clauses (-subject‖ and the one or two -relative‖) of the Clause factor were grouped in a common -relative‖ level, opposed to the -main‖ clause. The minimal adequate model is thus based on the pair of sentence considered (2-1 vs. 4-3), the type of Clause (relative vs. main), the syllable Position, and the double interaction between Position and Clause. Table 1 displays a summary of the model, with the effect size of each factor; the model explains slightly more than half of the observed variance in F 0 differences (R 2 = 0.53).

Model for duration
The factor Pair, and all its interactions, were removed, as non significant; the different levels of the relative clauses (-subject‖ and the one or two -relative‖) of the Clause factor were grouped in a common -relative‖ level, opposed to the -main‖ clause. The minimal adequate model is thus based on the type of Clause (relative vs. main), the syllable Position, and their double interaction. Table 2 displays a summary of the model, with the effect size of each factor; the model explains about half of the observed variance in duration differences (R 2 = 0.46).

Model for Intensity
The factor Pair and the interactions it participates in were removed as non-significant; the different levels of the relative clauses (-subject‖ and the one or two -relative‖) of the Clause factor were grouped in a common -relative‖ level, opposed to the -main‖ clause. The minimal adequate model is thus based on the type of Clause (relative vs. main), the syllable Position, and the interaction between Position and Clause. Table 3 displays a summary of the model, with the effect size of each factor; the model explains only one third of the observed variance in intensity differences (R 2 = 0.30).

Discussion about the acoustical measures
The statistical models built to explain the variation in acoustic parameters across sentences with juxtaposed or embedded clauses showed that the main changes are linked to F 0 and duration. Figure 4 depicts the summary of changes in these two parameters, for the factors found to have a significant effect. One may observe the following differences. For F 0 , the main rise happens on final vowels of relative embedded clauses, and on the root (penultimate) syllable of relative juxtaposed clauses. This rise is larger for embedding than juxtaposed clauses. This variation in pitch setting explains one third of the total variance (and 40% of the variance accounted for by the model). This variation in pitch does not happen on verbs ending main clauses, thus the significant interaction between the syllable's position and the location of the clause, which explains one sixth of the total variance. A difference was found in the range of pitch differences (thus on the peak's height) depending on the length of the sentences: shorter sentences (based on three clauses) bear higher peaks than longer sentences (based on four clauses). This effect may be explained by the planning of sentences by the speaker: he had to keep more air in his lungs during longer sentence-and thus produced shorter F 0 peaks. This last effect is marginal in the model (1% of the total variance), even if significant. That differences in vocalic durations show the same pattern as F 0 , with simpler effects: the main factor explaining duration changes is the position of the syllable, with larger lengthening on final vowels, a phenomenon that marks embedding. Such lengthening does not occur for sentence-final main clauses (i.e. it occurs only on relative clauses), and no difference in lengthening was observed depending on the sentence's length. The capsule review is that pitch accent is an acoustically salient phenomenon in Karajá syntax as it compares to juxtaposition. To analyze the neurophysiological impact of such differences, we present the EEG-ERP experiment in 4.

EEG/ERP Experiment
Electroencephalography makes it possible to acquire and store bioelectric signals. It executes the continuous registration of electro-cortical activity by means of electrodes fixed to the scalp. Each one of these electrodes is placed at a specific point, which is directly related to an area of the cerebral cortex. These points on the scalp are called derivations. The tip of the electrode captures the electric activity of thousands of neurons. Any voltage fluctuation (µV) captured between pairs of electrodes, or rather, between two derivations, is registered by the EEG, enabling the measurement of electric activity at the derivations (Bear; Conners; Paradiso, 2006), which is a reflection of the electric activity in the brain -ERPs.
ERPs, responses of the nervous system to motor or sensory stimulation, are composed by a sequence of waves characterized by its latency, amplitude, and polarity. ERPs usually present an instantaneous value of 10 to 1000 times less than the background EEG, which is why it cannot be visualized. For them to be visualized, the average of various epochs 4 needs to be calculated. This procedure is justified by considering spontaneous EEG as zero-mean white Gaussian noise and the ERPs as the only responses, which are really synchronized with the stimulus. This way, the effect of the grandaverage is to increase the signal/noise ratio (SNR), thus allowing for the visualization of the specific effect of thein this case -linguistic stimulus. Thus, ERPs provide a continuous sampling of the brain's electrical activity. This sampling can come from different sorts of linguistic phenomena produced in the cortex of volunteers performing linguistic tasks, while monitored by the electroencephalograph -EEG (Osterhout; Holcomb, 1993).
The possibility of seeing how a human brain works on-line, being able to track processes with millisecond resolution is one of the most important contributions of the electromagnetic methods to the neuroscience of language field. With methods like these, used in this work, it is possible to follow the electrical response to linguistic stimuli and many correlations have been made between special cortical waves -ERPs (Event-related brain potentials)and linguistic phenomena, especially if the problem being faced has to do with the chronology of the different computations in language processing and to the choice of the best language architecture, the most suitable assessment method is the EEG-ERP.
We set off by trying to examine whether the processing of embedding clauses was costlier than the processing of juxtaposed clauses. To investigate the hypothesis that embedded clauses, as in Conditions 2 and 4 should be harder to process than their juxtaposed counterparts, as in Conditions 1 and 3, we applied a listening comprehension experiment in a group of 28 Karajá subjects tested in their native Karajá. We used event-related brain potentials (ERPs/EEG) in auditory sentences in the four conditions presented in section 2.2.
Our work hypothesis is that in Karajá, the relativized clauses or embedded ones, which are only prosodically separated (no conjunction or operator is used), will be harder to process than the juxtaposed counterparts. Since the N400 component is associated in the literature to difficulties in word integration with the working context (Kutas & Hillyard, 1980, 1984 & França, , 2013, the more salient the N400, the harder the combinatorial process will be. Thus, we expect that the relativized clauses or embedded ones elicit the bigger N400s effects.
Moreover, we aim to investigate whether the difficulty of processing and understanding sentences in Karajá are also modulated by the number of juxtaposed and embedded sentences, by analyzing electrocortical manifestations of the time course of juxtaposed sentences with two levels of juxtaposition versus three levels and by analyzing embedded clauses constructions also with two levels versus three levels, as shown above.

Methods
We recorded ERPs to critical words (verb area) while participants read sentences in all four conditions. The four syntactic conditions of Karajá sentences were contrasted. The independent variables in this study were the type of compound (juxtaposition versus embedding and the number of levels (two versus three levels), while the subject response times, error rates and the ERPs were the dependent variables. To ensure participant engagement, there was a yes or no interpretation question to be answered after each sentence, For instance: hawɨˈɨ aõbo ruˈrura? Woman INT died Did the woman die? Subjects were expected to answer either kohe -yes‖ or kõre -no‖.

Participants
A total of 28 native speakers of Karajá (14 males) took part in this experiment. They live in the Araguaia Reserve in Bananal Island, located in Central Brazil. Participants aged 18-44 (mean: 35.7), were righthanded and had normal or corrected-to-normal vision and to be. A written consent form was obtained from all participants prior to their engagement in the test.
ERP components of interest were identified based on visual inspection of ERPs, ROIs and topographic maps, as well as prior findings.
These dependent measures, that is, the voltages within the N400 mean voltage time-window and P600 mean voltage time-window, were analyzed with repeated measures analyses of variance (ANOVA). ANOVAs were performed separately at each electrode site. A two-way ANOVA model was used, and the factors were sentence-type (juxtaposed versus embedded) and number of levels (two or three).
The Greenhouse and Geisser (1959) correction for inhomogeneity of variance was applied to all ANOVAs with greater than one degree of freedom in the numerator. In such cases, the corrected p value was reported. Significant main effects were followed by simple-effects analysis.

ERPs
The Grand-average ERPs to the critical verb in Condition 1 -two juxtaposed clauses (as in ‗The woman caught the piranha, the piranha bit the man, the woman died yesterday'), are shown in Figure 6:  We compared the three clauses in the juxtaposed sentences of Condition 1 ("The woman caught the piranha, the piranha bit the man, the woman died yesterday") and these three comparisons can be visually inspected in Figure 6.
We analyzed the differences in the amplitude and in the latency axes among the three clauses. The first clause ("the woman caught the piranha"), in black, has the lowest amplitude compared to the second ("the piranha bit the man"), and the third ("the woman died yesterday") in red and blue, respectively that basically overlaps.
In the 300 -500 ms window, ANOVAs revealed a statistically meaningful effect of stimulus type in the Condition 1 (juxtaposed Condition), simple effect analysis at midline sites showed that ERPs to the first clause were more negative than those of the second and third clauses [CPZ, F(1, 6) = 5,868, p*= .0229]. ANOVAs revealed no statistically meaningful effect of stimulus type in the comparison between the second and the third clauses in the same time-window [CPZ, F(1, 6) = 9,276, p= .2797].
Likewise, in condition 2, we compared the embedded clauses (‗The woman [that] caught the piranha, [that] bit the man, [the woman] died yesterday') and these three comparisons can be visually inspected in Figure 7 below. As we can see, in the 300 -500 ms, there is not much difference in the amplitude or in the latency axes between the first two clauses in the embedded sentences condition. In fact, ANOVAs revealed no statistically meaningful effect of stimulus type in the Condition 2 between the first and second clauses, simple effect analysis at midline sites showed that ERPs to the first clause were not different from the second one [CPZ, F(1, 6) = 9,534, p= .40].
The first clause (‗the woman caught the piranha'), in black, seems to be as hard to process as the second one (‗the piranha bit the man'), in red. But the same is not true for the third clause (‗the woman died yesterday'), in blue. Simple effect analysis at midline sites showed that ERPs related to the third clause has greater amplitude around the 400 ms than the first two clauses [CPZ, F(1, 6) = .573, p*= .0044]. Thus, it is possible to nurture the idea that the processing of the entire sentence gets harder towards the end.
As part of our work hypothesis, we also investigated the number of juxtaposed and embedded clauses in a sentence, in order to see if there was a difference in processing two-level sentences versus three-level ones. Thus, we created Conditions 3 and 4.
These comparisons can be seen in the Figure 8 and Figure 8 shows the comparison among the clauses in the juxtaposed Condition with three levels (Condition 3), as in ‗The woman caught the piranha, the piranha bit the man, the man swam in the river, the woman died yesterday'.
As we can see, the amplitudes of the three clauses seem to be in a progression. The first clause has the lowest amplitude of the three, and the third has the highest amplitude of them all. This is exactly what happened with the conditions with two levels.
These results taken together can support the idea that juxtaposition computation gradually increases in complexity with the number of layers juxtaposed, most probably for the memory resources involved. hawɨˈɨ dʒuaɗa riˈmãra, woman piranha caught, dʒuaɗa hãˈbu riˈrɔra, piranha man bit, bɛˈra ki robuˈnãra river -in swam -The woman caught the piranha, the piranha bit the man, the man swam in the river, the woman died yesterday‖ We also compared the three clauses in the embedded sentences Condition with three levels (Condition 4), as in ‗The woman [that]

caught the piranha, [that] bit the man[that] swam in the river, died yesterday'.
It is worth to notice that, although the syntactic form of the juxtaposed and embedded sentences seem to be the same ‗The woman caught the piranha, the piranha bit the man, the man swam in the river, the woman died yesterday', Karajá language uses the variation of pitch accent in the main verbs ‗riˈmãra' versus ‗rimãˈra', ‗riˈrɔra' versus ‗rirɔˈra' and ‗robuˈnãra' versus ‗robunãˈra' to signal the difference between juxtaposion and embedding. Thus, the relativized clauses or embedded ones are only prosodically different.
As we can see in Figure 9, there is not much of a difference neither in the amplitude nor in the latency axes between the first two clauses in the Embedded sentences Condition. The first clause ("the woman caught the piranha"), in black, seems to be as hard to process as the second one ("the piranha bit hawɨˈɨ dʒuaɗa rimãˈra, woman piranha caught, dʒuaɗa hãˈbu rirɔˈra, piranha man bit, bɛˈra ki robunãˈra river -in swam -The woman caught the piranha, the piranha bit the man, the man swam in the river, the woman died yesterday‖ the man"), in red. But, in the second comparison in Figure 9 above, the same is not true for the third clause ("the woman died yesterday"), in blue. The third clause has higher amplitude than the first two ones. These results point to the interpretation that differently from the processing of the sentences in the juxtaposed condition, that seem to increase in complexity layer by layer, the third layer in the embedded condition is way simpler to process than the first. Thus, embedded sentences are gradually facilitated, most probably for performance reasons: once the embedded computation is deployed reiteration in this mode becomes progressively easier. Figure 10 shows a panoramic view of the comparison between the two conditions with two layers: the juxtaposed in Condition 1, plotted in the black line, and the embedded in Conditions 2, plotted in red.

Condition 1 (solid black line):
The woman caught the piranha, the piranha bit the man, the woman died yesterday

Condition 2 (solid red line):
The woman [that] caught the piranha, [that] bit the man, died yesterday  Figure 11 shows a panoramic view of the comparison between the two conditions with three layers: the juxtaposed in Condition 3 plotted in black, and the embedded in Condition 4, plotted in red.

Condition 3 (solid black line):
-The woman caught the piranha, the piranha bit the man, the man swam in the river, the woman died yesterday‖

Condition 4 (solid red line):
-The woman caught the piranha, [that] bit the man, [that] swam in the river, the woman died yesterday‖ Figure 11 (Graphics 1, 2 and 3): Grand-average ERPs recorded at CPZ site to Condition 3 -Juxtaposed Sentences with three levels (solid black line): first, second and third clauses (each graphic from left to right, respectively) versus Condition 4 -Embedded sentences with three levels (solid red line): first, second and third clauses (each graphic from left to right, respectively). Onset of the critical verbs is indicated by the vertical bar. Each has mark represents 100 ms of activity. Positive voltage is plotted down.
The panoramic views of the contrasting conditionsjuxtaposed and embeddedin two and three layers, allow us to draw the overall conclusion that it is harder to process an embedded sentence than a juxtaposed one, but as the layers come in, the processing difficulty increases with juxtaposition and decreases with embedding (cf. third cell in Figure 11).

Conclusions
We found evidence in favor of a stronger facilitation to process a coordinated structure than a recursive structure. We found smaller RTs and amplitudes in the EEG related to the coordinated conditions versus the embedding conditions. We also found smaller RTs and amplitudes in the EEG related to coordinated sentences with two juxtapositions versus the embedding counterparts. As for the number of errors, our subjects seemed to get more embedded sentences wrong than coordinated sentences.
Closer inspection also reveals that the electric wave amplitude of each embedding clause in the coordinated conditions seemed to be in a progression, meaning that the amplitude increases as the number of juxtapositions (one versus two). We did not find the same results in the embedding conditions. In line with previous results obtained for Prepositional Phrase coordination and embedding both in Karajá and in Brazilian Portuguese, we interpret the facilitation verified for the last embedding as the result of a third factor learning algorithm. Even though embedding is harder to launch, hierarchical structuring makes it easier to process in the third embedding, when comprehenders learn they are in an embedding mode. Coordination, on the other hand, being a default, is easier to launch, but it seems to become progressively harder as it does not benefit from hierarchical structuring. Therefore, the Karajá data discussed in the present study basically support the analysis that there is a difficulty in the launching of the embedding process which contrasts with the launching of the coordination process, which is the default one. However, crucially, once launched, embedding becomes gradually easier in contrast with juxtaposition, since the embedded phrases share structural identity. The existence of a similar hierarchical learning algorithm applied exclusively to the processing of the recursively embedded phrases had already been found in previous work on embedded vs coordinated Prepositional Phrases in BP and Karajá (cf. Maia; 2016; Maia, França, Gesualdi, Lage, Oliveira, Soto & Gomes, to appear). As we tried to demonstrate above this learning algorithm is also verified for Karajá in the present study, as there is a significant fall in processing difficulty only for the second and third RC embeddings, and not for the second and third RC juxtapositions. These patterns allow for the interpretation that there is a learning algorithm at workonce the hierarchical syntactic process is established it falls back to shorter levels of difficulty and does not pose any extra significant effort to the system. Ultimately, we speculate that this crucial difference between embedding and juxtaposition is due to the hierarchical structuring present only in embedding, which would make up for the increase in memory load. Coordination which is not a hierarchical process would not allow for the structural identity which is inherent in a hierarchical process, and would not benefit from it, being, therefore, more sensitive to the increase of memory load.
Finally, we would like to point out that the proposal made in Everett (2005) that indirect recursion is not found in the Brazilian indigenous language Pirahã may be the result of a false negative, for two reasons: (i) as exemplified by the Karajá data discussed here, languages present a wide array of largely unknown grammatical devices they make use to code embedding; (ii) even though it is harder to process than juxtaposition, thus tending to appear less frequently in corpora, embedding may be actually available in the grammar. The present study showed that embedding can be grammatically coded through variations in the location of the pitch accent and that it is an available syntactic resource in the language to express a type of modification, which is usually more commonly expressed through relativizing particles or nominalizations. Additionally, the present study shows that an embedding analysis is more difficult to launch than coordination. This fact may account for the fewer observations of embedding, than coordination, that has also been shown to be easier to process and to acquire. Because of these facts, it should be advisable to consider processing and acquisition data in the investigation of understudied languages, before making definite claims about the inexistence of recursive processes in these languages.