THE ROLE OF RHYTHMIC CUES FOR SPEECH SEGMENTATION IN MONOLINGUAL AND BILINGUAL LISTENERS

Spoken word recognition involves the segmentation and identification of a continuous and highly complex stimulus. It has been proposed that, in seg menting speech, listeners apply a universal rhythmic strategy that has lan guage-specific manifestations depending on the phonological characteristics of their native language (Cutler, Mehler, Norris, & Segui, 1983, 1986): While native listeners of Romance languages like French are said to rely on syllab ic structures, native listeners of Germanic languages like English or Dutch would use metrical structures. In the first part of the present paper, these pro posals are discussed wilh regard to speech segmentation in monolingual. It will be argued lhat word stress may provide powerful cues to word boundaries in both French and Dutch. The second part of the present contribution addresses the issue of speech segmentation in bilinguals, and, in particular, the claim that bilinguals develop a single rhythmic segmentation procedure restricted to their dominant language (Cutler, Mchlcr, Norris, & Segui, 1992). Il will be argued instead that the use of adapted rhythmic segmentation cues is a necessary component of second language acquisition, and, consequently, that bilinguals who attain a high level of proficiency in their second language are able to exploit the rhythmic structures of thai language in speech segmen tation.

word specifies the location of the following word onset.
The segmentation efficiency of such a speech recogniser lies in its anticipation of words offsets through the isolation of the correct candidate before all phonetic information concerning this candidate is available (see e.g. Mattys, 1997, for a comprehensive review). However, words are not always recognised before their end (Bard, Shillcock, & Allmaun, 1988;Grosjean, 1985;Tabossi, Burani, & Scott, 1995). Moreover, many polysyllabic words contain shorter embedded words. For example, the French word ravissant (ravening) contains the shorter words rat (rat), vit (lives), sans (without), ravi (delighted), vissant (screwing), etc. (for statistical analyses, see Frauenfelder, 1991a, for Dutch;McQueen, Cutler, Briscoe, & Norris, 1995, for English). This embeddedness problem, which cannot be solved by strictly sequential models, requires decision about word identity to be delayed until enough information is available. Such a delayed commitment component has been introduced in subsequent speech recognition models like TRACE (McClelland & Elman, 1986) and Shortlist , which involve a competition between many lexical candidates beginning at different points of the signal and considered in parallel (for detailed comparison of these two models, see e.g., McQueen, Norris, & Cutler, 1994;Norris, McQueen, & Cutler, 1995).
When considered alone, lexical competition does not solve the problem of speech segmentation, which requires a decision on the parts of the acoustical wave that correspond to the beginning of words (e.g., Frauenfelder, 1991b;Norris el al., 1995). Moreover, the highly complex and variable nature of the speech stream would frequently lead to erroneous recognition if speech parsing were merely relying on a simple matching process between parts of the input and stored representations. Instead, listeners are able to identify spoken words correctly and almost instantly, i.e. to discriminate each of them from amongst the tens of thousands of other words stored in their mental lexicon. As Cutler argued (1994), "a much more robust model is needed to account for what is obviously true, namely than human speech recognition is extremely successful even with background noise, distance between speakers, distortion of the speaker's vocal tract, foreign accent, slips of the tongue, etc." (p. 89).
To explain how the complex mapping between form and meaning can be successfully achieved, it was proposed that both segmentation and recognition processes could be best achieved by a prelexical component that exploits units or cues indicating where word boundaries are likely to occur. Within such a view, speech segmentation is an explicit process that precedes tlie listener's attempts to recognise words. Prelexical segmentation is actually not incompatible with lexical competition, as shown by several studies that have assessed the specific and additional effects of these two compo-nents (McQueen et al., 1994;Norris et al., 1995;Vroomen <fe de Gelder, 1997a). The next part of the present paper focuses on the nature of the segmentation cues that have been proposed to assist the speech recognition process.

LI. From the Search for the Universal Unit to the Notion of Language-Specific Implementations of a Universal Rhythmic Strategy
The idea that the human speech recogniser is provided with a separate, prelexical, segmentation component has prompted the search for a universal speech segmentation unit. Several candidates have been considered, ranging from temporally defined templates (e.g., Klatt, 1980) to abstract linguistic units. The most influential of these proposals was that speech recognition is assisted by a syllabic segmentation procedure (e.g., Mehler, 1981; for a review, see e.g. Frauenfelder «& Kearns, 1996). In their seminal study, Mehler, Dommergues, Frauenfelder and Segui (1981) showed that French listeners detect faster a sequence of two or three phonemes when this matches exactly the first syllable of a subsequent auditory carrier word than when it does not. For example, pa was detected faster in pa.lace than in pal.tnier, while pal was detected faster in palmier than in pa.lace. The authors concluded thai "the syllable is probably the output of the segmenting device operating upon the acoustic signal. The syllable is then used to access the lexicon" (p.342).
However, a major conceptual change in theories of speech segmentation emerged with the cross-linguistic follow-up study of Segui (1983, 1986). As a matter of fact, they showed that while French listeners syllabified English words, English listeners did not show a syllabic effect with either French or English materials in the fragment detection task. This cross-linguistic difference was interpreted as a consequence of the specific phonological structures of the two languages. That is, French listeners segment the speech stream into syllables because their native language displays clear and little diversified syllabic structures. English listeners do not use such a strategy because English presents both widespread ambisyllabicity, i.e., consonants belonging to two syllables at ouce (see Kahn, 1980;Kager, 1989) and a larger variety of syllabic structures than French (see Goldman, Content, <& Frauenfelder, 1996).
The idea that syllabic structures aie inappropriate parsing units in English led Anne Cutler and her colleagues to consider other phonological properties of that language. An essential characteristic of English and other stress-based languages like Dutch is the metrical distinction between strong syllables, which have full vowels, and weak syllables, which have reduced vowels (usually a schwa). Since most English words display word-initial strong syllables (Cutler & Carter, 1987), which is also the case in Dutch (Vroomen «fe de Gelder, 1995), Cutler and Norris (1988) proposed that an efficient strategy in such languages is to segment the speech stream and start a lexical access attempt at every strong syllable.
Laboratory-induced juncture misperceptions provide strong empirical evidence supporting the use of this Metrical Segmentation Strategy (henceforth, MSS) in both English (Cutler & BuUerfied, 1992) and Dutch (Vroomen, van Zon, & de Gelder, 1996). These slips of the ear, which are word boundary localisation enors, are observed when listeners are asked to recognise spoken sequences on the basis of partial acoustic cues, for example when these sequences are presented at individual speech perception threshold. Both in the English and Dutch studies, such juncture misperceptions were significantly related to the metrical structure of the sentences: listeners erroneously inserted word boundaries mainly before strong syllables and deleted word boundaries mainly before weak syllables (e.g., conDUCT asCENTS upHILL perceived as the DOCtor SENDS her BILD in Cutler & Butterfield's study).
Word-spotting experiments, in which listeners are asked to detect any real word embedded in nonsense disyllabic strings (McQueen, 1996), also support the use of the MSS in English (Cutler & Norris, 1988;McQueen et al., 1994). For example, Cutler and Norris (1998) showed faster and more accurate detection of the words (e.g., MINT) when the second syllable of the string was metrically weak (e.g., in MINtesh, /mintajV) than when it was metrically strong (e.g., in MINTAY\'E, /minteif/). This was predicted by the MSS, according to which the second syllable triggers segmentation only when it is strong, i.e. only in MINTAYVE, thus requiring assembly of the speech material across a segmentation point for successful detection.
The metrical effect observed by Cutler and Norris (1988) was only replicated on correct responses, but not on reaction times, in a similar word-spotting experiment conducted in Dutch (e.g., detection of MELK in either /melkos/ or /inclkos/, Vroomen et al., 1996). This suggests that the rhythmic cues used for speech segmentation in Dutch differ from those used in English, which might be a consequence of the different linguistic characteristics of these two languages (for a review, see Cutler & van Donselaar, in press). In particular, whereas vowel quality differences determine the metrical status of syllables in English (Fear, Cutler, & Butterfield, 1995), many weak syllables contain unstressed unreduced vowels in Dutch (Queue, 1993;Queue «fe Smith, 1992;Queue & Koster, 1998;Vroomen & de Gelder, 1995). For example, the weak syllables of words like cigar or cobra, which nearly always contain reduced vowels in English, display unreduced vowels in Dutch (Cutler «fe van Donselaar, in press;Koster & Cutler, 1997). Therefore, vowel quality is not a reliable predictor of the metrical status of the syllable in Dutch. Since Dutch word-initial strong syllables often bear primary stress (e.g. Kager, 1989;van der Hülst, 1984), the most obvious acoustic characteristics other than vowel quality that predict metrical structure are the suprasegmental manifestations of primary stress, i.e., longer duration, higher intensity, and a flatter spectrum (Sluijter, 1995). Vroomen and de Gelder (submitted) provided support for the idea that Dutch listeners use a Stress Based Segmentation strategy (henceforth, SBS), according to which word boundaries are better signalled by the degiee of the syllable stress than by the occurrence of a metrically strong syllable. These authors showed indeed that the spotting of a Dutch disyllabic word like KRAter (meaning crater and pronounced as /'kiatar/ 2 ) embedded in trisyllabic strings was faster when the first syllable of the target word was realised as primary stress (e.g., /,p3'kratai7) than when it was realised as secondary stress (e.g., /'po.kiatat'/). This difference cannot be accounted for by the MSS, which attributes the same segmentation power lo any syllable bearing a full vowel. Nor can it be attributed to acoustic differences between the two sets of trisyllabic carriers, since the effect was still observed after these differences were factored out in the response times. Vroomen, Tuomainen and de Gelder (1998) further showed that, in a learning task where listeners had to recognise new "words" previously presented in the context of an artificial continuous speech stream (cf. Saffian, Aslin, Saffian, Newport, & Aslin, 1996), Dutch listeners (but not French listeners) performed better at recognising initial-stressed than non-initial stressed "words" of materials containing full vowels exclusively. Taken together, these results strongly suggest that stress is a determinant cue for speech segmentation in Dutch, and that vowel quality is less important in Dutch than in English.
Recent research has further suggested that stress may also play an important role iii speech segmentation and lexical access in English. Fine-grained stress discrimination, which is a prerequisite for being able to use stress cues iu speech segmentation, can be performed by English listeners in forced-choice discrimination (Mattys, 2000) or cross-modal fragment priming (Cooper, 2000). Mattys (2000; see also Mattys & Samuel, 2000) further argued that a SBS strategy would be more efficient that the MSS in English, since nearly half of the English words contain at least two strong syllables.
With such words, a recoguiser provided with the MSS would postulate at least one erroneous word boundary, and would therefore trigger at least one erroneous lexical attempt. Mattys (2000) presented lexical statistics showing that most English words start with a primary stressed syllable, at least when weighted by frequency of occurrence, to argue that the SBS would improve the recogniser's accuracy in comparison with the MSS.
The use of stress cues in segmenting English has also been supported by studies on very young English learners. As a matter of fact, Vroomen and de Gelder (submitted) pointed out that stress pattern was confounded with metrical pattern in some of the studies showing infants' preference for the predominant strong-weak metrical pattern of English words (Jusczyk, Cutler, & Redanz, 1993; see also Echols, Crowhurst and Childers, 1997;Morgan, 1996). For instance, Morgan (1996) showed that nine-month-olds find trochaic sequences more cohesive than iambic sequences when the material included only full vowels, i.e. differed only in duration pattern (long-short vs. short-long). Together with other similar findings (Mattys, Jusczyk, Luce and Morgan, 1999), this suggests that 9-month English-learners use the distinction between primary and secondary stress, rather than the metrical distinction between full and reduced vowels, to compute word boundaries. However, since other studies showed that, by 7.5 months, English learners exploit the metrical strong/weak distinction in extracting words from connected speech (Jusczyk, Houston and Newsome, 1999; see also Jusczyk «fe Aslin, 1995), this issue deserves further investigation.
Whatever the exact basis (stress or metrics) of the rhythmic strategy applied in Dutch and English, the important point is that this strategy has been considered to be similar to the syllabic strategy observed in French, since "both stress in English and the syllable in French are the basis or rhythmic structure in their respective language" (Cutler el al., 1997, p. 147; see also Cutler, Mehler, Norris, «fe Segui, 1992). This led to the hypothesis that listeners apply a universal rhythmic solution to the word-boundary problem, by exploiting whatever rhythmic structure characterising their native language. Language-specific implementations of the universal rhythmic segmentation strategy would root in specific capacities developing from infancy on Cutler, 1994;: Infants enter the world with a "periodicity bias" (Cutler & Mehler, 1993, p. 17) that enables them to pick out the smallest recurring rhythmic regularities from the speech stream, which then allows them to progressively develop discrete lexical entries from the continuous signal.
This view is reminiscent of the early typological work of linguists like Pike (1945) or Abercrombie (1967), who classified languages into rhythmic classes. These authors proposed to distinguish between stress-timed languages (like English and Dutch), which were said to display regular inter-stress inten'als, and syllable-timed languages (like French and other Romance languages), in which syllables, rather than only stressed vowels, were said to recur at regular intervals of time establishing temporal organisation. This way to cluster languages, (hough criticised as too simplistic (e.g., Bertettino, 1989;Dauer, 1983;Nespor, 1990) and not fitting different temporal regularities', is still useful. Indeed, the impression of different rhythmic types may be the by-product of the specific phonological properties of languages. In particular, syllable complexity (Bertinetto, 1981 ;Dauer, 1983) seems to have reliable acoustic/phonetic correlates in speech, like the vowel/consonant temporal ratio (Ramus, Nespor, «fe Mehler, 1999).
The rhythmic segmentation hypothesis proposed by Cutler and colleagues implies that if a language presents a rhythmic structure based on some phonological construct other than the syllabic or stress pattern, this construct should be used in segmentation. This has been illustrated with Japanese, which belongs to a third category of "mora-timed" languages-* (Ladefoged, 1975; see also Port, Dalby, & O'Deli, 1987;. As a matter of fact, Otake, Hatano, Cutler and Mehler (1993) observed in a Japanese fragment detection task that (mora) targets (e.g., ta) were detected as easily in CVC carriers (e.g., taii.shi) as in CV carriers (e.g., ta.ni.slti). hi addition, Japanese listeners often missed the CVC target (Ian) in CV carriers, presumably because this matching requires an internal segmentation of the second mora (ni).
Second, as Kolinsky (1998) further argued, the fragment detection task may tap post-rather than prelexical representations (see also Frauenfelder «fe Content, 1999;Meunier el al., 1997). If this were the case, there is a possible intervention of literacy-induced metaphonological or orthographic representations in the fragment detection task (see also Dupoux «fe Mehler, 1992). Japanese provides the clearer demonstration of this fact. Indeed, Kolinsky (1998) suggested that the "moraic" effect observed by Otake et al. (1993) could alternatively result from the application of an orthographic strategy based on the written kana characters that coincide regularly with the mora structure 5 . Recent evidence supporting this interpretation has been provided by Inagaki, Otake and Hatano (2000). Testing Japanese adults as well as children of various levels of kana literacy with a Japanese version of the fragment-detection task, these authors observed that the mora-based segmentation pattern was strongly associated to kana reading level. This result led the authors to conclude that, as children acquire kana literacy, the Japanese segmentation unit changes from a mixture of syllable-and morabased to a strictly mora-based one.
We may of course accept the notion that literacy affects speech segmentation at an early processing level. Yet, a more conservative view, compatible with other experimental evidence (see discussion in Kolinsky, 1998, and in Morais «fe Kolinsky, 1994), would be that the fragment detection task taps later, post-lexical representations. This further complicates the empirical verification of the notion that Romance and Germanic languages induce radically different segmentation strategies. As a matter of fact, up to now, while the use of a syllabic segmentation routine has been investigated by means of the fragment detection task, the MSS was assessed using other experimental techniques, mainly the word-spotting task. * Regarding the CVC targets, the effect observed by Otake et al. (1993) may also result from phonelic/phonotaclic mismatch between targets and carriers (Nakamura, Kolinsky, Spagnoletti & Morais, 1998).
These two tasks differ according to several parameters. Whereas listeners are requued both to segment words from continuous speech and to access the lexical entries conesponding to these words in word spotting, this is not the case in fragment detection. First, the experimenter has essentially solved the segmentation problem when he/she presents the listeners with isolated carrier words and targets matching the onsets of these carriers (Frauenfelder & Content, 1999). As Keams (1994) pointed out, no conclusion can be taken from these data concerning the idea that French listeners apply a syllabic segmentation procedure to the continuous speech input in everyday contexts. Second, no lexical access attempt is required to perform the fragment detection task 6 . More importantly, as it will be discussed next, the two tasks actually reflect important conceptual differences between the underlying assumptions of the MSS and the original conception of word segmentation that led to develop the fragment detection task.

From Uncovering the Nature of Classification Units to Determining the Cues Indicating Word Boundaries
Direct comparison of Romance and Germanic languages using the same technique is rather scarce. Actually, Cutler and colleagues (reported in Cutler «fe Norris, 1988, p. 114) did perform such a comparison by examining the English alternative to the syllable hypothesis, which would consist in classifying the speech input into feet?. Yet, they did not observe the conesponding effects: for English speakers fragment detection was not faster when the target, for example GAR, corresponded exactly to a foot, as in GARGOYLE, which includes two feet, than when it was smaller than a foot, as in GARgle, which constitutes one foot (but see Echols et al., 1997). The word-spotting task was then designed "to put on a test, in a way that directly measures speech recognition processes, the hypothesis that segmentation for lexical access occurs at strong syllables" (Cutler and Norris, 1988, p. 114). This citation illustrates how the focus of research shifted from uncovering the nature and size of the prelexical unit(s) towards investigating the cue(s) that may best indicate where word boundaries are likely to occur in the speech stream. Indeed, the fragment detection technique was aimed at uncovering the classification representations that are computed from the auditory input lo contact the lexical representations (e.g., Frauenfelder & Tyler, 1987). Under this view, a prelexical representation of the signal is constructed as a sequence of specific units (e.g., feet, syllables, etc.). On the contrary, the word-spotting task was aimed at testing the MSS, which is a segmentation device that does not investigate classification processes. Since its role is merely to indicate where in the speech stream lexical access must be initiated, the MSS is compatible with models of speech perception involving classification as well as with models involving no prelexical units at all.
Despite the different nature and focus of the fragment detection and word-spotting (asks, the syllabic segmentation strategy and the MSS were repeatedly considered as equivalent (although language-specific) solutions applied to the word boundary problem . It was indeed argued that even if the fragment detection task does not directly address the issue of segmentation, the classification of the speech input into any set of units is logically entailed by a segmentation process at the boundaries of these units (Norris <& Cutler, 1985;Cutler & Norris, 1988).
If syllables were important segmentation cues in French, syllabic information should help French listeners to detect word boundaries efficiently in fluent speech. This is what has been shown for metrical cues to support the MSS. This model has been underpinned by computer simulations showing the independent contribution of metrical cues in predicting the accurateness of the speech recognition process (see Norris et al., 1995;Cutler, Norris «fe McQueen, 1996) and by distributional analyses performed on the English and Dutch vocabulary showing a systematic relationship between metrical patterns and word boundary locations (Cutler «fe Carter, 1987;Vroomen «fe de Gelder, 1995; see also Cutler, 1994). These analyses led to the conclusion that the false alarm rate of the MSS would "be low in comparison wilh a lexical segmentation procedure that considered each phoneme or syllable to be a potential word onset location" (Cutler «fe Norris, 1988, p. 114). As has been already discussed, the same arguments were used to support the SBS (e.g., Mattys, 2000). There is to our knowledge no similar argument as regards the syllable in French or other Romance language. On the contrary, a rough look at the distribution of French words according to their number of syllables reveals that no less than 93% of the 37000 words included in the BRULEX database (Content, Mousty, «fe Radeau, 1990) are polysyllabic, the majority being di-or tri-syl labia When only the 1000 most frequent words of this database are token into account, about two third of them include at least two syllables. Thus, a syllabic segmentation procedure in French would lead to a huge number of erroneous lexical attempts. In fluent speech, a pure syllabic segmentation strategy would sometimes run into trouble even for monosyllables: because of fie-quent phenomena like resyllabiftcation and liaison*, syllabic boundaries do not always coincide with word boundaries (e.g., Dejean de la Batie & Bradley, 1995).
One is thus faced with a rather asymmetrical account of speech segmentation, since a syllabic segmentation procedure would be obviously less efficient for French than the MSS (or SBS) for English and Dutch. Does one have to conclude that French is intrinsically harder to parse than English or Dutch? We know of no evidence suggesting that this is the case. Rather, as pointed out by Kolinsky (1998), we should question the notion that cues used for segmentation are necessarily isomorphic to classification units. Alternatively, we hypothesise that segmentation in French is triggered by (at least partially) other cues than syllable boundaries, even if the classification format may be syllabic. In the next section, it will be argued that French listeners exploit stress cues in locating word boundaries, as was already suggested for Dutch and English listeners.

Rhythmic Regularities in French
French has final fixed stress. More precisely, syllables are grouped into right-headed prosodie units that generally do not exceed two or three syllables (Wenk «fe Wioland, 1982). Consequently, most polysyllabic words with non-schwa final syllable bear stress on their last syllable. Stressed syllables are mainly characterised by a sizeable lengthening in comparison with unstressed syllables (Flechter, 1991;Garde, 1968;Tranel, 1987;Vaissière, 1983Vaissière, , 1991Wenk «fe Wioland, 1982). Although this final lengthening is often accompanied by a falling or rising FO movement, it is perceptually salient because the durational increase widely exceeds the perception threshold of duration differences (Rossi, 1972). This led Wenk and Wioland (1982) to characterise French as being "trailer-timed" rather than "syllable-timed": its basic rhythmic unit is an iambic foot characterised by a short-long durational pattern.
As Cutler et al. (1997) proposed, "it might be imagined that fixed stress could provide an excellent cue to word-boundary location" (p. 146, see also Cutler, 1990;Vaissière, 1983;. Such a segmentation device based on s Resyllabificalion occurs when a final-word consonant becomes the onset of the syllable of a following word beginning with a vowel (e.g. par lei syllabified pa.ri.ci). Liaison is a particular instance of resyllabiftcation within which the first word ends in a normally silent consonant, like the HI of the French word petit. This consonant surfaces when il is followed by a vowelinitial word (like in petit air), resulting in the resyllabificalion of the surfaced latent consonant over word boundaries {pe.ti.tair).
stress may even be more effective in fixed-stress languages than in variablestress languages like English or Dutch, in that it constitutes a determinist cue for word boundary location only in the former case.
Several studies have provided support for the idea that stress helps word boundary localisation in French. A pioneering research by Rietveld (1980) showed that French listeners produce reliable suprasegmental contrasts when asked to pronounce phonologically ambiguous sequences like "le couple est complet" (the pair is complete) vs. "le couplet complet" (the complete verse). At the durational level, the critical sequence presented a trochaic (long-short: /kutple/) or an iambic (short-long: /ktiplc:/ 9 ) pattern, respectively. In a subsequent perception experiment in which French listeners were asked to choose between the two semantic interpretations, the durational contrasts provided the best predictor of the participants' responses. Banel and Bacri (1994) addressed more directly the use of syllabic lengthening as word boundary marker in French. When asked report the number of perceived words in ambiguous disyllabic strings, French listeners more often perceived a single word (e.g., /karbo/, crow) when the strings presented the typical French iambic pattern, but two words (e.g., /kor/, body, and /bo/, beattlifid) when the strings presented the unusual trochaic pattern. In a wordspotting task, Banel and Bacri ( 1997) further observed faster detection times of monosyllabic words (e.g., lampe, /lap/) in trochaic sequences (/lâipzak/) than in iambic sequences (/làpzoik/). In the same vein, Baue!, Frauenfelder and Perruchet ( 1998) showed better recognition performance in the learning of an artificial language for trisyllabic "words" that displayed the typical iambic pattern of French word (short-shoi t-long) than for "words" that displayed a trochaic (long-short-short) or neutral (long-long-long) rhythmic pattern.
Taken together, these studies support the idea thai French listeners treat lengthened syllables as word-offset markers, thus postulating a word boundary after stressed syllables. However, the durational contrasts used by Banel and co-workers reproduced the ratio observed between syllables of words pronounced in isolation. Such differences (e.g., 520 milliseconds between short and long syllables) are less likely lo occur in connected speech where temporal contrasts are far less contrasted (e.g., Klatt, 1975).
To further examine the use of syllabic lengthening in the segmentation of continuous and natural speech in French, we presented French listeners with Cutler and Butterfield's (1992) "juncture misperception" method (Goetry, Van de Velde, & Kolinsky, in progress). Participants were asked to write clown what they thought they had heard of part of sentences presented at individual speech perception threshold. The relationship between the rhythmic pattern and the word boundaries of identical phonemic strings (e.g., /livre/) was manipulated across sentences. In one type of sentence (henceforth, trochaic 2-words), the sequence included two (parts of) words and formed a trochaic pattern, e.g., /li:vr#e/ 10 in "j'ai vu que les livres, essentiels à notre époque, se vendaient très mal" ("I saw that the books, essential in our epoch, were sold very badly"). In another sentence type (henceforth, iambic 2-words), the phonemic string also contained two words but formed an iambic pattern, e.g., /li#vre:#/ in "j'ai vu que les lits vrais, sans cesse vantés, sont plus confortables que ces paillasses" ("I saw that the true beds, unceasingly praised, were more comfortable than those pallets"). In a third sentence type (henceforth, iambic I-word), the sequence contained a single disyllabic word and displayed an iambic pattern, e.g. /Iivre:#/ in "j'ai vu que les livrets, censés détenir les comptes du commerce, étaient vides" ("I saw that the booklets, supposed to contain the accounts of the firm, were empty"). According to the idea that in French a trochaic pattern should trigger segmentation while a iambic pattern should not, we predicted that the sequences including two words would be correctly segmented when trochaic but would be often misperceived as a single disyllabic word when iambic. In addition, few segmentation errors were expected for the iambic sequences including one word, since (hen stress pattern should not trigger segmentation. The results confirm largely these predictions by showing that the French listeners produced much more segmentation errors for the iambic 2words sequences than for the two other sequence types. This supports and extends the idea that stress cues assist speech segmentation in French by showing that rhythmic effects emerge for temporal contrasts displayed in continuous and natural speech.

Rhythmic Regularities in Dutch: A Direct Comparison Wilh French
The basic iambic rhythmic structures of French words differs fundamentally from those of "leader-timed" languages (cf. Wenk «fe Wioland, 1982) like Dutch or English, in which stress usually falls on the initial syllables of words (for Dutch, see Vroomen «fe de Gelder, 1997b;Vroomen et al., 1998; for English, see Mattys, 2000). These languages present basic trochaic, leftheaded, rhythmic structures. As already mentioned, direct evidence for the use of stress as a word-onset marker has been provided in Dutch (Vroomen «fe de Gelder, 1997b, submitted;Vroomen et al., 1998).
'"The symbol # indicates a word boundary.

SPEECH SEGMENTATION IN MONOLINGUAL AND BILINGUAL \ 29
Specific segmentation procedures across languages should be assessed with similar tasks (see sections 1.2 and 1.3). The fact that French and Dutch present contrasted stress structures provides a very good testing ground for examining the notion that listeners develop specific segmentation devices upon the recurrent rhythmic or stress regularities displayed in their native language (Cutler, 1985;Cutler et at., 1983Cutler et at., , 1986. To this aim, we used a cross-linguistic design in which Dutch listeners were presented with a Dutch material matched to the French one (Goetry et al., in progress). If Dutch listeners use syllabic lengthening (together with other stress cues) as a word-onset marker, they should display opposite segmentation patterns for the three sequence types compared to those observed for the French listeners. That is, the Dutch listeners should produce much less segmentation errors for the iambic 2-words than for the trochaic 2-words sequences, as well as many segmentation errors for the iambic 1-word sequences. Indeed, the more prominent second syllable of the iambic 2-words sequences was expected to correctly trigger segmentation, thus leading to the correct detection of the medial word boundary. On the contraiy, the less prominent second syllable of the trochaic 2-words sequences should not trigger segmentation and thus should be perceived together with the preceding one as a single disyllabic word. Likewise, we predicted the more prominent second syllable of the iambic 1-word sequences lo erroneously trigger segmentation and induce the perception of two monosyllabic words.
The results computed on both the total set of responses and the relative proportions of segmentation errors corroborated these predictions. As a matter of fact, they showed the reversed rhythmical effect for the two sequence types containing two words, as compared to what was observed for the French listeners. Moreover, about 85% of the segmentation enors made by the Dutch listeners were distributed across the trochaic 2-words and iambic 1-word sequences, in which the rhythmic patterns ate »Kongruent with word boundaries.
A control experiment showed that the difference between the French and Dutch listeners could not be attributed to some acoustical mismatch between the two material sets. Indeed, another group of French listeners presented with the Dutch material displayed similar segmentation patterns as those observed for the French participants tested in their native language, thus differing significantly from the Dutch listeners presented with the same Dutch material.
Taken together, these cross-linguistic results show that rhythmically matched materials lead to opposite segmentation patterns in French and Dutch listeners, thus suggesting that stress cues are exploited in an opposite way in these two languages. In French, a "trailer-timed" language, disyllabic sequences are interpreted as a single word when they display an iambic pattern but as two monosyllabic words when they display a trochaic pattern. In a "leader-timed" language like Dutch, stress cues would indicate the beginning of words, thus leading to perceive a disyllabic trochaic sequence as a single lexical item but an iambic sequence as two monosyllabic words.
Since the different rhythmic structures of French and Dutch seem to induce native monolingual listeners to use opposite stress-based segmentation strategies, one may wonder what happens in bilinguals who master these two languages: may opposite strategies coexist in the same individual? This issue will be discussed in the final part of the present paper.

II. Speech Segmentation in Bilinguals
Although one may consider that "genuine" bilingualism refers to a native and equal competence in more than one language (e.g., Thiery, 1976Thiery, , 1978, most researchers consider different degrees of bilingualism to range along a continuum (e.g., Baetens Beardsmore, 1986; de Groot «fe Kroll, 1997;Giosjean, 1982;Romaine, 1995;Schaerlaekens, 1998). It should be noted that whereas few (if any) people represent the bilingual extremity of this continuum (see e.g., Flege, Yeni-Komshian, & Liu, 1999;Pallier, Sebasthiii-Gallés «fe Colonie, 1999), the opposite extreme, that is, pure monolingualism, is no more representative of the human linguistic knowledge. Indeed, it has been estimated that more than one person out of two uses at least two languages in everyday interactions (Giosjean, 1982;Harris «fe Nelson, 1992). Any general model of speech processing should thus be able to account for the understanding of more than one language.
Paradoxically, snidies relating speech segmentation to bilingualism are scarce, and, as it will be argued, most of them provide us with an incomplete picture of speech processing in bilinguals since they focus on the availability of a syllabic segmentation procedure in these listeners.

ILL The Hypothesis of Restricted, Mutually Exclusive, Segmentation Routines
To our knowledge, Cutler, Mehler, Norris and Segui (1992) conducted the only study that explored the availability of specific segmentation procedures in bilinguals' two languages. These authors presented French-English bilinguals with the experimental situations that had been used previously to demonstrate syllabic parsing in French monolmguals (that is, fragment detection, Mehler et al., 1981) and metrical parsing in English monolmguals (that is, word-spotting, Cutler & Norris, 1988). The participants had learned both languages from the earliest stages of acquisition, spoke both languages daily, and were considered by monolinguals as native speakers in each language.
The fragment detection results showed that neither for the English nor for Ihe French material did the bilingual group as a whole produce results analogous to those of either previously studied monolingual group. To look further for a reflection of their findings with monolinguals, the authors subdivided the bilinguals according to several criteria -including their country of residence, parents' native language, and preferred language -. The factor dividing the bilinguals according to their preferred language (called "dominant" by the authors), which amounted to a decision as to which of their two languages they would be most sorry to lose if they had to, produced interprétable data. Indeed, the "English-dominant" bilinguals showed no syllabic effect in either language, while the "French-dominant" bilinguals produced a syllabic effect in French but not in English. This was interpreted as evidence for the development of a restricted segmentation procedure based on the bilinguals' "dominant" language, in this case, syllabic for the "French-dominant" bilinguals and non-syllabic for the "English-dominant" bilinguals. Cutler et al. found confirmation of the idea that bilinguals develop an adapted segmentation procedure restricted to their "dominant" language with the English word-spotling experiment. In this task, significantly faster target detection in the strong-weak than in the strong-strong items, indicative of the use of the MSS, was observed for the "English-dominant" but not for the "French-dominant" bilinguals.
From the entire set of results, Cutler el al. (1992) concluded that the restricted speech segmentation procedures are mutually exclusive. In other words, bilinguals would behave functionally as monolinguals in some aspects of their processing, hence developing a single restricted segmentation procedure adapted to their "dominant" language.
In the next sections, the two main issues related to Cutler et al.'s (1992) influential claims will be examined more closely, namely the relationship between language dominance and the restricted segmentation procedure (section LL2.) and the mutual exclusivity of the restricted segmentation procedures (sections II.3, II.4 and II.5).

U.2. Is Language Dominance Actually Predictive of the Restricted
Segmentation Procedure ?  were more interested in knowing whether balanced bilinguals can process their two languages in the same way as the respective monolingual groups than in characterising their absolute competence levels.
These authors nevertheless attempted to establish whether the participants' way of responding in the two languages was related to their language dominance. However, we have no idea of the relationship that may exist between the actual competence-related language dominance of these bilinguals and Cutler et al.'s classification criteria assessing the bilinguals' "dominance", which was based on a single forced-choice preference question. Moreover, the participants examined by Cutler et al. were very reluctant to state any preference for one language over the other. In fact, they often claimed to prefer French for some purposes but English for other purposes, hence the necessity to present a forced-choice question (see also Kearns, 1994, for similar concerns). As Giosjean (1998) pointed out, one does not know on which criteria the participants have answered to this difficult question, and, therefore, what kind of variables underlay the different patterns of results observed in the "French-dominant" and "English-dominant" subgroups.
Consequently, the replication of these results with other participants may be very difficult. Kearns (1994, Experiment 5) examined French-English balanced bilinguals comparable lo those tested by , using similar fragment detection tasks and classifying them on the basis on the same question. Surprisingly, she observed a syllabic effect for the French material in the "English-dominant" subgroup but not in the "French-dominant" subgroup (neither of the two groups syllabified the English material). Kearns further reasoned that dominance in balanced bilinguals might be better assessed on the basis of the participants' linguistic background and linguistic habits rather than on the basis of their preferred language. She computed for each bilingual an "Englislmess score" derived from a questionnaire related to the acquisition and use of the two languages. Contrary to what was predicted, correlational analyses showed a positive relationship between this Englislmess score and the syllabic effect in French. Multiple regressions revealed that the first spoken language was the best predictor of the syllabic effect in French: the bilinguals were more likely to show a syllabic effect if they had spoken English first. Further analyses performed on the French data showed a clear syllabic effect for the bilinguals who had spoken English first, but not for those who had spoken French first.
Thus, even if balanced bilinguals seem to separate into two groups of which oidy one shows a syllabic effect in French, the "dominant" language -as defined either in terms of preferred language or in terms of "most native" language -is not predictive of the use of a syllabic segmentation procedure. What would induce only some balanced bilinguals to parse the French input into syllabic units has so far escaped clear understanding.
This problem is further complicated by the extreme difficulty to assess tlie multi-dimensional nature of dominance in this population, if dominance is a meaningful concept at all for such bilinguals (e.g., Giosjean, 1982;Kearns, 1994;Romaine, 1995;Schaerlackens, 1998). As Schaerlaekens (1998) pointed out, "bilingualism is a loose concept that covers many quantitative variants, and within each variant there are also many qualitative nuances" (p. 131). While traditional competence-based measures of dominance may be unsuitable for early balanced bilinguals, some authors have argued that differences in performances in the two languages may nevertheless be revealed under specific conditions. For example, Baetens Beardsmore (1986) and Domic (1978Domic ( , 1981 both pointed out that stress and fatigue can show differences in ease of use of two languages even in early balanced bilinguals. Kearns (1994, Experiments 6 and 7) showed that early balanced bilinguals selected on the same criterion as those used by Cutler el al. (1992) displayed unequal performances in their two languages, as compared to monolinguals, when required to recognize speech under difficult listening conditions (i.e., faint speech or speech in noise). Recent studies on early bilinguals have reported subtle differences between their performances in their two languages as regards both phonetic perception (Bosch, Costa, «fe Sebastian, 2000;Pallier, Bosch, & Sebastian, 1997;Sebastiân-Gallés & Soto-Faraco, 1999) and lexical activation (Pallier el al., 1999). For example, Pallier et al. (1999) examined the abilities of early Spanish-Catalan bilinguals to identify words differing on a vocalic contrast which exists in Catalan (i.e., Id vs. /e/) but not in Spanish (which has only Id). In a lexical decision task, the Spanish dominant bilinguals treated Catalan minimal pairs like IncXd (grand-daughter) -/nets/ (clean) as if they were homophones. That is, they showed a priming repetition effect, suggesting that they did not correctly perceive the Catalan vocalic contrast. By contrast, the Catalan-dominant bilinguals did not show any priming repetition effect for these minimal pairs. Interestingly, no significant overall difference in reaction times and error rates was observed between the two bilingual groups, which thus displayed equivalent competence at the vocabulary level.
It thus seems possible to reveal fine-grained differences between the performances in the two languages of early bilinguals. Yet, neither  nor Kearns (1994, Experiment 5) did present their bilingual participants with tests assessing competence in their two languages. It seems worth to examine in further bilingual studies whether or not there is a relationship between the performance level of bilinguals in their two languages and the monolingual-like speech segmentation procedure they may apply to these languages.
Whether defined in terms of competence or in terms of "nativeness", the dominant language of many other (unbalanced) bilinguals, who have acquired their second language many years after their native language, is much less problematic to determine. Bradley, Sanchès-Casas and Garcia-Aibea (1993) turned to native Spanish speakers who subsequently learned English through immersion in an English-speaking community for a long period (namely, for 18 years on the average). According to the idea that listeners' segmentation routines depend on the rhythmic regularities encoded during childhood (cf. e.g., Cutler «fe , these unbalanced bilinguals were predicted to show a syllabic effect in their native language. Unexpectedly, this group showed no trend to syllabify Spanish. It is difficult to accept the idea that Spanish-English bilinguals abandon their native processing routine when we have no evidence that they have ever applied this routine to their native input. Even if a syllabic effect had been observed with the same material in Spanish monolinguals (Bradley et al., 1993), other experiments have not replicated these results unless listeners were artificially slowed down (Sebastiàn-Gallés et al., 1992). Moreover, the bilinguals' results of Bradley et al. have been replicated in another study conducted by Kearns (1994, Experiment 4) on French native speakers acquiring English as second language. That is, these listeners did not show any sign of syllabification of the French material in a fragment detection lask.
Thus, one is still left with a puzzle. What seems clear is that dominance is not a good predictor of the use of a syllabic segmentation procedure, since late unbalanced bilinguals whose native language is Spanish or French seem not to use a syllabic segmentation strategy to process that language. As Kearns (1994) pointed out, more variables may influence the way in which bilinguals process speech, in comparison to monolinguals, since the former have a wider variety of linguistic knowledge to apply to the input than the latter. For example, bilinguals may not have exactly monolingual-like phonemic categories (Elman, Diehl, Buchwald, 1977;Flege & Eefting, 1987;Hazan & Boulakia, 1993) and phonological representations (Bosch et al., 2000;Sebastidn-Gallés «fe Soto-Faraco, 1999;Pallier et al., 1997;. According to Kearns (1994), this may hold true for other factors that influence the phonological information about syllable boundaries in the speech stream. For example, within the frame of parameter-setting (e.g., Roeper «fe Williams, 1987), Flege (1988) argued that bilinguals establish mean values between the syllabic structures of the first language and those of the target language, which would afford them to display a single system adapted to both languages. The bilinguals' different amount of phonological knowledge, in comparison with monolinguals, may have been responsible, at least partly, for the puzzling picture of results obtained with the fragment detection task. It is therefore difficult to draw conclusions regarding the basic mechanisms underlying speech segmentation in bilinguals on the basis of these sole results.

IL3, What is the Evidence for Arguing That Restricted Segmentation
Procedures Are Mutually Exclusive?
Neither Bradley et al. (1993) nor Kearns (1994) did address the issue of speech processing in the bilingual participants' second language. Thus, these two studies do not provide information regarding Cutler et al.'s (1992) strong claim that bilinguals develop a single restricted segmentation procedure, i.e. that speech segmentation procedures are mutually exclusive. This argument was based on the facl that, in Cutler et al.'s study, the bilinguals who fulfilled their criterion for French "dominance" syllabified the French fragment detection material but did not show evidence of a stress-based segmentation in the English word-spotting experiment, while the bilinguals who fulfilled their criterion of English "dominance" did not syllabify the French fragment detection material but showed evidence for the use of a MSS in the English word-spolting task. However, it should be noted that, in this study, only six participants (out of 24) who took part in the fragment detection study also took part in the word-spotting study. In other words, for most of the sample, different bilingual participants have been examined with the experimental situations assessing the use of a specific segmentation strategy in French and English. It thus could be the case, as suggested by Kearns (1994), that at least some of the bilinguals who were showing evidence for syllabification in French may also have shown evidence of stress-based segmentation in English, and vice versa. This possibility will be further discussed in the two next sections, in the light of some studies conducted on language development in bilingual infants as well as additional speech segmentation data collected on bilingual adults.

Against Mutual Exclusivity of Restricted Segmentation Procedures:
Language Development in Early Bilinguals  claimed that individuals learning two languages simultaneously from birth develop a single restricted segmentation procedure based on the rhythmic regularities of their "dominant" language. If this were the case, one would expect to observe an asymmetrical development of the knowledge of the two languages in bilingual infants, the "dominant" language being prioritised over the other. Bosch and Sebastiân-Gallés (in press) did not find support for this view. Using the familiarisation-preference procedure (Jusczyk «fe Aslin, 1995), they observed that four-month-old infants from bilingual Catalan-Spanish environments were able to discriminate between Spanish and Catalan, even though these two languages share many prosodie features. These results suggest an early capacity to distinguish Ian-guages in simultaneous bilingual exposure.
Moreover, bilinguals do not differ from the "monolingual" infants in showing within-rhythmic class language discrimination abilities by four or five month of age (Bosch «fe Sebastian-Gallés, 1997; see also Nazzi, Jusczyk, <& Johnson, 2000). This suggests that early, simultaneous, exposure to two languages does not delay language discrimination and lexicon pie-compiling abilities. When subsequently faced with the necessity of compiling two lexicons simultaneously, bilingual babies seem to progressively understand and speak their two languages without any difficulty (for reviews, see e.g. de Houwer, 1990;Genesee, Nicoladis, «fe Paradis, 1995).
According to the notion that the languages rhythmic regularities play a determinant role in lexical acquisition, it is reasonable to predict that bilingual infants need to acquire appropriate rhythmic segmentation cues to parse their two languages. Cutler et al.'s (1992) claim that bilinguals would develop a single restricted segmentation procedure fails to explain how these infants are able to successfully develop two lexicons simultaneously.

A Direct, Within-Subject But Betiveen-Languages Comparison: French and Dutch Segmentation in Bilingual Adults
As Cutler (1994) pointed out, "the scale of the segmentation problem in the stnicture of the input is remarkably similar for the infant and for the adult" (p. 87). In other words, adults acquiring a second language are faced with the same auditory input as the infants, from which they must pick out discrete chunks that have to be mapped onto stored representations for recognition. As already argued, universal processes such as implicit segmentation or competition, when considered alone, may not be powerful enough to allow a correct retrieval of the lexical units from the highly variable acoustic signal (Frauenfelder, 1991b;Noms et al., 1995). Nevertheless, language learners become progressively successful at recognising auditory words from then non native language, and some of them become highly proficient bilinguals even if they have acquired their second language postpuberty (e.g., Birdsong, 1992;Coppieters, 1982).
On the basis of the importance of rhythmic cues for speech segmentation in both infants and adults, one may hypothesise that all language learners, including bilinguals, must progressively develop rhythmic solutions adapted to the target language, which may be at least partially similar to the one used by monolingual speakers. Indeed, if some bilinguals were reaching very high proficiency levels in a second language without the help of these rhythmic cues, there is uo a priori reason to believe that monolinguals need to use these cues, or even that these cues play any role in speech segmentation.
Such a view is at odds with Cutler et al.'s (1992) proposal that bilinguals do not develop a restricted segmentation procedure to parse their "non-dominant" language. However, as already mentioned, the evidence on this issue is limited to Cutler et al.'s study in which different bilingual participants took part in the fragment detection and word-spotting experiments. As acknowledged by the aulhors themselves (p. 407), the different processes tapped by the two tasks (see sections 1.2. and 1.3) further complicate the comparison between the two sets of results. This is why we tried to address the issue of the coexistence of specific rhythmic segmentation cues in bilinguals by presenting them wilh identical tasks in their two languages (Goelry et al., in progress). We presented French-Dutch bilinguals, all French-dominant, with both the French and the Dutch materials that had induced opposite segmentation patterns in monolinguals tested in their respective native language (see sections 1.4 and 1.5).
We reasoned that the bilinguals should have developed adapted segmentation cues to correctly segment their two languages. Yet, since French and Dutch monolinguals use stress cues in an opposite way, in bilinguals it may be the case that the stress-based strategy used in the (French) dominant language interferes with the use of an adapted strategy in the (Dutch) non-dominant language. As we expected this influence to be greater if Dutch is acquired later on, we tested two groups of bilinguals: early bilinguals who had acquired French and Dutch before the age of four, and late bilinguals who had French as native language and had acquired Dutch during adolescence.
The results for the French materials showed no significant difference between the segmentation patterns of the two bilingual groups and those of the French monolinguals. That is, both bilingual groups produced more segmentation errors for the iambic 2-words sequences than for the two other sequence types. This is coherent with the idea that bilinguals and monolinguals relied to a similar extent on the typical rhythmic structures of French to segment that language.
For the Dutch materials, the two groups of bilinguals also displayed similar segmentation patterns as those found for the Dutch monolinguals. These results suggest that both bilingual groups were highly familiar to the typical trochaic rhythmic structures of Dutch words and exploited these rhythmic cues to locate word boundaries in the same way as the Dutch monolinguals did. This is even more remarkable if we consider the fact that all the bilinguals, although displaying general listening abilities similar to those of Dutch monolinguals (as assessed by a speeded listening comprehension task), were nevertheless clearly French-dominant. This was documented by the participants' self-ratings as well as by their lower recognition rates of the experimental sentences (presented at speech perception threshold) and their lower performance in a lexical decision task, as compared to the Dutch monolinguals.
The surprising absence of difference between early and late bilinguals' segmentation pallenis for the Dutch material may be related to the fact that the late bilinguals displayed Dutch listening comprehension abilities roughly equivalent to those of the early bilinguals, both in normal and in difficult listening conditions (assessed through a speeded listening comprehension task and through the recognition rate of the experimental sentences, respectively). This absence of difference between early and late bilinguals may reflect the ability of both groups to rely on adapted rhythmic cues. While further investigation is required, our results thus suggest that the use of rhythmic cues may be little affected by age of acquisition for language learners who have reached a certain proficiency threshold in that language, Sanders, Yamada and Neville (1999) provided convergent evidence supporting the idea that, in highly proficient bilinguals, the late acquisition of a second language does not prevent them from exploiting the rhythmic regularities of that language. These authors compared the reaction times and event-related potentials (ERPs) of English monolinguals and Japanese speakers who learned English after the age of 12, but who became highly fluent in English. The participants were presented with normal English sentences or sentences in which words were replaced with pseudowords displaying similar stress patterns as the original words. They were asked to detect target phonemes as well as to locate their position (word-initial vs. word-medial). Both groups of listeners detected initial and non-initial target phoneme (e.g., /n/) significantly faster in words and pseudowords when these displayed the typical English strong-weak stress pattern (e.g., NEctar, WITjiess) than when these displayed the infrequent weak-strong stress pattern (e.g., ngGLECT, igNITE), This indicates that the bilinguals were able to exploit the English stress pattern to perform the detection task, even though they had not been exposed to it before the age of twelve. Thus, monolinguals and bilinguals seem to rely on the same prosodie cues, although the ERPs suggest that bilinguals did so using different neural systems than English native speakers.
The notion that attained proficiency may be more important than age of acquisition in bilinguals' second language processing has also received support from a series of PET studies. Perani el al. (1998) examined two groups of highly proficient bilinguals: Italian-English bilinguals who acquired their second language after the age of 10, and Spanish-Catalan bilinguals who acquired their second language before the age of four. The cortical responses of these bilinguals listening to stories were highly similar for their first and second language, regardless of age of acquisition. This was not the case for low proficiency, late Italian-English bilinguals, who showed very differ-eut patterns of cortical activity when listening to stories in the first vs. second language (Perani et al., 1996; see also Dehaene, Dupoux «fe Mehler, 1997).
Taken together, the three sets of results (Goetry et a!., in progress; Perani et al., 1998;Sanders et al., 1999) thus show little effect of age of acquisition on some aspects of second language processing. Yet, it should be noted that these results do not question the notion that age of acquisition is a major determinanl of attained proficiency in the second language of bilinguals (e.g., Johnson & Newport, 1989). Rather, they suggest that when proficiency in the second language is kept constant, «ige of acquisition perse does not have an impact on second language processing and (macroscopic) brain representations.
Nevertheless, not all aspects of the linguistic knowledge of a second language are equally easy to master, and age of acquisition may have different effects on different types of linguistic abilities. For example, whereas lexical acquisition and semantic processing seem to occur normally even in late learners (e.g., Long, 1990;Weber-Fox <& Neville, 1996), other aspects of the linguistic knowledge show stronger effects of late acquisition. This would be the case for accent (Oyama, 1976;Flege, Munro, & McKay, 1995), unknown phonemic contrasts (Bosch et al., 2000;Pallier et al., 1997Pallier et al., , 1999Sebastiân-Gallés <& Soto-Faraco, 1999), and complex grammatical structures (Johnson «fe Newport, 1991;Newport, 1990; but see Flege et al., 1999, for an alternative account). Similarly, it might be the case that age of acquisition is more critical for acquiring segmentation cues other than the rhythmic one examined in our study. What our results only suggest is that, once a high level of proficiency has been reached in a second language, the learner would be sufficiently attuned to the typical stress pattern of that language to be able to exploit it in speech segmentation.

General Discussion
The central problem of spoken word recognition is to understand how listeners segment continuous speech into discrete portions so efficiently despite the lack of reliable acoustic cues signalling the beginning of words.
Two different proposals have been made regarding the information that assists the speech segmentation processes. The first involves universal mechanisms like lexical competition (e.g., . However, lexical competition fails as a segmentation procedure in many contexts (e.g. for very short or embedded words), and is unable lo explain how infants begin to extract meaningful units from the signal in the absence of lexical knowledge. The second involves some specific knowledge of the phonological and rhythmic characteristics of the native language. According to this proposal, the listeners' speech segmentation routines are critically influenced by the basic rhythmic regularities of their native language (e.g. stress-, syllable-or mora-based). Recent studies have provided support for the notion that stress may also indicate word boundaries and/or initiate lexical access in languages like Dutch (Vroomen «fe de Gelder, 1997b, submitted) or English (Mattys, 2000;Mattys «fe Samuel, 1997. Such language-specific adaptations may be traced back to an innate rhythmic sensitivity (Nazzi, Bertoncini, «fe Mehler, 1998;Ramus, Häuser, Miller, Morris, <& Mehler, 2000; see also Mehler & Christophe, 2000), which is rapidly incremented by language-specific prosodie knowledge of the mother tongue (Bosch & Sebastiân-Gallés, 1997;Nazzi et al., 2000). In English, the sensitivity to the predominant metrical and/or stress pattern of words has been shown to develop between six and nine months (Jusczyk el al., 1993;Mattys et al., 1999;Morgan, 1996;Turk, Jusczyk, & Gerken, 1995). By 7.5 months, English learners already rely on this information to extract whole words (rather than just the salient strong syllables) from fluent speech .
As pointed out by several authors (e.g., Culler, 1990;Cutler et ai., 1997;Jusczyk et al., 1999;Vaissière, 1983; learners of languages other than English may also develop a stress-based segmentation strategy. As far as French is concerned, Jusczyk et al. (1999) suggested that "French learners could possibly use information about lengthening at the ends of words as a marker of word offsets" (Jusczyk et al. 1999, p. 201). Obviously, data on French infants' sensitivity to the stress pattern of their native language are critically lacking, and future research should address this issue.
In the present paper, we discussed studies conducted on adult French listeners which strongly suggest that one of the main acoustic correlates of stress, namely vowel lengthening, may act as a powerful word-offset marker in that language. In addition, we discussed data suggesting that, because of the differing rhythmic structures of their language, French and Dutch adult listeners display opposite segmentation patterns when presented with matched material sets in a similar experimental situation (Goelry et al., in progress). Similar stress cues (vowel lengthening) were used as word-initial boundary markers by Dutch listeners and as word-offset boundary markers by French listeners. This suggests that listeners of both languages use stress cues in locating word boundaries, although they rely on the rhythmic structure typical of their mother tongue, namely on the iambic (short-long) pattern for French listeners and on the trochaic (long-short) pattern for Dutch listeners. These results thus extent, with a direct cross-linguistic comparison, the idea that listeners adopt a universal rhythmic segmentation strategy exploiting the specific regularities of their native language.
The parallel drawn between the use of stress cues in French and Dutch does not deny the fact that stress has a different role in lexical access in these two languages. As a consequence of its fixed final position, stress has no distinctive function in French, so that coding stress in the representation of the words would be completely uninformative. Therefore, it is likely that stress is not represented lexically in French. By contrast, although rare, minimal pairs that only differ by their stress patterns can be found in both Dutch (e.g., 'voornaam -first name -vs. voor'naam -respectively -) and English (e.g., forbear vs. for'bear). At least in Dutch, stress has been shown to constrain lexical activation. For ex«imple, Cutler and van Donselaar (in press; see also Koster «fe Cutler, 1997) showed no facilitation from one member of a minimal stress pair to the other in auditory lexical decision, although reliable repetition priming occurred. In the same study, word-spotting results further showed that mismatching suprasegmental information reduced word activation (e.g., mu'settm received a greater degiee of activation from the fragment intt'zee than from the fragment 'muzee). In English, stress is rather redundant to vowel quality, and therefore would not constrain the initial stages of word activation (Cutler & van Donselaar, in press). For example, Cutler (1986) showed that presentation of either of the two members of a minimal stress pair, e.g. either forbear or for'bear, primed lexical decision to words associated to both of them, like ancestor and tolerate (see also Bond «fe Small, 1983;Small, Simon, «fe Goldberg, 1988). Yet, other data suggest that stress information does play an important role in lexical activation in English. Using the migration paradigm (Kolinsky «fe Morais, 1996), Mattys and Samuel (1997) showed that the vowels of secondary stressed syllables were less likely to migrate in words than in matched nonsense words, whereas no such lexical effect was observed with the vowels of primary stressed syllables. This difference was interpreted as support for the idea that a primary stressed syllable would be processed more autonomously (i.e. with less assistance from the lexicon) than secondary stressed syllable, and thus that primary stress may play a critical role in lexical access in English (see also Mattys & Samuel, 2000). This issue thus clearly deserves further investigation.
In any case, the important point is that the role of stress in lexical access and its potential use as a word-boundary marker are two different issues. Such a view has also been held by Vroomen et al. (1998), who examined word segmentation in Finnish. Finnish has fixed word stress on the initial syllable, and thus stress is probably not represented in the lexical representations of Finnish words. Nevertheless, using both word spotting and a task in which participants had to segment an artificial language, Vroomen et al. showed that a stressed syllable is a reliable segmentation cue for Finnish listeners. Similarly, to our view the fact that stress is probably not part of the lexical representations in French, while it is in Dutch, is not incompatible with the idea that the acoustic correlates of word stress can signal likely word boundaries in both languages.
Relying on rhythmic or stress cues alone would however sometimes lead to incorrect segmentations. For example, while vowel lengthening may constimte a much better word boundary predictor in French than a (pure) syllabic segmentation strategy, it would lead to missegment words ending with a schwa-syllable. Likewise, many Dutch (and English) words do not bear word-initial primary stress or do not begin with a strong syllable. In these cases, it seems likely that listeners might take advantage of other cues that can be indicative of likely word boundaries in the language input.
Indeed, as proposed for example by Church ( 1987), there are several other potential sources of language-specific information that listeners could draw on in segmenting words from fluent speech, such as allophonic, phonotactic, and distributional cues. Phonotactics refers to the constraints on the possible ordering of phonetic segments within morphemes, syllables and words in a language. Similarly, different phonetic realisations (allophones) of the same phoneme are often restricted in terms of the position that they can appear within a word. For example, /(/ will be aspirated at the beginning of English words but not at their end (e.g., Umeda «fe Cocker, 1974). The knowledge of co-occurrence relations between syllables may also be exploited to extract words from fluent speech (e.g., Brent «fe Cartwright, 1996). Recognising and segmenting words has been shown to be helped by these additional cues, including phonotactics (McQueen, 1998;McQueen «fe Cox, 1995;Vitevitch «fe Luce, 1999), distributional regularities (Saffian, Newport, «fe Aslin, 1996), vocalic harmony (Suomi, McQueen, <& Cutler, 1997), and allophonic variation (Yerkey «fe Sawusch, 1993; see also the studies conducted by Frauenfelder et al. reported in Kearns, 1994, andStoujesdijk, 1994, who showed that syllabic parsing may be critically affected by the phonetic make-up of the material).
Such an approach to the word boundary problem, relying on several sources of information that would be extracted and integrated from infancy on, might provide a realistic account of listeners* accuraleness in recognising continuous speech (e.g., Mattys et al., 1999;Morgan «fe Saffian, 1995;Myers et al., 1996;Saffian, Newport «fe Aslin, 1996). Computational models even suggest that a multiple-cue integration approach is more powerful than what would be predicted on the basis of the individual contribution of each cue (Christiansen, Allen, «fe Seidenberg, 1997).
A major challenge for future research will be to characterise the time course at which these different cues become available and the way in which they are combined. The few adult studies in which multiple speech segmeti tation cues were systematically manipulated suggest that rhythmic cues are prioritised: listeners would rely on other potential word boundary markers only when rhythmic cues are absent. For example, Banel and Bacri (1994) observed syllable frequency effects in parsing when stress cues were absent, i.e. in spondees displaying a long-long rhythmic pattern. Likewise, Vroomen et al. (1998) showed that vowel harmony was used for segmentation in Finnish only when stress cues were unavailable. The infants data are also compatible with this view. For example, Mattys et al. (1999) showed that, in case of conflicting rhythmic and phonotactic information, nine-moiith-olds preferentially relied on prosodie cues to locate word boundaries, These investigations may also shed some light on the processes involved in the acquisition of a second language. In the second part of the present paper, we argued that the acquisition of a language-specific rhythmic strategy is a necessary component of second language learning. In particular, it was hypothesised that the typical stress pattern of words, which has been shown to provide a strong cue for word boundary location in monolinguals, should be used by second language learners to allow a certain level of proficiency in their second language. This hypothesis seems supported by our data (Goetry et al., in progress), which show that the typical word stress pattern of the second language seems to be actually exploited in speech segmentation by bilingual listeners who have attained a high proficiency level in that language. Future research should address this issue more systematically as well as the potential impact of proficiency level, for example by examining the sensitivity to the specific rhythmic structures of their second language of several groups of bilinguals displaying various levels of second language comprehension abilities.
We further suggested that, for highly proficient bilinguals, the use of rhythmic cues for speech segmentation in a second language may show weaker effects of age of acquisition than learning other aspects of the linguistic system characterising that language. Yet, this might hold true only for pairs of languages where similar rhythmic word boundary cues can be used. Given the considerable implications of this issue for foreign-language learning, it would be critical to determine exactly what aspects of language-specific segmentation strategies are (or are not) strongly depending on age of acquisition.