Author's Personal Copy - Stony Brook

1y ago
8 Views
2 Downloads
834.95 KB
17 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Genevieve Webb
Transcription

This article was published in an Elsevier journal. The attached copyis furnished to the author for non-commercial research andeducation use, including for instruction at the author’s institution,sharing with colleagues and providing to institution administration.Other uses, including reproduction and distribution, or selling orlicensing copies, or posting to personal, institutional or third partywebsites are prohibited.In most cases authors are permitted to post their version of thearticle (e.g. in Word or Tex form) to their personal website orinstitutional repository. Authors requiring further informationregarding Elsevier’s archiving and manuscript policies areencouraged to visit:http://www.elsevier.com/copyright

Author's personal copyAvailable online at www.sciencedirect.comSpeech Communication 50 (2008) 163–178www.elsevier.com/locate/specomAdapting speaking after evidence of misrecognition:Local and global hyperarticulationAmanda J. Stentaa,c,*,Marie K. Huffman b, Susan E. Brennana,cDepartment of Computer Science, State University of New York at Stony Brook, Stony Brook, NY 11794, USAbDepartment of Linguistics, State University of New York at Stony Brook, Stony Brook, NY 11794, USAcDepartment of Psychology, State University of New York at Stony Brook, Stony Brook, NY 11794, USAReceived 14 November 2006; received in revised form 27 July 2007; accepted 28 July 2007AbstractIn this paper we examine the two-way relationship between hyperarticulation and evidence of misrecognition of computer-directedspeech. We report the results of an experiment in which speakers spoke to a simulated speech recognizer and received text feedback aboutwhat had been ‘‘recognized’’. At pre-determined points in the dialog, recognition errors were staged, and speakers made repairs. Eachrepair utterance was paired with the utterance preceding the staged recognition error and coded for adaptations associated with hyperarticulate speech: speaking rate and phonetically clear speech. Our results demonstrate that hyperarticulation is a targeted and flexibleadaptation rather than a generalized and stable mode of speaking. Hyperarticulation increases after evidence of misrecognition and thendecays gradually over several turns in the absence of further misrecognitions. When repairing misrecognized speech, speakers are morelikely to clearly articulate constituents that were apparently misrecognized than those either before or after the troublesome constituents,and more likely to clearly articulate content words than function words. Finally, we found no negative impact of hyperarticulation onspeech recognition performance.Published by Elsevier B.V.Keywords: Hyperarticulation; Clear speech; Speaking rate; Adaptation in speaking; Speech recognition; Spoken dialog1. IntroductionSpeech recognition technology has made its way intomany telephone and information applications in wide useby the general public; people routinely encounter theoption of speaking to a machine when they request phonenumbers, make collect calls, and seek information aboutschedules, events, or accounts. Most speech applicationsused by the public achieve acceptable performance by*Corresponding author. Address: Department of Computer Science,Stony Brook University, Stony Brook, NY 11794-4400, USA. Tel.: 1 631335 2849; fax: 1 631 632 8334.E-mail addresses: amanda.stent@stonybrook.edu, amanda.stent@gmail.com (A.J. Stent), marie.huffman@stonybrook.edu (M.K. Huffman),susan.brennan@stonybrook.edu (S.E. Brennan).0167-6393/ - see front matter Published by Elsevier B.V.doi:10.1016/j.specom.2007.07.005strongly constraining what users can say—for instance byasking users questions with yes or no answers or by presenting menus containing just a few items with short labelsthat users are invited to repeat. By seizing most or all of theinitiative, spoken dialog systems increase the likelihoodthat input utterances will be predictable and recognizable(Schmandt and Arons, 1984; Schmandt and Hulteen,1982). In contrast, applications that recognize spontaneous, unconstrained utterances, such as dictation programs,have many fewer users, who need to be motivated enoughto co-train with a particular application over time.A long-standing goal of the speech and dialog researchcommunities has been to enable less constrained, more flexible, mixed-initiative interaction with spoken dialog systems (e.g., Allen et al., 2001; Gorin et al., 2002); this goalhas yet to be realized. The problem is that speech is highlyvariable. In addition to those variations characteristic of

Author's personal copy164A.J. Stent et al. / Speech Communication 50 (2008) 163–178individual speakers (e.g., voice quality, dialect, and idiosyncratic pronunciation), there is variation in lexical choiceand choice of syntactic structures, as well as prosodic orarticulatory variability (due, e.g., to emphasis, affect, fluency, or even the speaker having a cold). Generally speaking, variability is associated with error: larger vocabulariesand greater syntactic flexibility are associated with higherperplexity and, correspondingly, with higher word errorrates (Huang et al., 2001), and disfluent or fragmentedutterances, with recognition errors (Core and Schubert,1999). To the extent that a source of variability is systematic, it can be described and modeled, which (in theoryat least) should lead to ways in which to handle itsuccessfully.Through the experiment presented in this paper, weexamine the causes and consequences of a kind of adaptivevariation in speaking that has been loosely labeled hyperarticulation. When speakers believe that their addressees cannot understand them, they adapt in a variety of ways, suchas by speaking more slowly, more loudly, and more clearly.Speakers have been found to adapt their speech to babies(Fernald and Simon, 1984), to foreigners (Ferguson,1975; Sikveland, 2006), in noisy rooms (Summers et al.,1988) or on cell phones, as well as to computer-basedspeech recognizers. Each of these situations inspires a setof distinct but overlapping adaptations (see Oviatt et al.,1998a,b for discussion). For example, utterances directedto young children as well as those directed to speech recognizers tend to be shorter than those to adults; at the sametime, child-directed speech typically has expanded pitchcontours (Fernald and Simon, 1984) while machine-directed speech does not. Although hyperarticulation canimprove intelligibility in speech directed at people (Cutlerand Butterfield, 1990; Picheny et al., 1985), especially inthe listener’s native language (Bradlow and Bent, 2002),it can also result in increased error rates in automatedspeech recognizers (Shriberg et al., 1992; Soltau and Waibel, 1998; Wade et al., 1992).The relationship between hyperarticulation in speakingand misrecognition by computers is thought to be bi-directional. This relationship has been described by some as aspiral in which evidence of misrecognition causes speakersto hyperarticulate, in turn causing even more recognitionerrors (e.g., Hirschberg et al., 1999; Levow, 1998; Oviattet al., 1998a; Soltau and Waibel, 2000b). For example, inone study of machine speech recognition, an utterance produced right after a misrecognized utterance was itself misrecognized 44% of the time, compared to only 16% whenproduced after a correctly recognized utterance (Levow,1998). Because of such observations, it has been widely presumed that increased error rates in automatic speech recognition are due to hyperarticulation. However there is ashortage of systematic data documenting the effects of specific features of hyperarticulation on speech recognitionperformance, as well as the persistence or actual timecourse of this kind of adaptation over the course of ahuman–machine dialog.1.1. Elements of hyperarticulationHyperarticulation is really an umbrella term for manydifferent adaptations in speaking, including prosodic adaptations due to speaking more slowly, pausing more often,and speaking more loudly, as well as segmental adaptationsdue to replacing reduced or assimilated forms of vowelsand consonants with more canonical forms. As used inthe literature, the term hyperarticulation is sometimes equated with clear speech, and often contrasted with casualspeech (e.g., Moon and Lindblom, 1994) or conversationalspeech (e.g., Picheny et al., 1986; Levow, 1998; Krauseand Braida, 2004). But the distinction is not a simple binary one. Hyperarticulate speech is a gradient phenomenon(e.g., Moon and Lindblom, 1994; Oviatt et al., 1998b); theproperties of speech that vary during hyperarticulation donot all vary at the same rates or under the same conditions.Perhaps the most detailed analyses of both prosodic andsegmental aspects of hyperarticulate speech have been provided by Oviatt and colleagues (Oviatt et al., 1998a,b).These studies examined the duration of utterances, segments and pauses; pause frequency; F0 minimum, maximum, range and average; amplitude; intonation contour;and the incidence of these segmental features: stop consonant release, /t/ flapping, vowel quality, and segment deletion. These studies used a simulated (‘‘Wizard of Oz’’)multimodal spoken dialog system and a form-filling task.Users were given staged error messages at random pointsin the dialog; this elicited matched pairs of short utteranceswith the same wording by the same speaker, producedbefore and after evidence of speech recognition error. Ina corpus of 250 paired utterances, speakers spoke moreslowly (by about 49 ms/syllable) and paused longer andmore often after evidence of recognition failure thanbefore, whether they experienced high (20%) or low(6.5%) error rates; this hyperarticulation was not accompanied by much variation in amplitude and pitch (Oviattet al., 1998b). Only the speakers who experienced thehigher error rate produced clearer phonetic segments(e.g., released stop consonants) after error messages thanbefore (Oviatt et al., 1998b).The second study in this series by Oviatt and colleaguesprovided acoustic evidence that hyperarticulation in speechto machines is targeted to the perceived problem within anutterance, rather than produced as a persistent, non-specific adaptation in speaking style. A somewhat larger corpus of 638 pairs of utterances produced by 20 speakers(and elicited using the same task, the same simulated-errortechnique, and a 15% error rate, with errors distributedrandomly during the dialog) yielded consistent increasesin features of hyperarticulation across paired utterances(Oviatt et al., 1998b). These included prosodic adaptationssuch as increased duration and pausing as well as segmentally clearer forms on 6% of repetitions. In a further analysis of 96 paired utterances, speakers hyperarticulated mostduring the part of the repaired utterance perceived to havebeen problematic (Oviatt et al., 1998b). That is, speech at

Author's personal copyA.J. Stent et al. / Speech Communication 50 (2008) 163–178the focal area of the repair was greater in pitch range(11%), amplitude (1%), pausing (149%), and duration(11%) than adjacent segments before or after.Another study (Levow, 1999) analyzed spontaneouscommands directed at an interactive, working system (theSun Microsystems SpeechActs system). Utterances produced following two types of system error were analyzed:those after a general failure (where the system simply indicated that it could not process the input) and those after amisrecognition (in which part of the utterance was correctly recognized and part was not, as evident from thefeedback provided to the user). Both types of error resultedin more pausing and longer word durations, particularly inutterance final position, but the effect was stronger aftermisrecognition errors. In addition to these prosodicchanges, there were also segmental changes during repairs,in the form of higher incidence of full vowels and releasedstop consonants. These segmental changes apparently didnot depend on the type of the preceding error (misrecognition or general failure).1.2. Effects of hyperarticulation on automatic speechrecognitionAlthough hyperarticulation has been widely blamed forspeech recognition errors (e.g., Levow, 1999; Oviatt et al.,1998a,b), the effect is by no means large, determinate, orwell understood. Relatively few studies have systematicallyexamined the effects of hyperarticulate speech on automated speech recognition (ASR). One set of studies (Shriberg et al., 1992; Wade et al., 1992) looked at dialogswith a working spoken dialog system, DECIPHER , withwhich speakers were able to take substantial initiativeand produce spontaneous, relatively long utterances.Speakers experienced a higher word error rate in their firstsession with DECIPHER than their second (20.4% vs.16.1%), suggesting that they successfully adapted theirspeech to the system over time. In this experiment, thespeakers’ recorded utterances were subjectively categorizedby human raters on a three-point scale as natural-sounding, hyperarticulated in portions, or completely hyperarticulated and then re-processed through the speech recognizerwith a bigram and a ‘nogram’ language model. Overall,reduction in word error rate for utterances from the firstto the second session was about 4% regardless of languagemodel, suggesting that the reduction in word error rate wasdue to speakers’ prosodic and segmental adaptation ratherthan any adaptation to the system’s grammar.Most speakers in that experiment actually reduced theiruse of hyperarticulation from the first to the second session. However, improved performance by DECIPHERwas due not only to reduced frequency of hyperarticulation, but also to adaptation in the nature of hyperarticulation. While utterances rated as strongly hyperarticulatedyielded higher word error rates than ones not so rated, eventhe strongly hyperarticulated utterances from the secondsession were better recognized than the strongly hyperarticTMTMTM165ulated ones from the first session (Wade et al., 1992). Wadeet al. documented that over time, the hyperarticulatedutterances actually became more acoustically similar tothe data on which the speech recognizer had originally beentrained (whereas the natural-sounding utterances did not).This set of findings highlights the need to better understandjust what about the broad category of ‘‘hyperarticulation’’is detrimental to speech recognizer performance; the mapping of speaking style to word error rate is not a simpleone.Another study that used corpora of utterances directedto a working spoken dialog system (the TOOT and W9 corpora, Hirschberg et al., 1999, 2000, 2004) analyzed theacoustic–prosodic characteristics of recognized vs. misrecognized utterances. Significant differences were found inloudness, pitch excursion, utterance length, pausing, andspeaking rate. As in the studies by Shriberg et al. (1992)and Wade et al. (1992), utterances were rated subjectivelyon a three-point scale as to whether they sounded hyperarticulated. Utterances rated as sounding hyperarticulatedwere more likely to have been misrecognized, and misrecognized utterances had higher hyperarticulation ratings;moreover, utterances rated as not hyperarticulated weremore likely to have been misrecognized when they werehigher on objective loudness, pitch, and durational measures (Hirschberg et al., 1999). In follow-on work, Hirschberg et al. identified and labeled corrections in thesecorpora; compared to non-corrections, corrections weresignificantly longer and louder and had a slower speakingrate, longer prior pause, higher pitch and less silence.52% of corrections vs. 12% of non-corrections were subjectively rated as sounding hyperarticulated. Corrections weremore likely to be misrecognized than non-corrections, andhyperarticulated corrections than non-hyperarticulatedones. However, the number of misrecognized correctionsvaried by type, with corrections including additional information and paraphrases being misrecognized at higherrates than repetitions and corrections omitting information(Litman et al., 2006).A third set of studies confirmed that hyperarticulationlowers word accuracy in ASR (Soltau and Waibel, 1998,2000a,b). These studies elicited a corpus of highly confusable word pairs in either German or English as baselinepronunciations, for comparison with pronunciations aftersimulated evidence of error in a dictation task. TheseASR studies measured not only adaptation in speaking ratebut also hyperarticulation of phonetic segments. In English, phone duration increased by 28% on average (44%for voiced plosives but only 16% for vowels; Soltau andWaibel, 2000a) and in German, 20% (with the greatestincreases for voiced consonants and schwa sounds, Soltauand Waibel, 2000b). Recognition of before and after errortokens of isolated words was compared using the JANUSII speech recognition toolkit (with a 60K vocabulary forGerman and a 30K vocabulary for English). These studiesreport on the order of 30% more errors in hyperarticulatethan casual speech (Soltau and Waibel, 2000a).

Author's personal copy166A.J. Stent et al. / Speech Communication 50 (2008) 163–1781.3. Strategies for avoiding hyperarticulationAlthough users of spoken dialog systems are oftenexplicitly instructed to speak naturally, it is questionablewhether this strategy works for minimizing misrecognition.For example, when speakers in one study were told not to‘‘overenunciate’’, they produced utterances that yieldedlower subjective ratings of hyperarticulation, and yet thisadjustment did not result in reliably lower ASR error rates(Shriberg et al., 1992).Oviatt and colleagues (Oviatt et al., 1998b) also lookedfor prosodic and segmental differences in speech inresponse to three different kinds of error messages: thosefor which users saw only the message ‘‘?’’, those forwhich the system apparently substituted a related (semantically plausible) word in the utterance, and those for whichit substituted an unrelated (semantically implausible) word.These situations (experienced by all the speakers) led to nodifferences in prosodic or segmental measures ofhyperarticulation.To summarize, previous research on hyperarticulationin human–computer interaction has shown that whenspeakers experience misrecognition, they adapt by exaggerating their speech: speaking more loudly and more slowly,with greater variety in pitch, and with greater attentionpaid to the articulation of certain phonemes. Speakersfocus their hyperarticulation on the part of the utterancethat was misrecognized. The impact on speech recognitionperformance is unclear: Misrecognized utterances exhibitfeatures of hyperarticulation, and on isolated word taskshyperarticulate tokens are more likely to be misrecognized.On the other hand, in spoken dialog to a computer whereusers can produce continuous speech, reduced word errorrates over time are partly due to adapation in the natureof hyperarticulate speech and to syntactic and lexical adaptation, as well as to reduction in the amount of hyperarticulate speech.1.4. Rationale and predictionsOur goal for the current project was to investigate:(1) How speakers adapt spontaneous speech directed atspoken dialog systems after they receive evidence ofmisrecognition. When speakers encounter evidencethat an utterance was misrecognized, they shouldrepair by repeating the utterance more slowly; andforms that had been relaxed in the ‘‘before’’ utteranceshould tend to be replaced by clear forms in the‘‘after’’ version.(2) How long adaptations in response to evidence of misrecognition persist during a dialog. We expected thatsegmental adaptations would be targeted to troublesome parts of the utterance (local adaptation); we wereinterested in whether segmental and prosodic adaptations would persist over turns (global adaptation). Wewere particularly interested in whether hyperarticula-tion to a computer is like a ‘‘switch’’: an adaptationthat, once turned on, persists mostly independent oflater system behavior (as suggested by the notion of‘‘spiraling errors’’); or whether it is like a ‘‘dial’’ thatis adjusted gradually during the interaction.(3) When (or whether) adaptations in response to evidence of misrecognition cause problems for speechrecognition. We investigated the effects of hyperarticulation on ASR systems trained on broadcast speechand conversational speech and configured with different statistical language models (word list, unigram,bigram, and trigram), as well as for a grammar-basedASR. We were not primarily interested in staging acompetition between ASR systems, but in establishing whether the features of hyperarticulate speechare really as severe a problem as has been assumed,and which features of hyperarticulation (prosodic orsegmental) are problematic.Because speech read aloud has different prosodic, segmental, and fluency characteristics than spontaneousspeech, and because we wanted to examine speech generated by speakers who were trying to repair errors, we didnot have speakers read sentences aloud, as in most othercontrolled studies of hyperarticulation (e.g., Harnsbergerand Goshert, 2000; Johnson et al., 1993). We used a Wizard-of-Oz procedure (adapted from Brennan, 1991, 1996;Oviatt et al., 1998a,b) to collect a corpus of spontaneousutterances from naive volunteer speakers who were led tobelieve that they were interacting with an ASR in orderto enter information into a computerized database. In fact,the system’s responses were simulated by a human operatorbehind the scenes. To elicit paired tokens with identical lexical and syntactic content from each speaker that could becompared for hyperarticulation, we adapted Oviatt andcolleagues’ (Oviatt et al., 1998a,b) and Soltau and Waibel’s(1998) method of simulating errors by providing spuriouserror messages so that speakers would spontaneouslyrepeat utterances.We wished to extend Oviatt and colleagues’ findings bylooking not only at focal prosodic adaptations withinrepairs, but also at segmental adjustments before, during,and after the problematic word(s). Unlike Oviatt et al.(1998a,b), our errors appeared at pre-planned locationsin the dialog for all the speakers, so that we could examinedesignated target words for hyperarticulation. This was animportant property of the corpus we collected, as it enabledus to systematically conduct both local and global analysesof the persistence of hyperarticulation by multiple speakers, over multiple utterances, and across parts of the dialogthat had higher and lower incidence of errors. We alsowished to extend previous research on the impact of hyperarticulation in spoken dialog to a computer by looking atthe impact of hyperarticulate speech on automatic speechrecognition.We used a task that enabled us to elicit spontaneousspeech in the form of complete sentences containing multi-

Author's personal copyA.J. Stent et al. / Speech Communication 50 (2008) 163–178ple tokens of words with specific phonetic segments, articulated within controlled contexts. This is difficult to do, butnot impossible (e.g., Brennan, 1996; Kraljic and Brennan,2005). Fortunately, what speakers choose to say can beconstrained implicitly by the dialog context to some degree.Previous studies of lexical and syntactic entrainment havedemonstrated that speakers are strongly influenced by adialog partner’s words and syntax and tend to re-use theseelements (Brennan, 1991, 1996; Brennan and Clark, 1996).In fact, the tendency to entrain on a partner’s wording andsyntax occurs not only with human partners, but also withcomputer partners (Brennan, 1991), and this is truewhether the currency of communication is speech or text(Brennan, 1996).We aimed to collect a speech corpus that met the following criteria: it should contain (1) spontaneous speech, (2) inthe form of sentences, (3) by multiple speakers, (4) who produced target words with particular phonetic segments, (5) inrelatively controlled phonetic environments, (6) in a dialogcontext in which they received responses contingent upontheir utterances, (7) enabling us to collect paired tokens ofthe same utterance, before and after the speaker receivedevidence that the utterance was misrecognized.2. Method2.1. Task and setupWe designed an information-entry task to elicit spontaneously planned yet predictable utterances. Participatingspeakers were supplied with a one-page spreadsheet depicting a database of a hypothetical children’s softball teamcontaining the children’s names, positions on the team,parents’ occupations, and what the children would bringto sell at two fund-raising events (a food and kitchen itemssale and a garage sale). Speakers were to use this spreadsheet to look up the answers to questions they would beasked and present their answers by speaking (followingthe procedure in Brennan, 1996). They were told to answerin complete sentences. Feedback from the ‘‘dialog system’’was provided as text messages. When the speaker made aspeaking error (for example, using an incomplete sentenceor abandoning an utterance), the system produced an167unplanned error message (e.g., ‘‘Complete sentences are necessary for the database – please repeat’’). In other cases, thesystem displayed a message in the form ‘‘You said:’’, followed by a transcription of the participant’s utterance.Sometimes, when the utterance was the site of a plannederror, the transcription would contain a ‘‘misrecognition’’1–6 words long. This was done to localize the site of themisrecognition within the utterance and the interaction.By analyzing speech before, during, and after these misrecognition sites, we hoped to discern the time course of hyperarticulation adaptations. Table 1 shows example plannedand unplanned errors.The planned errors were distributed throughout the dialog as follows: of the 66 total answer turns to be providedby speakers, #1–24 were associated with only two plannederrors (an 8.3% error rate), utterances #25–47 with eightplanned errors (a 33% error rate), and utterances #48–66with nine planned errors (a 50% error rate that began witha highly salient run of five errors in a row, followed by arun of five non-errors and then some alternation). To makethe recognition errors as realistic as possible, we had aresearch assistant produce the answers to the questionswhile talking to computer-based dictation software; theplanned error messages in the experiment were based onthe resulting misrecognitions.We elicited spontaneous spoken sentences by havingspeakers answer questions heard over a headset, prerecorded in a female voice (e.g., ‘‘Who is the catcher ofthe softball team?’’). We made this modality distinctionbetween the questions (speech) and system responses (text)because we wanted speakers to consider what they heardover the headphones to be prompts that did not necessarilyoriginate from the system. This way, the recognition systemwould not be assumed to be a partner with prior knowledge of the discourse, and speakers would be less likelyto use pronouns and ellipses.2.2. Materials and stimulus wordsWe identified categories of sounds that can be spokenwith both clear and relaxed forms and that would enableus to quantify hyperarticulation. The first is mid-word /t/before an unstressed vowel, as in water. In relaxed speech,Table 1Examples of planned and unplanned error/repair sequences (Q Question; U Uset’s answer; S System’s feedback response)Planned errorUnplanned errorQ. What is Kate Tolstoy bringing to the food sale?U. Kate Tolstoy is bringing some cookie dough and a picnic tableto the food saleS. You said: Kate Tolstoy is bringing some cooking label in a pickleto the food saleU. Kate Tolstoy is bringing some cookie dough and a picnic tableto the food saleS. You said: Kate Tolstoy is bringing some cookie dough and apicnic table to the food saleQ. What is Ryan Dade bringing to the food sale?U. Ryan Dade is bringing cat collars, and a basket, and pet toys to the food, to thegarage sale, oopsS. Please repeatU. Ryan Dade is bringing cat collars, a basket, and pet toys to the garage sale

Author's personal copy168A.J. Stent et al. / Speech Communication 50 (2008) 163–178a /t/ in this context is said as a ‘‘flap’’, which is short and/d/-like, while in clear speech this would be a strong voiceless [t]. The next is word-final /t/ (as in cat), which inrelaxed speech is produced without audible noise at theend of the oral closure for the sound; a clear speech formwould have audible noise as oral pressure is released afteroral closure. The third /t/ variant is mid-word /t/ after n,which may be absent in relaxed forms (as when wintersounds much like winner), and clearly voiceless andreleased in clear forms. As noted earlier, Levow (1999)and Oviatt et al. (1998a,b) found consonant release andunflapped /t/ to occur more frequently in corrections aftersystem misrecognitions. These same features have also beenreported in nonsense sentences when subjects are told tospeak more clearly, as for a hearing impaired or non-nativespeaking partner (Picheny et al., 1986; Krause and Braida,2004). The latter studies also report a higher occurrence offull vowels in function words, as opposed to the reducedvowel schwa. In our materials the indefinite article aoccurred very frequently, and could thus be examined forchanges in vowel quality. Finally, the d in the word andmay be unpronounced in a relaxed form of the word, oraudibly produced in a clear form of the word. Since ahighly frequent variant of this word is the relaxed formwith no /d/ (e.g., Bell et al., 2003 report that in the Switchboard corpus, /d/ was articulated in and only 14% of thetime), presence of the /d/ was taken as a sign of clearer,or hyperarticulated, speech.We chose a set of target words containing several tokensfor each of these sound categories. The database providedto experiment participants contained these target words(see Table 2). One of the experimenters recorded a set of66 prompting questions about this database; the questionswere worded to evoke answers in

Author's personal copy Adapting speaking after evidence of misrecognition: Local and global hyperarticulation Amanda J. Stent a,c, *, Marie K. Hu!man b, Susan E. Brennan a,c a Department of Comput er Science, State Universit y of New York at Stony Brook, NY 11794, USA b Department ofLinguistics, State University New York at Stony Brook, NY 11794, USA c Departme nt ofPsychology , State .

Related Documents:

Stony Brook University Stony Brook, NY 11794-2350. 2 CONTENTS 1. Introduction 3 2. Degree Requirements for Electrical Engineering 5 2.1 ABET Requirements for the Major 5 2.2 Stony Brook Curriculum (SBC) 6 . Stony Brook electrical engineering students may work as interns in engineering and high-technology industries

2014- Co-founding Director, Innovative Global Energy Solutions Center, Stony Brook University 2012-2013 Vice President for Research and Chief Research Officer (1.5 years), Stony Brook University 2007-2012 Chair, Department of Chemistry, Stony Brook University 2002- Professor, Department of Chemistry, Stony Brook University .

3Department of Materials Science and Engineering, Georgia Institute of Technology, Atlanta, GA USA 4Department of Chemistry, Stony Brook University, Stony Brook, NY USA 5Department of Materials Science and Engineering, Stony Brook University, Stony Brook, NY USA 6Energy Sciences Directorate,

Vivek Kulkarni Stony Brook University, USA vvkulkarni@cs.stonybrook.edu Rami Al-Rfou Stony Brook University, USA ralrfou@cs.stonybrook.edu Bryan Perozzi Stony Brook University, USA bperozzi@cs.stonybrook.edu Steven Skiena Stony Brook University, USA skiena@cs.stonybrook.edu ABSTRACT

Embury Gordon 10632 50StEdm 892-2875 COUNTY OF PARKLAND General Office Stony Plain 963-2231 Emergency Services Only Stony Plain963-9111 PatrolDepartment Stony Plain963-2730 DogPound Stony Plain963-5200 Cowan Roy 892-3086 Cravrford Wm RR2Duffi ld 892-2502 CRIME STOPPERS (No Ctiarge Dial) 1-800-922-8477 Oitchley Douglas 892-3196 Croft Wanda 892 .

Modelling attention control using a convolutional neural network designed after the ventral visual pathway Chen-Ping Yua,c, Huidong Liua, Dimitrios Samarasa and Gregory J. Zelinskya,b aDepartment of Computer Science, Stony Brook University, Stony Brook, NY, USA; bDepartment of Psychology, Stony Brook University, Stony Brook, NY, USA; cD

BSW PROGRAM. Undergraduate Student Handbook. 2020 - 2021. School of Social Welfare Health Sciences Center, Level 2, Room 092. Stony Brook University Stony Brook, New York 11794-8231. Stony Brook University/SUNY is an affirmative action, equal opportunity educator and employer.

This section contains a list of skills that the students will be working on while reading and completing the tasks. Targeted vocabulary words have been identified. There are links to videos to provide students with the necessary background knowledge. There is a Student Choice Board in which students will select to complete 4 out of the 9 activities. Student answer sheets are provided for .