1) Speech Articulation And The Sounds Of Speech. 2) The .

2y ago
119 Views
6 Downloads
2.20 MB
32 Pages
Last View : 2d ago
Last Download : 2m ago
Upload by : Xander Jaffe
Transcription

Overview1) Speech articulation and the sounds of speech.2) The acoustic structure of speech.3) The classic problems in understanding speechperception: segmentation, units, and variability.4) Basic perceptual data and the mapping of sound tophoneme.5) Higher level influences on perception.6) Physiology of speech perception and language.9/8/11PSY 719 - Speech1

Vocal Tract9/8/11PSY 719 - Speech2

Articulation - 1The speech signal is the result of the movement of thetongue, lips, jaw, and vocal cords in modifying the airstream from the lungs.The movement of the articulators takes time (they haveinertia). The movements are rapid (for communication).This leads to the phenomena of coarticulation. Thepreceding segment alters the precise realization of thecurrent segment (inertia or perseveration). The nextsegment also alters how the current segment is realized(planning or anticipation).9/8/11PSY 719 - Speech3

Articulation - 2For example, when a vowel sound is produced, the vocalcords vibrate and the tongue is in a particular positionwithin the oral cavity. The lips are open (either spread orrounded).For a nasal consonant, such as /m/, the uvula is pulleddown and sound is allowed to resonate (flow) through thenasal cavity. During /m/ production, the lips are closed fora brief interval.For a fricative consonant, such as /s/, the vocal folds areheld open (they do not vibrate) and air is forced through anarrow opening between the tongue and the alveolar ridge.This produces the noise quality of /s/.9/8/11PSY 719 - Speech4

Articulation - 3Basic dimensions of articulation:1) Voicing - Vocal folds vibrate or are held open(voiced or voiceless)2) Nasalization - Nasalized or not (uvula closed)3) Place - Location in vocal tract of constrictionbilabial, labiodental, inter-dental, alveolar, palatal,velar, glottal4) Manner - Degree of constriction in vocal tract(open, moderate, constricted and closed)9/8/11PSY 719 - Speech5

PhonemesPhonemes are the smallest segment of the signal that, ifchanged, would produce a different word with a differentmeaning. Thus, while words carry meaning, phonemesare the units from which words are built. /m/ and /b/ aredifferent phonemes in English because /mQd/ (mad)and /bQd/ (bad) are different words.Different languages have different numbers of phonemes(Hawaiian has 11, Midwestern American English has 39),but all come from a universal set.All languages divide their inventory of phonemes intovowels and consonants. All languages group phonemesinto sequences to form syllables.9/8/11PSY 719 - Speech6

Phonemes - 2Every phoneme represents the coordinated movement ofthe articulators that results in a different sound. Becausethe movement from one phoneme to the next iscontinuous, the precise sound that represents aphoneme varies with the nature of what precedes andfollows it. This phenomenon is called coarticulation.The position of the articulators is similar to the differentpipes in an organ. The position of the tongue, lips andjaw produces a set of resonators (tubes with a particularlength and area). These amplify some frequencies andattenuate others. The pattern of this frequencyinformation, over time, is speech.9/8/11PSY 719 - Speech7

Some TermsThe terms to describe the segments include allophonic,phonetic, phonemic and phonological:Phonetic – a description of the sound segment thatincludes details of production. [ph] vs [p]Phonemic – a description of the sound segment whereanything predictable is omitted. /p/Allophonic – variations within a segment category(phonetic) that do no change the identity of the segment(phonemic).Phonological – The sound segment inventory andconstraints on sequences.9/8/11PSY 719 - Speech8

Acoustics of SpeechThe speech signal is broken down by the ear into arepresentation of the intensity at each frequency overtime. The sound spectrogram is a similar representationof the acoustic information in speech.Dark areas are concentrations of energy at a particularfrequency. When such a concentration occurs over time,it is called a formant. In the next graph, the energy infour syllables that start with the consonant /b/ and end inthe consonant /g/ are shown. These syllables differ inthe vowel and illustrate the different sound for these fourvowels.9/8/11PSY 719 - Speech9

bVg Examplestime (100 msec)frequency (kHz)5formant31/big/9/8/11/bEg/PSY 719 - Speech/bçg//bUg/10

Speech Acoustics - 2The formants in the speech signal vary with the positionof the tongue, lips and jaw. They are a “cue” forlisteners to recognize the sounds of speech.For the vowel in /bEg/ (beg), spoken by this male talker,the center frequencies of the first three formants at themiddle of the syllable are approximately 560 Hz, 1750 Hzand 2400 Hz.For the consonants /b/ and /g/ at the beginning and endof each syllable, the formants change rapidly over time.These changes are called formant transitions and arecritical to our ability to recognize consonants and vowels.9/8/11PSY 719 - Speech11

Consonants and VowelsThe sounds of speech vary in:1) The frequencies and intensities of the formants2) The pattern of change in the formants, over time3) Voicing (whether the vocal folds vibrate or not)4) The presence of nasal formants5) Presence of noise in the spectrum and the frequency/intensity distribution of the noise6) The duration of 1 through 5 above9/8/11PSY 719 - Speech12

The Challenge of Speech PerceptionThe question of how humans perceive speech is complexbecause of two classical problems:1) How is a continuous signal divided up into phonemes,syllables and words?2) How does the listener recognize the sequence ofphonemes, syllables and words when the speech signalchanges because of differences in speaker, speakingrate, dialect, and coarticulation?9/8/11PSY 719 - Speech13

SegmentationSpeech is continuous. There are no breaks betweenwords in fluent speech. One effect of coarticulation is tosmear the boundaries between adjacent phonemes,syllables and words.As an illustration, consider the following sentence.Where does one word end and the next begin? Thesentence is shown first as a waveform then as aspectrogram.9/8/11PSY 719 - Speech14

amplitudeSegmentation Illustration - 1time(msec)In this sentence, where are the boundaries between words?9/8/11PSY 719 - Speech15

Segmentation Illustration - 2frequency (kHz)531time9/8/11PSY 719 - Speech16

Segmentation - 2In fluent speech, silence occurs when we take a breath,when we deliberately pause for emphasis (or to think), orwhen a particular type of phoneme occurs (stops) inwhich the vocal tract is briefly closed.These silence intervals that occur with stops can bewithin words or between words.9/8/11PSY 719 - Speech17

Segmentation Illustration - 3“Our waiter was rudely interrupted while working.”silent gapswaiter9/8/11interruptedPSY 719 - Speechworking18

VariabilityVariability refers to the changes that occur in phonemes,syllables, and words because:1) They occur with different other sounds before andafter them. This influence of coarticulation alters thesound for a phoneme based on what came before andafter.2) Different talkers have different length vocal tracts andspeak with different dialects and idiolects.3) Talkers vary their rate of speech and the accuracy(carefulness) of their articulation.9/8/11PSY 719 - Speech19

Variability: CoarticulationThe formant transitions that characterize a /b/ or a /g/change with the vowel. That is, the acoustic details of /b/in the words beat, bit, bet, bat, box, bought, boat, book,boot, but, bird, bite, bout, and boy are different.One of the primary goals of research in speech is to finda way to characterize a pattern of change in the soundwhich is the same for all examples of a particularphoneme (e. g. /b/s) and distinguishes it from otherphonemes (an invariant). Finding such a pattern foreach consonant and vowel has so far proved elusive.9/8/11PSY 719 - Speech20

Variability: Talkers1) Individuals differ in vocal tract length and size of theirlarynx. They speak different dialects with differentaccents. This leads to different physical sounds thatcorrespond to the same phonemes and words.2) Individuals vary in how careful or “sloppy” they are intheir articulation. In saying “Did you get to know himwell”, “Did you” is often said as “dija”, “get to” becomes“geta”, and the “h” in “him” is omitted. In spite of this,listeners have relatively little difficulty understandingspeech across this range of variation.9/8/11PSY 719 - Speech21

Variability: Speaking RateA person may intrinsically speak rapidly or slowly. Eachindividual also varies their rate of speech. This causessegments (phonemes and syllables) to vary in duration.However, some phonemes such as /b/ and /w/ (“beat”and “wheat”) are differentiated (in part) by their duration.This implies that listeners adjust their perception for therate at which the person is speaking.Like size-distance scaling in vision, this implies thatcertain properties of the sound must be extracted first toproperly perceive the distal object.9/8/11PSY 719 - Speech22

Variability: EnvironmentA person listens to speech in many differentenvironments.They may hear a person speaking against a quietbackground, against a background of environmentalnoise (e.g., a busy street) or against a background ofother conversations.How does a listeners recover the intended message?How do they separate the aspects of sound that belongto the speech signal that they are trying to recognizefrom other sound or other speech?9/8/11PSY 719 - Speech23

PerceptionIn speech perception, listeners show evidence of phoneticconstancy. They hear the same speech sounds in spite ofvariation in who is talking, how fast they talk, or othervariation in the sound.From an ecological perspective, this has led to a search forinvariant properties or features in the sound. When thefeature is present, it would signal the listener that aparticular phoneme has been spoken.While some invariant properties may have been identified,there are also phonemes for which no invariant propertieshave been found yet.9/8/11PSY 719 - Speech24

Speech by Ear and by EyeThe perception of speech can take advantage of visualinformation (from looking at the face of the speaker) inaddition to the sound. That is, a listener is more accuratein their perception when they have both auditory andvisual information.This can also lead to illusions. If we edit a video to showa speaker saying /ga/ while the audio track plays /ba/, anobserver will report that they hear “da”. If the observercloses her/his eyes, they hear “ba”. Known as theMcGurk Effect, this illustrates how listeners integrateinformation from two sensory systems in speechperception.9/8/11PSY 719 - Speech25

Articulation of VowelsIn English, vowels are voiced, non-nasal and theirmanner is vocalic and continuant (very little constriction).They are classified according to the position of thetongue and the shape of the lips. The point of maximalconstriction in the oral tract with the tongue can be front,mid or back. The height of the tongue in the vocal tractcan be high, mid or low. The lips can be rounded (roundopening) or spread (wide, shallow opening). The vowelcan be tense (long duration) or lax (short duration).9/8/11PSY 719 - Speech26

Articulation of Vowels - 2For example, /i/ as in beat is a high (height), front (frontto back), spread (lip rounding), tense (long duration),non-nasal vowel./ / is a low, mid, spread, lax vowel.The next diagram show the position of the maximalconstriction of the tongue in vowel production for most ofthe vowels of American English.9/8/11PSY 719 - Speech27

American English Vowel SpaceuiUISymbols indottedboxes areallophoneso E Qfront9/8/11highaçAlowbackPSY 719 - Speech28

Articulation of Vowels - 3The vowel /‘/ was not show in the diagram because it isdistinguished by the curvature of the tongue and not thetongue position or height, where it is similar to /U/ or / /.The diphthongs were not shown since they arecharacterized by movement of the tongue over time.9/8/11PSY 719 - Speech29

200300iX400eXuX‘XXUIXF1 (Hz)oXXoI500EXçXQX600700!XaIXaU XA Talker’sVowel SpaceaX8009002400200016001200800F2 (Hz)9/8/11PSY 719 - Speech30

Acoustics of VowelsPlotted as a function of the first formant (F1) and thesecond formant (F2) frequenciesThe diphthongs are shown as an “arrow” representingtheir formant frequencies at the beginning, middle andend.9/8/11PSY 719 - Speech31

Other LanguagesThere are attributes of vowel articulation that Englishdoes not use.1) Nasalization. A vowel can be nasal or non-nasal.French and Hindi use this distinction.2) Long or short (duration). A vowel can be long induration or short. Japanese uses this distinction.Note that English speakers make long and short, nasaland non nasal vowels, but these are allophonicvariations.9/8/11PSY 719 - Speech32

9/8/11! PSY 719 - Speech! 1! Overview 1) Speech articulation and the sounds of speech. 2) The acoustic structure of speech. 3) The classic problems in understanding speech perception: segmentation, units, and variability. 4) Basic perceptual data and the mapping of sound to phoneme. 5) Higher level influences on perception.

Related Documents:

Silat is a combative art of self-defense and survival rooted from Matay archipelago. It was traced at thé early of Langkasuka Kingdom (2nd century CE) till thé reign of Melaka (Malaysia) Sultanate era (13th century). Silat has now evolved to become part of social culture and tradition with thé appearance of a fine physical and spiritual .

May 02, 2018 · D. Program Evaluation ͟The organization has provided a description of the framework for how each program will be evaluated. The framework should include all the elements below: ͟The evaluation methods are cost-effective for the organization ͟Quantitative and qualitative data is being collected (at Basics tier, data collection must have begun)

̶The leading indicator of employee engagement is based on the quality of the relationship between employee and supervisor Empower your managers! ̶Help them understand the impact on the organization ̶Share important changes, plan options, tasks, and deadlines ̶Provide key messages and talking points ̶Prepare them to answer employee questions

Dr. Sunita Bharatwal** Dr. Pawan Garga*** Abstract Customer satisfaction is derived from thè functionalities and values, a product or Service can provide. The current study aims to segregate thè dimensions of ordine Service quality and gather insights on its impact on web shopping. The trends of purchases have

On an exceptional basis, Member States may request UNESCO to provide thé candidates with access to thé platform so they can complète thé form by themselves. Thèse requests must be addressed to esd rize unesco. or by 15 A ril 2021 UNESCO will provide thé nomineewith accessto thé platform via their émail address.

Chính Văn.- Còn đức Thế tôn thì tuệ giác cực kỳ trong sạch 8: hiện hành bất nhị 9, đạt đến vô tướng 10, đứng vào chỗ đứng của các đức Thế tôn 11, thể hiện tính bình đẳng của các Ngài, đến chỗ không còn chướng ngại 12, giáo pháp không thể khuynh đảo, tâm thức không bị cản trở, cái được

3. Articulation tests as the basis of a new test 142 Types of test, Sentence articulation tests, Word articulation tests, Syllable articulation tests, Sound articulation tests 4. Equipment for experimental tests 146 Provision of the noise background, Recording channel, Replay channels, Replay channel for speech, Replay channel for

asset management must be considered as one of the first revolutions in financial technology. However, it quickly became the industrial secret of many successful hedge funds such as Re-naissance, D.E.Shaw, Two Sigmas, CFM, e.t.c. The 2008 crisis has changed the investment point of view of investors and the regulators. They required more and more efforts from the hedge fund industry and asset .