Cortical Tracking Of Hierarchical Linguistic Structures In .

2y ago
11 Views
2 Downloads
2.03 MB
10 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Camryn Boren
Transcription

a r t ic l e sCortical tracking of hierarchical linguistic structuresin connected speechnpg 2016 Nature America, Inc. All rights reserved.Nai Ding1,2, Lucia Melloni3–5, Hang Zhang1,6–8, Xing Tian1,9,10 & David Poeppel1,11The most critical attribute of human language is its unbounded combinatorial nature: smaller elements can be combined intolarger structures on the basis of a grammatical system, resulting in a hierarchy of linguistic units, such as words, phrases andsentences. Mentally parsing and representing such structures, however, poses challenges for speech comprehension. In speech,hierarchical linguistic structures do not have boundaries that are clearly defined by acoustic cues and must therefore be internallyand incrementally constructed during comprehension. We found that, during listening to connected speech, cortical activity ofdifferent timescales concurrently tracked the time course of abstract linguistic structures at different hierarchical levels, such aswords, phrases and sentences. Notably, the neural tracking of hierarchical linguistic structures was dissociated from the encodingof acoustic cues and from the predictability of incoming words. Our results indicate that a hierarchy of neural processingtimescales underlies grammar-based internal construction of hierarchical linguistic structure.To understand connected speech, listeners must construct a hierarchyof linguistic structures of different sizes, including syllables,words, phrases and sentences1–3. It remains puzzling how the brainsimultaneously handles the distinct timescales of the different linguistic structures, for example, from a few hundred milliseconds forsyllables to a few seconds for sentences4–14. Previous studies havesuggested that cortical activity is synchronized to acoustic featuresof speech, approximately at the syllabic rate, providing an initialtimescale for speech processing15–19. But how the brain utilizes suchsyllabic-level phonological representations closely aligned with thephysical input to build multiple levels of abstract linguistic structure,and represent these concurrently, is not known. We hypothesizedthat cortical dynamics emerge at all timescales required for theprocessing of different linguistic levels, including the timescalescorresponding to larger linguistic structures such as phrases andsentences, and that the neural representation of each linguistic levelcorresponds to timescales matching the timescales of the respectivelinguistic level.Although linguistic structure building can clearly benefit fromprosodic20,21 or statistical cues22, it can also be achieved purely onthe basis of the listeners’ grammatical knowledge. To experimentallyisolate the neural representation of the internally constructed hierarchical linguistic structure, we developed new speech materialsin which the linguistic constituent structure was dissociated fromprosodic or statistical cues. By manipulating the levels of linguisticabstraction, we found separable neural encoding of each differentlinguistic level.RESULTSCortical tracking of phrasal and sentential structuresIn the first set of experiments, we sought to determine the neuralrepresentation of hierarchical linguistic structure in the absenceof prosodic cues. We constructed hierarchical linguistic structuresusing an isochronous, 4-Hz sequence of syllables that were independently synthesized (Fig. 1a,b, Supplementary Fig. 1 andSupplementary Table 1). As a result of the acoustic independencebetween syllables (that is, no co-articulation), the linguistic constituentstructure could only be extracted using lexical, syntactic andsemantic knowledge, and not prosodic cues. The materials were firstdeveloped in Mandarin Chinese, in which syllables are relativelyuniform in duration and are also the basic morphological unit(always morphemes and, in most cases, monosyllabic words).Cortical activity was recorded from native listeners of MandarinChinese using magnetoencephalography (MEG). Given that different linguistic levels, that is, the monosyllabic morphemes, phrasesand sentences, were presented at unique and constant rates, thehypothesized neural tracking of hierarchical linguistic structurewas tagged at distinct frequencies.The MEG response was analyzed in the frequency domain andwe extracted response power in every frequency bin using an optimal spatial filter (Online Methods). Consistent with our hypothesis,the response spectrum showed three peaks at the syllabic rate (P 1.4 10 5, paired one-sided t test, false discovery rate (FDR) corrected),phrasal rate (P 1.6 10 4, paired one-sided t test, FDR corrected)and sentential rate (P 9.6 10 7, paired one-sided t test, FDR1Departmentof Psychology, New York University, New York, New York, USA. 2College of Biomedical Engineering and Instrument Sciences, Zhejiang University,Hangzhou, China. 3Department of Neurology, New York University Langone Medical Center, New York, New York, USA. 4Department of Neurophysiology, Max-PlanckInstitute for Brain Research, Frankfurt, Germany. 5Department of Psychiatry, Columbia University, New York, New York, USA. 6Department of Psychology and BeijingKey Laboratory of Behavior and Mental Health, Peking University, Beijing, China. 7PKU-IDG/McGovern Institute for Brain Research, Peking University, Beijing, China.8Peking-Tsinghua Center for Life Sciences, Beijing, China. 9New York University Shanghai, Shanghai, China. 10NYU-ECNU Institute of Brain and Cognitive Scienceat NYU Shanghai, Shanghai, China. 11Neuroscience Department, Max-Planck Institute for Empirical Aesthetics, Frankfurt, Germany. Correspondence should beaddressed to N.D. (ding nai@zju.edu.cn) or D.P. (david.poeppel@nyu.edu).Received 12 August; accepted 3 November; published online 7 December 2015; doi:10.1038/nn.4186158VOLUME 19 NUMBER 1 JANUARY 2016nature NEUROSCIENCE

a r t ic l e sDependence on syntactic structuresAre the responses at the phrasal and sentential rates indeed separateneural indices of processing at distinct linguistic levels or are theymerely sub-harmonics of the syllabic rate response, generated byintrinsic cortical dynamical properties? We address this question bymanipulating different levels of linguistic structure in the input. Whenthe stimulus is a sequence of random syllables that preserves theacoustic properties of Chinese sentences (Fig. 1 and SupplementaryFig. 2), but eliminates the phrasal/sentential structure, only syllabic(acoustic) level tracking occurs (P 1.1 10 4 at 4 Hz, paired onesided t test, FDR corrected; Fig. 2a). Furthermore, this manipulationpreserves the position of each syllable in a sentence (Online Methods)and therefore further demonstrates that the phrasal- and sententialrate responses are not a result of possible acoustic differences betweenthe syllables in a sentence. When two adjacent syllables and morphemes combine into verb phrases, but there is no four-element sentential structure, phrasal-level tracking emerges at half of the syllabicrate (P 8.6 10 4 at 2 Hz and P 2.7 10 4 at 4 Hz, paired one-sidedt test, FDR corrected; Fig. 2b). Similar responses are observed fornoun phrases (Supplementary Fig. 3).134*Chinese materials, English listenerse*12 dB1234English materials, English listeners3*123Frequency (Hz)nature NEUROSCIENCE4*4f*Power2*Power*1c2*Powerb*12 dBdPowerChinese materials, Chinese listenersPowera**1/1.28 2/1.284/1.28Frequency (Hz)VOLUME 19 NUMBER 1 JANUARY 2016aSentenceN phraseSentenceV phraseN phrase1 HzV phrase2 Hz4 HzDryFurRubsSkinNew PlansGave Hope250 msbSpectrum for stimulus intensity*Intensity20 dB1 Hzc2 Hz3 HzNeural response spectrumfsentence*fphrase*4 Hzfsyllable*Power (dB)corrected) and the response was highly consistent across listeners(Fig. 1c). Given that the phrasal- and sentential-rate rhythms werenot conveyed by acoustic fluctuations at the corresponding frequencies (Fig. 1b), cortical responses at the phrasal and sentential ratesmust be a consequence of internal online structure building processes.Cortical activity at all the three peak frequencies was seen bilaterally (Fig. 1c). The response power averaged over sensors in eachhemisphere was significantly stronger in the left hemisphere at thesentential rate (P 0.014, paired two-sided t test), but not at thephrasal (P 0.20, paired two-sided t test) or syllabic rates (P 0.40,paired two-sided t test).Powernpg 2016 Nature America, Inc. All rights reserved.Figure 1 Neural tracking of hierarchical linguistic structures.(a) Sequences of Chinese or English monosyllabic words were presentedisochronously, forming phrases and sentences. (b) Spectrum of stimulusintensity fluctuation revealed syllabic rhythm, but no phrasal or sententialmodulation. The shaded area covers 2 s.e.m. across stimuli. (c) MEGderived cortical response spectrum for Chinese listeners and materials(dark red curve, grand average; light red curves, individual listeners;N 16, 0.11-Hz frequency resolution). Neural tracking of syllabic,phrasal and sentential rhythms was reflected by spectral peaks atcorresponding frequencies. Frequency bins with significantly strongerpower than neighbors (0.5 Hz range) are marked (*P 0.001, pairedone-sided t test, FDR corrected). The topographical maps of responsepower across sensors are shown for the peak frequencies.Max6 dB1 Hz2 Hz3 HzFrequency4 HzMinTo test whether the phrase-level responses segregate from the sentence level, we constructed longer verb phrases that were unevenlydivided into a monosyllabic verb followed by a three-syllable nounphrase (Fig. 2c). We expect that the neural responses to the longverb phrase to be tagged at 1 Hz, whereas the neural responses to themonosyllabic verb and the three-syllable noun phrase will present asharmonics of 1 Hz. Consistent with our hypothesis, cortical dynamics emerged at one-fourth of the syllabic rate, whereas the responseat half of the syllabic rate is no longer detectable (P 1.9 10 4, 1.7 10 4 and 9.3 10 4 at 1, 3 and 4 Hz, respectively, paired one-sidedt test, FDR corrected).Dependence on language comprehensionWhen listening to Chinese sentences (Fig. 1a), listeners who did notunderstand Chinese only showed responses to the syllabic (acoustic)rhythm (P 3.0 10 5 at 4 Hz, paired one-sided t test, FDR corrected;Fig. 2d), further supporting the argument that cortical responsesto larger, abstract linguistic structures is a direct consequence oflanguage comprehension.If aligning cortical dynamics to the time course of linguistic constituent structure is a general mechanism required for comprehension,it must apply across languages. Indeed, when native English speakerswere tested with English materials (Fig. 1a), their cortical activity alsofollowed the time course of larger linguistic structures, that is, phrasesand sentences (P 4.1 10 5, syllabic rate; Fig. 2e; P 3.9 10 3,Figure 2 Tracking of different linguistic structures. Each panel showssyntactic structure repeating in the stimulus (left) and the corticalresponse spectrum (right; shaded area indicates 2 s.e.m. over listeners,N 8). (a) Chinese listeners, Chinese materials: syllables weresyntactically independent and cortical activity encoded only acoustic andsyllabic rhythm. (b,c) Additional tracking emerged with larger linguisticstructures. Spectral peaks marked by a star (black, P 0.001; gray,P 0.005; paired one-sided t test, FDR corrected). (d) English listeners,Chinese materials from Figure 1: acoustic tracking only, as there wasno parsable structure. (e,f) English listeners, English materials: syllabicrate (4/1.28 Hz) and sentential and phrasal rate responses to parsablestructure in stimulus.159

a r t ic l e sNeural tracking of linguistic structures rather than probability cuesWe found that concurrent neural tracking of multiple levels of linguistic structure was not confounded with the encoding of acousticcues (Figs. 1 and 2). However, is this simply explained by the neuraltracking of the predictability of smaller units? As a larger linguisticstructure, such as a sentence, unfolds in time, its component unitsbecome more predictable. Thus, cortical networks solely trackingtransitional probabilities across smaller units could show temporaldynamics matching the timescale of larger structures. To test thisalternative hypothesis, we crafted a constant transitional probabilityMarkovian Sentence Set (MSS) in which the transitional probabilityof lower level units was dissociated from the higher level structures(Fig. 3a and Supplementary Fig. 1e,f). The constant transitionalprobability MSS is contrasted with a varying transitional probabilityMSS, in which the transitional probability is low across sententialboundaries and high in a sentence (Fig. 3b,c). If cortical activity onlyencodes the transitional probability between lower level units (forexample, acoustic chunks in the MSS) independent of the underlyingsyntactic structure, it can show tracking of the sentential structure forthe varying probability MSS, but not for the constant probability MSS.In contrast with this prediction, indistinguishable neural responsesto sentences were observed for both MSS (Fig. 3d), demonstratingthat neural tracking of sentences is not confounded by transitionalprobability. Specifically, for the constant transitional probability MSS,the response was statistically significant at the sentential rate, twicethe sentential rate and the syllable rate (P 1.8 10 4, 2.3 10 4 and6 dBSentence4duration:6782.25 s* –44567867845Actual duration (number of syllables)d6 dB –5Single-trial decodingof sentence duration6 dBRMS over sensorsb* –3*1.25 s1605cDecoded duration(number of syllables)Neural tracking of sentencesof variable durations (4–8 syllables)Percent trials (%)RMS over sensorsaRMS over sensorsnpg 2016 Nature America, Inc. All rights reserved.4.3 10 3 and 6.8 10 6 at the sentential, phrasal and syllabic rates,respectively; Fig. 2f; paired one-sided t test, FDR corrected). –2* –1* 1* 23-syllable NPVP4-syllable NPVP1s40302010aConstant transitional probability1/51/51/5C3C1C2orderedlives inspeaksdidn’twrote athe boya girlher dadJohnJessbbeer, soup, saladpizza, coffee.book, letter, storypoem, memocTransitionalprobabilityfcC1 C2 C3 C1 C2 C31 11 11/25dN 25C11my cattheySarah.C21is sogrowlooks.C3lovelyappleshappy.fs1/25TimeVarying transitional nt transitional probabilityVarying transitional probability** fs**6 dBPower (dB)Figure 3 Dissociating sentential structures and transitional probability.(a,b) Grammar of an artificial Markovian stimulus set with constant (a)or variable (b) transitional probability. Each sentence consists of threeacoustic chunks, each containing 1–2 English words. The listenersmemorized the grammar before experiments. (c) Schematic time courseand spectrum of the transitional probability. (d) Neural response spectrum(shaded area covers 2 s.e.m. over listeners, N 8). Significant neuralresponses to sentences were seen for both languages. Spectral peaks areshown by an asterisk (P 0.001, paired one-sided t test, FDR corrected,same color code as the spectrum). Responses were not significantlydifferent between the two languages in any frequency bin (pairedtwo-sided t test, P 0.09, uncorrected).1/1.05**2/1.05Frequency (Hz)fc3/1.052.7 10 6, respectively). For the varying transitional probability MSS,the response was statistically significant at the sentential rate, twicethe sentential rate and the syllable rate (P 7.1 10 4, 7.1 10 4 and4.8 10 6, respectively).Given that the MSS involved real English sentences, listeners hadprior knowledge of the transitional probabilities between acousticchunks. To control for the effect of such prior knowledge, we createda set of Artificial Markovian Sentences (AMS). In the AMS, the transitional probability between syllables was the same in and across sentences (Supplementary Fig. 4a). The AMS was composed of Chinesesyllables, but no meaningful Chinese expressions were embedded inthe AMS sequences. As the AMS was not based on the grammar ofChinese, the listeners had to learn the AMS grammar to segmentsentences. By comparing the neural responses to the AMS sequencesbefore and after the grammar was learned, we were able to separatethe effect of prior knowledge of transitional probability and the effectof grammar learning. Here, the grammar of the AMS indicates theset of rules that governs the sequencing of the AMS, that is, the ruleof which syllables can follow which syllables.The neural responses to the AMS before and after grammar learningwere analyzed separately (Supplementary Fig. 4). Before learning,when the listeners were instructed that the stimulus was just a sequenceof random syllables, the response showed a statistically significantpeak at the syllabic rate (P 0.0003, bootstrap), but not at the sentential rate. After the AMS grammar was learned, however, a significantresponse peak emerged at the sentential rate (P 0.0001, bootstrap).A response peak was also observed at twice the sentential rate, possiblyreflecting the second harmonic of the sentential response. This resultfurther confirms that neural tracking of sentences is not confoundedby neural tracking of transitional probability.Neural tracking of sentences varying in duration and structureThese results are based on sequences of sentences that have uniformduration and syntactic structure. We next addressed whether corticalFigure 4 Neural tracking of sentences of varying structures. (a) Neuralactivity tracked the sentence duration, even when the sentence boundaries(dotted lines) were not separated by acoustic gaps. (b) Averaged responsenear a sentential boundary (dotted line). The power continuously changedthroughout the duration of a sentence. Shaded area covers 2 s.e.m.over listeners (N 8). Significance power differences between time bins(shaded squares) are marked by asterisks (P 0.01, one-sided t test,FDR corrected). (c) Confusion matrix for neural decoding of the sentenceduration. (d) Neural activity tracks noun phrase duration (shown inthe bottom). Yellow areas show significant differences between curves(P 0.005, bootstrap, FDR corrected).VOLUME 19 NUMBER 1 JANUARY 2016nature NEUROSCIENCE

a r t ic l e stracking of larger linguistic structures generalizes to sentences thatare variable in duration (4–8 syllables) and syntactic structures.These sentences were again built on isochronous Chinese syllables,intermixed and sequentially presented without any acoustic gap atthe sentence boundaries. Examples translated into English include“Don’t be nervous,” “The book is hard to read,” and “Over the streetis a museum.”As these sentences have irregular durations that are not tagged byfrequency, the MEG responses were analyzed in the time domain byaveraging sentences of the same duration. To focus on sentential levelprocessing, we low-pass filtered the response at 3.5 Hz. The MEGresponse (root mean square, r.m.s., over all sensors) rapidly increasedafter a sentence boundary and continuously changed throughout theduration of a sentence (Fig. 4a). To illustrate the detailed temporalabHigh-gamma powerf fpfsLow-frequency waveformfs6 dBf fpN 22N 62N 10npg 2016 Nature America, Inc. All rights reserved.N 60N 131/1.28 2/1.284/1.28N 291/1.28 2/1.28Frequency (Hz)fs fp–cnature NEUROSCIENCE4/1.28Frequency (Hz)fs–fp (fs fp )fσ–dVOLUME 19 NUMBER 1 JANUARY 2016High-gamma powerLow-frequency waveformSentential rate16 dB10 dBLeft hemisphereRight hemisphere16 dBPhrasal rateFigure 5 Localizing cortical sources of the sentential and phrasal rateresponses using ECoG (N 5). Left, power envelope of high-gammaactivity. Right, waveform of low-frequency activity. Electrodes in the righthemisphere were projected to the left hemisphere, and right hemisphere(left hemisphere) electrodes are shown by hollow (filled) circles.The figure only displays electrodes that showed statistically significantneural re

The materials were first developed in Mandarin Chinese, in which syllables are relatively uniform in duration and are also the basic morphological unit (always morphemes and, in most cases, monosyllabic words). Cortical activity was recorded from native listeners of Mandarin

Related Documents:

Cortical Modularity in Autism Symposium – Oct 12-14, 2007 26 CH L2/3 Higher cortical areas L5 L4 lower cortical areas BC 0,24 12 4 16 8 20 Gamma (ms) Intrinsic, Final round of integration and horizontal competition in L2/3, with modulated activation function depending on NE (and possibly) Ach levels. Interneurons engaged to squash all

The Linguistic Wars. Oxford University Press. Harris, Roy. and Talbot Taylor (eds.) (1997). Landmarks In Linguistic Thought Volume I: The Western Tradition From Socrates To Saussure (History of Linguistic Thought), Routledge. [on Frege, Saussure] Heine, Bernd. and Heiko Narrog (eds.) (2010) The Oxford Handbook of Linguistic Analysis.

A city is a kaleidoscope to observe various social and linguistic activities, where people are surrounded by numerous linguistic artifacts, such as posters, billboards, public road signs, and shop signs. Languages displayed in public linguistic artifacts are linguistic landscape (henceforth, LL). The study on the presence,

Studies applying one or both of these cross-linguistic methods have yielded six basic findings, summarized briefly as follows. (1) Cross-linguistic variation: First, the papers in this issue (and related cross-linguistic studies by these investigators and other research groups- . much more cross-linguistic research, we hope that this .

Object tracking is the process of nding any object of interest in the video to get the useful information by keeping tracking track of its orientation, motion and occlusion etc. Detail description of object tracking methods which are discussed below. Commonly used object tracking methods are point tracking, kernel tracking and silhouette .

Cortical cells suffer from the "aperture problem", and further computation is required to disambiguate object motion. Cortical cells are also selective for speed (Or-ban et al., 1983). Both the motion selectivity and binocularity suggest a gen-eral hypothesis for cortical function: it links information likely to have a

function, consciousness, and many other cognitive functions. To study the spatiotemporal cortical dynamics underlying higher cognitive functions, we developed the VSDI technique for exploration in the behaving monkey. Here we show that VSDI of the same cortical area can be used repeatedly, on a long-term basis, for a period of 1 yr (Slovin et .

MRT, and self-development weekend workshops. Alyeska Counseling Group 701 W. 41 st Ave, Suite 104 Anchorage, AK 99503 907-782-4553 Monique Andrews MS, CDCII Alyeska Counseling Group Alyeska Counseling Group Counselors: Monique Andrews MS, CDCII Damito Owen, LPC-S Phoebe Proudfoot LCSW CDCI