Computational Models Of Discourse - Columbia University

2y ago
24 Views
2 Downloads
2.64 MB
75 Pages
Last View : Today
Last Download : 3m ago
Upload by : Macey Ridenour
Transcription

Computational Models of DiscourseRegina BarzilayMIT

What is Discourse?

What is Discourse?

Landscape of Discourse Processing Discourse Models: cohesion-based, content-based,rhetorical, intentional Applications: anaphora resolution, segmentation,event ordering, summarization, natural languagegeneration, dialogue systems Methods: supervised, unsupervised, reinforcementlearniing

Discourse Exhibits Structure! Discourse can be partition into segments, which canbe connected in a limited number of ways Speakers use linguistic devices to make thisstructure explicitcue phrases, intonation, gesture Listeners comprehend discourse by recognizing thisstructure– Kintsch, 1974: experiments with recall– Haviland&Clark, 1974: reading time forgiven/new information

Modeling Text StructureKey Question: Can we identify consistent structuralpatterns in text?“various types of [word] recurrence patterns seem tocharacterize various types of discourse” (Harris, 1982)

ExampleStargazers Text(from Hearst, 1994) Intro - the search for life in space The moon’s chemical composition How early proximity of the moon shaped it How the moon helped the life evolve on earth Improbability of the earth-moon system

---------------- Sentence:05101520253035404550556065707580859095 --------- 14form1111 111 1111111 8 scientist1111111 1 5space 11111 25star1111 22 111112 1 1 111 11111 5binary11 111 4trinary1111 8 astronomer 111 1111 1 7orbit11121 1 6pull21 11 1 16planet11111121 1111111 7galaxy111 1111 4lunar1 111 19life 1 1 1111 1 11 111 11 111 1 1 27moon13 11111 1 22 21 212111 1 3move111 7 continent2 1 1 2 1 3 shoreline12 6time11 1 111 3water111 6say1 11111 3species1 1 1 --------- Sentence:05101520253035404550556065707580859095 ---------

Outline Text segmentation Coherence assessment

Flow model of discourseChafe’76:“Our data . suggest that as a speaker moves fromfocus to focus (or from thought to thought) thereare certain points at which they may be a more orless radical change in space, time, character configuration, event structure, or even world . Atpoints where all these change in a maximal way,an episode boundary is strongly present.”

Segmentation: AgreementPercent agreement — ratio between observedagreements and possible agreementsA B 22 91%8 3C

Results on AgreementPeople can reliably predict segment boundaries!Grosz&Hirschbergberg’92newspaper text74-95%Hearst’93expository text80%Passanneau&Litman’93monologues82-92%

DotPlot RepresentationKey assumption: change in lexical distribution signalstopic change (Hearst ’94) Dotplot Representation: (i, j) – similarity betweensentence i and sentence j0100Sentence Index2003004005000100200300Sentence Index400500

Segmentation Algorithm of Hearst Initial segmentation– Divide a text into equal blocks of k words Similarity Computation– compute similarity between m blocks on the right andthe left of the candidate boundary Boundary Detection– place a boundary where similarity score reacheslocal minimum

Similarity Computation: RepresentationVector-Space RepresentationSENTENCE1 : I like applesSENTENCE2 : Apples are good for 110Sentence21111001

Similarity Computation: Cosine MeasureCosine of angle betweentwo vectors in n-dimensional spacePsim(b1 ,b2 ) pPtwy,b1 wt,b2w2t t,b1Pnt 1w2t,b2SENTENCE1 : 1 0 0 0 1 1 0SENTENCE2 : 1 1 1 1 0 0 1sim(S1 ,S2 ) 2 2 21 0 0 1 0 1 0 1 1 0 1 0 0 122222222(1 0 0 0 1 1 0 ) (1 1 1 1 02 02 12 )Output of Similarity computation:0.220.33 0.26

Boundary Detection Boundaries correspond to local minima in the gap 80200220240260 Number of segments is based on the minima threshold(s σ/2, where s and σ corresponds to average andstandard deviation of local minima)

Segmentation EvaluationComparison with human-annotated segments(Hearst’94): 13 articles (1800 and 2500 words) 7 judges boundary if three judges agree on the same segmentationpoint

Evaluation ResultsMethodsPrecisionRecallRandom Baseline 33%0.440.37Random Baseline 41%0.430.42Original method thesaurus-based similarity0.640.58Original method0.660.61Judges0.810.71

Evaluation Metric: Pk onokaymissfalsealarmokayPk : Probability that a randomly chosen pair of words kwords apart is inconsistently classified (Beeferman ’99) Set k to half of average segment length At each location, determine whether the two ends of theprobe are in the same or different location. Increase acounter if the algorithm’s segmentation disagree Normalize the count between 0 and 1 based on thenumber of measurements taken

Notes on Pk measure Pk [0, 1], the lower the better Random segmentation: Pk 0.5 On synthetic corpus: Pk [0.05, 0.2] On real segmentation tasks: Pk [0.2, 0.4]

Outline Text segmentation Coherence assessment

Modeling CoherenceActive networks and virtual machines have a long history ofcollaborating in this manner. The basic tenet of this solutionis the refinement of Scheme. The disadvantage of this typeof approach, however, is that public-private key pair and redblack trees are rarely incompatible. Coherence is a property of well-written texts that makesthem easier to read and understand than a sequence ofrandomly strung sentences Local coherence captures text organization at the level ofsentence-to-sentence transitions

Centering TheoryGrosz&Joshi&Weinstein,1983; Strube&Hahn,1999;Poesio&Stevenson&Di Eugenio&Hitzeman,2004 Constraints on the entity distribution in a coherent text– Focus is the most salient entity in a discourse segment– Transition between adjacent sentences is characterizedin terms of focus switch Constraints on linguistic realization of focus– Focus is more likely to be realized as subject or object– Focus is more likely to be referred to with anaphoricexpression

Phenomena to be ExplainedJohh went to his favorite musicstore to buy a piano.John went to his favorite musicstore to buy a piano.He had frequented the store formany years.It was a store John had frequented for many years.He was excited that he could finally buy a piano.He was excited that he could finally buy a piano.He arrived just as the store wasclosing for the day.It was closing just as John arrived.

Analysis The same content, different realization Variation in coherence arises from choice ofsyntactic expressions and syntactic forms

Another ExampleJohn really goofs sometimes.Yesterday was a beautiful day and he was excited abouttrying out his new sailboat.He wanted Tony to join him on a sailing trip.He called him at 6am.He was sick and furious at being woken up so early.

Centering Theory: Basics Unit of analysis: centers “Affiliation” of a center: utterance (U) and discoursesegment (DS) Function of a center: to link between a givenutterance and other utterances in discourse

Center Typology Types:– Forward-looking Centers Cf (U, DS)– Backward-looking Centers Cb (U, DS) Connection: Cb (Un ) connects with one of Cf(Un 1 )

Constraints on Distribution of Centers Cf is determined only by U; Cf are partially ordered in terms of salience The most highly ranked element of Cf (Un 1 ) isrealized as Cb (Un ) Syntax plays role in ambiguity resolution: subj ind obj obj others Types of transitions: center continuation, centerretaining, center shifting

Center ContinuationContinuation of the center from one utterance not onlyto the next, but also to subsequent utterances Cb (Un 1 ) Cb (Un ) Cb (Un 1 ) is the most highly ranked element ofCf (Un 1 ) (thus, likely to be Cb (Un 2 )

Center RetainingRetention of the center from one utterance to the next Cb (Un 1 ) Cb (Un ) Cb (Un 1 ) is not the most highly ranked element ofCf (Un 1 ) (thus, unlikely to be Cb (Un 2 )

Center ShiftingShifting the center, if it is neither retained no continued Cb (Un 1 ) Cb (Un )

Coherent DiscourseCoherence is established via center continuationJohn went to his favorite musicstore to buy a piano.John went to his favorite musicstore to buy a piano.He had frequented the store formany years.It was a store John had frequented for many years.He was excited that he could finally buy a piano.He was excited that he could finally buy a piano.He arrived just as the store wasclosing for the day.It was closing just as John arrived.

Application to Essay Grading(Miltsakaki&Kukich’00) Framework: GMAT e-rater Implementation: manual annotation of coreferenceinformation Grading: based on ratio of shifts Data: GMAT essays

Study results Correlation between shifts and low grades(established using t-test) Improvement of score prediction in 57%

Statistical ApproachKey Premise: the distribution of entities in locallycoherent discourse exhibits certain regularities Abstract a text into an entity-based representationthat encodes syntactic and distributionalinformation Learn properties of coherent texts, given a trainingset of coherent and incoherent texts

Text Representation Entity Grid — a two-dimensional array that capturesthe distribution of discourse entities across textsentences Discourse Entity — a class of coreferent nounphrases

Input Text1 Former Chilean dictator Augusto Pinochet, was arrested in London on October 14th, 1998.2 Pinochet, 82, was recovering from surgery.3 The arrest was in response to an extradition warrant served by a Spanish judge.4 Pinochet was charged with murdering thousands,including many Spaniards.5 He is awaiting a hearing, his fate in the balance.6 American scholars applauded the arrest.

Input Text with Syntactic AnnotationUse Collins’ parser(1997):1. [Former Chilean dictator Augusto Pinochet]s , was arrested in[London]x on [October 14th]x 1998.2. [Pinochet]s , 82, was recovering from [surgery]x .3. [The arrest]s was in [response]x to [an extradition warrant]xserved by [a Spanish judge]s .4. [Pinochet]s was charged with murdering [thousands]o , including many [Spaniards]o .5. [He]s is awaiting [a hearing]o , [his fate]x in [the balance]x .6. [American scholars]s applauded the [arrest]o .Notation: S subjects, O object, X other

Input Text with Coreference InformationUse noun-phrase coreference tool (Ng and Cardie,2002):1. [Former Chilean dictator Augusto Pinochet]s , was arrested in[London]x on [October 14]x 1998.2. [Pinochet]s , 82, was recovering from [surgery]x .3. [The arrest]s was in [response]x to [an extradition warrant]x servedby [a Spanish judge]s .4. [Pinochet]s was charged with murdering [thousands]o , includingmany [Spaniards]o .5. [He]s is awaiting [a hearing]o , [his fate]x in [the balance]x .6. [American scholars]s applauded the [arrest]o .

sOutput Entity Grid1S X X2S– –– – – – – – – – – – – 1X3 – – – –– – – – – – – – – – 2S X X S– – – – – – 34S– – – – – – –5S– – – – – – – – –6 – – – –OO O– – – – 4O X X– – – – – – – –– 5S6

Comparing ––X

Coherence Assessment Text is encoded as a distribution over entity transitiontypesSOX––––––XX OX XX S–OO SO OO X–SS OS XS S Entity transition type — {S, O, X, –}ndi10 0 0 .03 0 0 0 .02 .07 0 0 .12 .02 .02 .05 .25di2.02 0 0 .03 0 0 0 .06 0 0 0 .05 .03 .07 .07 .29How to select relevant transition types?: Use all the unigrams, bigrams, . . . over {S, O, X, –} Do feature selection

SOX––––––XX OX XX S–OO SO OO X–SS OS XS SText Encoding as Feature Vectordi10 0 0 .03 0 0 0 .02 .07 0 0 .12 .02 .02 .05 .25di2.02 0 0 .03 0 0 0 .06 0 0 0 .05 .03 .07 .07 .29Each grid rendering xij of a document di is represented by afeature vector:Φ(xij ) (p1 (xij ), p2 (xij ), . . . , pm (xij ))where m is the number of all predefined entity transitions, andpt (xij ) the probability of transition t in the grid xij

Learning a Ranking Function Training SetOrdered pairs (xij , xik ), where xij and xik are renderingsof the same document di , and xij exhibits a higher degreeof coherence than xik Training Procedure– Goal: Find a parameter vector w that yields a “rankingscore” function w · Φ(xij ) satisfying:w · (Φ(xij ) Φ(xik )) 0 (xij , xik ) in training set– Method: Constraint optimization problem solved usingthe search technique described in Joachims (2002)

Evaluation: Information Ordering Goal: recover the most coherent sentence ordering Basic set-up:– Input: a pair of a source document and a permutationof its sentences– Task: find a source document via coherence ranking Data: Training 4000 pairs, Testing 4000 pairs (Naturaldisasters and Transportation Safety Reports)

Information Ordering(a) During a third practice forced landing, with the landinggear extended, the CFI took over the controls.(b) The certified flight instructor (CFI) and the private pilot,her husband, had flown a previous flight that day andpracticed maneuvers at altitude.(c) The private pilot performed two practice power offlandings from the downwind to runway 18.(d) When the airplane developed a high sink rate during theturn to final, the CFI realized that the airplane was lowand slow.(e) After a refueling stop, they departed for another trainingflight.

Information Ordering(b) The certified flight instructor (CFI) and the private pilot,her husband, had flown a previous flight that day andpracticed maneuvers at altitude.(e) After a refueling stop, they departed for another trainingflight.(c) The private pilot performed two practice power offlandings from the downwind to runway 18.(a) During a third practice forced landing, with the landinggear extended, the CFI took over the controls.(d) When the airplane developed a high sink rate during theturn to final, the CFI realized that the airplane was lowand slow.

Evaluation: Summarization Goal: select the most coherent summary amongseveral alternatives Basic set-up:– Input: a pair of system summaries– Task: predict the ranking provided by human Data: 96 summary pairs for training, 32 pairs fortesting (from DUC 2003)

Baseline: LSACoherence Metric: Average distance between adjacentsentences measured by cosine (Foltz et al. 1998) Shown to correlate with human judgments Fully automatic Orthogonal to ours (lexicalized)

Evaluation ResultsTasks: O1 ordering(Disasters) O2 ordering(Reports) S summary rankingModelO1O2SGrid87.390.481.3LSA72.172.125.0

Varying Linguistic ComplexityWhat is the effect of syntactic knowledge? Reduce alphabet to { X,– }ModelO1O2S Syntax87.390.468.8-Syntax86.988.362.5

Conclusions Word distribution patterns strongly correlate withdiscourse patterns within a text Distributional-level approaches can be successfullyapplied to text-level relations

Computational Models of Discourse Regina Barzilay MIT. What is Discourse? What is Discourse? Landscape of Discourse Processing Discourse Models: cohesion-based, content-based, rhetorical, intentional

Related Documents:

International Journal of Peace Studies, Volume 10, Number 1, Spring/Summer 2005 THE POWER OF DISCOURSE AND THE DISCOURSE OF POWER: PURSUING PEACE THROUGH DISCOURSE INTERVENTION Michael Karlberg Abstract Western-liberal discourses of power and the social practices associated with them are proving inadequate to the task

Korean: Papers and Discourse Date Discourse and Grammar Asian Discourse and Grammar Discourse Transcription East Asian Linguistics Aspects of Nepali Grammar Prosody, Grammar, and Discourse in Central Alaskan Yup'ik 15.00 Proceedings from the fIrst 20.00 Workshop on American Indigenous Languages Proceedings from the second 15.00

Columbia 25th Birthday Button, 1992 Columbia 25th Birthday Button, 1992 Columbia Association's Celebrate 2000 Button, 1999 Columbia 40th Birthday Button, 2007 Lake-Front Live, Columbia Festival of the Arts Button, n.d. Columbia 22nd Birthday Button, 1989 I Love Columbia Button, n.d. Histor

Columbia Days Inn 1504 Nashville Highway Columbia, TN 38401 1-800-576-0003 Comfort Inn Columbia 1544 Bear Creek Pike Columbia, TN 38401 1-866-270-2846 Holiday Inn Express 1554 Bear Creek Pike Columbia, TN 38401 1-800-465-4329 Jameson Inn Columbia 715 James M. Campbell Columbia, TN 34802 1-800-423-7846

Computational models have been designed to obtain a better understanding of the mechanisms behind angiogenesis. In this paper we review computational models of sprouting angiogenesis. These models can be subdivided into three categories: models that mainly focus on tip cell migration, models that make a distinction between the role of tip

theoretical framework for computational dynamics. It allows applications to meet the broad range of computational modeling needs coherently and with fast, structure-based computational algorithms. The paper describes the SOA computational ar-chitecture, the DARTS computational dynamics software, and appl

share some philosophical underpinnings. This article will describe the theoretical antecedents for the Foucaultian version of this useful method of inquiry. Keywords: Foucault, discourse analysis 1. Introduction Discourse analysis (also called critical discourse analysis) is a relatively recent

Agile methods in SWEP Scrum (mainly) XP Head First Software Development Process The Scrum process follows the agile manifesto is intended for groups of 7 consists of simple rules and is thus easy to learn 15.04.2012 Andreas Schroeder 9