Deep Parsing In Watson - Brenocon

2y ago
18 Views
2 Downloads
631.27 KB
15 Pages
Last View : 15d ago
Last Download : 3m ago
Upload by : Kamden Hassan
Transcription

Deep parsing in WatsonTwo deep parsing components, an English Slot Grammar (ESG)parser and a predicate-argument structure (PAS) builder, providecore linguistic analyses of both the questions and the text contentused by IBM Watsoni to find and hypothesize answers. Specifically,these components are fundamental in question analysis, candidategeneration, and analysis of passage evidence. As part of the Watsonproject, ESG was enhanced, and its performance on Jeopardy!iquestions and on established reference data was improved. PAS wasbuilt on top of ESG to support higher-level analytics. In this paper,we describe these components and illustrate how they are used ina pattern-based relation extraction component of Watson. We alsoprovide quantitative results of evaluating the component-levelperformance of ESG parsing.IntroductionTwo deep parsing components, an English Slot Grammar(ESG) parser and a predicate-argument structure (PAS)builder, provide core linguistic analyses of both the questionsand the text content used by IBM Watson* to find andhypothesize answers. Specifically, these components arefundamental in question analysis, candidate generation, andanalysis of passage evidence [1–3].ESG [4–7] is a deep parser in the sense that the parse treesit produces for a sentence (or segment of any phrasalcategory) show a level of logical analysis (or deep structure).However, each parse tree also shows a surface-levelgrammatical structure (surface structure), along with thedeep structure. The parse trees for a segment are rankedaccording to a parse scoring system (described below), andfor Watson, we use only the highest-ranked parse. (A parsescore roughly corresponds to the likelihood that the parseis a correct one.) In this paper, we provide an overviewof Slot Grammar (SG) in its current state, discussing newfeatures and special adaptations made for the Jeopardy!**question-answering (QA) task. Most of the improvementsmotivated by the Jeopardy! challenge are applicable togeneral English and other applications. The adaptations thatare really special to Jeopardy! questions can be controlledby flag settings, which are off by default and can be turnedon when ESG is used for the Jeopardy! task.Parse analysis by ESG is followed by the applicationof a PAS builder, which simplifies and abstracts from theDigital Object Identifier: 10.1147/JRD.2012.2185409M. C. McCordJ. W. MurdockB. K. BoguraevESG parse in a variety of ways; for example, it drops someterms (e.g., auxiliary verbs) that are rarely very important forthe tasks that our downstream components perform. Theactive/passive alternations such as BJohn sold a fish[ andBA fish was sold by John[ have slightly different structuresin ESG but the same structure in PAS.The deep parsing suite for Watson consists of ESG,followed by the PAS builder. Deep parsing results arepervasively used in the Watson QA system, in componentswithin every stage of the DeepQA architecture [8]: questionanalysis, question decomposition, hypothesis generation,hypothesis and evidence scoring, etc. Here are a few specificexamples. Relation extraction (see [9] and the section on relationextraction below) identifies semantic relationships(in the sense of that section) among entities using theresults of deep parsing.Question analysis [1] uses results of deep parsingto identify the type of answer that a question isseeking.The keyword search component [2] uses semanticrelations in the question to identify keywords that havesome strong semantic connection to whatever thequestion is asking for; those keywords are given a higherweight in the search query.Passage-scoring components [3] use the results of deepparsing on both the question text and the passagesfound by keyword search to determine whether a passagealigns well to the question and thus provides evidencein support of some candidate answer. Copyright 2012 by International Business Machines Corporation. Copying in printed form for private use is permitted without payment of royalty provided that (1) each reproduction is done withoutalteration and (2) the Journal reference and IBM copyright notice are included on the first page. The title and abstract, but no other portions, of this paper may be copied by any means or distributedroyalty free without further permission by computer-based and other information-service systems. Permission to republish any other portion of this paper must be obtained from the Editor.0018-8646/12/ 5.00 B 2012 IBMIBM J. RES. & DEV.VOL. 56NO. 3/4PAPER 3MAY/JULY 2012M. C. MCCORD ET AL.3:1

Figure 1ESG parse of the Jeopardy! question BChandeliers look great but nowadays do not usually use these items from which their name is derived.[ Some of the type coercion components [10] use the PASto compare the type requested to answer types found innatural-language text.The results of PAS (and relation extraction) across a largecorpus are aggregated in the PRISMATIC knowledgebase [11], which is used by a variety of search [2] andanswer-scoring [3, 10] components.analysis. Step (A) is self-contained and can handle varioustagging systems (such as HTML). Its output is directlyused in some components of Watson. Unlike some parsers,SG uses no part-of-speech (POS) tagger; the correspondinginformation simply comes out of syntactic analysis. In thefollowing, after describing the nature of SG parse trees,we concentrate on the lexicon and syntactic analysis.The pervasive usage of deep parsing results reflects the factthat these components provided the core natural-languageprocessing capabilities for the Watson QA system.In this paper, we first describe SG parsing for Watson, andthen we discuss PAS, followed by an approach to relationextraction that illustrates the use of ESG and PAS. Thesemain sections are followed by sections on use in Watson,evaluation, related work, and conclusion and future work.Nature of SG analysesFigure 1 shows a sample Jeopardy! question and its ESGparse tree. We look at the example and give an overview ofSG parses in general.An SG parse tree is a dependency tree: Each tree node Nis centered on a headword, which is surrounded by its leftand right modifiers, which are, in turn, tree nodes. Eachmodifier M of N fills a slot in N . The slot shows thegrammatical role of M in N . In our example, the nodewith headword Bchandelier[ fills the subj (i.e., subject) slotfor the coordinated VP (verb phrase) node with headwordBbut[. This modifier tree structure is the surface structureof the parse analysis. In our sample parse display, you cansee, on the left side, the surface structure tree linesVeachline connecting a node M to its mother node N and showingthe slot filled by M in N .Slots are of two kinds: complement slots and adjunctslots. Complement slots, such as subj and obj (i.e., directobject) for verbs, are idiosyncratic to senses of theirheadwords and are associated with these senses in theirlexical entries. Adjunct slots, such as vadv (verb-modifyingadverbial), are associated with the POS of the headwordsense in the SG syntax module. Adjunct slot-fillers can SG parsingThe SG parsing system is divided into a largelanguage-universal shell and language-specific grammars forEnglish, German, French, Spanish, Italian, and Portuguese.Some of the SG features described in this section are inthe shell, and some are specific to English (ESG); all of theexamples are for ESG. We discuss 1) the pipeline of SGparsing, 2) the nature of SG parses, 3) the lexical system,and 4) syntactic analysis. At the end of this paper, wedescribe an evaluation of ESG performance.Pipeline of SG parsingThe main steps of SG parsing are (A) tokenization andsegmentation, (B) morpholexical analysis, and (C) syntactic3:2M. C. MCCORD ET AL.IBM J. RES. & DEV.VOL. 56NO. 3/4PAPER 3MAY/JULY 2012

typically modify any node of the category (POS) withwhich they are associated. Complement slots play a dual rolein SG: They can name grammatical roles, as mentioned.In addition, they can name logical arguments of word senses,as described later in this section.In the sort of parse display given in Figure 1, thelines/rows correspond 1-to-1 to tree nodes.We now describe the five main ingredients associatedwith a parse node, and then we state which parts constitutedeep structure and which constitute surface structure.(1) The headword of the nodeVThe internal parse datastructure stores several versions of the headword,including a) the form of it as it occurs in the text(inflected, mixed case, etc.); b) the lemma (citation)form; and c) the SG word sense of the node, whichwe explain below in the BSG lexicons[ subsection.Typically, the headword comes from a single-wordtoken, but it may be a multiword or a punctuation symbolacting as a coordinator or a special symbol for a specialkind of node, such as a quote node as describedbelow. In the above form of parse display, the headwordis shown in lemma form, but there are display optionsto show other forms. The headword is seen in themiddle column as a predicate followed by arguments,for example, inderive(17, u, 15, 12).We call this predication the word-sense predicationfor the node, and it is the main vehicle for showing thedeep structure of the parse.(2) The ID of the nodeVThis is an integer that, in mostcases, is the word number of the headword in thesegment, but there are exceptions, with the most commonbeing for multiwords, where the ID is the wordnumber of the head of the multiword. In this parsedisplay, the node ID is shown as the first argumentof the word-sense predication, for example, 17 inderive(17, u, 15, 12).(3) The (logical or deep) argument frame of the nodeVInthe internal parse data structure, this consists of the list ofcomplement slots of the word sense, each slot beingassociated with its filler node (or nil if it has no filler).In the Bderive[ node of our example, this list of pairswould be (subj: nil, obj: ph15, comp:ph12),where ph15 is the phrase with node ID 15, spanningBtheir name[, and ph12 is the phrase with ID 12,spanning Bfrom which[. The subj slot has no overtfiller. Note that Bderive[ is given in the passive, but thesethree slot-fillers constitute the logical (active form)arguments of the verb. For example, ph15 is the logicalobj of Bderive[, although grammatically, it is a subj(of Bbe[). That is why we speak of the logical or deepIBM J. RES. & DEV.VOL. 56NO. 3/4PAPER 3MAY/JULY 2012argument frame of the node. For a verb, the first memberof its argument frame is always its logical subject.Now we can say what the word-sense predication of anode is in the above form of parse display. The predicatename is the word sense or, optionally, the citationform. The first argument is the node ID. The remainingarguments are the IDs of the filler nodes in the argumentframe or u (for Bunfilled[ or Bunknown[) if there isno filler.The word-sense predication can be directly translatedinto a logical predication. We can replace the numericalarguments by similarly indexed logical variables, forexample, as in deriveðe17; x; x15; x12Þ, where, ingeneral, deriveðe; x; y; zÞ means that e is an event where xderives y in manner z. Hence, the node ID argument canbe thought of as an event argument or, more generally,an entity argument for the predication. Note that, inthe example, Bchandeliers[ (node 1) is shown as thelogical subj of the predicates for Blook[, Bdo[, andBuse[, although in surface structure, its only role is as the(grammatical) subj of the coordinated node 4. Inhandling coordination, the SG parsing algorithm canBfactor out[ slots of the conjuncts. This happens withnodes 2 and 6, providing the common subj filled bynode 1, but still showing 1 as logical subj for eachconjunct. SG parsing also fills in implicit arguments formany nonfinite VPs, and this results in 1 being thelogical subj of node 9.The sample parse shows two other kinds ofimplicit arguments filled in: a) The predicationgreat(3, 1, u), where Bchandeliers[ (1) fills the firstslot (asubj) of the adjective Bgreat[, directly showsthat Bgreat[ applies to Bchandeliers[ (under the contextof Blook[). b) The predication which(13, 11, u),where Bitems[ (11) fills the first slot (nsubj) of therelative pronoun Bwhich[, is interpreted as showing thatthe relative pronoun is co-referent with Bitems[. Then,in building a logical form, the relative pronoun’svariable can simply be replaced throughout with thevariable for Bitems[.(4) The features of the nodeVIn our parse display, thenode’s features are listed to the right of the headword.These can be morphosyntactic or semantic features.(In this example, some features were omitted forbrevity’s sake.) The first feature listed is the POS ofthe node. Most of the features come from those ofthe headword sense, as obtained from morpholexicalanalysis of the headword, but some may be added duringsyntactic analysis when the node acquires modifiers.(5) The (surface) modifier structure for the nodeVIn theinternal parse data structure, a node N has two associatedlistsVfor its left modifiers (premodifiers) and its rightmodifiers (postmodifiers), where each modifier nodeis paired with the slot it fills in N . In our parse display,M. C. MCCORD ET AL.3:3

it should be clear how to read the tree structure fromthe lines and dots on the left of the display (picture a treediagram in standard form turned on its side). The slotshown closest to the headword of a node N is the slot thatN fills in its mother node. For each complementslot S, the slot option used for S is shown in parenthesesafter S. For instance, node 17, for Bderived[, fills slotpred(en), meaning that node 17 fills a past-participialform of the pred (predicate) slotVfor Bbe[ (innode 16). More information about slot options is givenin the next subsection.The core of the SG deep structure is the set of word-sensepredications described in (3) above, since these are closeto logical predications. Deep structure information also existsin the semantic features of nodes. However, even somemorphosyntactic features, such as tense and number, matterfor logical form. The core of the surface structure lies in theheadword information (1) and (2), the morphosyntacticfeatures, and the surface modifier structure (5). However,adjunct slots appearing in (5) can also be of relevance to deepstructure, because, e.g., determiners may produce quantifiersin logical form.SG lexiconsIn this subsection, we describe the SG lexical systemand improvements made to it to benefit Watson. Much ofthe SG analysis process is driven by the lexicons used,particularly because SG lexicons specify (complement) slotframes for word senses, and the main step in syntacticanalysis is slot-filling.SG lexical entries are typically indexed by citationforms of words or multiwords. Morpholexical analysis oftokens does efficient lookup in the lexicons, along withmorphological analysis, both inflectional and derivational.ESG morphology currently handles 29 derivational affixes.For any language version of SG (such as ESG), thereis a main lexicon called the base lexicon. The systemallows any number of lexicons; ones besides the base lexiconwould typically be user addendum lexicons. The ESG baselexicon has approximately 87,000 entries, but many moreword forms are recognized because of the derivationaland inflectional morphology.In the work on Watson, we have developed a way ofaugmenting (i.e., expanding and improving) the ESG baselexicon automatically from other sources, particularlyPrinceton WordNet** [12, 13]. The process of augmentationis done before run time for any new version of the baselexicon and takes only about 5 seconds on a standarddesktop. We describe the augmentation methods in thissubsection.In the following, we describe (A) the form of SG lexicalentries and then (B) improvements made during the workon Watson.3:4M. C. MCCORD ET AL.Form of SG lexical entriesThe following is a sample entry (slightly simplified) fromthe ESG base lexicon:talk G v (obj n (p about)) (comp (p to with))G v obj1 (comp1 (p into))G n nsubj (nobj n (p about))(ncomp (p to with))In general, a lexical entry has an index word, given incitation (lemma) form, and can be a single word or amultiwordVtalk in our example. This is followed bya sequence of sense frames for the wordVthree inour example, two verb frames and one noun frame. Eachsense frame can specify any of the following seven kindsof items, all of which are optional except the first: (1) POS,(2) complement slot frame, (3) featuresVboth semanticand syntacticV(4) word-sense name, (5) numerical score,(6) subject area test, and (7) generalized support verbconstruction. Our sample shows only (1) and (2) in the senseframes. Each sense frame defines an SG word sense forthe index word. The ESG lexical word senses are rathersyntactic in nature, although the differing slot framesdo constrain the possible semantic word senses. However,the SG framework allows finer semantic distinctions inits word senses, because slot options can make semantic typetests on the slot’s fillers. This is done to some extent inthe ESG lexicon.Now, let us look in more detail at the seven kinds of itemsin a sense frame.Part of speechIn parse data structures, there are 15 possible parts of speechthat include noun, verb, adj, adv, qual (qualifier), det,prep, subconj (subordinating conjunction), and conj(coordinating conjunction). Some of these are seen inthe sample parse tree of Figure 1, where the POS islisted as the first of the nodes’ features. The lexicon usesthese same POS names, except in the case of nouns andverbs, for the sake of brevity. For instance, the lexical POS nis expanded into noun cn (common noun) in parse trees,and v expands into verb. Other features, such as numberand tense, are added on the basis of morphology.Complement slot frameIn our sample entry, the first sense frame for talk showstwo slots, namely, obj and comp, in its slot frame. An initialsubj slot is implied; every verb has a subj slot, andthis can be omitted as long as it needs no special options.The obj slot shown has two options: n and (p about).The first allows NP (noun phrase) fillers for obj (plus someother nominals such as gerund phrases), and the secondallows Babout[-PPs (prepositional phrases). The optionsare disjunctively viewed. The comp slot has optionIBM J. RES. & DEV.VOL. 56NO. 3/4PAPER 3MAY/JULY 2012

(p to with), which allows both Bto[- and Bwith[-PPs.Thus, the slot frame allows variants such as BJohn talked(about) mathematics to/with Bill[.In general, a slot has an associated list of possible (slot)options, which are symbols that name constraints on thepossible fillers of the slot. This applies to adjunct andcomplement slots (the options for each adjunct slot arespecified along with the slot in the syntax module). The ideafor complement slots is that the slot deals with one argumentof the word sense, which can be realized in different wayssyntactically. Slot options can specify not only the basicsyntactic category of the filler but also many other kindsof tests on the filler, such as semantic type requirements,other feature tests, subject area tests, tests for specific words,or recursively any Boolean combination of tests. Ourexample shows how lexical entries name specific optionsfor complement slots. If none is specified, as in the obj1 slotin the second sense frame for talk, then default optionsare assigned by the system.Most slots are optional by default, meaning that theyare not required to be filled for a valid use of the senseframe. This applies to the two slots listed in the first senseframe of our example. A suffix 1, as in our secondsense frame, indicates that the slot is obligatoryVthat itmust be filled.FeaturesThe features can be syntactic features or semantic types.ESG currently uses approximately 160 semantic types,belonging to a type hierarchy. Examples can be seenin the sample parse in Figure 1, e.g., artf (artifact) andlangunit (language unit). The types are mainly high-leveland include, e.g., physical object; substance; abstraction;property; natural phenomenon; event; act; change; varioustypes for time, location,

extraction that illustrates the use of ESG and PAS. These main sections are followed by sections on use in Watson, evaluation, related work, and conclusion and future work. SG parsing The SG parsing system is divided into a large language-universal shell and language-specific grammars for English

Related Documents:

The parsing algorithm optimizes the posterior probability and outputs a scene representation in a "parsing graph", in a spirit similar to parsing sentences in speech and natural language. The algorithm constructs the parsing graph and re-configures it dy-namically using a set of reversible Markov chain jumps. This computational framework

Model List will show the list of parsing models and allow a user of sufficient permission to edit parsing models. Add New Model allows creation of a new parsing model. Setup allows modification of the license and email alerts. The file parsing history shows details on parsing. The list may be sorted by each column. 3-4. Email Setup

the parsing anticipating network (yellow) which takes the preceding parsing results: S t 4:t 1 as input and predicts future scene parsing. By providing pixel-level class information (i.e. S t 1), the parsing anticipating network benefits the flow anticipating network to enable the latter to semantically distinguish different pixels

operations like information extraction, etc. Multiple parsing techniques have been presented until now. Some of them unable to resolve the ambiguity issue that arises in the text corpora. This paper performs a comparison of different models presented in two parsing strategies: Statistical parsing and Dependency parsing.

Concretely, we simulate jabberwocky parsing by adding noise to the representation of words in the input and observe how parsing performance varies. We test two types of noise: one in which words are replaced with an out-of-vocabulary word without a lexical representation, and a sec-ond in which words are replaced with others (with

For Willis Towers Watson and Willis Towers Watson client use only. Source: Willis Towers Watson 2020 Q1 Salary Budget Planning Preliminary Report Singapore, Willis Towers Watson COVID-19 Readiness Pulse Survey

Arthur Allen Watson was born in Alabama, the son of Nimrod Wm. and Mary Randolph Watson, July 20, 1858. In the late sixties the Watson family moved to Hardin County, Tenn. As a young man Judge Watson taught school for a number of years. Later he became interested in county affairs and served as deputy county court clerk, county court clerk, county

Algae: Lectures -15 Unit 1: Classification of algae- comparative survey of important system : Fritsch- Smith-Round Ultrastructure of algal cells: cell wall, flagella, chloroplast, pyrenoid, eye-spot and their importance in classification. Structure and function of heterocysts, pigments in algae and Economic importance of algae. Unit 2: General account of thallus structure, reproduction .