Comparison Of Parsers Dealing With Text Ambiguity In .

3y ago
311.51 KB
7 Pages
Last View : 3d ago
Last Download : 5m ago
Upload by : Abram Andresen

Comparison of Parsers Dealing with Text Ambiguity in Natural LanguageProcessingSareya Qayyum, Nimra Aziz, Waqas Anwar, Usama Ijaz BajwaComputer ScienceCOMSATS University Islamabad, Lahore Campus{sareyaqayyum, nimraiftikhar1995}, {waqasanwar, usamabajwa} have exactly one or many such tree structures. Thegrammar that is usually used is CFG (context-free grammar).One of the major challenges in Parsing is dealing withambiguities in the sentence. Ambiguity refers to sentences thatare subjective, open to interpretation and can have multiplemeanings. Three types of ambiguities are present in a sentencewhen parsing is performed namely Syntactic Ambiguity,Lexical Ambiguity, and Semantic Ambiguity.Syntactic Ambiguity: Sentences can be parsed in multiplesyntactical forms. E.g. ‘I heard his cell phone ring in myoffice’. The Phrase ‘in my office’ can be parsed in a way thatmodifies the noun or vice versa modifies the verb.Lexical Ambiguity: Sentences having Lexical ambiguity canhave words with multiple assertions. E.g. ‘book’ is used as anoun when used in a sentence as ‘He loves to read books’. Onthe other hand, it can also be used as a verb when used in asentence as ‘He books an appointment at the dentist’.Semantic ambiguity: It is related to the interpretation of thesentence. E.g. ‘I saw a man with the telescope’. It can bededuced as if I saw a man holding a telescope. Or I saw a manthrough a telescope. Parsing of sentences happens in multiplestages:a) Dividing a sentence into tokens. These tokens are used asinput to some other tasks like parsing.b) Tagging each token with parts of speech.AbstractParsing in Natural Language processing is a vast domainwhich serves as a pre-processing step for different NLPoperations like information extraction, etc. Multiple parsingtechniques have been presented until now. Some of them unableto resolve the ambiguity issue that arises in the text corpora.This paper performs a comparison of different modelspresented in two parsing strategies: Statistical parsing andDependency parsing. The comparison has been made on veryfamous Penn Treebank corpus specifically involving its WallStreet Journal Portion.1. IntroductionIn Natural Language Processing (NLP), Parsing acts as anessential and key component to many problems. Parsing isthe analysis of syntax or commonly called syntactic analysisin which we process the sentences following the rules of aformal grammar. Parsing involves uncovering the meaning,content and underlying structure that makes up a sentence.Every Language has a unique grammar thus making parsingprocess unique as well. Languages can be divided into twocategories:Segmented Languages: Those languages whose words arespace-delimited e.g. English and Spanish languageUn-Segmented Languages: In un-segmented Languages wordsegmentation is required as a pre-processing step before furtherprocessing E.g. Chinese and Japanese LanguagesA language is not just a ‘bag of words’ or else there wouldbe no need for grammar. Grammatical rules apply to sentenceswhere a sentence in a language is a group of strings that consistsof two things: a subject and a predicate. A subject is defined asa Noun Phrase (NP) and predicate as a verb phrase (VP). E.g.In an English Sentence ‘Sam went to school’, ‘Sam’ is NP and‘went’ is VP. Parsing has many applications in NaturalLanguage Processing. For Example, Machine translation, textsummarization and question answering are few areas of NLP inwhich immense work is being performed, etc. Parsing serves asan initial step for these problems. The result of parsing asentence using a formal grammar is a tree structure. A sentenceEight parts of speech are observed in the English language –verbs, nouns, pronouns, adverbs, adjectives, conjunctions,interjections and prepositions.1.1. Types of parsingTwo main types of parsing will be discussed in this paper and acomparison of performances for both the types will be made.Following are the types:i.Statistical Parsingii.Dependency Parsing1.1.1. Statistical ParsingIn Natural Language processing statistical parsers are thetype of parsers which associate grammar rules with probability.The Statistical parser is an algorithm that looks for a tree thatmaximizes the probability P (T S). PCFG (Probabilistic29

Context-Free Grammar) which is an extended form of CFG isused as an underlying grammar. The probability of a parse treeof a sentence can be computed by firstly calculating theprobability of the productions used in the derivation of the treeand then taking the product of these probabilities. Followingthree main tasks are involved in statistical parsing:Presently the assignment of Syntactic parsing is veryunpredictable because of the way that a given sentence can havenumerous parse trees which we call as ambiguities. Consider asentence "Book that flight." which can frame various parse treesdependent on its uncertain grammatical speech tags except ifthese ambiguities are settled. Picking a right parse from thenumerous conceivable parses is called as syntacticdisambiguation.The two main methods of dependency parsing are:1. Determine the likely parse trees for the sentence.2. Assign probabilities to each derived parse3. Select the most probable parse (highest probability)a) Transition-based dependency parsingb) MST (Maximum spanning tree) dependency parsingStatistical parsing requires a corpus of hand-parsed text. Forthis purpose, we have Penn treebank (Marcus 1993). PennTreebank is extensively used since it is the largest annotateddataset for English. In recent years Penn Treebank has beenimmensely used and considered as a standard for training andtesting statistical parsers. Parseval measures are used toevaluate the Penn Treebank parsers. From many Parsevalmeasures most commonly used ones are and labelled recall(LR) and labelled precision (LP). Sometimes Bracketedprecision (BP) and bracketed recall (BR) are also used whichare less strict measures then LP and LR.Transition based parsers commonly have a linear or quadraticcomplexity. MST based parsers divides the dependencystructure into small parts called ‘factors’. The components ofthe principle MST parsing algorithm are edges that consists ofthe head, the edge name and the dependent (child). Thisalgorithm has quadratic complexity. (Bernd Bohnet., 2010).Treebanks have a critical job in the advancement andassessment of dependency parsers. Having human annotatorslegitimately create dependency structures for a given corpus.The most generally utilized syntactic structure is the parse treewhich can be produced utilizing some parsing algorithms.These parse trees are valuable in different applications likesentence grammar checking, co-reference goals, question, andtheir answers, data extraction or all the more significantly itassumes a basic job in the semantic analysis stage. We canlikewise utilize a deterministic procedure to decipher existingprincipal based treebanks into dependency trees using headrules. The significant English reliance treebanks have to a greatextent been removed from existing assets, for example, theWall Street Journal segments of the Penn Treebank (Marcus etal., 1993). The later OntoNotes venture (Hovy et al. 2006,Weischedel et al. 2011) expands this methodology going pastcustomary news content to incorporate conversational phonespeech, newsgroups, weblogs and talk programs in English,Arabic and Chinese.1.1.2. Dependency ParsingIdentifying a sentence and allocating a syntactic arrangementto it is the major task of dependency parsing. In thedependency-based method, the head-dependent relationprovides an estimate to the semantic relationship betweenarguments and their predicates.The translation of a sentence to its dependency structure is donein two subtasks:· Classify the structure for all head-dependentrelationships.· Classify these relations with their correct dependencyrelations.·A tree of dependency parsing is a coordinated diagram (adirected graph) which fulfills the stated following limitations:· It consists of a single assigned root hub that does nothave any approaching segments or arcs.· With the exemption of the root hub, every vertex hasprecisely one approaching segment or arc.· From every vertex in V, a unique way exits from theroot hub.In short, the stated requirements guarantee that every word hasa distinct head, to which the dependency tree is linked, and aunique root hub by which one can pursue a unique guided pathto each word of the sentence.The idea of projectivity forces an extra restriction and isfirmly identified with the setting free nature of human dialects.If there is a way from the head to each word that lies betweenthe head and it’s dependent, then an arc from a head to itsdependent is considered as projective. A dependence tree istherefore said to be projective if every single one of the arcs isprojective There are, in any case, numerous impeccablysubstantial developments especially in dialects with a generallyadaptable word that leads to non-projective trees.2. Literature ReviewFollowing is the previous work for statistical parsing anddependency parsing.2.1. Statistical ParsingMagerman [9] presented a statistical parser called SPATTER.It achieves the best accuracy by building a complete parse forevery sentence in the corpus. SPATTER is based on a decisiontree learning technique. Using the PARSEVAL evaluationmeasure, SPATTER on the Penn Treebank Wall Street Journalcorpus achieves 86% precision P (see equation 1) and recall R(see equation 2), and 1.3 crossing brackets CB (see equation 3)per sentence for sentences with a word length of 40 or less. Forsentences having word length between 10 and 20, SPATTERachieves 91% P, 90% R, and 0.5 CB.30

1997 and Ratnaparkhi 1998 model to better understand thecapabilities of parsers and to check if they yield better results.This combination of three parsers gave the best results withLabelled precision of 92.1% and Labelled Recall of 89.2% onthe development set while achieving LP of 92.4% and LR of90.1% on the test set of Penn Treebank. These are best-knownresults up till now.Parser presented by Bod [15] claims to give a betterperformance in terms of Parseval measures. It improved on theCharniak’s result by achieving 89.7% LP and LR on sentenceswith 100 words. For sentences with 40 words or less Bod’smodel achieves 90.8% LP and 90.6% LR. Although it isdebatable whether an increase of 0.2% in LP and 0.1% in LR isconsidered an improvement. Bods’ model takes arbitrarystructural and lexical dependencies into consideration whencomputing probabilities of a parse tree as it is based on DataOriented Parsing (DOP).Collins in 2000 and Collins and Duffy in 2002 [16] presentedtwo approaches in which they improved the performance forCollins 2000 model by re-ranking the parses using a differentmodel on the outcome of Collins’ 1999 model. LP improvedfrom 88.1% to 88.3% while LR improved from 88.3% to 89.6%on Collins 1999 model. For Collins 2000 model using a variantfor boosting LP improved to 88.3% and LR to 89.6%. ForCollins and Duffy 2002 model by using a DOP like approachusing a voted Perceptron LP improved to 88.6% and LR to 88.9%.P number of Correct Components / number of Componentsin parser output(1)R number of correct Components / number of Components ingold standard(2)CB number of Components in parser output that cross goldstandard Components / number of Components in parseroutput(3)In 1996 Collins [10] presents his first model for statisticalparsing. Below equation 4 demonstrates a conditional modelcapable of parse selection, where for a given sentence S in thecorpus, the probability of a parse tree T is calculated directly.Input to the model is a Part of Speech (POS) tagged sentencewhich produces a tree as an output. For a given sentence S inthe corpus and its tree T, the conditional model for Collinsrepresents the probability in the following manner:The most likely parse under the model is then:T (best) argmax T (P (T S))(4)Collins (1996) model showed an improvement in Precisionand Recall when compared with Magerman’s (1995) results.Collins presented a new model in 1997 [11] which improvedon the previous results of Collins conditional model (1996).This approach is based on a generative model. The Collins(1997) accounts for word-word dependencies when generatinga parse tree. This model focuses on the modeling of the parsesand deals with the flat trees of the Penn Treebank corpus. Thisgenerative version of Collins parser corpus shows animprovement of 2.3% on the conditional model of Collins(1996). It achieves 88.1% P and 87.5% R on Wall StreetJournal.In [12] Collins (1999) few problems were observed with thegenerative model for Collins (1997). It was noticed that Collinsmodel is no longer a model for predicting maximum likelihoodbecause of how the dependency probabilities were estimated.Another deficiency is its way of considering all dependencyrelations as independent. Due to these reasons, Collinspresented another modification to his previous model.Charniak [13] (2000, 1999) after presenting his first modelfor statistical parsing in 1997 Charniak presented anotherstatistical Treebank parser. It outperformed the Collinsgenerative model presented in 1997. Charniak’s mainadvantage to Collin’s is generating candidate parses using asimple probabilistic chart parser. Charniak’s model showed animprovement of 0.45% in labelled recall (LR) and labelledprecision (LP). The model achieved average Precision and anaverage recall of 91.1% on sentences with length less than 40.For sentences with length less than 100, the model achieved89.5% average precision and recall on Penn Treebank corpus.Over the previously best results, Charniak’s model achieves anerror reduction of 13% for single parser on this test set.Henderson and Brill [14] presented an approach where theycombined the results/prediction of three current existingparsers. They combined Charniak’s 1997 model with Collins2.2. Dependency ParsingA huge interest has been seen in Dependency Parsing latelyfor applications such as relation extraction, machine translation,synonym generation, and lexical resource augmentation. Themain reason for using dependency structures is because they arehighly effective to study and parse while still training much ofthe predicate-argument information needed in a lot ofapplications.Most of these parsing models have concentrated on trees thatare projective, including the effort of Eisner (1996), Yamadaand Matsumoto (2003), Collins et al. (1999), Nivre and Scholz(2004), and McDonald et al. (2005).A parsing model presented by Nivre and Nilsson in 2005 allowsto include edges that are non-projective, into trees using learnededge transformations in the memory-based parser. The methodvaries in examining efficiently the full span of non-projectivetrees. The main focus was that the dependency parsing canserve as the main search point for an MST in a directed graph.This specifies the regular projective models of parsing that arebased on the Eisner algorithm (Eisner, 1996) to have a betterefficient of O (n 2). By using the spanning-tree illustration, tocover non-projective dependencies we extend the work ofMcDonald et al. (2005) on online large-margin discriminativetraining methods (McDonald, Pereira, and Ribarov; 2005)Nowadays there has been an increase in the usage ofdependency representations through many tasks of naturallanguage processing (NLP). Stanford dependency isextensively used in both NLP and biomedical text mining.31

The study in [10] reveals that Collins and Magerman both uselexicalized PCFG which is associating a head word to everynon-terminal present in the parse tree. Collins parser performsalmost equally as SPATTER when it is trained and tested onWall Street Journal portion of Penn Treebank. The advantageof Collins 1996 model over SPATTER is the simplicity of itsarchitecture and working. Still, many improvements can bemade by using a more sophisticated probability estimationtechniques like deleted interpolation or estimation on relaxingthe distance measure for smoothing could be used. Anotherlimitation of Collins 1996 model is that it does not account forvalency when calculating the parse. Collins [11] attempt to address the flaws of the modelpresented in 1996 by putting forth 3 models. It shows that subcategorization and wh-movement can be given a probabilistictreatment thus resulting in the statistical interpretation of theconcepts causing an increase in performance by adding usefulinformation to the parser’s output. The average improvement ofCollins 97 over the previous model is 2.3%. Model 1 presentedhas clear advantages when handling unary rules and distancemeasures. Model 2 and 3 can apply condition on any structurethat has been previously generated while Collins 96 lack in thistreatment. [12] Is an update of previous works of Collins. It addressesthe limitation of Collins 1996 and 1997 related to punctuationas surface features of the sentence. Previous models failed togenerate punctuation and are considered a deficiency of themodel. Collins 2000 uses a technique that is based on boostingalgorithms for machine learning for re-ranking the best outputsusing additional features. The major invention of Charniak’s [13] 2000 model is the useof maximum entropy inspired model which results in anincrease of 2% in performance due to its strategy of smoothingand to combine multiple conditioning events for testing.Maximum entropy inspired approached has certain advantagesover the probabilistic model and has recommended itself for usedue to its novel approach of smoothing. Most importantprogress accomplished by using Charniak’s model overconventional deleted interpolation is the flexibility achieveddue to simpler maximum-inspired-model which let usexperiment with different conditioning events and to move upto Markov grammar without significant programming. Thismodel uses Markov processes to generate rules. The additionalfeatures incorporated boost the performance. The main goal forCharniak’s parser is to generate model flexible enough to allowchanges for parsing to more semantic levels. To solve some of the fundamental problems of NaturalLanguage processing like parsing some authors includingHenderson and Brill [14] may adopt a unique approach tocombine the previous parsers to obtain better results. Collinsalong, with Charniak and Ratnaparkhi model, are experimentedto explore different parser combination. With poor parser beingintroduced during the experiments, Techniques like parserswitching and parser hybridization still gave better results. Formore powerful parser combinations the results can be improvedfurther.Stanford Dependency was originally extracted from constituentparses but the production of parse trees from the raw text wasquite time. The approaches hence designed specifically fordependency parsing such as Covington, minimum spanningtree (MST), Eisner and Nivre should perform faster, assumingthat they have low time complexity. The different approachesare compared in terms of their collective accuracy and speedand characteristic errors are reported. The parsing models aretrained using the training set extracted from the Penn Treebankthat consists of sections 2 through 21, different parsers ofdependency were compared such as MaltParser package v1.3selected models, the MSTParser 0.4.3b, and the rule BasedRelEx parser 1.2.0. We used F1-score other than accuracybecause the typical Stanford dependency representation parserscan generate a variety of different dependencies for everysentence. The fastest parsers were the Malt package, Nivre, andCovington. Nivre Eager and MSTParser (Eisner) and theyachieved better F1scores when the interaction between modeland features was not used. If parsing a huge amount of data, andspeed is important, the experiments suggest that the top choiceis to use parsers included in the Malt package. (Cer, D. M., DeMarneffe, (2010, May)).3. Performance Evaluatio

operations like information extraction, etc. Multiple parsing techniques have been presented until now. Some of them unable to resolve the ambiguity issue that arises in the text corpora. This paper performs a comparison of different models presented in two parsing strategies: Statistical parsing and Dependency parsing.

Related Documents:

gramming forms the basis of most current parsing techniques, a comparison of parsers using it allows us to roughly grasp the difference between the performance of LTAG and HPSG parsers. Since the impact of CFG filtering for LTAG is quite different from that for HPSG, a comparison of parsers using it can demonstrate the utility of our methods.

higher-order graph-based techniques can be more accurate, though somewhat slower, than constituent parsers. We demonstrate also that n-way jackknifing is a useful technique for producing automatic (rather than gold) part-of-speech tags to train Chinese dependency parsers. Finally, we analyze the relations pro-duced by both kinds of parsing and .

CHAPTER 2 JAVACC 2.1 JAVACC OVERVIEW Javacc is a java based parser generator that generates a top-down parser. Top-down parsers or recursive decent parsers allow the use of more general grammars. The only limitation of these parsers is that left recursion is not allowed because this could lead to infinite recursion.

Python programmers who are processing texts—but who are not designing new pro-gramming languages—need not worry about those parsing subtleties omitted from this book. 4.1 An Introduction to Parsers 4.1.1 When Data Becomes Deep and Texts Become Stateful Regular expressions can match quite complicated patterns, but they fall short when

data-plane architectures called Protocol Independent Switch Architecture (PISA) [2]. PISA architectures are composed of programmable parsers, match-action pipelines, and de-parsers and are designed to process packets at line rate. Each instance of a PISA architecture exposes a certain data-pl

Comparison table descriptions 8 Water bill comparison summary (table 3) 10 Wastewater bill comparison summary (table 4) 11 Combined bill comparison summary (table 5) 12 Water bill comparison – Phoenix Metro chart 13 Water bill comparison – Southwest Region chart 14

figure 8.29 sqt comparison map: superior bay (top of sediment, 0-0.5 ft) figure 8.30 sqt comparison map: 21st avenue bay figure 8.31 sqt comparison map: agp slip figure 8.32 sqt comparison map: azcon slip figure 8.33 sqt comparison map: boat landing figure 8.34 sqt comparison map: cargill slip figure

ASME B31.8 Gas Transmission and Distribution Piping Systems ASME B31.9 Building Services Piping ASME B31.11 Slurry Transportation Piping Systems ANSI/AGA Z223.1 National Fuel Gas Code (same as NFPA 54) AWWA C 100 Cast-Iron Pipe, Fittings AWWA C 200 Steel Pipe AWWA C 300 Concrete Pipe AWWA C 400 Asbestos Cement Pipe AWWA C 500 Valves and Hydrants AWWA C 600 Pipe Laying AWWA C 900 PVC Pressure .