Lexical Resources To Enrich English Malayalam Machine . - LREC Conf

1y ago
25 Views
2 Downloads
591.27 KB
8 Pages
Last View : 28d ago
Last Download : 3m ago
Upload by : Emanuel Batten
Transcription

Lexical Resources to Enrich English Malayalam Machine TranslationSreelekha. S, Pushpak BhattacharyyaDept. of Computer Science & Engg., IIT Bombay, Mumbai, India{sreelekha, pb}@cse.iitb.ac.inAbstractIn this paper we present our work on the usage of lexical resources for the Machine Translation English and Malayalam. We describe acomparative performance between different Statistical Machine Translation (SMT) systems on top of phrase based SMT system asbaseline. We explore different ways of utilizing lexical resources to improve the quality of English Malayalam statistical machinetranslation. In order to enrich the training corpus we have augmented the lexical resources in two ways (a) additional vocabulary and(b) inflected verbal forms. Lexical resources include IndoWordnet semantic relation set, lexical words and verb phrases etc. We havedescribed case studies, evaluations and have given detailed error analysis for both Malayalam to English and English to Malayalammachine translation systems. We observed significant improvement in evaluations of translation quality. Lexical resources do helpuplift performance when parallel corpora are scanty.Keywords: Lexical Resources, Statistical Machine Translation, English-Malayalam Machine Translation1.IntroductionEach Machine processing of Natural (Human) Languageshas a long tradition, benefiting from decades of manualand semi-automatic analysis by linguists, sociologists,psychologists and computer scientists among others.Development of a full-fledged bilingual MachineTranslation (MT) system for any two natural languageswith limited electronic resources and tools is achallenging and demanding task. Since India is rich inlinguistic divergence there are many morphologically richlanguages quite different from English as well as fromeach other, there is a large requirement for machinetranslation between them. Development of efficientmachine translation systems using appropriatemethodologies and with limited resources is a challengingtask. There are many ongoing attempts to develop MTsystems for Indian languages (Antony, 2013;Kunchukuttan et al., 2014; Sreelekha et al., 2014;Sreelekha et al., 2015) using both rule based andstatistical approaches. There were many attempts toimprove the quality of Statistical MT systems such as;using Monolingually-Derived Paraphrases(Marton et al.,2009), Using Related Resource-Rich languages (Nakovand Ng, 2012) Considering the large amount of humaneffort and linguistic knowledge required for developingrule based systems, statistical MT systems became abetter choice in terms of efficiency. Still the statisticalsystems fail to handle rich morphology.Consider the English sentence,Here the system fails to translate the verb phrase “hasbeen sent to” together and it translated a part of the phrase“sent” as “അയച്ചു”{ayachu}{sent}, which is wrong in thecontext. The same way another verb phrase “for openingthe door” has been translated partly as “തുറന്നു” {thurannu}{opened}. Also, the system has deficiency in vocabularyand it couldn’t translate the English word “Mosque”. Inthese kinds of situations in order to learn various inflectedforms and verb phrases, lexical resources can play a majorrole. In this paper we discuss the usage of various lexicalresources and how it can be used for improving thetranslation quality with a detailed analysis about variouslinguistic phenomena.2.Major design challenges in Machine Translation (MT) arethe syntactic structural transfer, which is the conversionfrom a syntactic analysis structure of the source languageto the structure of the target language and the ambiguities.2.1 Challenge of AmbiguityThere are three types of ambiguities: structural ambiguitylexical ambiguity and semantic ambiguity.2.1.1.Lexical AmbiguityThe Words and phrases in one language often havemultiple meaning in another language.For example, the English sentence,He has been sent to the mosque for opening the doorEnglishMalayalam-The English–Malayalam SMT system translated it as,Malayalam-അവൻ അയച്ചുmosque വാതിൽChallenges in English –MalayalamMachine Translationതുറന്നു{avan ayachu mosque vathil thurannu}{He sent mosque door opened}{He sent mosque opened door}He picked the photoഅവൻ ഫോഫടോ എടുത്തു{avan photo eduthu}Here in the above sentence “picked”, has ambiguity inmeaning. It is not clear that whether the word “picked”, isused as the “clicked the photo” (എടുത്തു {eduthu} in620

Malayalam) sense or the “took” sense. However this is agood example where both in source language and targetlanguage ambiguity is present for the same word. Thiskind of ambiguity has to be identified from the context.Subject- Object-Verb (SOV).Consider an example for word ordering,English-2.1.2.Structural AmbiguityThe In this case, due to the structural order, there will bemultiple meanings. For example,Malayalam- അവിടെ ടാക്കമുള്ള ടMalayalam-ഗീത ചന്തയിൽ ഫോയി.{Gita chanthayil poyi}(S)(O) �കുട്ടികളുും ഉണ്ടായിരുന്നു{avide pokkamulla penkuttikalum ankuttikalumundayirunnu}English There were tall girls and boys therethe words “ട ാക്കമുള്ള ട ണ്കുട്ടികളുും{pokkamulla penkuttikalum aankuttikalum}{tall girls and boys} it is clear that, girls are tall but it isnot clear that boys are tall, since in Malayalam torepresent tall girls and boys only one word “പ ോക്കമുള്ള”{pokkamulla} {tall} is being used. It can have twointerpretations in English according to its structure.HereGita went to market(S) ��”{There were tall girls and boys there}or{There were tall girls and fat boys there}One of the big problems in Machine Translation is togenerate appropriate Machine Translations by handlingthis kind of structural ambiguity.2.1.3.Semantic AmbiguityThe In this case, due to the understanding of the semantics,there will be multiple translations. For example, considerthe English sentence,I travel with bag and umbrellaI travel with my kidsHere this English sentence can be translated in Malayalamas,In addition, Malayalam is morphologically very rich ascompared to English, wherein there are a lot ofpost-modifiers in the former as compared to the later.For example, the word form “കടലിൽ” {kadalil} {inthe sea} is derived by attaching “ൽ” {il} as a suffix to thenoun “കടൽ”{kadal}{sea} by undergoing an inflectionalprocess. Malayalam exhibits agglutination of suffixeswhich is not present in English and therefore thesesuffixes has equivalents in the form of pre positions. Forthe above example, the English equivalent of the suffix “ൽ”{il} is the pre position “in the” which is separated from thenoun “sea”. This kind of structural differences have to behandled properly during MT.2.3 Vocabulary DifferencesLanguages differ in the way they lexically divide theconceptual space and sometimes no direct equivalents canbe found for a particular word or phrase of one languagein another.Consider the sentence,നാടളഞോൻ എന്പറ കുടികഫ ോപടോപ്പമോണ് �{njan ente kuttikalodoppamanu sancharikkarullathu}{I travel with my kids}Here, in the two English sentences “with” gets translatedto ടകാണ്ടാണ് {kontanu} {with} and ഒപ്പമാണ് {oppamanu}{with} respectively. This disambiguation requiresknowledge to distinguish between “bag- umbrella” and“kids”.2.2 Structural DifferencesThere are word order differences between English andMalayalam such as, English language follows Subject-Verb- Object (SVO) and Malayalam language followsഉണ്ട്Here the word, “കളഭാഭിഷേകും” {kalabhabhishekam} as averb has no equivalent in English, and this word have tobe translated as “the pooja which will cover the idol withsandlewood”. Hence the sentence will be translated as,Malayalam-ഞോൻ ബോഗുും കുടയുും പകോണ്ടോണ് �{njan bagum kudayum kontanu sancharikkarullathu}{I bags umbrella with travel}orകളഭാഭിഷേകും{ nale kalabhabhishekam ��ുംഉണ്ട്{ nale kalabhabhishekam undu}Tomorrow, the pooja which will coverthe idol with sandalwood is there.Translating such language specific concepts poseadditional challenges in machine translation. .3.Experimental DiscussionWe now describe our experiments and results on phrasebased baseline SMT system1 for English- Malayalam andMalayalam – English, specifically with the usage oflexical resources. We use Moses (Koehn et al., 2007) andGiza 2 for learning the statistical models (Och 2001).There are structural differences between Malayalam andEnglish and in the generation of word forms due to themorphological complexity. In order to overcome www.statmt.org/

difficulty and make the machine to learn differentmorphological word forms, lexical resources can play amajor role. Different word forms such as verb phrases,morphological forms prepositional phrases etc can beused. Moreover the SMT system lacks in vocabulary dueto the small amount of parallel corpus. Comparativeperformance studies conducted by Och and Ney (2003)have shown the significance of adding lexical words intocorpus and the improvement in the translation quality. Wehave used lexical words, IndoWordnet (Bhattacharyya,2010), verb phrases etc. to increase the coverage ofvocabulary. We have done many experiments to improvethe quality of machine translation by augmenting variouslexical resources into the training corpus. The statistics oflexical resources used are shown in Table 1 and the resultsare shown in Tables 2, 3, 4 and 5. Our experiments arelisted as below:quality. For example, consider a sentence from the corpuswhere the translation is wrong,Sl.No{Tus located on berach river’s bank udaipur’s sun templeimportant sculptin ining Corpus[Manually cleanedand icalLexical ResourcesResource in CorpusSourceCFILT,Indo WordnetIITSynset wordsBombayCFILTLexical wordsIITB,Joshua,OlamCFILTVerb ually cleanedand aligned]ILCITourismILCIHealthTotalCorpusTesting corpusSource[Manually cleanedand aligned]ILCITourismILCIHealthTotalCorpus Size[Sentences]23750237502951877018Lexical Resource Size[Words]25341144505200544370390Corpus Size[Sentences]250250500Corpus Size[Sentences]100010002000English: Tus is located on the banks of the Berach rivernear Udaipur and the Sun temple here has an importantplace in the study of sculpting tradition .Equivalent Malayalam Translation (wrong)ഫബടച്ച് നദിയുപട തടത്തില് സ്ഥിതി പചയ്യുന്ന റ്റൂസ് �ൽ ശിൽുരിന്പറകലയുപട മഹതവമുണ്ട്.{bedach nadiyude thadathilu sdhithi cheyyunna thwamundu}{berach river’s bank located Tus Udaipur’s Sun templesculpting importance}The comparative performance results of cleanedcorpus over uncleaned corpus were shown in the Tables 2,3, 4 and 5.3.2 SMT system with cleaned corpusIn order to improve the quality of translation, we haveremoved the stylistic constructions, unwanted characters,wrong translations from the parallel corpus. We havecorrected the grammatical structures, missing translations,wrong phrases, misalignments between parallel sentenceswhich improves the learning of word to word alignments.Consider the sentence discussed in section 3.1, which hasboth source side and target side translation errors. Wehave corrected the translations as,English : Tus located on the banks of the Berach rivernear Udaipur and the Sun temple have an important placein the study of sculpting tradition.Malayalam :ഉദയ്ുരിന്പറ അടുത്ത് ഫബടച്ച് നദിയുപട തടത്തിൽസ്ഥിതി പചയ്യുന്ന �് �ുംശിൽകലയുപടഫതയക സ്ഥോനമുണ്ട്.{udaipurinte aduthu bedahcu nadiyude tadathil sthiticheyyunna toosinum sooryakshethrathinum silpakalyudepadanathinu pratyeka sthanamundu}{Udaipur’s near Berach river’s bank located Tus and Suntemple sculpting tradition’s for study important place }Table 1: Statistics of Corpus and Lexical resources used3.1 SMT system with an unclean corpus{Tus located on the banks of the Berach river nearUdaipur and the Sun temple have an important place inthe study of sculpting tradition.}The learning of proper grammatical structures wasprevented by the stylistic constructions, misalignments,wrong and missing translations etc. present in the uncleancorpus (Refer Table-1). This reduced the translationAfter cleaning, the translation quality improved to morethan 14 times compared to system with unclean corpus.We observed during error analysis that, the machine lacks622

in sufficient amount of vocabulary and hence weinvestigated on the usage of lexical words to improve thequality of machine translation.3.3 Corpus with lexical wordsWe have extracted a total of 437832 parallel EnglishMalayalam lexical words from parallel corpus. We havealso used dictionary words available from Joshua3 corpusand Olam4 dataset after manual validation. For example,while translating the English sentence,He reached early on for the movieSMT system translated it in Malayalam as,making up }Consider an English sentence,Decorations should beautify the occasionThe SMT system translated it in Malayalam as,അലങ്കാരും ടെയ്യു അവസരും{alankaram cheyyu avasaram}{decorations do occasion}Here the system fails to translate the meaning of “beautify”correctly. After augmenting the synsets of beautify to thecorpus, SMT system was able to translate the equivalentEnglish meaning in Malayalam as,അവൻ സിനിമയ്ക്ക്ക് എത്തിഅലങ്കോരും അവസരത്തിപന മഫനോഹരമോക്കണും{avan cinemaykku ethi}{He cinema reached}{He reached cinema}{alankaram avasarathine manoharamakkanam}{decoration occasion should beautify }{ Decorations should beautify occasions }Here the system failed to translate the meaning of themultiword early on. In lexicalWord list it has thefollowing equivalent,early on : മുൻ ുതപന്ന {munputhanne}We have augmented the extracted parallel lexical wordsinto the training corpus. After training, the above sentenceis translated correctly as,അവൻ സിനിമയ്ക്ക്ക് മുൻുതടന്ന എത്തി{avan cinemaykku munputhanne ethi}{He for the cinema early on reached}{He reached early on for the movie}Since the lexical words are extracted from the samecorpus, it helped in improving the translation quality to agreat extent. During the error analysis we observed thateven though the machine translation system is able to giveconsiderably good quality translation, it faces difficultiesin translating different words and its concepts. Hence weinvestigated the usage of word-synsets to make the systemlearn the words with its concepts.Since the synsets covers all common forms of a word, theaugmentation of extracted parallel synset words in to thetraining corpus not only helped in improving thetranslation quality to a great extent but also, helped inhandling the word sense disambiguation well. But weobserved during error analysis that the system fails inhandling case markers and inflected forms and further weinvestigated on handling it.3.5 Corpus with verb phrasesIn order to overcome the verbal translation difficulty wehave programmatically extracted English - Malayalamparallel verbal forms and their translations which containvarious phenomena with a frequency count. In additionwe have used pos-tagged corpus to extract verbal phrases.We have augmented the manually validated 200544entries of verbal translations into the training corpus.Consider an English sentence,English: He took the decision for being aliveSMT system translated it in Malayalam as,3.4 Corpus with IndoWordnet synsetsഅവൻ തീരുമാനും എെുത്തു ജീവൻWe have used an algorithm to extract the bilingual wordsfrom IndoWordnet according to its semantic and lexicalrelations (Bhattacharyya 2010). Bilingual mappings aregenerated using the concept-based approach across wordsand synsets (Kumar et.al, 2008). We have considered allthe synset word mappings for a single word and generatedthat many entries of parallel words. For example, theword beautify has the following equivalent synset wordsin the IndoWordnet.{ avan theerumanam eduthu jeevan}{ Hedecision took live}Here the system fails to translate and convey theimportance of verb phrase “for being alive” in thissentence.After augmenting the corpus with theequivalent meaning of English - Malayalam verb phrasepair,for being alive:beautify: ഷാഭിക്കുക അലങ്കരിക്കുക �ഹരമാക്കുക �{beautify: shobhikkuka alnkkarikkuka sajjeekarikkukamanoharikkukamodi pidippikkuka}{beautify: shining decorating arranging beautify34നിലനിൽക്കാൻ ഷവണ്ടി{ nilanilkkan vendi }{being alive for }the system translated the sentence correctly as,അവൻ നിലനിൽക്കാൻ ഷവണ്ടി തീരുമാനും എെുത്തു{avan nilanilkkan vendi theerumanam eduthu}{He being alive for decision took}{He took the decision for being alive}www.cs.jhu.edu/ joshua-docs/index.htmlwww.Olam.in623

English-Malayalam Statistical MT SystemBLEU scoreMET- EORTERWith Unclean 4.48With Cleaned CorpusWith Lexical wordsWith Wordnet synsetsWith verb PhrasesWithout TuningWith TuningWithout TuningWith TuningWithout TuningWith TuningWithout TuningWith TuningWithout TuningWith TuningTable 2 : Results of English- Malayalam SMT BLEU score, METEOR, TER EvaluationsMalayalam-English Statistical MT SystemBLEU scoreMET-EORTERWith Unclean CorpusWithout Tuning1.160.10595.32With Tuning1.800.11991.52With Cleaned CorpusWithout Tuning22.010.18786.32With Tuning25.220.19083.7With Lexical wordsWithout Tuning28.650.21079.56With Tuning30.540.22676.30With Wordnet synsetsWithout Tuning32.200.26373.19With Tuning34.460.28371.19Corpus with verb PhrasesWithout Tuning36.100.29968.36With Tuning37.900.35563.88Table 3: Results of Malayalam-English SMT BLEU score, METEOR, TER EvaluationsMalayalam-English Statistical MT SystemAdequacyFluencyWith Unclean CorpusWithout Tuning12.87%16.3%With Tuning15.56%19.65%With Cleaned CorpusWithout Tuning51.01%61.21%With Tuning54%65.32%With Lexical wordsWithout Tuning59.08%71.21%With Tuning62.01%75.04%With Wordnet synsetsWithout Tuning67.36%79.21%With Tuning69.1%81.68%Corpus with verb PhrasesWithout Tuning72.01%84.32%With Tuning74.89%85.34%Table 4: Results of Malayalam-English SMT Subjective EvaluationEnglish-Malayalam Statistical MT SystemAdequacyFluencyWith Unclean h Cleaned CorpusWith Lexical wordsWith Wordnet synsetsWith verb PhrasesWithout TuningWith TuningWithout TuningWith TuningWithout TuningWith TuningWithout TuningWith TuningWithout TuningWith TuningTable 5: Results of English- Malayalam SMT Subjective Evaluation624

Figure 2 English-Malayalam SMT AnalysisFigure 1. Malayalam - English SMT AnalysisFigure 2. English - Malayalam SMT Analysis625

The error analysis has shown that the verb phraseaugmentation helped in translating verbal inflectionscorrectly and hence the quality of the translation has beenimproved drastically.4.BLEU-Score, METEOR and a decrement of TERevaluations, which shows the translation qualityimprovement. Also, our subjective evaluation resultsshow promising scores in terms of fluency and adequacy.This leads to the importance of utilizing various lexicalresources for developing an efficient Machine Translationsystem for morphologically rich languages.Our future work will be focused on investigating moreeffective ways to handle the rich morphology and hence toimproving the quality of Statistical Machine Translation.Evaluation & Error AnalysisWe have tested the translation system with a corpus of2000 sentences taken from the ‘ILCI tourism, health’corpus as shown in Table 4. In addition we have used atuning (MERT) corpus of 500 sentences as shown in Table3. We have evaluated the translated outputs of bothMalayalam to English and English to Malayalam SMTsystems in all 5 categories using various methods such assubjective evaluation, BLEU score (Papineni et al., 2002),METEOR and TER (Agarwal and Lavie 2008). Theresults of BLEU score, METEOR and TER evaluationsare displayed in Tables 2 and 3. We gave importance tosubjective evaluation to determine the fluency (F) andadequacy (A) of the translation, since for morphologicallyrich languages subjective evaluations can give moreaccurate results compared to other measures. We havefollowed the subjective evaluation procedure with thehelp of linguistic experts as described in Sreelekhaet.al.(2013) and the results are given in Table 4 and Table5. Fluency is an indicator of correct grammaticalconstructions present in the translated sentence whereasadequacy is an indicator of the amount of meaning beingcarried over from the source to the target. For eachtranslation we assigned scores between 1 and 5 dependingon how much sense the translation made and itsgrammatical correctness.6.7.We have observed that the quality of the translation isimproving as the corpus is getting cleaned and morelexical resources are being used. Hence, there is anincremental growth in adequacy, fluency, BLEU score,METEOR score and reduction in TER score. In addition,we were able to handle the one-to-many mapping ofphrases to a great extend by increasing the frequency ofoccurrence with the usage of linguistic resources. Theperformance comparison graph is shown in figure 1 andfigure 2. The fluency of the translation is increased up to85.34% in the case of Malayalam to English and up to 87%in the case of English to Malayalam, which is 4 timesmore than the baseline system results.5.ConclusionIn this paper we have mainly focused on the usage ofvarious lexical resources for improving the quality ofMachine Translation for low resource languages. We havediscussed the comparative performance of phrase basedStatistical Machine Translation with various lexicalresources for both Malayalam – English and English Malayalam. As discussed in the experimental Section,Statistical Machine Translation quality has improvedsignificantly with the usage of various lexical resources.Moreover, the system was able to handle the richmorphology to a great extend. We can see that there is anincremental growth in both the systems in terms ofAcknowledgementsThis work is funded by Department of Science andTechnology, Govt. of India under Women 1075/2014.Bibliographical ReferencesAgarwal, A., Lavie, A. (2008), Meteor, M-Bleu, M-terEvaluation matrics for high correlation with HumanRanking of Machine Translation output, Proceedings ofthe Third Workshop on Statistical Machine Translation,,pages 115–118, Columbus, Ohio, USA, Association forComputational Linguistics.Antony P. J. 2013. Machine Translation Approaches andSurvey for Indian Languages, The Association forComputational Linguistics and Chinese LanguageProcessing, Vol. 18, No. 1, March 2013, pp. 47-78Bhattacharyya, P. ( 2010). IndoWordnet, LREC.Koehn, P., Hoang, H., Birch, A., Callison-Burch, C.,Federico, M., Bertoldi, N., Cowan, B., Shen, W.,Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A.,Herbst, E. (2007) Moses: Open Source Toolkit forStatistical Machine Translation, Annual Meeting of theAssociation for Computational Linguistics (ACL),demonstration session, Czech Republic.Kumar, R., Mohanty, Bhattacharyya, P., Kalele, S.,Pandey, P., Sharma, A., Kapra, M. (2008) Synset BasedMultilingual Lexical: Insights, Applications andChallenges, Global Wordnet Conference.Kunchukuttan, A., Mishra, A., Chatterjee, R., Shah, R.,and Bhattacharyya, P. (2014) Shata-Anuvadak:Tackling Multiway Translation of Indian Languages,LREC 2014, Rekjyavik, Iceland, 26-31.Marton, Y., Callison-Burch, C. and Resnik, P. (2009)Improved Statistical Machine Translation UsingMonolingually-derived Paraphrases, Proceedings ofthe 2009 Conference on Empirical Methods in NaturalLanguage Processing(EMNLP), Volume 1- Pages 381390.Nakov, P. I. and Ng, H. T. (2012). Improving StatisticalMahcine Translation for a Resource-Poor LanguageUsing Related Resource-Rich Languages, Journal of AIResearch, Volume 44, pages 179-222.Och, F. J. and Ney, H. (2003). A Systematic utational Linguistics.Och F. J. and Ney, H. 2001. Statistical Multi SourceTranslation. MT Summit.Papineni, K., Roukos, S., Ward, T., and Zhu, W. J. (2002).BLEU: a Method for Automatic Evaluation of MachineTranslation, Proceedings of the 40th Annual Meetingof the Association for Computational Linguistics626

(ACL), Philadelphia, pp. 311-318.Sreelekha, Dabre, R., Bhattacharyya, P. (2013).Comparison of SMT and RBMT, The Requirement ofHybridization for Marathi – Hindi MT ICON, 10thInternational conference on NLP.Sreelekha, Bhattacharyya, P., Malathi, D. (2014). LexicalResources for Hindi Marathi MT, WILDREproceedings.Sreelekha, Bhattacharyya, P., Malathi, D. (2015). A CaseStudy on English- Malayalam Machine Translation,iDravidian Proceedings.Sreelekha, Dungarwal, P., Bhattacharyya, P., Malathi, D.(2015). Solving Data Spasity by Morphology Injectionin Factored SMT, International conference on NLPICON 2015.627

There are three types of ambiguities: structural ambiguity lexical ambiguity and semantic ambiguity. 2.1.1. Lexical Ambiguity The Words and phrases in one language often have multiple meaning in another language. . be found for a particular word or phrase of one language in another. Consider the sentence,

Related Documents:

test whether temporal speech processing limitation in SLI could interfere with the autonomous pre-lexical process (Montgomery, 2002) -lexical contact and lexical . It is worth noting that the auditory lexical decision task and the receptive vocabulary measure taps two different levels of processing; the last one. Lexical decision in children .

lexical collocations, and using the correct lexical collocations continuously in oral and written communication. The study of lexical collocation has been conducted by many researchers in the past few decades. The first previous study was by Martelli (2004) about a study of English lexical collocations written by Italian

Enrich account statement online by logging onto Enrich Online. The Cardholders should notify CIMB Bank within four (4) weeks from the Fulfillment Period if they are not able to view the Bonus Enrich Points inside their Enrich account statement. If

Resolving ambiguity through lexical asso- ciations Whittemore et al. (1990) found lexical preferences to be the key to resolving attachment ambiguity. Similarly, Taraban and McClelland found lexical content was key in explaining people's behavior. Various previous propos- als for guiding attachment disambiguation by the lexical

causative constructions found in languages viz. non-lexical and lexical. The non-lexical causative, . The non-lexical causative shows ambiguity when used with adverbs Downloaded by [Kenyatta University] at 00:03 08 March 2016 . 388 but the lexical causative does not have this ambiguity (Cooper, 1976:323). To illustrate,

Reasons to Separate Lexical and Syntax Analysis Simplicity - less complex approaches can be used for lexical analysis; separating them simplifies the parser Efficiency - separation allows optimization of the lexical analyzer Portability - parts of the lexical analyzer may not be portable, but the parser is always portable

Lexical analyzer generator -It writes a lexical analyzer Assumption -each token matches a regular expression Needs -set of regular expressions -for each expression an action Produces -A C program Automatically handles many tricky problems flex is the gnu version of the venerable unix tool lex. -Produces highly .

Andhra Pradesh State Council of Higher Education w.e.f. 2015-16 (Revised in April, 2016) B.A./B.Sc. FIRST YEAR MATHEMATICS SYLLABUS SEMESTER –I, PAPER - 1 DIFFERENTIAL EQUATIONS 60 Hrs UNIT – I (12 Hours), Differential Equations of first order and first degree : Linear Differential Equations; Differential Equations Reducible to Linear Form; Exact Differential Equations; Integrating Factors .