Improved Statistical Translation Through Editing

2y ago
11 Views
3 Downloads
305.61 KB
9 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Bria Koontz
Transcription

Improved Statistical Translation Through EditingChris Callison-Burch Colin BannardUniversity of Edinburgh2 Buccleuch PlaceEdinburgh EH8 9LW{chris,colin}@linearb.co.ukAbstractIn this paper we introduce Linear B’s statistical machine translation system. Wedescribe how Linear B’s phrase-basedtranslation models are learned from a parallel corpus, and show how the quality ofthe translations produced by our systemcan be improved over time through editing. There are two levels at which ourtranslations can be edited. The first isthrough a simple correction of the text thatis produced by our system. The second isthrough a mechanism which allows an advanced user to examine the sentences thata particular translation was learned from.The learning process can be improved bycorrecting which phrases in the sentenceshould be considered translations of eachother.1IntroductionStatistical machine translation was first proposed inBrown et al. (1988). Since statistical machine translation systems are created by automatically analyzing a corpus of example translations they have anumber of advantages over systems that are built using more traditional approaches to MT: They make few linguistic assumptions and cantherefore be applied to nearly any languagepair, given a sufficiently large corpus. They can be developed in a matter of weeks orJosh SchroederLinear B Ltd.39 B Cumberland StreetEdinburgh EH3 6RAjosh@linearb.co.ukdays, whereas systems that are hand-crafted bylinguists and lexicographers can take years. They can be improved with little additional effort as more data becomes available.More recent advances in phrase-based approachesto statistical translation (Koehn et al., 2003; Marcuand Wong, 2002; Och et al., 1999) have led to adramatic increase in the quality of the translationsystems. Phrase-based translation systems producehigher-quality translation since they use longer segments of human translated text. Using longer segments of human translated text reduces problems associated with literal word-for-word translations. Forexample, multi-word expressions such as idioms arebetter translated.Linear B is a commercial provider of statisticalmachine translation systems. This paper describesLinear B’s advances to phrase-based machine translation that allow translation quality to be improvedthrough editing translations that are produced by oursystem. There are two levels at which our translations can be edited: The first is through a simple correction of thetext that is produced by our system. Our system improves by dynamically learning the correct translations of new phrases. These newphrases are extracted from the corrected sentence pair using the existing translation models,and can be used immediately for subsequenttranslations. The second is through a mechanism that allowsan advanced user to inspect which phrases the

These features mean that our system is capable ofimproving with use and adapting to be more appropriate for a new domain. This has two main implications: our systems get better as our customersuse them, and our systems have the potential to betrained using example translations from one domain(such as government documents, which have abundant translations) and gradually adapted to a new domain.The remainder of the paper is as follows: Section2 describes how our phrase-based models of translation are learned from archived translations, andgives example output produced by a system trainedon data from the Canadian parliament. Section 3shows how our system dynamically integrates editedoutput by extracting the translations of new phrases,and weighting the corrected translations more heavily than existing translations in the model. Section4 described the advanced editing technique that allows a user to inspect the sentence pairs which afaulty translation was learned from, and correct thestatistical models by explicitly showing the systemwhich phrases ought to be learned from those sentence pairs instead.2Phrase-based Statistical TranslationThe goal of statistical machine translation is to beable to choose that English sentence, e, that is themost probable translation of a given sentence, f , ina foreign language. Rather than choosing e that directly maximizes the conditional probability p(e f ),Bayes’ rule is generally applied:e arg max p(e)p(f e)eThe effect of applying Bayes’ rule is to divide thetask into estimating two probabilities: a languagemodel probability p(e) which can be estimated using a monolingual corpus, and a translation modelprobability p(f e) which is estimated using a bilingual, sentence-aligned corpus. Here we examinedifferent ways of calculating the translation rkedmanyyearsinafarmingdistrictsystem’s translation was composed from. Ifa particular phrase was mistranslated, the usercan examine the set of sentence pairs that a particular translation was learned from, and dizainesd'annéesdansledomaineagricole.Figure 1: A word-level alignment for a sentence pairthat occurs in our training data2.1Word AlignmentsBrown et al. (1993) define a series of translationmodels, which are commonly referred to as IBMModels 1 to 5. The IBM Models formulate translation essentially as a word-level operation. The probability that a foreign sentence is the translation of anEnglish sentence is calculating by summing over theprobabilities of all possible word-level alignments,a, between the sentences:p(f e) Xp(f , a e)aThus they decompose the problem of determiningwhether a sentence is a good translation of anotherinto the problem of determining whether there isa sensible mapping between the words in the sentences. Figure 1 illustrates a probable word-levelalignment between a sentence pair in the CanadianHansard bilingual corpus.Brown et al. (1993) formulate alignment probability p(f , a e) in terms of distortion, fertility, andspurious word probabilities in addition to word-forword translation probabilities. The act of translationin the Brown et al. (1993) approach is one of stringrewriting. In string rewriting each word in a sourcesentence is replaced by zero or more words in thetarget language. Then, a number of “spurious” target

words might be inserted with no direct connection tothe original source words. Finally the words are thenreordered in some fashion to form the translation.Problems with the Brown et al. (1993) approachto translation include: It doesn’t have a direct way of translatingphrases; instead fertility probabilities are usedto replicate words and translate them individually. Using small units such as words means that alot of word reordering has to happen. But thedistortion probability is a poor explanation ofword order.2.2Phrase AlignmentsPhrase-based translation, by contrast, uses largersegments of human translated text. Phrase-basedtranslation does away with fertility and spuriousword probabilities. While it does have some notion of distortion, this is less pertinent since local reorderings such as adjective-noun alternation can beeasily captured in phrases. The main part of phrasebased translation models is the estimation of phrasaltranslation probabilities.In general, the probability of an English phrase ētranslating as a French phrase f is calculated as thenumber of times that the English phrase was alignedwith the French phrase in the training corpus, divided by the total number of times that the Frenchphrase occurred:count(f , ē)p(f ē) P f count(f , ē)The trick is how to go about extracting the countsfor phrase alignments from a training corpus.Many methods for calculating phrase alignmentsuse word-level alignments as a starting point.1There are various heuristics for extracting phrasealignments from word alignments, some are described in Koehn et al. (2003), Tillmann (2003), andVogel et al. (2003).Figure 2 gives a graphical illustration ofthe method of extracting incrementally larger1There are other ways of calculating phrasal translationprobabilities. For instance, Marcu and Wong (2002) estimatethem directly rather than starting from word-level alignments.phrases2 from word alignments described inOch and Ney (2003). Counts are collected overphrases extracted from word alignments of all sentence pairs in the training corpus. These counts arethen used to calculate phrasal translation probabilities.The act of translating with phrase-based translation model involves breaking an input sentence intoall possible substrings, looking up all translationsthat were aligned with each substring in the training corpus, and then searching through all possibletranslations to find the best translation of source sentence.2.3Example translationThis section gives an example translation which wasproduced by our system when it was trained on a collection of example translations from the CanadianParliament. For the source passageL’honorable Leonard J. Gustafson: Honorables sénateurs, tandis que la guerreen Irak entre dans sa troisième semaine,nous ne devons pas oublier qu’il faut prendre des mesures pour éviter une crise humanitaire dans la population civile. Àcet égard, il y a de nombreux domainesdans lesquels les Canadiens doivent fairepreuve de leadership.Maintenant que la guerre fait rage en Iraket que des pénuries de produits alimentaires et de fournitures médicales commencent à se produire, les pays qui ontaccès à des ressources ont la responsabilité d’essayer de minimiser les effetsnégatifs du conflit sur la population irakienne. Je crois personnellement que leCanada devrait jouer un plus grand rôle àcet égard.Au Canada, nous disposons en abondancede blé et d’autres produits alimentaires.Nous pouvons également fournir des articles médicaux aux citoyens de l’Irak. LeCanada a une fière réputation pour ce qui2Note that the ‘phrases’ in phrase-based translation are notthe traditional notion of syntactic constituents; instead theymight be more aptly described as ‘substrings’ or ‘blocks’.

ThosepeoplehaveCesgensontgrandiPhrases extracted on iteration 1: (Ces : Those), (gens : people),(ont : have), (grandi : grown up), (, : ,), (vécu : lived), (et : and),(oeuvré : worked), (des dizaines d’ : many), (années : years),(dans : in), (le : a), (domaine : district), (agricole : farming), (. : ygrownupThosepeoplehaveNotice that one word can translate as a phrase, such as ‘grandi’ ‘grown up’. Incrementally larger phrases are created by incorporating adjacent phrases words and phrases.Iteration 2: (Ces gens : Those people), (gens ont : people have),(ont grandi : have grown up), (grandi , : grown up ,), (, vécu : ,lived), (vécu et : lived and), (et oeuvré : and worked), (oeuvrédes dizaines d’ : worked many), (des dizaines d’ années, manyyears), (années dans : years in), (dans le : in a), (domaine agricole: farming sd'annéesdansleNotice that ‘a farming’ does not have a translation since thephrase that it spans are not omaineNotice that ‘a farming district’ now has a translation since thephrase ‘farming district’ is adjacent to the phrase ation 3: (Ces gens ont : Those people have), (gens ontgrandi : people have grown up), (ont grandi , : have grown up ,),(grandi , vécu : grown up , lived), (, vécu et : , lived and), (vécuet oeuvré : lived and worked), (et oeuvré des dizaines d’ : andworked many), (oeuvré des dizaines d’ années : worked manyyears), (des dizaines d’ années dans : many years in), (annéesdans le : years in a), (le domaine agricole : a farming district),(domaine agricole . : farming district .)Iteration 4: (Ces gens ont grandi : Those people have grown up),(gens ont grandi , : people have grown up ,), (ont grandi , vécu :have grown up , lived), (grandi , vécu et : grown up , lived and), (,vécu et oeuvré : , lived and worked), (vécu et oeuvré des dizainesd’ : lived and worked many), (et oeuvré des dizaines d’ années: and worked many years), (oeuvré des dizaines d’ années dans: worked many years in), (des dizaines d’ années dans le : manyyears in a), (dans le domaine agricole : in a farming district), (ledomaine agricole . : a farming district .)agricole.Figure 2: Extracting incrementally larger phrases from a word alignment

Je crois personnellement que le Canada devrait jouer un plus grand rôle à cet égard.I personally believe that Canada should play a more significant role in this regard.À cet égard , il y a de nombreux domaines dans lesquels les Canadiens doivent faire preuve de leadership.In this regard , there are a number of areas in which the Canadians must concentrate to show leadership.Figure 3: An example of the phrases that were used to translate two sentencesHon. Leonard J. Gustafson: Honourablesenators, as the war in Iraq enters its thirdweek, the need to take measures to avoida humanitarian crisis among Iraq’s civilian population should not be forgotten. Inthis regard, there may be avenues whereCanadians must provide leadership.est d’offrir une aide humanitaire aux gensquand ils en ont besoin.Our system produces the translationThe Honourable Leonard J. Gustafson:Honourable senators, while the war in Iraqextending into a third week, the ministershould not forget the one we must takesome steps to prevent a crisis humanitarian face in the civilian populations. Inthis regard, there are a number of areas inwhich the Canadians must concentrate toshow leadership.That the war is raging in Iraq and thata shortage of food and medical suppliesare beginning to take place, those countries that have access to the resources arethe responsibility for doing try to minimize the negative effects on workers onthe people Irakienne. I personally believethat Canada should play a more significantrole in this regard.In Canada, we have in abundance ofwheat, as have other food products. Wecan also provision of medical articles tothe people on the other Iraq. Canada has aproud record in so far, as would be to givehumanitarian assistance to people whenthey need it.As a reference, here is how a human translator rendered the passageAs the war wages in Iraq and shortagesof foodstuffs and medical supplies start tooccur, countries with access to resourceshave a responsibility to help minimizethe negative impact of the conflict on thecountry’s population. In this regard, itis my belief that Canada should play agreater role.In Canada, we have access to plentifulsupplies of wheat and other foodstuffs.Canada can also provide medical suppliesto the citizens of Iraq. Canada has a proudhistory in providing humanitarian assistance to people in times of need.Figure 3 shows which phrases were selected byour system when deciding how to translate two ofthe sentences from the passage.3Simple EditingBecause our translations are learned from exampletranslations we can exploit edited system output toimprove the quality of subsequent translations. Themost straightforward way of improving the translation quality would be to add the edited translations

Durant l’ère glacière, les changementsclimatiques obligèrent le règne animaleà migrer vers des contrées plus eice(along with the source sentences that they were produced from) to the training corpus, and re-train thesystem on the augmented set of data. However,training the system can take a long time – it tooklonger than a week to train the system using the parallel corpus containing roughly 25 million words ofCanadian Parliament text. Since it takes so long tore-train a system, this method could only be done periodically, and translation quality would not immediately benefit from a user correcting the system’stranslations.Instead we have developed what we term an“alignment server”. The alignment server is a variant of the software that trains the translation models. It keeps the parameters of a translation modelin memory, and is able to create a word alignmenton-the-fly from a new sentence pair.For example, if our system had translated theFrench réesplusaccueillantes.Figure 4: Word alignment produced by the alignment server for an edited saccueillantes.kingdomtomigrateOur alignment server then produces a word-levelalignment such as the one depicted in Figure 4.From this we extract a set of new phrases and theirtranslations (as shown in Figure 5). We extract theice age as the correct translation of the phrase l’èreglacière, and animal kingdom as the correct translation of règne animale. These would be added tothe database of phrasal translations that the decoderdraws upon, and would prevent the incorrect translation of l’ère glacière as the era refrigerator or règneanimale as reign animal in the future.Since our alignment server produces new wordalignments dynamically as text is edited, we are ableèreglacière,causedtheDuring the ice age, climatic changescaused the animal kingdom to migrate towards more accessible regions.Durantl'age,climaticchangesA user would easily be able to edit the translation tomake it éesplusaccueillantes.DuringtheiceDuring the era refrigerator, the climaticchanges oblige the the reign animal to migrate towards more accessible regions.Duringtheiceinto English lusaccueillantes.Figure 5: Word alignment produced by the alignment server for an edited translation

to update our database in real time. In this way weare able to take advantage of new phrases almostimmediately. A problem occurs when adding newtranslations for phrases which already exist in thedatabase. The problem is that the there are usuallymany more occurrences of the phrase in the existingtraining data.We would like the edited text to play an important role in shaping what translations are produced,since it will reflect how our customers would likethe text to be translated. However, since the amountof edited text will be dwarfed by the amount of textin the training data, the translations will likely stilllook very similar to the training data. To overcomethis, we employ a weighting scheme in calculatingthe probabilities.p(f ē) λ1λ countC1 (f , ē) λ2 countC2 (f , ē)PcountC1 (f , ē) λ2 f countC2 (f , ē)f P1The λ1 and λ2 co-efficients allow us to scale thecontribution of phrase alignments extracted from thenewly created corpus (C2 ) of edited translations, anddownweight the counts of phrases from the originaltraining corpus (C1 ). This allows us to create a system from existing public domain data, such as thetranslations of government proceedings, and gradually adapt it to a new domain, while placing moreweight on the newly created data.4Advanced EditingAs an alternative to adding new phrasal translationsto the database, we have also implemented a featurewhich allows advanced use

target language. Then, a number of “spurious” target. words might be inserted with no direct connection to the original source words. Finally the words are then reordered in some fashion to form the translation. Problems with the Brown et al. (1993) approach . cet

Related Documents:

neural machine translation (NMT) paradigm. To this end, an experiment was carried out to examine the differences between post-editing Google neural machine translation (GNMT) and from-scratch translation of English domain-specific and general language texts to Chinese. We analysed translation process and translation product data from 30 first-year

The importance of Translation theory in translation Many theorists' views have been put forward, towards the importance of Translation theory in translation process. Translation theory does not give a direct solution to the translator; instead, it shows the roadmap of translation process. Theoretical recommendations are, always,

Philipp Koehn, Marcello Federico, Wade Shen, Nicola Bertoldi, Ondˇrej Bojar, Chris Callison-Burch, Brooke Cowan, Chris Dyer, Hieu Hoang, Richard Zens, . Statistical machine translation has emerged as the dominant paradigm in machine translation research. Statistical machine translation is built on the insight that many translation choices

from: howstuffworks.com Inside This Article 1. Introduction to How Video Editing Works 2. Digital Camcorders 3. Video-Editing Computers 4. Video Editing: Basic Concepts 5. Running Adobe Premiere 6. Editing a Video: Capture and Clips 7. Editing a Video: Timeline and Transit

Accepted translation 74 Constraints on literal translation 75 Natural translation 75 Re-creative translation 76 Literary translation 77 The sub-text 77 The notion of theKno-equivalent1 word - 78 The role of context 80 8 The Other Translation Procedures 81 Transference 81 Naturalisation 82 Cultural equivalent 82 Functional equivalent 83

translation, idiomatic translation and communicative translation. From 116 data of passive voice in I Am Number Four novel, there is 1 or 0.9% datum belongs to word-for-word translation, there are 46 or 39.6% data belong to literal translation, there is 1 or 0.9% datum belongs to faithful translation, there are 6 or

(Statistical) Machine Translation Cristina Espana i Bonet MAI{ANLP Spring 2014. Overview 1 Introduction 2 Basics 3 Components 4 The log-linear model 5 Beyond standard SMT . Example-based Translation Rule-based systems. Introduction Machine Translation Taxonomy Machine Translation systems Human T

Futura 4 - Editing Option - English Futura 4 - Editing Option - English 3-01 7 add-on Editing Inside Embroidery settings The ‘Embroidery settings’ is the basic dialogue from within the software related to editing the embroidery settings of any