LIDIOMS: A Multilingual Linked Idioms Data Set

1y ago
2 Views
1 Downloads
569.80 KB
7 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Philip Renner
Transcription

LI DIOMS: A Multilingual Linked Idioms Data SetDiego Moussallem1,2 , Mohamed Ahmed Sherif2 ,Diego Esteves1 Marcos Zampieri3 ,Axel-Cyrille Ngonga Ngomo21Faculty of Mathematics and Computer Science - University of Leipzig, Germany2Data Science Group - University of Paderborn, Germany3Research Group in Computational Linguistics - University of Wolverhampton, United n this paper, we describe the LI DIOMS data set, a multilingual RDF representation of idioms currently containing five languages:English, German, Italian, Portuguese, and Russian. The data set is intended to support natural language processing applications byproviding links between idioms across languages. The underlying data was crawled and integrated from various sources. To ensurethe quality of the crawled data, all idioms were evaluated by at least two native speakers. Herein, we present the model devised forstructuring the data. We also provide the details of linking LI DIOMS to well-known multilingual data sets such as BabelNet. Theresulting data set complies with best practices according to Linguistic Linked Open Data Community.Keywords: multilingual, idioms, translation1.Introduction1Recently, the Linguistic Linked Open Data (LLOD) movement has gained significant momentum. According to McCrae et al. (2016), a large number of linguistic data setshave been extracted from various sources and been represented as Linked Data (LD). This new movement wasmotivated by the novel capabilities of the LD paradigmpertaining to transforming, sharing, and linking linguisticdata on the Web (Chiarcos et al., 2012). Resources suchas dictionaries and knowledge bases are essential in the development of Natural Language Processing (NLP) systems.However, most of these resources are still bilingual on theLLOD. Thus, becoming worthwhile to develop multilingual knowledge bases by reusing these bilingual contents.Multilingualism is important not only for sharing information across Web but also for learning new concepts fromother cultures.There are many data sets and linguistic resources availableat LLOD, however, most of them do not contain much information about Multiword Expressions (MWE). MWEare known to constitute a difficult problem on a number ofNLP tasks such as machine translation, language generation, and sentiment analysis/opinion mining. There are different types of MWE, according to Nunberg et al. (1994),MWE are categorized as phrase verbs, compounds, fixedexpression, semi-fixed expressions, idioms, slang, and others. This work focuses on idioms, a particular type ofMWE.Most idioms are culture-bound and their senses come fromparticular concepts of everyday life to a given culture. Bydefinition, idioms are a sequence of words whose meaningcannot be derived from the meaning of words that constitutethem (Nunberg et al., 1994). Idioms are generally classifiedas non-compositional. One of the direct consequences ofnon-compositionality is the impossibility of translating thiskind of word group literally (Nunberg et al., 1994) posing1http://linguistics.okfn.org/challenges to human translators and to machine translationsystems.In this paper, we propose LI DIOMS, a multilingual linkeddata set of idioms in five languages. In LI DIOMS, wedo not distinguish between idioms sub-categories and thuswork on idioms in general by providing lexical and semantic knowledge on a multilingual basis. The selected languages are English, German, Italian, Portuguese, and Russian. This choice of languages intends to show the possibility of correct translations among idioms independent oftheir language family, syntax or culture. Additionally, oneof the goals of LI DIOMS is to support further investigationsof similarity among idioms from different languages.In the following, we begin by presenting the related work(Section 2.) and the data sources that we used for the extraction (Section 3.). In Section 4., we give an overview ofthe model that underlies our data set. Section 5. depicts thecreation process that led to the publication of our data set.In Section 6., we present our approach to link LI DIOMSinternally and externally. Then, we present usage scenarios for our data set in Section 7. Subsequently, we discussLLOD quality in subsubsection 7.4.1. and we conclude thepaper and provide avenues for future work in Section 8.2.Related WorkA large number of ontologies have been developed to represent natural language data as LD on the Web of Data. Inthis context, the well-known ontology lemon (McCrae etal., 2012) was originally developed to model lexical datain mono or multilingual way. Subsequently, a significantamount of effort has been invested in order to improve thesupport of multilingual contents. To this end, other modules have been extended from lemon for representing multilingual data including (Gracia et al., 2014), which extends some of the lemon properties describing relationshipsamong translations.Recently, multilingual data sets have been created suchas DBnary (Sérasset, 2012), which was released with the2468

main purpose of describing translations among lexical entries. Another resource that describes multilingual contentis BabelNet (Navigli and Ponzetto, 2010), which integratesknowledge from various lexical resources, such as WordNet (Miller, 1995). Additionally, BabelNet has adopted thelemon structure for representing lexical entries (Ehrmann etal., 2014). Although these resources are linked lexical multilingual data sets, they contain a limited number of idiomsdescribed correctly along with their respective translationsacross languages. This lack of information about MWE andidioms is due to the missing appropriate ontologies and vocabularies for handling this phenomena properly. DespiteLexinfo ontology (Cimiano et al., 2011) contains a certainproperty just for representing idioms, there are no appropriate classes to reuse this information. Fortunately, the W3COntology Lexica Community Group2 has created an extension of lemon called Ontolex3 in order to not only addressthis lack of information but also to describe more appropriately linguistic terms (Bosque-Gil et al., 2015). Thus,enabling LI DIOMS to represent a particular type of linguistic unit, that is to say idioms. In the following, we presentthe data set creation process in more detail.Additionally, a number of multilingual data sets have beenpublished as Linked Open Data (LOD) in the last years.The well-known knowledge base of DBpedia (Lehmannet al., 2015) is one of first multilingual knowledge basesextracted from Wikipedia4 . Recently, the Semantic Qurandata set has published translations of the Quran in 43 different languages as linked data (Sherif and Ngonga Ngomo,2015). xLiD-Lexica (Zhang et al., 2014) is a cross-linguallinked data lexica which is constructed by exploiting all language versions of Wikipedia. Terminesp (Bosque-Gil et al.,2015) is another multilingual resource for terms along withtheir definitions in various languages.3.Data SourcesIn this section, we list the data sources from which LI D IOMS originates, where we describe the data collection process of each data source. In addition, we discuss how weensure the quality of the collected data.3.1.Data setsWe collected a set of MWE from the online lexical resources: (1) Phrase finder, (2) Memrise, (3) Collins and (4)Oxford dictionaries5 . Phrase finder is an online dictionaryabout idiomatic expressions created by Gary Martins (Martin, 2007) in 1997 for supporting his post-graduate researchin computational linguistics. Memrise is an online courseabout idiomatic expressions for achieving a native speakerlevel. Collins and Oxford provide high quality lexical resources. Therefore, we use them to guarantee the qualityof the idioms definitions and also for gathering some additional idioms. Memrise and Oxford provided idioms inEnglish, German, Italian, and Russian languages, while ww.w3.org/community/ontolex/wiki/Final Model Specification4https://www.wikipedia.org/5All repositories web pages http://faturl.com/repositories/?open3idioms in Phrase finder and Collins are in English. ThePortuguese idioms were initially gathered from WikipediaPortuguese page5 but because of the limited number of theavailable Portuguese idioms in Wikipedia, we asked fournative speakers (one from Portugal and the other three fromBrazil) to add more Portuguese idioms.For the sake of clarity pertaining to the copyrights to usethe data, Memrise and Collins granted us a full permissionwhile the others data providers have a free licence policywhen to use the data for research purposes.3.2.Data Collection processUsing a custom web crawler, we collected the MWE fromthese aforementioned on-line data sources. Each of thecrawled resources has specific pages about each MWE,which ease the configuration of our crawler. Note that, alldata sources are bilingual but not necessarily including English as one of the involved languages. For instance, Oxford has idiomatic expressions from Italian to Portuguese.We also noticed that most on-line dictionaries does not correctly categorize MWE. For example, in some cases themeaning of MWE can be deduced from the meaning of theircomponents (e.g. “by the book”) while in other cases thisis not possible (e.g. “out of the blue”). Therefore, MWEwhich can be represented by the meaning of their components should not be into the same category as the otherswith pragmatic meanings (i.e. non-compositional idioms).Collecting the right idioms was a hard task due to thelack of MWE categorization. Thus, we carried out the idiom collection manually where we discarded all the entriesthat were semantically equivalent to their lexical definitionswhich means to be not non-compositional. We dubbed thisprocess pragmatically-based selection. The pragmaticallybased selection identified only 50% of the MWE retrievedby our crawler as idioms. For instance, the idiom mentioned before “by the book” means “to follow the rules asdemand”. The meaning of “book” is “a stuff which contains information, rules, descriptions, and it can be a manual”. Therefore, this MWE is deductible from the meaningof each of its components, the meaning gets “to follow thebook’s writing”. Therefore, it is not considered an idiom,in contrast of the idiom “out of the blue” which means “anevent that occurs unexpectedly”, the meaning of “blue” is“color” then no relationship exists between “blue” or “outof” with “unexpected happening”.Moreover, considering that the meaning of idioms mayvary according to the geographical location where they areused (Martin, 2007). For example, American idioms whichcome from The United States of America differ from theBritish idioms which come from United Kingdom. We consider the location of idioms as an important characteristic tobe included in LI DIOMS.3.3.Data EvaluationTo ensure the quality of the retrieved data, we asked twonative speakers and one linguist (per language) to evaluate the extracted idioms and their respective definitions inEnglish. For evaluating an idiom, each native speaker separately evaluated the idioms’ definition. Idioms with accepted definitions by both evaluators are accepted. Also,2469

idioms with idioms’ definitions marked as wrong by bothevaluators were discarded. In case a mismatch evaluationhappens, the idiom was judged by the linguist. This procedure resulted in a manually checked data set containing alarge number of idioms as shown in Table 1. The Collection column shows the number of all MWE retrieved by ourweb crawler. The Filter column shows the number of idioms retrieved based on our pragmatically-based selection,a step which recognizes only idioms among MWE (see Section 3.2.). The Total column presents the resulting numberof idioms after the manual review process made by the natives and the 245150291114175130105Table 1: Number of idioms retrieved by step4.Semantic Representation ModelThe representation model of LI DIOMS aims at describingidioms correctly as a sub-type of MWE together with theirtranslations and geographical usage area. For this purpose,LI DIOMS data set is based on Ontolex model. We chosethe Ontolex model because it contains the necessary classesto represent MWE and its translations properly. Ontolexalso reuses the well-known Lexinfo ontology which has anessential term type called lexinfo:idiom for representing idioms as one type of MWE.We used the core Ontolex’s classes to model LI DIOMS,where (1) we use the class ontolex:LexicalEntry for representing a lexical entry (i.e. a word, amulti-word expression or an affix), (2) the sub-classontolex:MultiwordExpression is used to specifya lexical entry as a multi-word expression, (3) the ontolex:LexicalConcept class suits perfectly for representing idioms meaning as its formal definition comprisesof “to be a mental abstraction, concept or a thought thatcan be described by a given collection of senses”. (4) theontolex:LexicalSense class for lexical sense of anidiom. (5) the ontolex:Form class describes the written and alternative forms of the entries and (6) ontolex:Lexicon class is used for representing a collectionof lexical entries.For translations, Ontolex uses the vartrans module whichconnects ontolex:LexicalSense instances amongthemselves through vartrans:Translation class.The vartrans:Translation uses the property vartrans:category for describing translations and alsorepresenting variations of these translations across entriesin the same6 or different languages. The vartrans module was inspired by (Gracia et al., 2014) and we alsoreuse one of its translation categories called trcat:culturalEquivalent which represents a translation be6Same entry from a given language with different meaningstween two entries that are not semantically but pragmatically equivalent. Note that a cultural translation of an idiomis not a literal translation, rather it represents the specificcultural semantics of that idiom.For the geographical area of idioms, we use thelexvo:usedIn class from the Lexvo Ontology (de Melo,2015). The geographical area of an idiom is of great importance because the meaning of an idiom can vary in the samelanguage depending on where it is used (diatopic variation).For instance, the Portuguese idiom “amarrar o burro”(itsliteral translation: “tie the donkey”) means “to relax” inPortugal while in Brazil it means “to advise someone aboutfuture problems from one action”. Furthermore, this idiomhas also more meanings even within Brazil, for example,“to be angry when someone does not allow you to do something” that is typical for children. In addition, some idiomsare not understood in all countries even sharing the samelanguage. For instance, the Portuguese idiom “comprei ummamao” (eng: “buy a lemon”) is used in Brazil but not inPortugal.In Figure 1, we present a complete example of a translationof two idioms from Portuguese (“custa os olhos da cara”) toEnglish (“arm and a leg”) using vartrans class along withthe others descriptions modeled by Ontolex in LI DIOMS.In order to represent the names of the languages in a unifiedway, we publish LI DIOMS based on the best practices of theInternational Organization of Standardization (ISO). Giventhe fact that Brazilian Portuguese does not have an ISO resource, we chose to use the Brazilian Portuguese DBpediaresource7 for substituting that missing ISO.5.RDF GenerationThe original idioms were crawled in heterogeneous formatssuch as CSV, XML, and HTML. To convert the idiom datainto RDF, we used OpenRefine8 together with its RDF extension. The model underlying the RDF conversion relieson the group of patterns to generate linguistic resources asLD recommended by the Best Practices for MultilingualLinked Open Data (BPMLOD) W3C community group9 .In spite of our work being multilingual, we followed thepatterns for bilingual dictionaries10 . We were able to usebilingual patterns because we use English as pivot languagegiven that all the target translations are in English. Thus, themultilingual translations were found by inference relyingon the reflexivity property of the vartrans:target.For more details about LI DIOMS see Table 2 and visit LI D IOMS GitLab repository11 .6.LinkingIn this section, we describe how we link idioms in LI DIOMSinternally (i.e. within the data set) and externally (i.e. withother data ce-group/LIdioms2470

Figure 1: RDF representation of translation of two idioms from Portuguese (“Custa os olhos da cara”) to English (“arm anda leg”) by entries modeled with the LI DIOMS model.NameExampleDumpSparqlRepositoryVer. DateVer.NoLicenseto decrease the quality of an automatic linking process, because current link discovery frameworks (Nentwig et al.,2015) only support syntax-based string similarities. Giventhe lack of support of semantic-based string similarity functions, the internal linking was carried out manually by theauthors and a cross-validation among the natives and linguists were done on this manual internal linking.LI DIOMShttp://lid.aksw.org/en/killtwo birds with one dataset/lidioms20.04.20171.0CC BY-NC-SA 3.0Table 2: Technical Details LI DIOMS.6.1.Internal linkingWhile most of the definitions of the retrieved idioms werein English (87%), only in a few cases the definition wasin another language. We then decided to provide the definitions of all idioms in English regardless of the idioms’original language. The other 13% of idioms which had thedefinitions in another language were translated by a nativespeaker to English. Therefore, the English definitions became our pivot language, i.e. the idioms’ English definitions were used as bridge for the internal linking processacross languages. For instance, the “when pigs fly” English idiom has the definition “something that will neverhappen”. In Portuguese, the idiom “nem que a vaca tussa”has exactly the same lexical definition, but its literal translation would be “nor the cow cough”. Still, it is valid todecide to link these two idioms internally based on theirdefinitions. Figure 2 illustrates the main idea underlyingthis work, i.e. the provision of indirect translations (represented by dotted line) of idioms through a pivot language.Note that, some idioms have multiple idiomatic equivalents in other languages while others have none. However,some idioms have definitions with almost equivalent syntactic structures while the semantics of the definitions arevery different. For instance, the English idiom “Once ina blue moon” means “something that happens rarely” andanother English idiom “When pigs fly” means “somethingthat will never happen”. This kind of phenomena is likelyFigure 2: An indirect translation excerptThe Table 3 shows the number of direct and indirect translations found for the selected idioms per 736282488Translations 192Table 3: Number of idioms and Translations6.2.External linkingLinking LI DIOMS to other external resources is based onthe string similarities between LI DIOMS’s resources andthe other data sets’ resources. The current version of LI D IOMS is linked to two other data sets in order to ensure reusability and integrability.The first data set we linked to LI DIOMS is DBnary. Weused the algorithms provided in L IMES (Ngomo, 2012;Sherif and Ngonga Ngomo, 2015) framework which aretime-efficient to carry out the DBnary linking tasks. Thelinking was through rdfs:label property using thetrigram similarity with acceptance threshold 0.85.2471

LanguagesLI isionDBnary RetrievalDBnary AcceptedDBnary 1531.01.00Table 4: Number of links and precision values obtained between LI DIOMS and other data sets.The second data set we linked with LI DIOMS is BabelNet.The BabelNet linking process was carried out using the BabelNet API12 to retrieve senses and definitions. While linking, we noticed that BabelNet do not correctly type idioms(more details see Section 7.4.1.). We thus linked to BabelNet manually by comparing our skos:definitionproperty with the bn-lemon:definition property ofthe BabelNet resources. This task was performed by thesame group of linguists previously requested.of expressions that have a certain meaning. Using a simple SPARQL query over LI DIOMS enables these potentialagents to easily find idioms which contain a keyword ofchoice. For example, Listing 1 shows a SPARQL query forretrieving English, Italian and Russian idioms which contains the verb “to deceive” in their definitions.1236.3.4Linking Quality5In this section, we show and discuss the linking statistics ofLI DIOMS with BabelNet and DBnary.Table 4 presents the number of links per resource and language in the LI DIOMS data set. Note that all the linkswere evaluated manually. The Retrieval columns show thenumber of total idioms collected from a given data set andthe Accepted columns present the number of idioms whichwere matched exactly as an idiom. We also present the precision achieved by the aforementioned link specifications.DBnary has presented a good precision in general. Itslower score only comes from Portuguese and Russian asthese languages are a bit exploited by linguistic resourcesin terms of MWE thus containing only few idioms. DBnaryfollows the best practice of publishing linked data whichmeans without any typos in labels (e.g rdfs:label) incontrast of BabelNet (see 13 . This problem contributes forthe lower precision score of BabelNet because its API doesnot handle it instead of L IMES.7.Use CasesGathering idioms by definitionsThe first use case for our data set is exploratory in nature. Machine translation agents are commonly in /rdf/page/once in ablue moon r EN1378Listing 1: Idioms definitions that contains the same verb in(i) English (ii) Italian and (iii) Russian.7.2.Idioms usage per areaLI DIOMS provides information about the place of usage ofeach idiom. For instance, the idiom “it’s raining cats anddogs” has English as its language property and comes fromEngland. By being aware of the place of origin of an idiom, translators are now empowered to translate an idiomto the right idiom for a given target group. Listing 2 showsa SPARQL query which retrieves all idioms from England.1In this section, we outline selected application scenariosfor our data set. Listing 1, Listing 2 and Listing 3 illustrate different facets of how LI DIOMS can support translation use cases. LI DIOMS contains a significant number ofinstances of concepts, places and translations. Thus, multilingual idioms along with their definitions concerning abouta specific information can be easily retrieved from our dataset. Moreover, the aligned multilingual representation allows searching for idioms with the same meaning acrossdifferent languages.7.1.6SELECT ?label ?definitionWHERE {?idiom rdfs:label ?label.?idiom ontolex:sense ?sense.?sense ontolex:isLexicalizedSenseOf ?concept.?concept skos:definition ?definition.FILTER(bif:contains(?definition, "deceive")) .FILTER( lang(?label) "it" lang(?label) "en" lang(?label) "rus" ).}2345SELECT ?idiom ?labelWHERE {?idiom rdfs:label ?label;lexvo:usedIn dbr:England .}Listing 2: All idioms coming from England.7.3.Translating across languagesAnother important use of LI DIOMS is to retrieve indirecttranslations. By indirect translation we mean a translationwhich is based on another translation. Nevertheless, thepower of RDF representation of LI DIOMS enable the induction of indirect translations through the English translations. For example, the SPARQL query in Listing 3 firstfinds the English translation of the German idiom ”ZweiFliegen mit einer Klappe schlagen”, then it retrieves Russian idioms with equivalent English translations.2472

123456789101112SELECT ?idiomWHERE {?i rdfs:label "zwei fliegen mit einer klappe schlagen"@de;ontolex:sense ?sense.?trans vartrans:source ?sense;vartrans:target ?target.?transind vartrans:target ?target;vartrans:source ?source.?lex ontolex:sense ?source;rdfs:label ?idiom.FILTER( lang(?idiom) "rus" ).}1234567bn:arm and a leg n ENa lemon:LexicalEntry ;rdfs:label "arm and a leg"@en ;lemon:canonicalForm http://babelnet.org/rdf/arm and a leg n EN/canonicalForm ;lemon:language "EN" ;lemon:sense http://babelnet.org/rdf/arm and a leg EN/s13676929n ;lexinfo:partOfSpeech lexinfo:noun .Listing 5: Fragment of a BabelNet resource.Listing 3: Indirect translation.7.4.Third-party uses: Retrieving MoreInformation through LinksLI DIOMS is linked to other data sets, from which we areable to retrieve additional idiom-related information. Forexample, Listing 4 shows a SPARQL query for retrievinga given part-of-speech tag of the English idiom “out of theblue” from the same resource exists in DBnary.1234567891011SELECT ?posWHERE {?idiom rdfs:label "out of the blue"@en;owl:sameAs ?ext idiom.SERVICE http://kaiko.getalp.org/sparql {SELECT ?ext idiom ?posWHERE{?ext idiom dbnary:partOfSpeech ?pos}}}Listing 4: Retrieving data from different resources.7.4.1. DiscussionA main limitation in the currently available data sets inLLOD is the lack of proper categorization of MWE. Forexample, neither BabelNet nor DBnary have specific MWEtypes. For instance, in BabelNet, some idioms were nottyped as lexical entries, we were capable of finding exactmatches of many idioms which are included in LI DIOMSbut the matches were from other classes such as a film,a book or music album (e.g., “head over heels” is the label of a film14 ). In order to alleviate this problem, we alsotried to filter the idioms by bn-lemon:synsetType in BabelNet, however, incorrect types avoided us to link them easily.For example, the idiom “The Goose That Laid the GoldenEggs” is typed as “Named Entity” (see http://babelnet.org/rdf/page/s03200922n), but it should be a concept. Additionally, Listing 5 shows an example resource from BabelNet.In Listing 5, the idiom “arm and a leg” is represented asa noun while it should be firstly represented as a MWEor more precisely as an idiom. This lack of accurate categorization of MWE makes linking data sources such asLI DIOMS with other resources very difficult. In particular,using declarative link discovery frameworks for computingsimilarities among MWE without the right classification information becomes a slow task which leads to links with alow level of nFurthermore, this incomplete categorization exists also inother data sets such as DBpedia and DBnary. We thusregard LI DIOMS as a first effort towards a better LLOD,where MWEs (especially idioms) are represented as such.We envision that this better representation will lead to qualitative linked-data driven NLP systems, including but notlimited to better Machine Translation (MT) applications.8.SummaryIn this paper, we described LI DIOMS, a multilingual Resource Description Framework (RDF) data set containingidioms represented in five languages. The data set fills animportant gap on MWE processing and it can be used asa resource in NLP pipelines. The current version of LI D IOMS contains 13, 889 triples modeling 815 concepts with488 translations (115 indirect translations) coming from7 different sources and linked to 645 external resources.LI DIOMS connects idioms from different languages thathave semantically equivalent definitions. To ensure interoperability with other data sets on the LLOD, LI DIOMS islinked to BabelNet and DBnary.8.1.Future WorkWe are currently working to extend the coverage of LI D IOMS so that researchers and developers who work on languages not currently present in the data set can benefitfrom it. Future versions of LI DIOMS will include idiomsfrom other languages such as Arabic, Chinese, Korean,Czech, Finnish, and French. Moreover, to handle diatopiclanguage variation, the current languages of LI DIOMS arebeing updated including more fine-grained locations (e.g.,cities) as geographical area of use for idioms with morethan one meaning even sharing the same country and language. Finally, we plan to improve the automation of theprocess of internal as well as external linking of idioms byimplementing an approach for semantically linking idioms’definitions.AcknowledgementsWe would like to thank all native speakers which contributed with this work. Specifically Maria Sukhareva fromthe Olia Project15 , Will Hanley16 , Chiara Pace and Devayani Bhave. Special thanks to John McCrae, Jorge Gracia and Gilles Sérasset for their constructive discussionsabout modelling and best practices to create our data e/Will-Hanley2473

This work has been supported by the H2020 project HOBBIT (GA no. 688227) and supported by the Brazilian National Council for Scientific and Technological Development (CNPq) (no. 206971/2014-1). This research hasalso been supported by the German Federal Ministry ofTransport and Digital Infrastructure (BMVI) in the projectsLIMBO (no. 19F2029I), OPAL (no. 19F2028A) andGEISER (no. 01MD16014E) as well as by the BMBFproject SOLIDE (no. 13N14456).Bibliographical ReferencesBosque-Gil, J., Gracia, J., Aguado-de Cea, G., and MontielPonsoda, E. (2015). Applying the ontolex model toa multilingual terminological resource. In The Semantic Web: ESWC 2015 Satellite Events, pages 283–294.Springer.Chiarcos, C., Nordhoff, S., and Hellmann, S. (2012

Moreover, considering that the meaning of idioms may vary according to the geographical location where they are used (Martin, 2007). For example, American idioms which come from The United States of America differ from the British idioms which come from United Kingdom. We con-sider the location of idioms as an important characteristic to

Related Documents:

2. Illustrating Idioms Two 3. Illustrating Idioms Three 4. Illustrating Idioms Four 5. Illustrating Idioms Five 6. Illustrating Idioms Six 7. Illustrating Idioms Seven 8. Illustrating Idioms Eight 9. Illustrating Idioms Nine 10. Illustrating Idioms Ten These pages are great for big kids to practice using and understanding idioms during reading .

1 Few idioms stay in frequent usage for a long time. 2 Your English may sound unnatural if you use certain idioms. 3 Idioms can be used for dramatic effect. 4 Idioms are frequently used to comment on people and situations. 5 Headline writers always use idioms in their correct form. 6 Idioms

learners of English: Cambridge Idioms Dictionary. and . Oxford Idioms Dictionary . in focus _ by Anna Stachurska studies English idioms and focuses on the question of how usage of English idioms is marked within two most representative idioms dictionaries, namely . Oxford Idioms Dictionary

(2) bodily only idioms; (3) random non-bodily idioms. Our results pointed to a clear difference between the understanding of the three groups of idioms: those with the visuo-spatial component were understood best, followed by idioms referring to the body only and random idioms respectively, with strong statistical significances for the differences.

3.2 Idioms with Different Literal or Figurative Meanings When literal meanings or figurative meanings of English and Chinese idioms are a bit different, but their implicated meanings are the same, a translator can also borrow idioms from Chinese in translating English idioms. Such idioms as Among the blind the one" -eyed man is kingis "

Lesson 9: Business Idioms – Part 1 Idioms are short phrases with meanings that are different from the meanings of their individual words. Idioms are different from slang - idioms are in between formal and informal, so they are acceptable in everyday English conversations and

Their Origins (DITO) and Oxford Idioms Dictionary for Learners of English (OIDLE), which were also used as reference dictionaries. In the DITO, all the idioms that can be found in the Index of themes on pages 317-320 under the following titles were included: Idioms from ancient legends, Idioms from the Bible, Idio

The American Revolution: a historiographical introduction he literary monument to the American Revolution is vast. Shelves and now digital stores of scholarly articles, collections of documents, historical monographs and bibliographies cover all aspects of the Revolution. To these can be added great range of popular titles, guides, documentaries, films and websites. The output shows no signs .