An Automatic Ontology Generation Framework With An .

2y ago
6 Views
2 Downloads
1.93 MB
10 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Ophelia Arruda
Transcription

Proceedings of the 53rd Hawaii International Conference on System Sciences 2020An Automatic Ontology Generation Framework with An OrganizationalPerspectiveSamaa ElnagarVirginia Commonwealth Universityelnagarsa@vcu.eduVictoria YoonManoj A. ThomasVirginia Commonwealth UniversityUniversity of ractOntologies have been known for their powerfulsemantic representation of knowledge. However,ontologies cannot automatically evolve to reflectupdates that occur in respective domains. To addressthis limitation, researchers have called for automaticontology generation from unstructured text corpus.Unfortunately, systems that aim to generate ontologiesfrom unstructured text corpus are domain-specific andrequire manual intervention. In addition, they sufferfrom uncertainty in creating concept linkages anddifficulty in finding axioms for the same concept.Knowledge Graphs (KGs) has emerged as a powerfulmodel for the dynamic representation of knowledge.However, KGs have many quality limitations and needextensive refinement. This research aims to develop anovel domain-independent automatic ontologygeneration framework that converts unstructured textcorpus into domain consistent ontological form. Theframework generates KGs from unstructured textcorpus as well as refine and correct them to beconsistent with domain ontologies. The power of theproposed automatically generated ontology is that itintegrates the dynamic features of KGs and the qualityfeatures of ontologies.1. IntroductionOntologies have been used as a model for knowledgestorage and representation [1]. Characteristics of goodontology are: memory, dynamism, polysemy, andautomation [2]. In an ideal scenario, systems must becapable of generating and enriching ontologiesautomatically. However, most ontologies aregenerated manually by ontology engineers who arefamiliar with the theory and practice of ontologyconstruction [3]. The goal of automatic ontologygeneration is to convert new knowledge intoontological form by enabling related processingtechniques, such as semantic search and retrieval [4].Automatic ontology generation will significantlyreduce the labor cost and time required to buildontologies [5].Most of the current automatic ontology generationsystems convert existing structured knowledge (e.g.,URI: 3(CC BY-NC-ND 4.0)database schemas and XML documents) intoontological formats [6]. However, approaches toconvert unstructured text corpus into ontologicalformat have not been fully developed. Moreover,existing approaches are domain-specific and requiremanual intervention to create domain rules andpatterns [2]. Similar to ontologies, Knowledge Graphs(KGs) encode structured information of entities andtheir relations into a graphical form or a directed graphG (C, R), where 𝐶 is the set of vertices and 𝑅 is theset of edges that symbolizes a relationship betweentwo concepts in a graph [7]. However, there is asignificant difference between ontologies and KGsthat are important to note. First, from a practicalviewpoint, KGs are powerful in many aspects;however, their quality and reliability are questionable.Second, there is usually a trade-off between coverageand correctness of KGs, which could be perilous forcertain business problems. Third, from a theoreticalperspective, the trustworthiness of KGs has not beenestablished [8], particularly in cases of organizationsthat delegate high priority to data quality and systemsreliability.This research aims to develop a domain-independentautomatic ontology generation framework that enableorganizations to generate ontological form fromunstructured text corpus. The study is fueled by thelack of fully-automated domain-independent ontologygeneration systems that address common data qualityissues. The framework utilizes refined KGs to bemapped and tailored to fit into target domainontologies. The generated ontologies benefit fromKGs’ features and avoid quality issues traditionallyassociated with automatic ontology generation. Inaddition to enabling organizations to store and retrievenew knowledge in ontological RDF format, the studyalso shows how the framework can facilitateinteroperability to efficiently employ knowledgeacross multiple domains. It is to be noted thatgenerated ontologies are in the basic triple RDF formatand their hierarchical structure (i.e., the OWL format)is beyond the scope of the paper. However, generatingontologies from refined KGs will not only overcomethe limitations of ontologies such as data integrationPage 4860

and evolution but also take advantage of the benefitsof KGs such as timeliness.The contribution of this paper can be summarized as:1. The design of an automatic ontology generationframework from unstructured knowledge sourcesthat can be used across various domains.2. KGs alignment with reference ontologies afterrefining them in terms of correctness, completeness,and consistency with target domain ontologies.3. The development of criteria for KG correction andconsistency check.2. Literature ReviewDue to enhancements in machine learning and NaturalLanguage Processing (NLP) algorithms, many studieshave addressed the generation of ontology ogies and rule-based approaches have beenused extensively in ontology engineering [9].However, those approaches require manually craftedsets of rules or patterns to represent knowledge,making them narrow in scope and domain dependent.Authors in [10] and [11] presented ontologygeneration systems from plain text using predefineddictionary, statistical, and NLP techniques. The twoapproaches target the medical domain specifically.Additionally, their approaches require extensive laborcosts to construct patterns and maintain thecomprehensive dictionaries.The system in [12] used Wikipedia texts to extractconcepts and relations for ontology construction. Theyused a supervised machine learning technique whichrequired huge effort for manual labeling and validationfor data. An Alzheimer ontology generation systemwas built in [13]. The system used controlledvocabulary along with linked data to build theontology based on Text2Onto system by combiningmachine learning approaches with part-of-speech(POS) tagging. Unfortunately, involvement of domainexperts is needed during the development process.Alobaidi et al. asserted [3] the need for automaticand domain-independent ontology generationmethods. They identified biomedical concepts usingLinked life Data (LOD) and linked medicalknowledge-bases and applied semantic enrichment toenrich concepts. Breadth-First Search (BFS) algorithmwas used to direct the LOD repository to create precisewell-defined ontology. However, this approach targetsthe medical domain and the framework is trained onlywith linked biomedical ontologies. Further, the qualityof the generated ontologies is neither evaluated norchecked for error and consistency with domainontologies. The system in [14] automaticallyconstructed an ontology from a set of text documentsusing WordNet, but no details were provided on howthe terms are extracted and no qualitative assessmentis provided.Kong et. al. [15] designed a domain-specific automaticontology system based on WordNet. The approach ishighly dependent on the quality of the startingknowledge resource. User intervention is alsonecessary to avoid incompatible concepts. In [16],dictionary parsing mechanisms and discovery methodswere used for acquiring domain-specific concepts. Thedeveloped framework is considered a semi-automaticontology acquisition system for mining ontologiesfrom textual resources. The framework also dependson technical dictionaries for building a concepttaxonomy for the target domain.Meijer et al. [17] developed a framework for thegeneration of a domain taxonomy from a text corpora.The framework employed a disambiguation step forboth the extracted taxonomy and the referenceontology used for evaluation. In addition, thesubsumption method was used for hierarchy creation.However, the scope of this study was only to build ataxonomy of concepts and relations with minimalfocus on relations between instances. Further, thesystem has very low semantic precision and recallbecause of improper relation representation.To summarize, we conclude that most of theapproaches used for automatic ontology generationfrom unstructured text corpus are domain-specific,demonstrating the need for domain independentontology-generation methods [3, 4]. Additionally, fewsystems used unstructured text from externalheterogeneous sources [17, 18]. Human interventionwas also required in one or more tasks. Further, fewapproaches take into consideration the quality of thegenerated ontology. Moreover, issues such astimeliness, evolution, and integration have not beendiscussed. To fill this gap in the literature, our studyaims to develop a fully automatic, domain independentontology generation framework using various types ofunstructured text corpus. Quality issues are the coreconsideration in the design of our system. In ourmethod, we will focus on the relationships betweendifferent instances with reference to the structure ofreference ontologies.3. What are Knowledge Graphs?A knowledge graph is used mainly to describe realworld entities and their interrelations organized in agraph [19]. It is considered a dynamically growingsemantic network of facts about things. Some arguethat KGs are somehow superior to ontologies andprovide additional features such as timeliness andscalability [20]. A knowledge graph 𝐺 consists ofPage 4861

schema graph 𝐺𝑠, data graph 𝐺𝑑 and the relations Rbetween 𝐺𝑠 𝑎𝑛𝑑 𝐺𝑑, denoted as 𝐺 𝐺𝑠, 𝐺𝑑, 𝑅 .KGs schema do not necessarily contain all conceptsand relations as the domain ontology. In contrast, KGsare generated based on the concepts found in thesource corpus. For example, Figure 1.a. shows a KGfor IMDB reviews. In the figure, movies are connectedto actors, directors and genres. We can easily implythat the IMDB reviewer Rita (the beige colored nodein the middle) likes Charles Chaplin as an actor and adirector. However, the ontological representation forsuch graph would be the expansion of the composingontologies shown as in Figure 1.b. Expanding thebasic ontologies will include all concepts andattributes in each ontology will result in unnecessaryconcepts representation.relations for a certain domain [23], KG schema israther shallow, at a small degree of formalizationwithout hierarchical structure. Most KGs follow theOpen World Assumption (OWA) which states that KGscontain only true facts and the non-observed facts canbe either false or just missing. On the other hand, mostof ontologies follow a domain-specific approach or theClosed World Assumption (CWA) that assumes that thefacts not contained in the domain are false [24].Scalability is “the ability of a system to be enlarged toaccommodate growth” [25]. KGs are very scalable asshown in Figure 1.a. Reliability as a concept dependson the availability of sources [26] and thereforeontologies availability of data is higher than KGs. Theautomatic construction and maintenance of KGs facessubstantial challenges. Maintenance of KGs dependson manual user feedback which is burdensome,subjective and difficult [27].Timeliness measures how up-to-date data is relative toa specific task [28]. Since most updates in ontologiesare done manually by domain experts [29], There is atime that the ontology will be incomplete. In contrast,KGs are generated at runtime and the data is current.Evolution is very likely in KGs for several reasons: (i)KGs represent dynamic resources, and (ii) the entiregraph can change or disappear [30]. On the other hand,ontologies need domain expert’s intervention toevolve and it is usually a daunting and costly process.Figure 1.a1: Knowledge Graph for IMDB reviewsand the basic equivalent ontologiesFigure 1.b The main equivalent ontologiesrepresentation for retail store web application4. A Comparison between KGs andOntologiesA KG is considered a dynamic or problem specificontology [21]. KGs are domain-independent methodsfor knowledge representation, while ontologies areknown for representing domain knowledge [22]. So, inKGs the number of instances statements is far largerthan that of schema level statements. The focus ofknowledge graphs is the instance (A-box) level morethan the concept (T-box) level. While ontologies focuson building schematic taxonomies of concepts and1Licensing is defined as “the granting of permission fora consumer to re-use a dataset under definedconditions” [31]. Licensing is a new quality dimensionnot considered for relational databases [32]. KGsshould contain a license or clear legal terms so that thecontent can be (re)used. Interoperability is “the usageof relevant vocabularies for a particular domain” [33]such that different systems can exchange information.Interoperability is a main issue in KGs and some arguethat KGs may create inconsistencies with manyinformation systems [27, 34].Relevancy refers to “the provision of informationwhich is in accordance with the task at hand” [35].Because KGs are multi domain graphs, knowledgeabout certain domain might be superficial. Unlikeontologies, they contain detailed descriptions ofconcepts and relations for a specific domain. Dataintegration in ontologies is challenging specially whenit comes to extending the knowledge beyond thedomain knowledge [36, 37]. Data integration in caseof KG might lead to duplication of instances andreferential conflicts. Therefore, KG refinement isThe figure is generated using NEO4j sandboxPage 4862

crucial. KGs are essential for real time processing suchreal-time recommendations and fraud detection [38].The size of KGs usually is far larger than the size ofontologies [39]. Although extensively in use, KGs arehard to compare against each other in a given setting[40].Unlike KGs, there are many tools that are used tocompare different ontologies [41]. Computationalperformance concerns become more important as KGsbecome larger. Typical performance measures areruntime measurements, as well as memoryconsumption [42] . Since knowledge graphs explicitlyidentify all concepts and their relationships to eachother, they are inherently explainable which is not thecase with ontologies [43].KGs have a higher degree of agility, the rate ofknowledge change, than ontologies because of theirdynamicity and continuous evolution [29].Redundancy refers to the duplication of relations,attributes or instances [27]. KGs might be generated onthe fly, so they are prune to duplication of instances.KGs are connecting instances visually, so they arehuman friendly. KGs have changed the nature of manyML techniques such as the graph-convolutional neuralnetwork [44]. A comparison between ontologies andKGs is illustrated in Table 1.Table 1: A Comparison between KGs and al-timeKGOntologyCWAOWAMassiveRelatively smallVery scalableLimited scalabilityProblem specific Domain specificGenerated atLimited enerationAutomaticMostly by humansTrustworthiness Not veryTrustworthytrustworthyKnowledgeMore A-BoxUsually more Tbase typethan T-BoxBox than A-BoxMarkupRDFRDF, OWL, OILlanguageData Integration Easily integrated Hard to IntegrateQualityQuestionableHigh ticRedundancyVery likelyNot uestionableReasonable(licensing)Interoperability LowModerateRelevancyLowHighComputational ][3, ][35][42]ComparabilityFriendlinessVery HardAchievableMachine / human Machine / notfriendlyhuman friendly[40][43]4.1. Domain and Quality ConstraintsBuilding knowledge graphs from scratch is a tediousproposition that require machine learning models to betrained with huge number of datasets in addition tostrong NLP techniques and reasoning capabilities.However, third-parties solutions, such as IBM Watson,and Neo4j [49], offer knowledge graphs generation ason-demand services. However, some organizationsmay not trust third-party generated KGs owing toconcerns of security, reliability and relevancy.Therefore, organizations should weigh the time andeffort required to produce a knowledge graph againstthe value it receives from using third-parties KGs.4.2. Why Ontologies but Not KGs?While many people argue for KG superiority overontology, ontologies are superior to KGs ininteroperability and many quality measurements [20].In fact, whatever approach is used to build aknowledge graph, the result will never be perfect [8].Sources of imperfections are mainly because ofincompleteness, incorrectness and inconsistency [45].For example, If KGs are constructed from RSS feed orsocial media websites, there is a high probability thatthe knowledge will be noisy, missing important piecesof information or contains false information such asrumors. In addition, the accuracy of generatedknowledge graph depends on the accuracy of the KGgeneration system.Since the quality of generated KG is stronglydependent on the data quality of the knowledge sourceand the accuracy of KG generator, mapping thegenerated KG to reference ontologies will ensurequality and reliable representation of knowledge.Another reason for using ontologies over KGs is thatmost of the information systems in organizations aredesigned using domain-specific ontologies. Thosesystems cannot store and retrieve from KGs becauseinteroperability issues would emerge if KGs are used.5. Proposed Automatic ontologygeneration FrameworkThe proposed framework is inspired by the ontologygeneration life cycle developed in [2]. The frameworkconsists of three main phases of the Generation phase,the Refinement phase, and the Mapping phase asshown in Figure 2. Each phase is discussed below indetails.5.1. Generation PhasePage 4863

In the generation phase, the input to the framework isunstructured text corpus and the output is thepreliminarily generated KG. The processes conductedin the generation phase are:5.1.1. Data CleaningData cleaning is a very important process. Withoutcleaning, irrelevant concepts and relations coulddeviate the reliability of results. Unstructured corpusmight contain HTML tags, comments, social websitesplugins, and ads. etc. Therefore, cleaning irrelevantinformation is necessary. Otherwise, we might findthe term “Facebook” as one of the main conceptsbecause it is repeated in many webpages.5.2.1. Reference OntologiesReference ontologies are used to evaluate and verifythe generated KGs. However, there is no singleontology that is considered the best reference for alldomains. Accordingly, reference ontologies areselected based on the nature of problem of thegenerated KG in addition to the nature of domainontologies themselves. For example, if the generatedKG contains many general topics, DBpedia could be aperfect fit because DBpedia is the ontological form ofWikipedia [51]. In addition to reference ontologies,benchmark ontologies are needed to train the KGcompletion algorithms before they can be used in KGrefinement.5.2.1. Anomalies ExclusionFigure 2: Proposed Automatic Ontology Framework5.1.2. Knowledge Graph GeneratorA KG generator can be used to generate thepreliminarily KGs in the form of triples in RDF format(subject, predicate, object) or (Resource, a Property,and a Property value). This generated graph includesthe entities and relations with the correspondingconfidence score. For example, the relation that USAis located near Mexico has a confidence score of 0.91,as shown below. This confidence score is not onlyretrieved from the corpus but also reinforced byprevious knowledge stored in references ontologies.The generated KG can be also visualized on demand."results": [ {"id":"eea16dfd5fe6139a25324e7481a32f89","result metadata":{"confidence": 0.917}5.2. Refinement PhaseAs mentioned earlier, the most problematic issue withKG generation systems is that they cannot distinguishbetween reliable and unreliable knowledge source.Facts extracted from the Web may be unreliable, andthe generated KG will be based on the informationgiven in the knowledge source. So, deviation couldemerge from the incorrect and incomplete ontologicalcoverage of the generated KG [50]. The best approachto address these problems is to compare and completethe generated KG using prior reference knowledge.This step aims to exclude irrelevant, illogical, andunrelated nodes (concepts) and relations. There mightbe some nodes that are not connected to the rest of thegraph. For example, if a political article contains someidioms such as “kick the bucket”. Considering thebucket as a concept or entity is totally out of context.Most KG generators are creating concepts and relationalong with confidence scores which represent thegenerator certainty about a concept or a relation.Therefore, the concepts or relations with lowconfidence scores should be removed.5.2.3. Correctness ModuleBased on literature review, insufficient research hasaddressed KG correctness. The primary source oferrors in KGs are errors in the data sources used forcreating KGs. Association Rule Mining has been usedextensively for error checking and removinginconsistent axioms [52]. In the framework, the systemlearns about disjointness axioms or class disjointnessassertion, and then apply those disjointness axioms toidentify potentially wrong type assertions. Forexample, a school could be named “Kennedy”, but aschool cannot be a person (disjointness). So, a richontology is required to define the possible restrictionsthat cannot coexist [42]. DOLCE is a top levelontology that is rich with disjointness axioms [53].5.2.4. Completion ModuleThis is the most important module in the framework asmost of generated KGs are incomplete. KGscompletion is called knowledge graph embeddingwhich can be summarized as follows:For each triple (ℎ, 𝑟, 𝑡), the embedding model definesa score function f(ℎ, 𝑟, 𝑡). The goal is to choose the 𝑓which makes the score of a correct triple (ℎ, 𝑟, 𝑡) ishigher than the score an incorrect triple (ℎ′, 𝑟′, 𝑡′) [54].Page 4864

Completion approaches could be classified into twocategories: Translational Distance Models andSemantic Matching Models.Translational Distance Models, such as the popularTransE, have been used extensively in many scientificresearch. However, TransE has defects in dealing with1-to-N and N-to-N relations [55]. While SemanticMatching Models, such as RESCAL and its extensions,link each entity with a vector to capture its latentsemantics [56].5.3. Mapping Phase5.3.1. Domain ontologiesMost information systems are based on domainontologies. For example, in a hospital, there arefundamental healthcare ontologies. So, the generatedKG must be mapped to fit in the domain ontologies toensure the consistency and Interoperability generated KGwith ontologies used in an organization.5.3.2. Consistency checkThis step aims to solve interoperability issue in theKGs by checking whether all concepts and relationsare consistent with the range and object-property in thetarget domain. For example, if (𝑥, 𝑝, 𝑦) indicates John(𝑥, 𝑡ℎ𝑒 𝑠𝑢𝑏𝑗𝑒𝑐𝑡) with property (𝑝, has lastName) ofRobert (𝑦, 𝑡ℎ𝑒 𝑜𝑏𝑗𝑒𝑐𝑡), then has lastName makesJohn belongs to the domain class of (Person, 𝑐) or(𝑥, 𝑟𝑑𝑓: 𝑡𝑦𝑝𝑒, 𝑐). In the framework, we will extendthe method developed by Péron et al. [57]. Accordingto Péron, the domain inconsistency is defined as theoccurrence of an object-property 𝑝 that does notbelong to the containing domain. Similarly, the rangeinconsistency is the occurrence of object-property 𝑝that does not belong to the definition range of 𝑝.5.3.3. The Generated ontologyAfter validating the consistency from the previousmethod, the KG is trimmed and transformed to adomain ontology. In this step, Super-subtypes classesare resolved and any relation that is inconsistent withthe domain ontologies will be removed to ensure thatthe generated ontology could be easily integrated intothe knowledge bases of the target organization.6. ImplementationIn this section, we will discuss the implementationdetails of the proposed framework in terms of the usedalgorithms, ontologies, and refinement methods.Data Cleaning: For data cleaning, Python code wasdeveloped to parse webpages and search for HTMLtags, irrelevant meta-tags and social media plugins,then, discard them from the input corpus. Datacleaning procedures are simplified in Algorithm 1.Algorithm 1: Cleaning Unstructured CorpusInput: set of text file 𝑇{𝑡D , . . , 𝑡F }Output: cleaned set of text files 𝑇H {𝑡HD , . . , 𝑡HF }For each 𝑡 in 𝑇If t extension is HTMLSearch for “ p ” tag or “ span ”Apply NLP to check sentenceIf tag forms a sentenceIf container tag contains no adsAdd results tag to 𝑇HBreak;ElseDiscard;Else if 𝑡 extension is RSSSearch for “ Description ” or“ Title ” tagsAdd results tags to 𝑇HElse if 𝑡 extension is XMLFor each tag 𝑡𝑔 in 𝑡Apply NLP to the sentenceIf 𝑡𝑔 contains a sentenceAdd 𝑡𝑔 to 𝑇HElseDiscard 𝑡𝑔End ifEnd ifElseIf 𝑡 contains sentencesAdd 𝑡 to 𝑇HEnd foreachSelected Reference OntologiesTo select the appropriate reference ontologies, wefollowed the criteria developed by [51] to find themost suitable knowledge graph for a given setting.Since we are adopting an organizational perspective,YAGO and DOLCE were the most suitable referenceontologies for the following reasons. YAGO currently has around 10 million entities andcontains more than 120 million facts about theseentities [58]. YAGO provides source informationper statement. It also links classes to the WordNetKnowledge base and DBpedia. DOLCE is used for KG correction [53] whichprovides high-level disjointness axioms.For our empirical analysis, we used WN18 and FB15kdatasets, the most popular benchmark datasets built onWordNet [59, 60] for training and testing KGcompletion. These datasets serve as realistic KBcompletion datasets and are used for trainingcompletion module network.KG Generator: to ensure best results, we used thirdparty KG generators to avoid the time and resourceswaste in building generic KGs (we recognize that thisapproach may not provide decent accuracy similar toPage 4865

//search if class g has a relation with the in the graph GPREFIX GSELECT ?property ?value As eWHERE{ ?g rdf:SubclassOf a .?d rdf:onProperty a.}//if erroneous relation found, delete from GIf (e is not null)DELETE {?property?value}WHERE { ? propertyrdf:type e. ? propertyrdf:type a}Update 𝐺End ifElse // no disjoint detected// use YAGO to check for erroneous info.PREFIX YAGOSELECT ? property ?value ?subject ? object WHERE{ ?rdf:predicate g }If (property value subject object! g.resources)Update 𝐺;End ifEnd ifEnd ForeachEnd Foreachthose third-party solutions). Neo4j or Watson are thirdparty services that achieve outstanding performance ingenerating KGs from text. They offer KG generationas a service using APIs or special browsers.Anomaly Exclusion: The first step is to exclude anyconcept or relation with a confidence score less than0.3. The low confidence score means that the KGgenerator could not find enough evidence from thecorpus nor from reference ontologies to support theconcept or relation. Implausible links, such asRDF:sameAs assertion between a person and a book,can be identified based only on the overall distributionof all links, where such a combination is infrequent[61]. For concepts and relations with confidence scorebetween 0.3 and 0.5, Local Outlier Factor2 is appliedto check their validity.KG Error Correction: We used first-order logic tocheck the erroneous relations and associations bychecking if each class has any relations with otherclasses found in the disjointness axioms associatedwith each class. For example, assume that the KGmisrepresented “Columbus” city as a person but it hasa located in property. However, the class Persondisjoints with located in which belongs to the classLocation. DOLCE ontology contains hundreds ofaxioms, that combines many (non-trivial) formalizedontological theories into one theory [52]. Aside fromaxioms, each property that is associated with eachinstance is validated using YAGO to check if the datais outdated or misrepresented. For example, imagineKG represented “Einstein” correctly as a scientist butincorrectly related him with Biology. In this case,YAGO would be used as reference for checking errorbeyond class types. The error correction process issummarized in Algorithm 2 along with SPARQLqueries. The DOLCE axioms 𝐴 is generated using thefollowing SPARQL query.A {PREFIX DOLCESELECT DISTINCT ?subject , DISTINCT ?object ? property WHERE{ {?subjectDOLCE: type[]} UNION {[] DOLCE: type[]}}}Algorithm 2: KG error correctionInput: 𝐺{𝑔D , . . , 𝑔F } is the generated KG consisting oftriples and Let 𝐴{𝑎D , . . , 𝑎F } be a set of DOLCE axiomsOutput: corrected KG 𝐺H {𝐺HD , . . , 𝐺HF }// search for disjoint axioms for each classForeach g in 𝐺.subject.TypeForeach a in AIf (g rdf:disjoinWith(a))2The function on GitHub, LOF KG completion: there are many models that havebeen used for KG completion. Among those models,Complex Embeddings (ComplEx) [62] has achievedpromising results in comparison to other methods [63,64]. ComplEx is considered the simplification ofDistMult [65]. ComplEx uses tensor factorization tomodel asymmetric relations. ComplEx is implementedas a part of an embedding methods project on GitHubused for graph completion tasks3. The project containsComplEx as one of six powerful graph embeddingmethods such as TransE and HolE. We are using thesame settings created by the project in terms of numberof epochs, batch sizes, and optimization function.ComplEx is based on Hermitian dot product, thecomplex counterpart of the standard dot productbetween real vectors. In ComplEx, the embedding iscomplex

An Automatic Ontology Generation Framework with An Organizational Perspective . lack of fully-automated domain-independent ontology generation systems that address common data quality issues. The framework utilizes refined KGs to

Related Documents:

community-driven ontology matching and an overview of the M-Gov framework. 2.1 Collaborative ontology engineering . Ontology engineering refers to the study of the activities related to the ontology de-velopment, the ontology life cycle, and tools and technologies for building the ontol-ogies [6]. In the situation of a collaborative ontology .

method in map-reduce framework based on the struc-ture of ontologies and alignment of entities between ontologies. Definition 1 (Ontology Graph): An ontology graph is a directed, cyclic graph G V;E , where V include all the entities of an ontology and E is a set of all properties between entities. Definition 2 (Ontology Vocabulary): The .

To enable reuse of domain knowledge . Ontologies Databases Declare structure Knowledge bases Software agents Problem-solving methods Domain-independent applications Provide domain description. Outline What is an ontology? Why develop an ontology? Step-By-Step: Developing an ontology Underwater ? What to look out for. What Is "Ontology .

A Framework for Ontology-Driven Similarity Measuring Using Vector Learning Tricks Mengxiang Chen, Beixiong Liu, Desheng Zeng and Wei Gao, Abstract—Ontology learning problem has raised much atten-tion in semantic structure expression and information retrieval. As a powerful tool, ontology is evenly employed in various

Ontology provides a sharable structure and semantics in knowledge management, e-commerce, decision-support and agent communication [6]. In this paper, we described the conceptual framework for an ontology-driven semantic web examination system. Succinctly, the paper described an ontology required for developing

This research investigates how these technologies can be integrated into an Ontology Driven Multi-Agent System (ODMAS) for the Sensor Web. The research proposes an ODMAS framework and an implemented middleware platform, i.e. the Sensor Web Agent Platform (SWAP). SWAP deals with ontology construction, ontology use, and agent

In this paper, we describe the implementation of an ontology-driven framework for provenance management in eScience projects. The framework consists of an upper-level ontology called provenir that can be extended to model interoperable, domain-specific provenance ontologies. The application of the framework is demonstrated in two

America’s criminal justice system. Racial and ethnic disparity foster public mistrust of the criminal jus-tice system and this impedes our ability to promote public safety. Many people working within the criminal justice system are acutely aware of the problem of racial disparity and would like to counteract it. The pur-pose of this manual is to present information on the causes of disparity .