Attribute Extraction Study In The Field Of Military .

2y ago

6 Views

2 Downloads

976.35 KB

13 Pages

Last View : 2m ago

Last Download : 3m ago

Upload by : Baylee Stein

Report this link

Download PDF

Transcription

HindawiWireless Communications and Mobile ComputingVolume 2021, Article ID 2549488, 13 pageshttps://doi.org/10.1155/2021/2549488Research ArticleAttribute Extraction Study in the Field of Military EquipmentBased on Distant SupervisionXindong You,1 Meijing Yang,1 Junmei Han,2 Jiangwei Ma,1 Gang Xiao,2 and Xueqiang Lv11Beijing Key Laboratory of Internet Culture Digital Dissemination, Beijing Information Science and Technology University,Beijing, China2National key Laboratory for Complex Systems Simulation, Institute of Systems Engineering, ChinaCorrespondence should be addressed to Xueqiang Lv; lxq@bistu.edu.cnReceived 6 August 2021; Revised 24 September 2021; Accepted 23 October 2021; Published 23 November 2021Academic Editor: Honghao GaoCopyright 2021 Xindong You et al. This is an open access article distributed under the Creative Commons Attribution License,which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.The eﬀective organization and utilization of military equipment data is an important cornerstone for constructing knowledgesystem. Building a knowledge graph in the ﬁeld of military equipment can eﬀectively describe the relationship between entityand entity attribute information. Therefore, relevant personnel can obtain information quickly and accurately. Attributeextraction is an important part of building the knowledge graph. Given the lack of annotated data in the ﬁeld of militaryequipment, we propose a new data annotation method, which adopts the idea of distant supervision to automatically build theattribute extraction dataset. We convert the attribute extraction task into a sequence annotation task. At the same time, wepropose a RoBERTa-BiLSTM-CRF-SEL-based attribute extraction method. Firstly, a list of attribute name synonyms isconstructed, then a corpus of military equipment attributes is obtained through automatic annotation of semistructured data inBaidu Encyclopedia. RoBERTa is used to obtain the vector encoding of the text. Then, input it into the entity boundaryprediction layer to label the entity head and tail, and input the BiLSTM-CRF layer to predict the attribute label. Theexperimental results show that the proposed method can eﬀectively perform attribute extraction in the military equipmentdomain. The F1 value of the model reaches 77% on the constructed attribute extraction dataset, which outperforms the currentstate-of-art model.1. IntroductionWith the continuous development of Internet technology,data from all walks of life is growing rapidly. Organizingthese data through knowledge graph technology can eﬀectively improve data utilization eﬃciency. In the militaryﬁeld, the construction of knowledge graph is not onlyconducive for the military commanders to quickly anddeeply understand certain military equipment but also canbe combined with knowledge map and intelligent systemfor rapid intelligent decision-making assistance [1].Attribute extraction is an important step in knowledgegraph construction, which refers to extracting the attributename and attribute value of entities from text data. Facinga large amount of text data in the military ﬁeld, extractingattribute data automatically is one of the keys to study theconstruction of a military knowledge graph. The traditionalattribute extraction methods are divided into rule-basedmethods and machine learning-based methods. Zhai andQiu [2] proposed a rule-based knowledge meta-attributeextraction method based on phrase structure trees. Therule-based method needs to set rules manually accordingto the data characteristics, so the migration of the methodis poor. Jakob and Gurevych [3] fused multiple featuresand used conditional random ﬁelds [4] to extract attributes.However, machine learning methods require a large amountof labelled data and manual features. In recent years, deeplearning methods have also been gradually applied to attribute extraction. Toh and Su [5] used a bidirectional recurrent neural network BRNN combined with a conditionalrandom ﬁeld for attribute value extraction. Cheng et al. [6]used a bidirectional long short-term memory networkBiLSTM combined with a gated dynamic attention mechanism for attribute extraction. However, attribute extraction

2Wireless Communications and Mobile Computingbased on deep learning methods also requires a largeamount of annotated data. In the ﬁeld of weaponry, thereis a lack of corresponding annotated datasets. Manual annotation is not only time-consuming but also the level of theannotator will largely aﬀect the quality of the annotatedcorpus [7]. Through investigation, we found that BaiduEncyclopedia currently contains a large number of weaponand equipment entries. There are a large number of semistructured and unstructured data in the encyclopedia pages,which contain rich information of entity attributes. We propose a new way of attribute data annotation based on thecharacteristics of the encyclopedia pages. We annotate theunstructured text data by distant supervision based on theInfoBox data of the encyclopedia pages. At the same time,we convert the attribute extraction task into a sequenceannotation task and use the RoBERTa-BiLSTM-CRF-SELmethod for attribute data extraction.In summary, the contribution points of this paper can bedivided into the following three points.(1) A new way of data annotation is proposed for thecharacteristics of encyclopedia data. In the annotation process, the subjective is ﬁxed according to thename of the encyclopedia page, and then, its attributes and attribute values are annotated(2) Based on Baidu Encyclopedia data, the militarydomain attribute extraction dataset is automaticallyconstructed by using the idea of distant supervision(3) RoBERTa-BiLSTM-CRF-SEL is designed for automatic attribute extraction in the ﬁeld of weapons.The method obtains entity boundary featuresthrough the entity boundary prediction layer. Theloss of boundary prediction layer and the loss ofattribute prediction layer are weighted and summedas the loss value of the model. In this way, the modelentity recognition eﬀect is improved. On the militaryequipment attribute extraction dataset, the F1 of theproposed method reaches 0.77, which is better thanother existing methods2. Related WorkAttribute extraction methods can be mainly classiﬁed intorule-based methods, machine learning-based methods, anddeep learning-based methods. The rule-based approachneeds to formulate rules manually for speciﬁc situations.This method is simple and usually oriented to speciﬁcdomains. Although the method has a high accuracy rate,it has a small scope of application and is diﬃcult tomigrate to other domains. The method based on machinelearning is more ﬂexible, but it needs the support of artiﬁcial features and large-scale datasets. The method basedon deep learning can automatically mine hidden featuresbetween texts through a neural network model, but it alsorequires large amounts of labelled data for model trainingand optimization.In the early studies of attribute extraction, scholarsmainly formulated a series of rules to extract attributes. Huand Liu [8] extracted commodity attributes from customerreviews by frequent itemset feature extraction. Li et al. [9]presented an automatic method to obtain encyclopedia character attributes, and the speech tagging of each attributevalue was used to locate the encyclopedia free text. The ruleswere discovered by statistical method, and the characterattribute information was obtained from encyclopedia textaccording to rules matching. Yu et al. [10] proposed anapproach of extracting maritime information and convertingunstructured text into structural data. Ding et al. [11]formed nine types of description rules for attribute extraction by manually constructing rules. They analyzed thequantitative relationship and emotional information of attribute description and ﬁnally designed and implemented theacademic concept attribute extraction system. Qiao et al.[12] suggested a rule-based character information extractionalgorithm. Based on the rules, they researched and developed a character information extraction system and ﬁnallyrealized the automatic extraction of semistructured character attribute information. Kang et al. [1] oﬀered an unsupervised attribute triplet extraction method for the militaryequipment domain. According to the distribution law ofattribute triples in sentences, this method adopts an attributeindicator extraction algorithm based on frequent patternmining and completes the extraction of attribute triples bysetting extraction rules and ﬁltering rules.In a machine learning-based attribute extractionmethod, Zhang et al. [13] introduced word-level features inthe CRF model and used domain dictionary knowledge asan aid for product attribute extraction. Xu et al. [14] introduced shallow syntactic information and heuristic locationinformation and input them to CRF as features, which eﬀectively improved the attribute extraction performance of themodel. Gurumdimma et al. [15] presented the approach toextracting these events based on the dependency parse treerelations of the text and its part of speech (POS). The proposed method uses a machine-learning algorithm to predictevents from a text. Cheng et al. [16] broke through thecurrent method of a statistical operation mainly in the scopeof sentences in the attribute attribution judgment. Theyproposed a method of character attribute extraction that isclassiﬁed from text to sentence with the guidance of textknowledge. Kambhatla [17] employed maximum entropymodels to combine diverse lexical, syntactic, and semanticfeatures derived from the text. References [18–20] suggesteda weakly supervised automatic extraction method that usesvery little human participation to solve the problem of lackof training corpus. Zhang et al. [21] oﬀered a novel composite kernel for relation extraction. The composite kernelconsists of two individual kernels: an entity kernel thatallows for entity-related features and a convolution parsetree kernel that models syntactic information of relationexamples. Liu et al. [22] put a perceptron learning algorithmthat fuses global and local features for attribute value extraction of unstructured text. The combination of featuresmakes the model obtain better feature representation ability.Li et al. [23] constructed three kinds of semantic informationthrough word attributes, word dependencies, and wordembeddings of words. The three semantic information are

Wireless Communications and Mobile Computingcombined with the conditional random ﬁeld model to realizethe extraction of commodity attributes.In recent years, attribute extraction methods based ondeep learning have gradually become mainstream. Wanget al. [24] regarded attribute extraction as a text sequencelabelling task. Input the word sequences and lexicalsequences into a GRU network, and then, use CRF forsequence label prediction. Xu et al. [25] considered thatthere is a gap between the meaning of a word expressionin general and specialized domains. Therefore, they inputboth word embeddings from the generic domain and wordembeddings from the specialized domain into a convolutional neural network model. The model is used to decidewhich expression is more preferred to achieve the attributeextraction. For the low performance of slot ﬁlling methodapplied in Chinese entity-attribute extraction at present,He et al. [26] presented a distant supervision relation extraction method based on bidirectional long short-term memoryneural network. Wei et al. [27] proposed an attributeextraction-oriented class-convolutional interactive attentionmechanism. The target sentence was ﬁrst input into abidirectional recurrent neural network to obtain the implicitexpression of each word and then underwent classconvolution interactive attention. The force mechanismperformed representation learning. To solve the problemthat traditional information extraction methods have poorextraction results due to the existence of long and diﬃcultsentences and the diversity of natural language expressions,Wu et al. [28] introduced text simpliﬁcation as the preprocessing process of extraction. Among them, text reductionis modeled as a sequence-to-sequence (seq2seq) translationprocess and is implemented with the seq2seq-RNN modelin the ﬁeld of machine translation. Huang et al. [29] proposed a diﬀerent method, which uses an independent graphbased on a neural network as the input and is accompaniedby two attention mechanisms to better capture indicativeinformation. Cheng et al. [30] used the advantages of theCRF model to deal with the sequence labelling problemand realized the automatic extraction of journal keywordsby integrating the part-of-speech information and the CRFmodel into the BiLSTM network. Luo et al. [31] proposeda new bidirectional dependency grammar tree to extractthe dependency structure features of a given sentence andthen combined the extracted grammar features with thesemantic features extracted using BiLSTM and ﬁnally usedCRF for attribute word annotation. Feng et al. [32] introduced an entity attribute value extraction method based onmachine reading comprehension model and crowdsourcingveriﬁcation due to the high noise characteristics of Internetcorpus. The attribute extraction task is transformed into areading comprehension task. Luo et al. [33] introduced aMLBiNet (multilayer bidirectional network) that integratescross-sentence semantics and associated event information,thereby enhancing the discrimination of events mentionedwithin. Xi et al. [34] presented bidirectional entity leveldecoder (BERD) to gradually generate argument rolesequences for each entity.To address the problem of lack of annotation data in themilitary equipment domain, the attribute extraction dataset3in the military equipment domain is automaticallyconstructed based on distant supervision. The attributeannotation sequence is decoded by RoBERTa model combined with BiLSTM-CRF model, and the entity boundaryprediction layer is also added to improve the eﬀect of entityrecognition in this paper.3. Attribute Extraction Methods Based onRoBERTa and Entity Boundary PredictionThe model proposed in this paper is mainly composed oftext coding layer, entity boundary prediction layer, andBiLSTM-CRF attribute prediction layer. We ﬁrst encodethe input text through RoBERTa [35] to obtain its hiddenlayer state vector. Then, input them into the entity boundaryprediction layer and the BiLSTM-CRF attribute predictionlayer, respectively. At the entity boundary prediction layer,the 0/1 coding method is used to label the entity head andtail, respectively, and then, the start loss and end loss ofthe two sequence labels are calculated. In the BiLSTM-CRFattribute prediction layer, we take the output result of theentity boundary prediction layer as a feature and splice itwith the text vector. Input the splicing results into BiLSTMCRF to predict the text attribute tag. Next, calculate its lossvalue att loss. Finally, in the model optimization, we considerthe three-loss values together, weigh the summation, andachieve the overall optimization of the model by backpropagation. The model structure diagram is shown in Figure 1.3.1. Text Encoding Layer. BERT is a pretrained languagemodel proposed by Google in 2018. BERT uses the bidirectional transformer structure as the main framework of thealgorithm, which can capture the bidirectional relations inutterances more thoroughly. BERT uses a self-supervisedapproach to train the model based on a massive corpus,which can learn a good feature representation for words.Therefore, BERT has achieved good results in several downstream tasks such as text classiﬁcation and sequence annotation. RoBERTa model is an improved version based on theBERT model. Compared with BERT, RoBERTa hasimproved both the training data and training methods andpretrained the model more adequately.In terms of training data, RoBERTa uses 160G trainingtext, while BERT only uses 16G training text. RoBERTa alsouses a new dataset CCNEWS and conﬁrms that using moredata for pretraining can further improve the performance ofdownstream tasks. At the same time, RoBERTa hasincreased the batch size. BERT uses 256 batch size.RoBERTa uses a larger batch size in the training process.Researchers have tried batch sizes ranging from 256 to8000. Liu et al. found through experiments that the performance of certain downstream tasks can be slightly improvedafter removing the NSP (next sentence prediction, NSP) loss.Therefore, in the training method, RoBERTa deleted theNSP task. In addition, unlike the static masking mechanismof BERT, RoBERTa uses a dynamic masking mechanism torandomly generate a new mask pattern every time. BERTrelies on random masks and predicted tokens. The originalBERT implementation performs a mask during data

4Wireless Communications and Mobile ComputingUSS Abraham Lincoln, nation, AmericanUSS Abraham Lincoln, service, November STMLSTMLSTMLSTMLSTMLSTMImplicit layer state vectorEntity header sequence1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0Entity tail sequence0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0Implicit layer state vectorRoBERTa codeOn November 11-1989, the USS Abraham Lincoln was officially commissioned at Naval Station Norfolk and integrated into the American Atlantic Fleet.Figure 1: Structure diagram of attribute extraction model.preprocessing to obtain a static mask. RoBERTa uses adynamic mask. When a sequence is entered into the model,a new mask pattern will be generated. In this way, in the process of continuous input of a large amount of data, themodel will gradually adapt to diﬀerent masking strategiesand learn diﬀerent language representations. Byte-pairencoding (BPE) is a mixture of character-level and wordlevel representations and supports the processing of manycommon words in the natural language corpus. The originalBERT implementation uses character-level BPE vocabularywith a size of 30 K, which is learned after preprocessing theinput using heuristic word segmentation rules. Facebookresearchers did not adopt this approach but consideredusing a larger byte-level BPE vocabulary to train BERT,which contains 50 K subword units without any additionalpreprocessing or word segmentation on the input. Compared with BERT, RoBERTa makes small improvements ineach part of the model training, and the combination ofthe improvements in each part makes the model eﬀect eﬀectively improved.We use HIT’s open-source Chinese RoBERTa to encodethe input text and obtain its implicit layer state vector.3.2. Entity Boundary Prediction Layer. In the constructedmilitary equipment attribute extraction dataset, the entitynames are generally longer, such as “65-type 82mm recoilless gun” and “105mm 6 6 wheeled armored assault gun.”To avoid the problem of fuzzy entity boundary recognitionin the process of attribute extraction, the entity boundaryprediction layer is added for entity boundary recognition.In the entity boundary prediction layer, the implied layerstate vector output from RoBERTa is input to the fully connected layer to generate two 0/1 annotation sequences. Oneof the annotation sequences is for the entity head, in which 1represents the entity head, and 0 represents the nonentityhead. The other annotation sequence is for entity tails, where1 represents entity tails, and 0 represents nonentity tails.After obtaining the two sequence labels, we comparethem with the correct labels. Calculate the loss value of entityhead sequence recognition and entity tail sequence recognition. Meanwhile, to further obtain the boundary informationof entities, we take the entity head sequence and entity tailsequence as features and splice them with the hidden layerstate vector output by Roberta. Then, input it to theBiLSTM-CRF layer for attribute prediction.3.3. BiLSTM-CRF Attribute Prediction Layer. In the attributeprediction layer, we use a classical sequence labelling structure BiLSTM-CRF for the identiﬁcation of attribute valuelabels. The long short-term memory network LSTM [36] is

Wireless Communications and Mobile Computing5a temporal recurrent neural network, which can bettercapture the longer distance dependencies in the text. TheLSTM model structure is shown in Figure 2.There are three inputs to the LSTM, which are thehidden layer state vector ht 1 at the previous moment, thecell state C t 1 at the previous moment, and the input xt atthe current moment. Inside the LSTM, the retention andforgetting of information are decided by three gating mechanisms. The ﬁrst is the forgetting gate, which is used todecide what information to forget from the cell state. Theforgetting gate is used to read ht 1 and xt and outputs databetween 0 and 1 to decide which information in C t 1 to keepand which to discard, where 1 means fully retained, and 0means all discarded. The input gate is used to decide whichnew information is added to the cell state, and the outputgate decides which data in the cell state will be output. Thecalculation formulas of the LSTM model are shown in f t σ W f ½ht 1 , xt b f ,ð1Þit σðW i ½ht 1 , xt bi Þ,ð2Þot σðW o ½ht 1 , xt bo Þ,ð3Þ t tanh ðW C ½ht 1 , xt bC Þ,Cð4Þ t,C t f t · Ct 1 it · Cð5Þht ot tanh ðCt Þ:ð6ÞLSTM can only encode information in one direction. Toeﬀectively use the context information, we uses a bidirectional LSTM structure for encoding.By calculating the hidden layer vector output of theLSTM in both positive and negative directions and splicingthem together, the hidden layer state vector of BiLSTM isﬁnally obtained. The formulas are shown in! ! ! ht LSTM ht 1 , wt ,ð7Þ ht LSTM ht 1 , wt ,ð8Þ ! ht concat ht , ht :ð9ÞThe conditional random ﬁeld is a conditional probability distribution model of output Y ðY 1 , Y 2 , Y n Þgiven a set of input variables X ðX 1 , X 2 , X n Þ. CRFis a serialization annotation algorithm, which can considerthe dependencies between tags to obtain the globally optimal tag sequence.For a set of label prediction sequence Y, its scoringformula is shown innni 0i 1scoreðx, yÞ Ayi ,yi 1 Pi,yi :ð10ÞAmong them, P is an n m dimensional matrix, mrepresents the number of labels to be predicted, and Pi, jrepresents the possibility that input i is the label j. A is thetransition matrix, and Ai, j represents the probability of transition from label i to label j.Therefore, for all possible prediction sequence sets Y x ofthe input sequence X, the conditional probability is as shownP ðy x Þ escoreðx,yÞ: y Y x escoreðx, yÞð11ÞIn training, we optimize the model by maximizing thelog-likelihood probability of the correct output label inEquation (12). For prediction, we select the sequence withthe highest score as the best prediction sequence, which iscalculated as shown in Equation (12).y arg max scoreðx, yÞ: y Y xð12ÞTake sentences in the dataset as an example, such as “OnNovember 11-1989, the USS Abraham Lincoln was oﬃciallycommissioned at Naval Station Norfolk and integrated intothe American Atlantic Fleet,” “November 11-1989” wouldbe marked as “B-FY,” “USS” would be marked as “B-ST,”“Abraham” would be marked as “I-ST,” “Lincoln” wouldbe marked as “I-ST,” and “American” would be marked as“B-GJ” (please refer to Chapter 4 for label meaning).3.4. Loss Value Calculation. In terms of loss value calculation, we take the weighted sum of entity boundary loss valueand attribute identiﬁcation loss value as the ﬁnal loss value.The loss value is used to optimize the overall parameters ofthe model (as shown in Figure 3).The loss value calculation formula i and lossend represent the loss values ofentity head recognition and entity tail recognition, respectively, and lossattribute represents the loss value generated bythe attribute sequence labelling. α, β, γ ½0, 1 are hyperparameters that control the weighted summation of the threeloss values.loss αlossstart βlossend γlossattribute :ð13Þ4. Experimental Results and Analysis4.1. Acquisition of Military Equipment Attribute Data4.1.1. Data Acquisition. The experimental data came fromthe Baidu Encyclopedia website (https://baike.baidu.com/),and the data acquisition process is shown in Figure 4. Wecannot directly obtain military-related terms from BaiduEncyclopedia, because the website does not classify andindex terms. The military channel of http://globe.com/ hasa summary display of various types of weapons and equipment. We get the names of various military equipment fromthe military channel of the World Wide Web. Then, weexpand the rules, splice them with the links of encyclopediaentries, and ﬁnally, get the URL links of the required militaryequipment-related entries. After obtaining the links of military equipment entries in Baidu Encyclopedia, we analyzed

6Wireless Communications and Mobile ComputingOtCt–1 CV tanh 𝜎𝜎 𝜎tanhht–1htxtFigure 2: LSTM structure diagram.Loss value calculationLossstart 𝛼lossstart 𝛽lossend 𝛾lossattributeLossattributeProperty annotation layerAircraft carrier USS AbrahamLincoln, Country, USAAircraft carrier USS Abraham Lincoln,Commissioned, November 11-1989LossstartCRFCRFCRFCRFCRF LossendCRFEntity header sequence annotation layerEntity tail sequence annotation layer1 0 0 0 0 0 0 . . 0 0 0 1 0 0 0 0 0 00 0 0 0 0 0 0 . . 0 0 0 0 1 0 0 0 0 0Figure 3: Calculation of loss value.the encyclopedia entry pages and found that the entriesmainly consist of entry names, information boxes containing attribute data, and a large amount of unstructured text.We used a crawler to collect the InfoBox data and text datain the Baidu Encyclopedia entry of weapons and equipment and ﬁnally collected 1757 encyclopedia data ofmilitary equipment.4.1.2. Data Annotation. Data annotation by manual is notonly time-consuming and laborious but also diﬀerent annotators may have diﬀerent annotation rules for the same pieceof data. Therefore, automatic annotation of data has becomethe focus of current research. Encyclopedia word dataconsists of two main parts, which are attribute data in theinformation frame and unstructured text description data.Taking the “Nimitz aircraft carrier” as an example, the entryinformation box of the aircraft carrier contains basic attributes such as “English Name,” “Nation,” “pretype/level,”and “subtype/level,”. The text data is an introduction to thebasic information of the “Nimitz aircraft carrier.” Observingits text data, it can be seen that it contains textual expressions of the “English Name,” “Nation,” and other attributevalues of the “Nimitz aircraft carrier.”For this data feature, the data annotation in this paper isbased on the distant supervision hypothesis [33]. The distantsupervision hypothesis means that when there is a

Wireless Communications and Mobile ComputingGlobe military weaponsequipment nameJ-16 fighter aircraftFC-1 ̏Dragon /JF-17̏Thunderbolt multi-roleattack aircraftSu-27 fighter.Liaoning ship̏North Sea (558) missilefrigate7Rule expansion,URL splicingBaiduencyclopediaweaponrylinksWeb parsingdata ia dataTextdescriptiondataFigure 4: Flow chart of attribute data collection.relationship between two entities, then all sentences containing the pair of entities are considered to express this relationship to some extent. Distant supervision is to providelabels for data with the help of external knowledge bases,to save the trouble of manual labelling [37]. Attributes canalso be considered as a type of relationship, so the distantsupervision assumption is applied to the annotation of attribute data. Taking the Nimitz aircraft carrier as an example,the information box in Figure 5 shows that the relationshipbetween “Nimitz aircraft carrier” and the attribute “UnitedStates” is a “Nation” attribute. Then, based on the distantsupervision assumption, all sentences containing “Nimitzaircraft carrier” and “United States” can be labelled withthe “Nation” attribute, for example, the sentences “NimitzAircraft Carrier (CVN-68) is the ﬁrst ship of the Nimitzclass aircraft carriers of the United States Navy” and “TheNimitz aircraft carrier started construction in June 1968. Itwas launched in May 1972 and delivered to the United StatesNavy in May 1975”. Both of these sentences contain thewords “Nimitz aircraft carrier “ and “United States,” andthe triad (Nimitz aircraft carrier, nation, United States) canbe considered to exist in these two sentences when labellingthe data. Suppose a dataset D fs1 , s2 , , sn g, where si represents sentence and is unstructured text. Train a model Fsuch that Fðsi ; θÞ ½ðeti , etj , etk Þ , where θ represents modelparameters, and eti , etj , etk represent the T th entity and its corresponding relationship. The idea of the distant supervisionalgorithm is to use knowledge base to align plain text forannotation and then perform supervised training.However, Baidu Encyclopedia website is an open knowledge platform, and the editors of entries are not ﬁxed. Therefore, there is a lack of standardization and unity in thenaming of attributes, which leads to a variety of expressionsof the same attribute. Since the data in the military ﬁeld has acertain degree of conﬁdentiality, the ﬁeld itself has data sparsity. Diﬀerent attribute expressions can lead to a variety ofdata labels. If the labels are too scattered, the annotation dataof each type of attribute will be small, which is diﬃcult toobtain a good attribute extraction eﬀect. To merge multipleattribute labels’ expressions, we count the distribution ofattribute names to select high-frequency words as attributenames. The attribute expressions present in the militaryequipment data of the encyclopedia website were mergedby manual means, and a synonym table of military equipment attribute names

On the military equipment attribute extraction dataset, the F1 of the proposed method reaches 0.77, which is better than other existing methods 2. Related Work Attribute extraction methods can be mainly classiﬁed into rule-based methods, machine learning-based methods, and deep learning-based methods. The rule-based approach

Related Documents:

Nonprofit Self-Assessment Checklist

May 02, 2018 · D. Program Evaluation ͟The organization has provided a description of the framework for how each program will be evaluated. The framework should include all the elements below: ͟The evaluation methods are cost-effective for the organization ͟Quantitative and qualitative data is being collected (at Basics tier, data collection must have begun)

1.4K Views

2y ago

Name of thé élément in thé language and script of thé ... - UNESCO

Silat is a combative art of self-defense and survival rooted from Matay archipelago. It was traced at thé early of Langkasuka Kingdom (2nd century CE) till thé reign of Melaka (Malaysia) Sultanate era (13th century). Silat has now evolved to become part of social culture and tradition with thé appearance of a fine physical and spiritual .

116 Views

9m ago

Study Investigating thè Effect of E- Service Quality on Customer's ...

Dr. Sunita Bharatwal** Dr. Pawan Garga*** Abstract Customer satisfaction is derived from thè functionalities and values, a product or Service can provide. The current study aims to segregate thè dimensions of ordine Service quality and gather insights on its impact on web shopping. The trends of purchases have

125 Views

9m ago

[Kl - Mauritius

On an exceptional basis, Member States may request UNESCO to provide thé candidates with access to thé platform so they can complète thé form by themselves. Thèse requests must be addressed to esd rize unesco. or by 15 A ril 2021 UNESCO will provide thé nomineewith accessto thé platform via their émail address.

469 Views

1y ago

Employee Benefits Event - Schneider Downs Tax Services

̶The leading indicator of employee engagement is based on the quality of the relationship between employee and supervisor Empower your managers! ̶Help them understand the impact on the organization ̶Share important changes, plan options, tasks, and deadlines ̶Provide key messages and talking points ̶Prepare them to answer employee questions

328 Views

1y ago

Review of Extraction Techniques

Advance Extraction Techniques - Microwave assisted Extraction (MAE), Ultra sonication assisted Extraction (UAE), Supercritical Fluid Extraction (SFE), Soxhlet Extraction, Soxtec Extraction, Pressurized Fluid Extraction (PFE) or Accelerated Solvent Extraction (ASE), Shake Flask Extraction and Matrix Solid Phase Dispersion (MSPD) [4]. 2.

33 Views

1y ago

Kinh Giải Thâm Mật HT. Thích Trí Quang dịch giải

Chính Văn.- Còn đức Thế tôn thì tuệ giác cực kỳ trong sạch 8: hiện hành bất nhị 9, đạt đến vô tướng 10, đứng vào chỗ đứng của các đức Thế tôn 11, thể hiện tính bình đẳng của các Ngài, đến chỗ không còn chướng ngại 12, giáo pháp không thể khuynh đảo, tâm thức không bị cản trở, cái được

1.6K Views

3y ago

Entity Relationship Model

Derived attribute: attribute whose value can be determined based upon other data (e.g., a database that includes birthdate and age; age can be a derived attribute given birthdate). Base attribute: an attribute from which you derive another attribute. Descriptive

11 Views

2y ago

Recent Views

Consumer Guide to Auto Insurance - csimt.gov

consumer guide to auto insurance contents introduction to auto insurance 1 understanding your auto insurance policy 2 required auto insurance 3 optional types of auto insurance 4-5 getting the right coverage 6 accidents and violations 7 how to shop for auto insurance 8 shopping tips 9 frequently asked questions 10-11 insurance complaints/when you have a problem 12

2y ago

805 Views

your guide to understanding auto ins in nh - New Hampshire

Hampshire Insurance Department does not mandate or set Auto Insurance Rates. Auto Insurance Rates will vary by insurance company. This guide is intended to give New Hampshire consumers basic information on auto insurance. It suggests ways to: Lower the cost of your auto insurance, shop for Auto insurance and, file an auto insurance claim.

1y ago

449 Views

OWNER'S GUIDE - NinjaKitchen

auto auto auto. frozen drinks smoothies puree med high pulse low / dough. auto auto auto. frozen drinks smoothies puree med high pulse low / dough. auto auto auto. frozen drinks smoothies puree med high pulse low / dough. auto auto auto. please keep these important safeguards in mind when using the . appliance: mportant: make sure that the .

1y ago

285 Views

Quotes within Quotes: When Single (') and Double (") Quotes . - SAS

Here the outside double quotes are replaced by a single quote and the apostrophe is replaced by two single quotes. This works because when the parser sees two single (or double) quotes immediately following each other, the parser resolves them into one quote mark after the closing quote has been determined.

1y ago

237 Views

What These Inspirational Quotes Say

Self Motivation Quotes Success Quotes Teacher Quotes And after reading all of these inspirational quotes you’d like to share which quotation is . -- Brian Tracy "You must constantly ask yourself these questions: Who am I around? What are they doing to me? Wha

2y ago

302 Views

Consumer Guide Auto Insurance - Tennessee

Auto insurance doesn't cover paying off your loan if your car is damaged and its market value is less than what you owe. Auto dealers and lenders may offer guaranteed auto protection (GAP) insurance for this purpose. Your auto insurance will cover you if you drive into Canada. To drive into Mexico, however, you'll need to buy Mexican auto .

1y ago

199 Views

NAIC Consumer Shopping Tool for Auto Insurance

Whether you are buying auto insurance for the first time, or shopping to be sure you are getting the best deal, you already know how important auto insurance is. By law in most states, if you own a car, you must have some auto insurance. Remember, there is no such thing as a "full coverage" auto insurance policy. Policies are made up of

1y ago

185 Views

Personal insurance - Car & Business insurance King Price Insurance

The king's insurance options 5 Things you need to know 7 The stuff you need to do 14 How to claim 16 Our commitment to you 20 Car insurance 22 Car warranty 37 Shortfall cover 45 Scratch and dent 46 Tyre and rim 48 Motorbike insurance 53 Trailer and caravan insurance 64 Watercraft insurance 68 Home contents insurance 77 Buildings insurance 89

1y ago

673 Views

REVIEW OF AUTOMOBILE INSURANCE RATES - Consumers' Association of Canada

In the summer of 2003 the Association compiled over 7,000 auto insurance rate quotes from sources across Canada. In the case of those provinces in which private insurers provide auto insurance the study ensured that the rate quotes obtained reflected the range of prices likely to be found in those markets.

1y ago

213 Views

Broadway towing winchester ky

MO 77 Motors: Rock Hill, SC 7th Avenue Auto Salvage: Fargo, ND 81 Auto Parts & Recycling : Salem, VA 82 Auto Wrecking: Brookfield, OH #9 Truck & Auto Parts (No US Shipping) : Tottenham, ON 97 Auto Wrecking Shull's Towing: Brewster , WA 98 Auto Recyclers: Brooksville, FL 99 Auto Dismantler: Stockton, CA A & A Auto & Truck LLC:

2y ago

465 Views

All about auto insurance - Option Consommateurs

of insurance companies with which they have agreements. Insurance agents: agents work for a specific insurance company. Before you decide to do business with either a broker or an agent, check out prices, the products being proposed and the quality of the service. Buying auto insurance 4 All about auto insurance

1y ago

230 Views

A Message from Our President - Fox Valley Corvette

Bob Jass Chev-rolet 630-365-6481 Auto Parts 25% in most cas-es Ron Westphal Chevrolet 630-898-9630 Auto Parts 25% in most cas-es Thomsons Auto Parts 630-879-6363 Auto Parts 10% in most cas-es American Mod-ern Insurance Co. Collector Car Auto Insurance 10% on Collector Auto Polic

2y ago

225 Views

Quotations - Free Website Builder: Create free websites

cards, but sometimes, playing a poor hand well." . 50th Birthday Quotes 60th Birthday Quotes And there are more. Funny Birthday Quotes Cute Birthday Quotes . it a try, itʼs free. Triumph over failure can be a

2y ago

267 Views

The Top 100 Motivational & Inspirational Quotes for 2015

I've spent hours crawling through the web trying to find the best quotes to keep me motivated and inspired all throughout the New Year. I've saved hundreds of quotes on my laptop and figured that words alone could motivate and inspire me. but if I couple the quotes

2y ago

329 Views

Inspirational Quotes - Guideposts

Inspirational Quotes Inspiring quotes are like vitamins for the soul. From the heartfelt to the humorous, the words of wisdom you’ll find here will strengthen your faith, lift your spirits, and even spark a positive change in your life. This collection of some our favorite inspirational quotes from religious figures, world leaders, authors,

2y ago

553 Views

Attribute Extraction Study In The Field Of Military .

It looks like you're using an ad-blocker