Attribute Extraction Study In The Field Of Military .

2y ago
6 Views
2 Downloads
976.35 KB
13 Pages
Last View : 2m ago
Last Download : 3m ago
Upload by : Baylee Stein
Transcription

HindawiWireless Communications and Mobile ComputingVolume 2021, Article ID 2549488, 13 pageshttps://doi.org/10.1155/2021/2549488Research ArticleAttribute Extraction Study in the Field of Military EquipmentBased on Distant SupervisionXindong You,1 Meijing Yang,1 Junmei Han,2 Jiangwei Ma,1 Gang Xiao,2 and Xueqiang Lv11Beijing Key Laboratory of Internet Culture Digital Dissemination, Beijing Information Science and Technology University,Beijing, China2National key Laboratory for Complex Systems Simulation, Institute of Systems Engineering, ChinaCorrespondence should be addressed to Xueqiang Lv; lxq@bistu.edu.cnReceived 6 August 2021; Revised 24 September 2021; Accepted 23 October 2021; Published 23 November 2021Academic Editor: Honghao GaoCopyright 2021 Xindong You et al. This is an open access article distributed under the Creative Commons Attribution License,which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.The effective organization and utilization of military equipment data is an important cornerstone for constructing knowledgesystem. Building a knowledge graph in the field of military equipment can effectively describe the relationship between entityand entity attribute information. Therefore, relevant personnel can obtain information quickly and accurately. Attributeextraction is an important part of building the knowledge graph. Given the lack of annotated data in the field of militaryequipment, we propose a new data annotation method, which adopts the idea of distant supervision to automatically build theattribute extraction dataset. We convert the attribute extraction task into a sequence annotation task. At the same time, wepropose a RoBERTa-BiLSTM-CRF-SEL-based attribute extraction method. Firstly, a list of attribute name synonyms isconstructed, then a corpus of military equipment attributes is obtained through automatic annotation of semistructured data inBaidu Encyclopedia. RoBERTa is used to obtain the vector encoding of the text. Then, input it into the entity boundaryprediction layer to label the entity head and tail, and input the BiLSTM-CRF layer to predict the attribute label. Theexperimental results show that the proposed method can effectively perform attribute extraction in the military equipmentdomain. The F1 value of the model reaches 77% on the constructed attribute extraction dataset, which outperforms the currentstate-of-art model.1. IntroductionWith the continuous development of Internet technology,data from all walks of life is growing rapidly. Organizingthese data through knowledge graph technology can effectively improve data utilization efficiency. In the militaryfield, the construction of knowledge graph is not onlyconducive for the military commanders to quickly anddeeply understand certain military equipment but also canbe combined with knowledge map and intelligent systemfor rapid intelligent decision-making assistance [1].Attribute extraction is an important step in knowledgegraph construction, which refers to extracting the attributename and attribute value of entities from text data. Facinga large amount of text data in the military field, extractingattribute data automatically is one of the keys to study theconstruction of a military knowledge graph. The traditionalattribute extraction methods are divided into rule-basedmethods and machine learning-based methods. Zhai andQiu [2] proposed a rule-based knowledge meta-attributeextraction method based on phrase structure trees. Therule-based method needs to set rules manually accordingto the data characteristics, so the migration of the methodis poor. Jakob and Gurevych [3] fused multiple featuresand used conditional random fields [4] to extract attributes.However, machine learning methods require a large amountof labelled data and manual features. In recent years, deeplearning methods have also been gradually applied to attribute extraction. Toh and Su [5] used a bidirectional recurrent neural network BRNN combined with a conditionalrandom field for attribute value extraction. Cheng et al. [6]used a bidirectional long short-term memory networkBiLSTM combined with a gated dynamic attention mechanism for attribute extraction. However, attribute extraction

2Wireless Communications and Mobile Computingbased on deep learning methods also requires a largeamount of annotated data. In the field of weaponry, thereis a lack of corresponding annotated datasets. Manual annotation is not only time-consuming but also the level of theannotator will largely affect the quality of the annotatedcorpus [7]. Through investigation, we found that BaiduEncyclopedia currently contains a large number of weaponand equipment entries. There are a large number of semistructured and unstructured data in the encyclopedia pages,which contain rich information of entity attributes. We propose a new way of attribute data annotation based on thecharacteristics of the encyclopedia pages. We annotate theunstructured text data by distant supervision based on theInfoBox data of the encyclopedia pages. At the same time,we convert the attribute extraction task into a sequenceannotation task and use the RoBERTa-BiLSTM-CRF-SELmethod for attribute data extraction.In summary, the contribution points of this paper can bedivided into the following three points.(1) A new way of data annotation is proposed for thecharacteristics of encyclopedia data. In the annotation process, the subjective is fixed according to thename of the encyclopedia page, and then, its attributes and attribute values are annotated(2) Based on Baidu Encyclopedia data, the militarydomain attribute extraction dataset is automaticallyconstructed by using the idea of distant supervision(3) RoBERTa-BiLSTM-CRF-SEL is designed for automatic attribute extraction in the field of weapons.The method obtains entity boundary featuresthrough the entity boundary prediction layer. Theloss of boundary prediction layer and the loss ofattribute prediction layer are weighted and summedas the loss value of the model. In this way, the modelentity recognition effect is improved. On the militaryequipment attribute extraction dataset, the F1 of theproposed method reaches 0.77, which is better thanother existing methods2. Related WorkAttribute extraction methods can be mainly classified intorule-based methods, machine learning-based methods, anddeep learning-based methods. The rule-based approachneeds to formulate rules manually for specific situations.This method is simple and usually oriented to specificdomains. Although the method has a high accuracy rate,it has a small scope of application and is difficult tomigrate to other domains. The method based on machinelearning is more flexible, but it needs the support of artificial features and large-scale datasets. The method basedon deep learning can automatically mine hidden featuresbetween texts through a neural network model, but it alsorequires large amounts of labelled data for model trainingand optimization.In the early studies of attribute extraction, scholarsmainly formulated a series of rules to extract attributes. Huand Liu [8] extracted commodity attributes from customerreviews by frequent itemset feature extraction. Li et al. [9]presented an automatic method to obtain encyclopedia character attributes, and the speech tagging of each attributevalue was used to locate the encyclopedia free text. The ruleswere discovered by statistical method, and the characterattribute information was obtained from encyclopedia textaccording to rules matching. Yu et al. [10] proposed anapproach of extracting maritime information and convertingunstructured text into structural data. Ding et al. [11]formed nine types of description rules for attribute extraction by manually constructing rules. They analyzed thequantitative relationship and emotional information of attribute description and finally designed and implemented theacademic concept attribute extraction system. Qiao et al.[12] suggested a rule-based character information extractionalgorithm. Based on the rules, they researched and developed a character information extraction system and finallyrealized the automatic extraction of semistructured character attribute information. Kang et al. [1] offered an unsupervised attribute triplet extraction method for the militaryequipment domain. According to the distribution law ofattribute triples in sentences, this method adopts an attributeindicator extraction algorithm based on frequent patternmining and completes the extraction of attribute triples bysetting extraction rules and filtering rules.In a machine learning-based attribute extractionmethod, Zhang et al. [13] introduced word-level features inthe CRF model and used domain dictionary knowledge asan aid for product attribute extraction. Xu et al. [14] introduced shallow syntactic information and heuristic locationinformation and input them to CRF as features, which effectively improved the attribute extraction performance of themodel. Gurumdimma et al. [15] presented the approach toextracting these events based on the dependency parse treerelations of the text and its part of speech (POS). The proposed method uses a machine-learning algorithm to predictevents from a text. Cheng et al. [16] broke through thecurrent method of a statistical operation mainly in the scopeof sentences in the attribute attribution judgment. Theyproposed a method of character attribute extraction that isclassified from text to sentence with the guidance of textknowledge. Kambhatla [17] employed maximum entropymodels to combine diverse lexical, syntactic, and semanticfeatures derived from the text. References [18–20] suggesteda weakly supervised automatic extraction method that usesvery little human participation to solve the problem of lackof training corpus. Zhang et al. [21] offered a novel composite kernel for relation extraction. The composite kernelconsists of two individual kernels: an entity kernel thatallows for entity-related features and a convolution parsetree kernel that models syntactic information of relationexamples. Liu et al. [22] put a perceptron learning algorithmthat fuses global and local features for attribute value extraction of unstructured text. The combination of featuresmakes the model obtain better feature representation ability.Li et al. [23] constructed three kinds of semantic informationthrough word attributes, word dependencies, and wordembeddings of words. The three semantic information are

Wireless Communications and Mobile Computingcombined with the conditional random field model to realizethe extraction of commodity attributes.In recent years, attribute extraction methods based ondeep learning have gradually become mainstream. Wanget al. [24] regarded attribute extraction as a text sequencelabelling task. Input the word sequences and lexicalsequences into a GRU network, and then, use CRF forsequence label prediction. Xu et al. [25] considered thatthere is a gap between the meaning of a word expressionin general and specialized domains. Therefore, they inputboth word embeddings from the generic domain and wordembeddings from the specialized domain into a convolutional neural network model. The model is used to decidewhich expression is more preferred to achieve the attributeextraction. For the low performance of slot filling methodapplied in Chinese entity-attribute extraction at present,He et al. [26] presented a distant supervision relation extraction method based on bidirectional long short-term memoryneural network. Wei et al. [27] proposed an attributeextraction-oriented class-convolutional interactive attentionmechanism. The target sentence was first input into abidirectional recurrent neural network to obtain the implicitexpression of each word and then underwent classconvolution interactive attention. The force mechanismperformed representation learning. To solve the problemthat traditional information extraction methods have poorextraction results due to the existence of long and difficultsentences and the diversity of natural language expressions,Wu et al. [28] introduced text simplification as the preprocessing process of extraction. Among them, text reductionis modeled as a sequence-to-sequence (seq2seq) translationprocess and is implemented with the seq2seq-RNN modelin the field of machine translation. Huang et al. [29] proposed a different method, which uses an independent graphbased on a neural network as the input and is accompaniedby two attention mechanisms to better capture indicativeinformation. Cheng et al. [30] used the advantages of theCRF model to deal with the sequence labelling problemand realized the automatic extraction of journal keywordsby integrating the part-of-speech information and the CRFmodel into the BiLSTM network. Luo et al. [31] proposeda new bidirectional dependency grammar tree to extractthe dependency structure features of a given sentence andthen combined the extracted grammar features with thesemantic features extracted using BiLSTM and finally usedCRF for attribute word annotation. Feng et al. [32] introduced an entity attribute value extraction method based onmachine reading comprehension model and crowdsourcingverification due to the high noise characteristics of Internetcorpus. The attribute extraction task is transformed into areading comprehension task. Luo et al. [33] introduced aMLBiNet (multilayer bidirectional network) that integratescross-sentence semantics and associated event information,thereby enhancing the discrimination of events mentionedwithin. Xi et al. [34] presented bidirectional entity leveldecoder (BERD) to gradually generate argument rolesequences for each entity.To address the problem of lack of annotation data in themilitary equipment domain, the attribute extraction dataset3in the military equipment domain is automaticallyconstructed based on distant supervision. The attributeannotation sequence is decoded by RoBERTa model combined with BiLSTM-CRF model, and the entity boundaryprediction layer is also added to improve the effect of entityrecognition in this paper.3. Attribute Extraction Methods Based onRoBERTa and Entity Boundary PredictionThe model proposed in this paper is mainly composed oftext coding layer, entity boundary prediction layer, andBiLSTM-CRF attribute prediction layer. We first encodethe input text through RoBERTa [35] to obtain its hiddenlayer state vector. Then, input them into the entity boundaryprediction layer and the BiLSTM-CRF attribute predictionlayer, respectively. At the entity boundary prediction layer,the 0/1 coding method is used to label the entity head andtail, respectively, and then, the start loss and end loss ofthe two sequence labels are calculated. In the BiLSTM-CRFattribute prediction layer, we take the output result of theentity boundary prediction layer as a feature and splice itwith the text vector. Input the splicing results into BiLSTMCRF to predict the text attribute tag. Next, calculate its lossvalue att loss. Finally, in the model optimization, we considerthe three-loss values together, weigh the summation, andachieve the overall optimization of the model by backpropagation. The model structure diagram is shown in Figure 1.3.1. Text Encoding Layer. BERT is a pretrained languagemodel proposed by Google in 2018. BERT uses the bidirectional transformer structure as the main framework of thealgorithm, which can capture the bidirectional relations inutterances more thoroughly. BERT uses a self-supervisedapproach to train the model based on a massive corpus,which can learn a good feature representation for words.Therefore, BERT has achieved good results in several downstream tasks such as text classification and sequence annotation. RoBERTa model is an improved version based on theBERT model. Compared with BERT, RoBERTa hasimproved both the training data and training methods andpretrained the model more adequately.In terms of training data, RoBERTa uses 160G trainingtext, while BERT only uses 16G training text. RoBERTa alsouses a new dataset CCNEWS and confirms that using moredata for pretraining can further improve the performance ofdownstream tasks. At the same time, RoBERTa hasincreased the batch size. BERT uses 256 batch size.RoBERTa uses a larger batch size in the training process.Researchers have tried batch sizes ranging from 256 to8000. Liu et al. found through experiments that the performance of certain downstream tasks can be slightly improvedafter removing the NSP (next sentence prediction, NSP) loss.Therefore, in the training method, RoBERTa deleted theNSP task. In addition, unlike the static masking mechanismof BERT, RoBERTa uses a dynamic masking mechanism torandomly generate a new mask pattern every time. BERTrelies on random masks and predicted tokens. The originalBERT implementation performs a mask during data

4Wireless Communications and Mobile ComputingUSS Abraham Lincoln, nation, AmericanUSS Abraham Lincoln, service, November STMLSTMLSTMLSTMLSTMLSTMImplicit layer state vectorEntity header sequence1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0Entity tail sequence0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0Implicit layer state vectorRoBERTa codeOn November 11-1989, the USS Abraham Lincoln was officially commissioned at Naval Station Norfolk and integrated into the American Atlantic Fleet.Figure 1: Structure diagram of attribute extraction model.preprocessing to obtain a static mask. RoBERTa uses adynamic mask. When a sequence is entered into the model,a new mask pattern will be generated. In this way, in the process of continuous input of a large amount of data, themodel will gradually adapt to different masking strategiesand learn different language representations. Byte-pairencoding (BPE) is a mixture of character-level and wordlevel representations and supports the processing of manycommon words in the natural language corpus. The originalBERT implementation uses character-level BPE vocabularywith a size of 30 K, which is learned after preprocessing theinput using heuristic word segmentation rules. Facebookresearchers did not adopt this approach but consideredusing a larger byte-level BPE vocabulary to train BERT,which contains 50 K subword units without any additionalpreprocessing or word segmentation on the input. Compared with BERT, RoBERTa makes small improvements ineach part of the model training, and the combination ofthe improvements in each part makes the model effect effectively improved.We use HIT’s open-source Chinese RoBERTa to encodethe input text and obtain its implicit layer state vector.3.2. Entity Boundary Prediction Layer. In the constructedmilitary equipment attribute extraction dataset, the entitynames are generally longer, such as “65-type 82mm recoilless gun” and “105mm 6 6 wheeled armored assault gun.”To avoid the problem of fuzzy entity boundary recognitionin the process of attribute extraction, the entity boundaryprediction layer is added for entity boundary recognition.In the entity boundary prediction layer, the implied layerstate vector output from RoBERTa is input to the fully connected layer to generate two 0/1 annotation sequences. Oneof the annotation sequences is for the entity head, in which 1represents the entity head, and 0 represents the nonentityhead. The other annotation sequence is for entity tails, where1 represents entity tails, and 0 represents nonentity tails.After obtaining the two sequence labels, we comparethem with the correct labels. Calculate the loss value of entityhead sequence recognition and entity tail sequence recognition. Meanwhile, to further obtain the boundary informationof entities, we take the entity head sequence and entity tailsequence as features and splice them with the hidden layerstate vector output by Roberta. Then, input it to theBiLSTM-CRF layer for attribute prediction.3.3. BiLSTM-CRF Attribute Prediction Layer. In the attributeprediction layer, we use a classical sequence labelling structure BiLSTM-CRF for the identification of attribute valuelabels. The long short-term memory network LSTM [36] is

Wireless Communications and Mobile Computing5a temporal recurrent neural network, which can bettercapture the longer distance dependencies in the text. TheLSTM model structure is shown in Figure 2.There are three inputs to the LSTM, which are thehidden layer state vector ht 1 at the previous moment, thecell state C t 1 at the previous moment, and the input xt atthe current moment. Inside the LSTM, the retention andforgetting of information are decided by three gating mechanisms. The first is the forgetting gate, which is used todecide what information to forget from the cell state. Theforgetting gate is used to read ht 1 and xt and outputs databetween 0 and 1 to decide which information in C t 1 to keepand which to discard, where 1 means fully retained, and 0means all discarded. The input gate is used to decide whichnew information is added to the cell state, and the outputgate decides which data in the cell state will be output. Thecalculation formulas of the LSTM model are shown in f t σ W f ½ht 1 , xt b f ,ð1Þit σðW i ½ht 1 , xt bi Þ,ð2Þot σðW o ½ht 1 , xt bo Þ,ð3Þ t tanh ðW C ½ht 1 , xt bC Þ,Cð4Þ t,C t f t · Ct 1 it · Cð5Þht ot tanh ðCt Þ:ð6ÞLSTM can only encode information in one direction. Toeffectively use the context information, we uses a bidirectional LSTM structure for encoding.By calculating the hidden layer vector output of theLSTM in both positive and negative directions and splicingthem together, the hidden layer state vector of BiLSTM isfinally obtained. The formulas are shown in! ! ! ht LSTM ht 1 , wt ,ð7Þ ht LSTM ht 1 , wt ,ð8Þ ! ht concat ht , ht :ð9ÞThe conditional random field is a conditional probability distribution model of output Y ðY 1 , Y 2 , Y n Þgiven a set of input variables X ðX 1 , X 2 , X n Þ. CRFis a serialization annotation algorithm, which can considerthe dependencies between tags to obtain the globally optimal tag sequence.For a set of label prediction sequence Y, its scoringformula is shown innni 0i 1scoreðx, yÞ Ayi ,yi 1 Pi,yi :ð10ÞAmong them, P is an n m dimensional matrix, mrepresents the number of labels to be predicted, and Pi, jrepresents the possibility that input i is the label j. A is thetransition matrix, and Ai, j represents the probability of transition from label i to label j.Therefore, for all possible prediction sequence sets Y x ofthe input sequence X, the conditional probability is as shownP ðy x Þ escoreðx,yÞ: y Y x escoreðx, yÞð11ÞIn training, we optimize the model by maximizing thelog-likelihood probability of the correct output label inEquation (12). For prediction, we select the sequence withthe highest score as the best prediction sequence, which iscalculated as shown in Equation (12).y arg max scoreðx, yÞ: y Y xð12ÞTake sentences in the dataset as an example, such as “OnNovember 11-1989, the USS Abraham Lincoln was officiallycommissioned at Naval Station Norfolk and integrated intothe American Atlantic Fleet,” “November 11-1989” wouldbe marked as “B-FY,” “USS” would be marked as “B-ST,”“Abraham” would be marked as “I-ST,” “Lincoln” wouldbe marked as “I-ST,” and “American” would be marked as“B-GJ” (please refer to Chapter 4 for label meaning).3.4. Loss Value Calculation. In terms of loss value calculation, we take the weighted sum of entity boundary loss valueand attribute identification loss value as the final loss value.The loss value is used to optimize the overall parameters ofthe model (as shown in Figure 3).The loss value calculation formula i and lossend represent the loss values ofentity head recognition and entity tail recognition, respectively, and lossattribute represents the loss value generated bythe attribute sequence labelling. α, β, γ ½0, 1 are hyperparameters that control the weighted summation of the threeloss values.loss αlossstart βlossend γlossattribute :ð13Þ4. Experimental Results and Analysis4.1. Acquisition of Military Equipment Attribute Data4.1.1. Data Acquisition. The experimental data came fromthe Baidu Encyclopedia website (https://baike.baidu.com/),and the data acquisition process is shown in Figure 4. Wecannot directly obtain military-related terms from BaiduEncyclopedia, because the website does not classify andindex terms. The military channel of http://globe.com/ hasa summary display of various types of weapons and equipment. We get the names of various military equipment fromthe military channel of the World Wide Web. Then, weexpand the rules, splice them with the links of encyclopediaentries, and finally, get the URL links of the required militaryequipment-related entries. After obtaining the links of military equipment entries in Baidu Encyclopedia, we analyzed

6Wireless Communications and Mobile ComputingOtCt–1 CV tanh 𝜎𝜎 𝜎tanhht–1htxtFigure 2: LSTM structure diagram.Loss value calculationLossstart 𝛼lossstart 𝛽lossend 𝛾lossattributeLossattributeProperty annotation layerAircraft carrier USS AbrahamLincoln, Country, USAAircraft carrier USS Abraham Lincoln,Commissioned, November 11-1989LossstartCRFCRFCRFCRFCRF LossendCRFEntity header sequence annotation layerEntity tail sequence annotation layer1 0 0 0 0 0 0 . . 0 0 0 1 0 0 0 0 0 00 0 0 0 0 0 0 . . 0 0 0 0 1 0 0 0 0 0Figure 3: Calculation of loss value.the encyclopedia entry pages and found that the entriesmainly consist of entry names, information boxes containing attribute data, and a large amount of unstructured text.We used a crawler to collect the InfoBox data and text datain the Baidu Encyclopedia entry of weapons and equipment and finally collected 1757 encyclopedia data ofmilitary equipment.4.1.2. Data Annotation. Data annotation by manual is notonly time-consuming and laborious but also different annotators may have different annotation rules for the same pieceof data. Therefore, automatic annotation of data has becomethe focus of current research. Encyclopedia word dataconsists of two main parts, which are attribute data in theinformation frame and unstructured text description data.Taking the “Nimitz aircraft carrier” as an example, the entryinformation box of the aircraft carrier contains basic attributes such as “English Name,” “Nation,” “pretype/level,”and “subtype/level,”. The text data is an introduction to thebasic information of the “Nimitz aircraft carrier.” Observingits text data, it can be seen that it contains textual expressions of the “English Name,” “Nation,” and other attributevalues of the “Nimitz aircraft carrier.”For this data feature, the data annotation in this paper isbased on the distant supervision hypothesis [33]. The distantsupervision hypothesis means that when there is a

Wireless Communications and Mobile ComputingGlobe military weaponsequipment nameJ-16 fighter aircraftFC-1 ̏Dragon /JF-17̏Thunderbolt multi-roleattack aircraftSu-27 fighter.Liaoning ship̏North Sea (558) missilefrigate7Rule expansion,URL splicingBaiduencyclopediaweaponrylinksWeb parsingdata ia dataTextdescriptiondataFigure 4: Flow chart of attribute data collection.relationship between two entities, then all sentences containing the pair of entities are considered to express this relationship to some extent. Distant supervision is to providelabels for data with the help of external knowledge bases,to save the trouble of manual labelling [37]. Attributes canalso be considered as a type of relationship, so the distantsupervision assumption is applied to the annotation of attribute data. Taking the Nimitz aircraft carrier as an example,the information box in Figure 5 shows that the relationshipbetween “Nimitz aircraft carrier” and the attribute “UnitedStates” is a “Nation” attribute. Then, based on the distantsupervision assumption, all sentences containing “Nimitzaircraft carrier” and “United States” can be labelled withthe “Nation” attribute, for example, the sentences “NimitzAircraft Carrier (CVN-68) is the first ship of the Nimitzclass aircraft carriers of the United States Navy” and “TheNimitz aircraft carrier started construction in June 1968. Itwas launched in May 1972 and delivered to the United StatesNavy in May 1975”. Both of these sentences contain thewords “Nimitz aircraft carrier “ and “United States,” andthe triad (Nimitz aircraft carrier, nation, United States) canbe considered to exist in these two sentences when labellingthe data. Suppose a dataset D fs1 , s2 , , sn g, where si represents sentence and is unstructured text. Train a model Fsuch that Fðsi ; θÞ ½ðeti , etj , etk Þ , where θ represents modelparameters, and eti , etj , etk represent the T th entity and its corresponding relationship. The idea of the distant supervisionalgorithm is to use knowledge base to align plain text forannotation and then perform supervised training.However, Baidu Encyclopedia website is an open knowledge platform, and the editors of entries are not fixed. Therefore, there is a lack of standardization and unity in thenaming of attributes, which leads to a variety of expressionsof the same attribute. Since the data in the military field has acertain degree of confidentiality, the field itself has data sparsity. Different attribute expressions can lead to a variety ofdata labels. If the labels are too scattered, the annotation dataof each type of attribute will be small, which is difficult toobtain a good attribute extraction effect. To merge multipleattribute labels’ expressions, we count the distribution ofattribute names to select high-frequency words as attributenames. The attribute expressions present in the militaryequipment data of the encyclopedia website were mergedby manual means, and a synonym table of military equipment attribute names

On the military equipment attribute extraction dataset, the F1 of the proposed method reaches 0.77, which is better than other existing methods 2. Related Work Attribute extraction methods can be mainly classified into rule-based methods, machine learning-based methods, and deep learning-based methods. The rule-based approach

Related Documents:

May 02, 2018 · D. Program Evaluation ͟The organization has provided a description of the framework for how each program will be evaluated. The framework should include all the elements below: ͟The evaluation methods are cost-effective for the organization ͟Quantitative and qualitative data is being collected (at Basics tier, data collection must have begun)

Silat is a combative art of self-defense and survival rooted from Matay archipelago. It was traced at thé early of Langkasuka Kingdom (2nd century CE) till thé reign of Melaka (Malaysia) Sultanate era (13th century). Silat has now evolved to become part of social culture and tradition with thé appearance of a fine physical and spiritual .

Dr. Sunita Bharatwal** Dr. Pawan Garga*** Abstract Customer satisfaction is derived from thè functionalities and values, a product or Service can provide. The current study aims to segregate thè dimensions of ordine Service quality and gather insights on its impact on web shopping. The trends of purchases have

On an exceptional basis, Member States may request UNESCO to provide thé candidates with access to thé platform so they can complète thé form by themselves. Thèse requests must be addressed to esd rize unesco. or by 15 A ril 2021 UNESCO will provide thé nomineewith accessto thé platform via their émail address.

̶The leading indicator of employee engagement is based on the quality of the relationship between employee and supervisor Empower your managers! ̶Help them understand the impact on the organization ̶Share important changes, plan options, tasks, and deadlines ̶Provide key messages and talking points ̶Prepare them to answer employee questions

Advance Extraction Techniques - Microwave assisted Extraction (MAE), Ultra sonication assisted Extraction (UAE), Supercritical Fluid Extraction (SFE), Soxhlet Extraction, Soxtec Extraction, Pressurized Fluid Extraction (PFE) or Accelerated Solvent Extraction (ASE), Shake Flask Extraction and Matrix Solid Phase Dispersion (MSPD) [4]. 2.

Chính Văn.- Còn đức Thế tôn thì tuệ giác cực kỳ trong sạch 8: hiện hành bất nhị 9, đạt đến vô tướng 10, đứng vào chỗ đứng của các đức Thế tôn 11, thể hiện tính bình đẳng của các Ngài, đến chỗ không còn chướng ngại 12, giáo pháp không thể khuynh đảo, tâm thức không bị cản trở, cái được

Derived attribute: attribute whose value can be determined based upon other data (e.g., a database that includes birthdate and age; age can be a derived attribute given birthdate). Base attribute: an attribute from which you derive another attribute. Descriptive