“The Boating Store Has Its Best Sale Ever”: Pronunciation .

2y ago
7 Views
3 Downloads
2.09 MB
30 Pages
Last View : 22d ago
Last Download : 3m ago
Upload by : Kaleb Stephen
Transcription

“The boating store has its best sale ever”:Pronunciation-attentiveContextualized Pun RecognitionYichao Zhou, Jyun-yu Jiang, Jieyu Zhao, Kai-Wei Chang and Wei WangDepartment of Computer Science, University of California, Los Angeles

What is Pun?I'd tell you a chemistry joke but I know I wouldn't get a reaction.

What is Pun?I'd tell you a chemistry joke but I know I wouldn't get a reaction.Global ContextLocal Context

What is Pun?I'd tell you a chemistry joke but I know I wouldn't get a reaction.Global Context Local ContextBoth local and global contexts are consistent with the pun word “reaction”.“Reaction” both means “chemical change” and “response”.The contrast between two meanings create a humorous pun.

Homographic PunsI'd tell you a chemistry joke but I know I wouldn't get a reaction.Homographic puns rely on multiple interpretations of the same expression.

Heterographic PunsThe boating store had its best sail (sale) ever.Global Context Local ContextThe local and global contexts are consistent with the pun word “sail” and “sale” separately.“Sail” links to “boating”, while “sale” relates to “store had its best” and “ever”.The same or similar pronunciation connects two words, while the different meanings createfunniness.

Heterographic PunsThe boating store had its best sail (sale) ever.Heterographic puns take advantage of phonologically same or similar words.

Puns

Task and Previous Research In this paper, we tackle the pun detection and location tasks. Deploying word sense disambiguation methods or using externalknowledge base cannot tackle heterographic puns (Pedersen,2017; Oele and Evang, 2017). Leveraging static word embedding techniques that could notmodel pun very well because a word should have very differentrepresentations regarding of its context (Hurtado et al., 2017;Indurthi and Oota, 2017; Cai et al., 2018).

Contribution of our work In this paper, we propose Pronunciation-attentive ContextualizedPun Recognition (PCPR) to jointly model the contextualized wordembeddings and phonological word representations for punrecognition. We prove the effectiveness of different embeddings and modulesvia extensive experiments.

Task FormulationSuppose the input text consists of a sequence of N words. For eachword with M phonemes in its pronunciation.For instance, the phonemes of the word “pun” are {P, AH, N}. Pun detection is a sentence binary classification problem. Pun location can be modeled as a sequential tagging task,assigning a binary label to each word.

Framework Architecture

Framework ArchitectureHere, we choose BERT to derive contextualized word embeddingswithout loss of generality.

Framework ArchitectureWe apply the attentionmechanism to simultaneouslyidentify important phonemesand derive the pronunciationembedding for each word.FP (·) is a fully-connected layer and ui,j represents the phonemeembeddings.contextvector

Framework Architecture for Pun LocationA self-attentive encoder blends contextualized word embeddings and pronunciation embeddings tocapture the overall representation for each word.

Framework Architecture for Pun DetectionThe whole input embedding can be derived by concatenating the overallcontextualized embedding and the self-attentive embedding.

Dataset and Evaluation The Experiments are conducted on two publicly available benchmarkdatasets SemEval 2017 shared task 7 and Pun of the Day (PTD). We adopted Precision, Recall and F1-score to evaluate both pundetection and location task.

Main Experiment on SemEval-2017SemEval taskparticipants,extractingcomplicatedlinguistic features totrain rule based andmachine learningbased classifiers.

Main Experiment on SemEval-2017Incorporates wordsense emb into RNN

Main Experiment on SemEval-2017Captures linguisticfeatures such as POStags, n-grams, andword suffix

Main Experiment on SemEval-2017Jointly models twotasks with RNNs anda CRF tagger

Main Experiment on SemEval-2017Exploits only thecontextualized wordencoder withoutconsideringphonemes.

Main Experiment on SemEval-2017PCPR dramatically improves the pun location and detection performance, compared to theSOTA models, Joint and CPR.

Main Experiment on SemEval-2017By applying the pronunciation-attentive representations, different words with similarpronunciations are linked, leading to a much better pinpoint of pun word for the heterographicdataset.

Main Experiment on SemEval-2017Pronunciation embeddings also facilitate homographic pun detection, implying the potential ofpronunciation for enhancing general language modeling. This is consistent with [1] that improvesthe quality of word embeddings by introducing pronunciation features.[1] Wenhao Zhu et al. "Improve word embedding using both writing and pronunciation." PloS one, 2018.

Main Experiment on PTDExploits word representations withmultiple stylistic features.Applies a random forest model withWord2Vec and human-centricfeatures.Trains a CNN to learn essentialfeature automatically.Improves the CNN by adjusting thefilter size and adding a highwaylayer.

Main Experiment on PTD The contextualized word embeddings canimplicitly reveal those contradictions ofmeanings and further improve punmodeling. Phonetical embeddings can be intuitivelyuseful to recognize identically pronouncedwords for detecting heterographic puns.

Ablation Study on SemEval-2017All these components are essential for PCPR to recognize puns.

Attention VisualizationVisualization of attention weights of each pun word (marked in pink) in the sentences. A deeper color indicates ahigher attention weight.

Conclusion and Future Work In this paper, we propose a novel approach, PCPR, for punrecognition by leveraging a contextualized word encoder andmodeling phonemes as word pronunciations. Extensive experiments prove the effectiveness of the attentionmechanisms, contextualized embeddings and pronunciationembeddings. We release our implementations and pre-trained phonemeembeddings at https://github.com/joey1993/pun-recognition tofacilitate future research.

Pronunciation-attentive Contextualized Pun Recognition Yichao Zhou, Jyun-yu Jiang, Jieyu Zhao, Kai-Wei Chang and Wei Wang Department of Computer Science, University of California, Los Angeles. What is Pun? I'd tell you a chemistry joke but I know I wouldn't get a reaction. What is Pun? I'd

Related Documents:

May 02, 2018 · D. Program Evaluation ͟The organization has provided a description of the framework for how each program will be evaluated. The framework should include all the elements below: ͟The evaluation methods are cost-effective for the organization ͟Quantitative and qualitative data is being collected (at Basics tier, data collection must have begun)

Silat is a combative art of self-defense and survival rooted from Matay archipelago. It was traced at thé early of Langkasuka Kingdom (2nd century CE) till thé reign of Melaka (Malaysia) Sultanate era (13th century). Silat has now evolved to become part of social culture and tradition with thé appearance of a fine physical and spiritual .

Dr. Sunita Bharatwal** Dr. Pawan Garga*** Abstract Customer satisfaction is derived from thè functionalities and values, a product or Service can provide. The current study aims to segregate thè dimensions of ordine Service quality and gather insights on its impact on web shopping. The trends of purchases have

On an exceptional basis, Member States may request UNESCO to provide thé candidates with access to thé platform so they can complète thé form by themselves. Thèse requests must be addressed to esd rize unesco. or by 15 A ril 2021 UNESCO will provide thé nomineewith accessto thé platform via their émail address.

̶The leading indicator of employee engagement is based on the quality of the relationship between employee and supervisor Empower your managers! ̶Help them understand the impact on the organization ̶Share important changes, plan options, tasks, and deadlines ̶Provide key messages and talking points ̶Prepare them to answer employee questions

providing boating safety information and educational opportunities to the boating public. The community program partners must feel the Program Visitor is committed to promoting boating safety and is a vital link in the safe boating or boating education process. It is essential that the Program Visitors maintain frequent contact and re-

Chính Văn.- Còn đức Thế tôn thì tuệ giác cực kỳ trong sạch 8: hiện hành bất nhị 9, đạt đến vô tướng 10, đứng vào chỗ đứng của các đức Thế tôn 11, thể hiện tính bình đẳng của các Ngài, đến chỗ không còn chướng ngại 12, giáo pháp không thể khuynh đảo, tâm thức không bị cản trở, cái được

Alaska Boater’s Handbook—2012 PREPARATION INTRODUCTION From powerboating and kayaking our coastal waters to air boating, jet boating, rafting, drift boating and canoeing our interior rivers and lakes, Alaska’s boating opportunities are unsurpassed. However, Alaska also has one of the highest boating fatality rates in the nation. Statistics