End-to-end Neural Information Retrieval - GitHub Pages

2y ago
54 Views
2 Downloads
1.12 MB
63 Pages
Last View : 21d ago
Last Download : 2m ago
Upload by : Azalea Piercy
Transcription

End-to-end Neural Information RetrievalMMath thesisWei YangCheriton School of Computer ScienceUniversity of WaterlooApril 2019Wei YangEnd-to-end Neural Information Retrieval1 / 29

Table of Contents1Introduction2Related Work3End-to-end Neural Information Retrieval Architecture4Experiments5Conclusion and DiscussionWei YangEnd-to-end Neural Information Retrieval2 / 29

Table of Contents1Introduction2Related Work3End-to-end Neural Information Retrieval Architecture4Experiments5Conclusion and DiscussionWei YangEnd-to-end Neural Information Retrieval3 / 29

Types of NLP TasksSequence classificationSequence pair classification (text matching)Sequence labelingSequence-to-sequence generationWei YangEnd-to-end Neural Information Retrieval4 / 29

Types of NLP TasksSequence classificationSequence pair classification (text matching)Sequence labelingSequence-to-sequence generationWei YangEnd-to-end Neural Information Retrieval4 / 29

Types of NLP TasksSequence classificationSequence pair classification (text matching)Sequence labelingSequence-to-sequence generationWei YangEnd-to-end Neural Information Retrieval4 / 29

Types of NLP TasksSequence classificationSequence pair classification (text matching)Sequence labelingSequence-to-sequence generationWei YangEnd-to-end Neural Information Retrieval4 / 29

Information RetrievalTasksParaphrase IndentificationTextual EntailmentQuestion AnsweringConversationInformation RetrievalText 1string 1textquestiondialogqueryText 2string RTable: Typical text matching tasks (C: classification; R: ranking)Wei YangEnd-to-end Neural Information Retrieval5 / 29

Problem DefinitionSearch microblogs:Query: 2022 fifa soccerRelevant document: #ps3 best sellers: fifa soccer 11 ps3#cheaptweet https://www.amazon.com/ fifa-soccer -11-playstation-3Search newswire articles:Query: international organized crimeRelevant document: The past few years have been characterized byan unprecedented growth in crime , changes in its characteristics, andfor all practical purposes the loss of state and public control over thecrime situation.More than 40 international smuggling crime groups have beenidentified. More than 130 ”Russian” stores selling Russian antiqueshave been found abroad.Wei YangEnd-to-end Neural Information Retrieval6 / 29

Problem DefinitionSearch microblogs:Query: 2022 fifa soccerRelevant document: #ps3 best sellers: fifa soccer 11 ps3#cheaptweet https://www.amazon.com/ fifa-soccer -11-playstation-3Search newswire articles:Query: international organized crimeRelevant document: The past few years have been characterized byan unprecedented growth in crime , changes in its characteristics, andfor all practical purposes the loss of state and public control over thecrime situation.More than 40 international smuggling crime groups have beenidentified. More than 130 ”Russian” stores selling Russian antiqueshave been found abroad.Wei YangEnd-to-end Neural Information Retrieval6 / 29

Why Neural NetworkWhy NN for NLP?Low-dimensional semantic spaceThousands of variationsHierarchical structureHardware developmentsWhy NN for IR?Relevance judgments are based on a complicated human cognitiveprocessWei YangEnd-to-end Neural Information Retrieval7 / 29

Why Neural NetworkWhy NN for NLP?Low-dimensional semantic spaceThousands of variationsHierarchical structureHardware developmentsWhy NN for IR?Relevance judgments are based on a complicated human cognitiveprocessWei YangEnd-to-end Neural Information Retrieval7 / 29

Why Neural NetworkWhy NN for NLP?Low-dimensional semantic spaceThousands of variationsHierarchical structureHardware developmentsWhy NN for IR?Relevance judgments are based on a complicated human cognitiveprocessWei YangEnd-to-end Neural Information Retrieval7 / 29

Why Neural NetworkWhy NN for NLP?Low-dimensional semantic spaceThousands of variationsHierarchical structureHardware developmentsWhy NN for IR?Relevance judgments are based on a complicated human cognitiveprocessWei YangEnd-to-end Neural Information Retrieval7 / 29

Why Neural NetworkWhy NN for NLP?Low-dimensional semantic spaceThousands of variationsHierarchical structureHardware developmentsWhy NN for IR?Relevance judgments are based on a complicated human cognitiveprocessWei YangEnd-to-end Neural Information Retrieval7 / 29

Why Neural NetworkWhy NN for NLP?Low-dimensional semantic spaceThousands of variationsHierarchical structureHardware developmentsWhy NN for IR?Relevance judgments are based on a complicated human cognitiveprocessWei YangEnd-to-end Neural Information Retrieval7 / 29

Table of Contents1Introduction2Related Work3End-to-end Neural Information Retrieval Architecture4Experiments5Conclusion and DiscussionWei YangEnd-to-end Neural Information Retrieval8 / 29

20120 617201820120 3142009Related Workbefore 2009: vector space models and probabilistic models (Querylikelihood (QL), BM25, RM3.)2009: Learning to rank models (tens of hand-crafted features)2013: Deep Structured Semantic Model (DSSM)2014: CDSSM, ARC-I, ARC-II (mainly for short text ranking)2016: MatchPyramid, DRMM .2017: KNRM, DUET, DeepRank, PACRR .2018: HINT, MP-HCNN (hierachical matching patterns) .Wei YangEnd-to-end Neural Information Retrieval9 / 29

20120 617201820120 3142009Related Workbefore 2009: vector space models and probabilistic models (Querylikelihood (QL), BM25, RM3.)2009: Learning to rank models (tens of hand-crafted features)2013: Deep Structured Semantic Model (DSSM)2014: CDSSM, ARC-I, ARC-II (mainly for short text ranking)2016: MatchPyramid, DRMM .2017: KNRM, DUET, DeepRank, PACRR .2018: HINT, MP-HCNN (hierachical matching patterns) .Wei YangEnd-to-end Neural Information Retrieval9 / 29

20120 617201820120 3142009Related Workbefore 2009: vector space models and probabilistic models (Querylikelihood (QL), BM25, RM3.)2009: Learning to rank models (tens of hand-crafted features)2013: Deep Structured Semantic Model (DSSM)2014: CDSSM, ARC-I, ARC-II (mainly for short text ranking)2016: MatchPyramid, DRMM .2017: KNRM, DUET, DeepRank, PACRR .2018: HINT, MP-HCNN (hierachical matching patterns) .Wei YangEnd-to-end Neural Information Retrieval9 / 29

20120 617201820120 3142009Related Workbefore 2009: vector space models and probabilistic models (Querylikelihood (QL), BM25, RM3.)2009: Learning to rank models (tens of hand-crafted features)2013: Deep Structured Semantic Model (DSSM)2014: CDSSM, ARC-I, ARC-II (mainly for short text ranking)2016: MatchPyramid, DRMM .2017: KNRM, DUET, DeepRank, PACRR .2018: HINT, MP-HCNN (hierachical matching patterns) .Wei YangEnd-to-end Neural Information Retrieval9 / 29

20120 617201820120 3142009Related Workbefore 2009: vector space models and probabilistic models (Querylikelihood (QL), BM25, RM3.)2009: Learning to rank models (tens of hand-crafted features)2013: Deep Structured Semantic Model (DSSM)2014: CDSSM, ARC-I, ARC-II (mainly for short text ranking)2016: MatchPyramid, DRMM .2017: KNRM, DUET, DeepRank, PACRR .2018: HINT, MP-HCNN (hierachical matching patterns) .Wei YangEnd-to-end Neural Information Retrieval9 / 29

20120 617201820120 3142009Related Workbefore 2009: vector space models and probabilistic models (Querylikelihood (QL), BM25, RM3.)2009: Learning to rank models (tens of hand-crafted features)2013: Deep Structured Semantic Model (DSSM)2014: CDSSM, ARC-I, ARC-II (mainly for short text ranking)2016: MatchPyramid, DRMM .2017: KNRM, DUET, DeepRank, PACRR .2018: HINT, MP-HCNN (hierachical matching patterns) .Wei YangEnd-to-end Neural Information Retrieval9 / 29

20120 617201820120 3142009Related Workbefore 2009: vector space models and probabilistic models (Querylikelihood (QL), BM25, RM3.)2009: Learning to rank models (tens of hand-crafted features)2013: Deep Structured Semantic Model (DSSM)2014: CDSSM, ARC-I, ARC-II (mainly for short text ranking)2016: MatchPyramid, DRMM .2017: KNRM, DUET, DeepRank, PACRR .2018: HINT, MP-HCNN (hierachical matching patterns) .Wei YangEnd-to-end Neural Information Retrieval9 / 29

ContributionAn end-to-end retrieval and reranking system to allow the user toapply different retrieval models and neural reranking models ondifferent datasets.State-of-the-art performance on two benchmark datasets (Robust04and Microblog) for document retrieval.Prove the effectiveness and additivity of a strong baseline for neuralreranking methods.Co-design the MP-HCNN model for social media post searching.Wei YangEnd-to-end Neural Information Retrieval10 / 29

ContributionAn end-to-end retrieval and reranking system to allow the user toapply different retrieval models and neural reranking models ondifferent datasets.State-of-the-art performance on two benchmark datasets (Robust04and Microblog) for document retrieval.Prove the effectiveness and additivity of a strong baseline for neuralreranking methods.Co-design the MP-HCNN model for social media post searching.Wei YangEnd-to-end Neural Information Retrieval10 / 29

ContributionAn end-to-end retrieval and reranking system to allow the user toapply different retrieval models and neural reranking models ondifferent datasets.State-of-the-art performance on two benchmark datasets (Robust04and Microblog) for document retrieval.Prove the effectiveness and additivity of a strong baseline for neuralreranking methods.Co-design the MP-HCNN model for social media post searching.Wei YangEnd-to-end Neural Information Retrieval10 / 29

ContributionAn end-to-end retrieval and reranking system to allow the user toapply different retrieval models and neural reranking models ondifferent datasets.State-of-the-art performance on two benchmark datasets (Robust04and Microblog) for document retrieval.Prove the effectiveness and additivity of a strong baseline for neuralreranking methods.Co-design the MP-HCNN model for social media post searching.Wei YangEnd-to-end Neural Information Retrieval10 / 29

Table of Contents1Introduction2Related Work3End-to-end Neural Information Retrieval Architecture4Experiments5Conclusion and DiscussionWei YangEnd-to-end Neural Information Retrieval11 / 29

ArchitectureCorpusIndexingRaw modelTrain edmodeltop k1 documents Top k2 documentsRerankerretrieval scoreFigure: The architecture of the Retrieval-rerank framework.Wei YangEnd-to-end Neural Information Retrieval12 / 29

Retrieval, Rerank and AggregationRetrieval: Anserini (QL, QL RM3, BM25, BM25 RM3)Rerank:MatchZoo models (DSSM, CDSSM, DUET, KNRM, DRMM)MP-HCNNBERTAggregation:rel(q, d) λ Reranker(q, d) (1 λ) Retriever(q, d)Wei YangEnd-to-end Neural Information Retrieval(1)13 / 29

Retrieval, Rerank and AggregationRetrieval: Anserini (QL, QL RM3, BM25, BM25 RM3)Rerank:MatchZoo models (DSSM, CDSSM, DUET, KNRM, DRMM)MP-HCNNBERTAggregation:rel(q, d) λ Reranker(q, d) (1 λ) Retriever(q, d)Wei YangEnd-to-end Neural Information Retrieval(1)13 / 29

Retrieval, Rerank and AggregationRetrieval: Anserini (QL, QL RM3, BM25, BM25 RM3)Rerank:MatchZoo models (DSSM, CDSSM, DUET, KNRM, DRMM)MP-HCNNBERTAggregation:rel(q, d) λ Reranker(q, d) (1 λ) Retriever(q, d)Wei YangEnd-to-end Neural Information Retrieval(1)13 / 29

Retrieval, Rerank and AggregationRetrieval: Anserini (QL, QL RM3, BM25, BM25 RM3)Rerank:MatchZoo models (DSSM, CDSSM, DUET, KNRM, DRMM)MP-HCNNBERTAggregation:rel(q, d) λ Reranker(q, d) (1 λ) Retriever(q, d)Wei YangEnd-to-end Neural Information Retrieval(1)13 / 29

Retrieval, Rerank and AggregationRetrieval: Anserini (QL, QL RM3, BM25, BM25 RM3)Rerank:MatchZoo models (DSSM, CDSSM, DUET, KNRM, DRMM)MP-HCNNBERTAggregation:rel(q, d) λ Reranker(q, d) (1 λ) Retriever(q, d)Wei YangEnd-to-end Neural Information Retrieval(1)13 / 29

MatchZooFigure: Details of five neural information retrieval models in MatchZooWei YangEnd-to-end Neural Information Retrieval14 / 29

MP-HCNNFigure: Overview of our Multi-Perspective Hierarchical Convolutional NeuralNetwork (MP-HCNN), which consists of two parallel components for word-leveland character-level modeling between queries, social media posts, and URLs.Wei YangEnd-to-end Neural Information Retrieval15 / 29

BERTFigure: The architecture of the BERT model for text matching.Wei YangEnd-to-end Neural Information Retrieval16 / 29

Table of Contents1Introduction2Related Work3End-to-end Neural Information Retrieval Architecture4Experiments5Conclusion and DiscussionWei YangEnd-to-end Neural Information Retrieval17 / 29

Evaluation MetricsPPrecisionq PAPq hi,di Rqhi,di Rq Rq Precisionq,i relq (d)Pd DNDCGq DCGq Xhi,di RqWei Yangrelq (d)relq (d)DCGqIDCGq2relq (d) 1log2 (i 1)End-to-end Neural Information Retrieval(2)(3)(4)(5)18 / 29

Evaluation MetricsPPrecisionq PAPq hi,di Rqhi,di Rq Rq Precisionq,i relq (d)Pd DNDCGq DCGq Xhi,di RqWei Yangrelq (d)relq (d)DCGqIDCGq2relq (d) 1log2 (i 1)End-to-end Neural Information Retrieval(2)(3)(4)(5)18 / 29

Evaluation MetricsPPrecisionq PAPq hi,di Rqhi,di Rq Rq Precisionq,i relq (d)Pd DNDCGq DCGq Xhi,di RqWei Yangrelq (d)relq (d)DCGqIDCGq2relq (d) 1log2 (i 1)End-to-end Neural Information Retrieval(2)(3)(4)(5)18 / 29

DatasetsTest Set# query topics# query-doc 000Table: Statistics of the TREC Microblog Track datasets# query topics# query-doc pairs250250,000Table: Statistics of the Robust04 datasetsWei YangEnd-to-end Neural Information Retrieval19 / 29

DatasetsTest Set# query topics# query-doc 000Table: Statistics of the TREC Microblog Track datasets# query topics# query-doc pairs250250,000Table: Statistics of the Robust04 datasetsWei YangEnd-to-end Neural Information Retrieval19 / 29

Experimental SetupTrain/test splits:Microblog: train on 2011, 2012 and 2013, test on 2014Robust04: five-fold cross validationHyper-parameter tuning: 10% of the training dataModels:Microblog: MatchZoo models, MP-HCNN, BERTRobust04: MatchZoo modelsWei YangEnd-to-end Neural Information Retrieval20 / 29

Experimental SetupTrain/test splits:Microblog: train on 2011, 2012 and 2013, test on 2014Robust04: five-fold cross validationHyper-parameter tuning: 10% of the training dataModels:Microblog: MatchZoo models, MP-HCNN, BERTRobust04: MatchZoo modelsWei YangEnd-to-end Neural Information Retrieval20 / 29

Experimental SetupTrain/test splits:Microblog: train on 2011, 2012 and 2013, test on 2014Robust04: five-fold cross validationHyper-parameter tuning: 10% of the training dataModels:Microblog: MatchZoo models, MP-HCNN, BERTRobust04: MatchZoo modelsWei YangEnd-to-end Neural Information Retrieval20 / 29

Results of Baselines: Robust04ModelQL (Guo et al.)BM25 (Guo et al.)DRMM (Guo et al.)MatchPyramid (Pang et al.)BM25 (Mcdonald et al.)PACRR (Mcdonald et 250.443Table: Previous Results on the Robust04 datasetAPP@20NDCG@20QL0.24650.35080.4109QL RM30.27430.36390.4172BM250.25150.36120.4225BM25 RM30.30330.39730.4514Table: Our results of retrieval models on the Robust04 datasetWei YangEnd-to-end Neural Information Retrieval21 / 29

Results of Baselines: Robust04ModelQL (Guo et al.)BM25 (Guo et al.)DRMM (Guo et al.)MatchPyramid (Pang et al.)BM25 (Mcdonald et al.)PACRR (Mcdonald et 250.443Table: Previous Results on the Robust04 datasetAPP@20NDCG@20QL0.24650.35080.4109QL RM30.27430.36390.4172BM250.25150.36120.4225BM25 RM30.30330.39730.4514Table: Our results of retrieval models on the Robust04 datasetWei YangEnd-to-end Neural Information Retrieval21 / 29

Results of End-to-end Neural IR Models: Robust04ModelsBM25 RM3DSSMCDSSMDRMMKNRMDUETDSSM RM3CDSSM RM3DRMM RM3KNRM RM3DUET RM3MAP0.30330.0982 0.0641 0.2543 0.1145 0.1426 0.30260.29950.3151 0.30360.3051P@200.39730.1331 0.0842 0.3405 0.1480 0.1561 0.39460.39440.4147 0.39280.3986NDCG@200.45140.1551 0.0772 0.4025 0.1512 0.1946 0.44910.44680.4717 0.44410.4502Table: Results of retrieval and reranking on the Robust04 dataset. RM: retrievalmodel. NRM: neural re-ranking model. Significant improvement or degradationwith respect to the retrieval model is indicated ( /-) (p-value 0.05).Wei YangEnd-to-end Neural Information Retrieval22 / 29

Results of Baselines: MicroblogMethodQL (Rao et al.)RM3 (Rao et al.)L2R (Rao et al.)MP-HCNN (Rao et al.)BiCNN (Shi et 390.62000.66120.6806Table: Previous Results on Microblog datasetsAPP@30QL0.41810.6430QL RM30.46760.6533BM250.39310.6212BM25 RM30.43740.6442Table: Our results of retrieval models on Microblog datasetsWei YangEnd-to-end Neural Information Retrieval23 / 29

Results of Baselines: MicroblogMethodQL (Rao et al.)RM3 (Rao et al.)L2R (Rao et al.)MP-HCNN (Rao et al.)BiCNN (Shi et 390.62000.66120.6806Table: Previous Results on Microblog datasetsAPP@30QL0.41810.6430QL RM30.46760.6533BM250.39310.6212BM25 RM30.43740.6442Table: Our results of retrieval models on Microblog datasetsWei YangEnd-to-end Neural Information Retrieval23 / 29

Results of End-to-end Neural IR Models: MicroblogModelsQL RM3DSSMCDSSMDRMMKNRMDUETMP-HCNNBERTDSSM RM3CDSSM RM3DRMM RM3KNRM RM3DUET RM3MP-HCNN RM3BERT RM3Wei YangAP0.46760.2634 0.1936 0.4477 0.3432 0.2713 0.44970.46460.46660.47030.4862 0.4848 0.4844 0.4902 0.5011 End-to-end Neural Information RetrievalP@300.65330.3836 0.2636 0.6127 0.5121 0.3533 6842 24 / 29

map 88187175211210202map DiffPer-topic Analysis: MicroblogPer-topic analysis on mapFigure: Per-topic differencesbetween BERT RM3 and QL RM30.4Per-topic analysis on map0.20.00.20.4TopicsFigure: Per-topic differences between BERT and QL RM3Wei YangEnd-to-end Neural Information Retrieval25 / 29

313map Diffmap DiffPer-topic Analys

Cheriton School of Computer Science University of Waterloo April 2019 Wei Yang End-to-end Neural Information Retrieval 1 / 29. Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Archit

Related Documents:

Neuroblast: an immature neuron. Neuroepithelium: a single layer of rapidly dividing neural stem cells situated adjacent to the lumen of the neural tube (ventricular zone). Neuropore: open portions of the neural tube. The unclosed cephalic and caudal parts of the neural tube are called anterior (cranial) and posterior (caudal) neuropores .

A growing success of Artificial Neural Networks in the research field of Autonomous Driving, such as the ALVINN (Autonomous Land Vehicle in a Neural . From CMU, the ALVINN [6] (autonomous land vehicle in a neural . fluidity of neural networks permits 3.2.a portion of the neural network to be transplanted through Transfer Learning [12], and .

neural networks and substantial trials of experiments to design e ective neural network structures. Thus we believe that the design of neural network structure needs a uni ed guidance. This paper serves as a preliminary trial towards this goal. 1.1. Related Work There has been extensive work on the neural network structure design. Generic algorithm (Scha er et al.,1992;Lam et al.,2003) based .

markers are expressed in the dorsal neural tube (SOX9, SOX10, SNAI2, and FOXD3), the neural tube is closed, and the ectodermal cells are converging on the midline to cover the neural tube. (d, d 0 ) By HH9, the NC cells are beginning to undergo EMT and start detaching from the neural tube.

Deep Neural Networks Convolutional Neural Networks (CNNs) Convolutional Neural Networks (CNN, ConvNet, DCN) CNN a multi‐layer neural network with – Local connectivity: Neurons in a layer are only connected to a small region of the layer before it – Share weight parameters across spatial positions:

Neural Network, Power, Inference, Domain Specific Architecture ACM Reference Format: KiseokKwon,1,2 AlonAmid,1 AmirGholami,1 BichenWu,1 KrsteAsanovic,1 Kurt Keutzer1. 2018. Invited: Co-Design of Deep Neural Nets and Neural Net Accelerators f

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 5, SEPTEMBER 2002 1075 GenSoFNN: A Generic Self-Organizing Fuzzy Neural Network W. L. Tung and C. Quek, Member, IEEE Abstract— Existing neural fuzzy (neuro-fuzzy) networks pro-posed in the literature can be broadly classified into two groups.

Bruce Springsteen - Born To Run - Full Score Author: www.DrumsTheWord.com Subject: Bruce Springsteen - Born To Run - Full Score Keywords: Bruce Springsteen - Born To Run - Full Score Created Date: 7/27/2017 3:03:01 PM