Identification Of Factors Predicting ClickThrough In F Web Searching .

1y ago
4 Views
1 Downloads
823.48 KB
15 Pages
Last View : Today
Last Download : 3m ago
Upload by : Amalia Wilborn
Transcription

12/11/200812: 13Page 1Author Proofasi6002 0251 20993.texIdentification of Factors Predicting ClickThrough inWeb Searching Using Neural Network AnalysisYing ZhangThe Harold and Inge Marcus Department of Industrial and Manufacturing Engineering, College of Engineering,The Pennsylvania State University, University Park, PA 16802. E-mail: yzz114@psu.eduBernard J. Jansen329F Information Sciences and Technology Building, College of Information Sciences and Technology,The Pennsylvania State University, University Park, PA 16802. E-mail: jjansen@ist.psu.eduAmanda SpinkFaculty of Information Technology, Queensland University of Technology Gardens Point Campus, 2 George St,GPO Box 2434, Brisbane QLD 4001 Australia. E-mail: ah.spink@qut.edu.auIn this research, we aim to identify factors that significantly affect the clickthrough of Web searchers. Ourunderlying goal is determine more efficient methods tooptimize the clickthrough rate. We devise a clickthroughmetric for measuring customer satisfaction of searchengine results using the number of links visited, numberof queries a user submits, and rank of clicked links. Weuse a neural network to detect the significant influenceof searching characteristics on future user clickthrough.Our results show that high occurrences of query reformulation, lengthy searching duration, longer query length,and the higher ranking of prior clicked links correlatepositively with future clickthrough. We provide recommendations for leveraging these findings for improvingthe performance of search engine retrieval and resultranking, along with implications for search engine marketing.IntroductionThe usefulness of a search engine depends on the relevance of the results retrieved and ranked in response touser queries. While millions of Web pages may include aparticular word or a phrase, some may be more relevant,popular, useful, or authoritative than others. Most searchengines employ methods to rank the results to provide thebest, most useful, or most relevant results first. How a searchengine decides which pages are the best matches and in whatReceived August 4, 2008; revised October 11, 2008; accepted October 11,2008 2009 ASIS&T Published online XXX in Wiley InterScience(www.interscience.wiley.com). DOI: 10.1002/asi.20993order to show the results varies from one engine to another.The retrieval and ranking methods also change over time asWeb use changes and new techniques evolve. Therefore, theevaluation of searching efficiency is a critical and ongoingresearch area.To perform this evaluation, search engines record usersystem interactions in a transaction log (a.k.a., search log orquery log) for analysis. A search engine transaction log is anelectronic record of the interactions that have occurred during a searching episode between a Web search engine andusers searching for content on that Web search engine. Justas transaction logs have yielded comprehensive documentation of users’ online behaviors, they have become importantresources for system evaluation and studies of user searching behavior. The voluminous nature of such logs, however,means that companies interested in user behavior on theWeb face enormous amounts of data to analyze to determinevaluable metrics.One of these commercial metrics is clickthrough rate(CTR), which is one measure of user satisfaction with theresults retrieved by a search engine based on a query submitted by a user (Joachims, 2002; Joachims, Granka, Pan,Hembrooke, & Gay, 2005; Xue et al., 2004). Naturally, hismay not always be the case. There are certainly times whenhigher clickthrough may indicate users not finding what theyare looking for. Additionally, Dupret and Piwowarsk (2008)point out that search logs may be missing important datasuch as documents that the user has already seen. However, CTR is an important element reflecting the quality andeffectiveness of commercial search engine and online advertising (Nettleton, Calderon, & Baeza-Yates, 2006), such asJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 60(2):1–14, 2009

asi6002 0251 20993.tex12/11/200812: 13Page 2Review of LiteratureWeb search engine transaction logs have become animportant data collection method for studying information retrieval and searching. However, companies interestedin Web user behavior face enormous amounts of data thatthey must analyze to gain worthwhile information. For example, Nielsen / NetRatings monitors the search behavior ofapproximately 500,000 people worldwide (Sullivan, 2006,2008), and datasets of this size present significant challengesto analysts. There has been research in overall characteristics of Web users (Jansen & Spink, 2005; Park, Bae, & Lee,2005; Wang, Berry, & Yang, 2003; Wolfram, 1999), as wellas methods to analyze these logs effectively and efficiently(Almpanidis, Kotropoulos, & Pitas, 2007; Meghabghab &Kandel, 2004), along with several studies investigating otheraspects of Web searching. (For a comprehensive review, seeMarkey, 2007a, 2007b.) Additionally, Chau, Fang, and Yang(2007) present results from the analysis of the search logsfrom Timway, a Chinese search engine, reporting that search2topics and the mean number of queries per sessions are similar to usage of English search engines. Whittle, Eaglestone,Ford, Gillet, and Madden (2007) have explored new ways tomine value from search logs. Kellar, Hawkey, Inkpen, andWatters (2008) explore augmenting log analysis with otherresearch methods. Machill, Beiler, and Zenker (2008) highlight the need for search engine research along culture andsocial lines.Beitzel, Jensen, Chowdhury, Grossman, and Frieder(2004) reviewed a log of hundreds of millions of queriesthat constituted the total query traffic of a general purposecommercial Web search service. They found that query traffic from particular topical categories differed both from thequery stream as a whole and from other categories. Thisanalysis provided valuable insight for improving retrievaleffectiveness and efficiency. It is also relevant to the development of enhanced query disambiguation, routing, and cachingalgorithms.Yates, Benavides, and Gonźalez (2006) presented a framework for the automatic identification of user interests, basedon the analysis of query logs. The researchers found thatsupervised learning could identify user interests given certainestablished goals and categories. With unsupervised learning,one can validate the goals and categories used, refine them,and then select those most appropriate to the user’s needs.Fan, Pathak, and Wallace (2006) proposed a representationscheme of nonlinear ranking function and compared this newdesign to the vector space model. Fan et al. then tested thenew representation scheme with the genetic programmingbased discovery framework in a personalized search contextusing a Text Retrieval Conference (TREC) Web corpus.This line of research is primarily descriptive of currentactions, and researchers are beginning to use more robustmethodologies to analyze the interactions between users andsystems to predict future actions. One of the most challenging problems of building an efficient predictive model of Websearch is that search engine transaction logs contain technically discrete time series data. Neural networks are goodtools for identifying relationships between inputs and outputs from a set of examples; therefore, neural networks aregood candidates for transaction log analysis. Due to the neuralnetworks approximation properties as well as their inherentadaptation features, neural networks have wide applicationfor modeling of nonlinear systems (Giles, Lawrence, & Tsoi,2001). We were surprised to learn that only a few neuralnetworks have been applied to the analysis of Web searchengine logs.Özmutlu, Spink, and Özmutlu (2004) provided the resultsfrom a comprehensive time-based Web study of US-basedExcite and Norwegian-based Fast Web search logs, exploring variations in user searching related to the changes in timeof the day. The researchers reported that the analysis of thedatasets was very useful to Web search engines for reconstructing the search structure and reallocating the resourceswith respect to different periods.In a follow-up of this research, Özmutlu, Seda, and Çavdur(2005) analyzed contextual information in search engine AQ3Author Proofsponsored search campaigns. Using the data recorded in thetransaction log and knowing the number of results retrievedin response to a query, one can calculate the existing CTR.Given the importance of clickthrough as a measure of usersatisfaction and with clickthrough being the primary revenuegenerating mechanism for most search engines, it would bebeneficial to develop more advanced inferential models thatcould predict future CTR of a given user based on currentsearching characteristics. Commercial search engine companies could then utilize user-system interactive data to improvethe CTR by designing more efficient searching algorithmsor advertising platforms, which could potentially improverevenue streams for online advertising. This is the primarymotivation for focusing this research on clickthrough. Usingmethods that result in predictive models will aid searchengines in serving relevant organic and sponsored results tousers.In this research, we identify and model the relationshipbetween the data recorded in transaction logs (logon time,browser type, query length, etc) and future propensity of userclickthrough (i.e., how likely is the user to click on links inthe results listing).In the next sections, we first summarize concepts andprevious work related to the use of Web transaction logsto investigate user behaviors. Then, the basic theories andtraining algorithms underlying our neural networks methodare introduced. We used the multilayer perceptron neural network (MLPN), which is a backpropagation neural network.Afterwards, there is a discussion of the necessary data sets(training, testing, and evaluating) to build the corresponding neural networks, explore the constructed neural networksusing our prepared data sets, and analyze the neutral networkcharacteristics by varying parameters. Then, we present asensitivity analysis of the input on clickthrough. Finally, theresults and importance of the models utilized are highlightedbefore concluding with discussion of the findings.JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2009DOI: 10.1002/asi

asi6002 0251 20993.tex12/11/200812: 13Page 3Research QuestionSpecifically, we ask, which user-search engine interactionfactors are correlated with future clickthrough?From a practical point of view, lots of information includedin the transaction log may or may not impact the user’s clickthrough. Therefore, we want to find the potential factors thatwill predict increased or decreased clickthrough of a user sothat the search engine companies can determine more efficientmethods to optimize the CTR.Methodologyresult from their origins as biological information processingcells. Neural networks are especially useful for open loopand closed loop feedback control, which make these especially useful for our application with search log data. Logdata is not normal; therefore, the standard statistical methodssuch as regression may not be effective.In the open loop application, neural networks serve asclassification, pattern recognition, or function approximation. To perform any of these functions, however, one musttrain the neural networks, and a widely used training technique for neural networks is backpropagation error algorithm.This training technique involves a forward pass to computeresponses corresponding to the input patterns followed by abackward pass to adjust the synaptic weights. Both passesare repeated until the actual responses of the network matchthe desired ones (Kampolis, Karangelos, & Giannakoglou,2004). Feedforward networks are memory-less in the sensethat their response to an input is independent of the previousnetwork state.Unlike open loop neural networks, closed loop neuralnetworks are dynamic systems. When a new input patternis presented, neuron outputs will be computed. Because ofthe feedback paths, the inputs to each neuron are modified,which leads the network to enter a new state. Consequently,different network architectures require different learningalgorithms. For this project, we use open loop feedforwardneural networks because transaction log analysis conformsmost closely to pattern recognition.Since the purpose of this research is to explore the behaviors of online users and to discover which information shownin the transaction log influences and predicts the futureclickthrough, we designed two primary open loop neural networks, and after tuning the networks, analyzed the weightsof each input element. Knowing how specific types of information impact clickthrough will allow commercial searchengine companies to leverage the user-system interactive datato design more efficient searching algorithms to increaseclickthrough. After evaluating the two types of neural networks, we use MLPN because it was the better performingnetwork.Author Proofquery logs. The study proposed a topic identification algorithm using artificial neural networks. A sample from anExcite data log was selected to train the neural networks, andthen the neural network was used to identify topic changesin the data log. The researchers reported that topic shiftswere estimated correctly, with a 77.8% precision in the overall database. Özmutlu, Çavdur, Spink, and Özmutlu (2005)have shown that one can train neural networks using multiple search logs. Özmutlu, Çavdur, and Özmutlu (2008)conducted a cross-validation of an artificial neural networkapplication to automatically identify topic changes in Websearch engine user sessions by using data logs of different Web search engines for training and testing the neuralnetwork.However, these works were focused primarily on classifying past behaviors or query topics. These studies didnot provide an efficient model to identify user-system interaction that could predict future user behaviors reliably,especially user clickthrough. An exception is Zhang, Jansen,and Spink (Forthcoming) who explore time-series analysisto predict clickthrough. A primary metric in search engineevaluation for both organic and sponsored links, CTR for asearch engine is a critical measure of both system performance and revenue generation. Jansen, Brown, and Resnick(2007) have conducted laboratory investigations of factorsinfluencing clickthrough. Ravid, Bar-Ilan, Baruchson-Arbib,and Rafaeli (2007) explore the relationship between searchengine queries and the access pages on Web sites.To address this gap in current research, we construct aneural network to study user-system interaction and to provide an efficient mechanism to predict user clickthrough. Weexplore two primary neural networks, applying each networkmethod to a training data set to compare the fitting results ofeach approach. We follow this by conducting a sensitivityanalysis of the input neurons based on the better-fitted neuralnetwork method, which is the MLPN, to determine whichtypes of data represented in the transaction log are predictiveof users’ clickthrough.Neural networks are powerful data modeling tools that areable to capture and represent complex relationships betweeninput and output. Neural networks are complex, nonlinear, distributed systems, and, consequently, they have broadapplication. Many remarkable properties of neural networksMLPNIn our study, a MLPN is a network with multiple layers using back propagation algorithm to tune the weights.Generally, backpropagation algorithm in the multilayer feedforward network is enough to perform the system identification and has had a wide application in different areas.Therefore, in this section, we will introduce its basic structure and provide the pseudo-code used to train the networkfor the transaction log data.Basic structure. MLPNs often have one or more hiddenlayers followed by an output layer of linear neurons. Multiple layers of neurons with nonlinear transfer functions (i.e.,sigmoid nonlinearity function) allow the network to learnnonlinear as well as linear relationships between input andJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2009DOI: 10.1002/asi3

asi6002 0251 20993.tex12/11/200812: 13Page 4Training algorithm. Training the data set for MLPN comprises two parts: forwarding the network and backpropagating. Forwarding the network means that all outputs arecomputed using sigmoid thresholds of the inner product ofthe corresponding weight and input vectors. Backpropagating the network entails transmitting errors backwards throughthe network by apportioning them to each unit accordingto the portion of the error for which the units are responsible. In this research, we use the DELTA backpropagationmethod to train the neural network.Because numerous textbooks and papers have illustratedthe basic algorithms of backpropagating the network (c.f.,Haykin, 1999), here we simply list the notation of variablesand algorithms used to train the network.Variable Notation: tj :Target vector for unit j in the output layer.Learning rate, in this study, η 0.25.Number of units in layer l.The bias for threshold function in each layer.Momentum, which means the proportion of previousadjusted weight needed to adjust the current weight forthe whole neural network. To increase the learning ratewithout leading to oscillations, Rumelhart, Hinton, andWilliams (1986) suggested a modification to generalized delta to include a momentum term. In our study,α 0.9.δjl : First partial derivative of sum square error w.r.t the input Elll lof each unit, δjl zl (tj oj )(1 oj )oj .η:nl :Bias:α:Author Proofoutput vectors. Generally, such networks are trained moreefficiently with standardized data. In this research, we usenormalized input and target data as the training and testingsample, and we use sigmoid transfer function as the activation function to constrain the output from hidden layers ofthe network within the range from 0 to 1.There is no clear way of determining how many hidden neurons and layers are necessary to form a decisionregion that is sufficiently complex to satisfy the demandsof a given problem. Thus, parameters required are best determined based on experimentation. For the current project, wedesigned a neural network with a flexible number of layers tofilter out nonlinear relationships as much as possible. Afterbuilding the network structure, we also designed a learningalgorithm to fit the desired output. Figure 1 shows the detailedstructure of an MLPN.Data Set:We have three data sets with which to construct the neuralnetwork:1. Training sample: xjl , t jl . lt j l .2. Testing sample: xj, l3. Evaluating sample: x lj , t j .Training, Testing, and Evaluating Algorithm:1. Normalize the input and target value into the range of lowerand upper limit (i.e., 0.1 and 0.9).2. Generate a feedforward network through all the layers (seeFigure 1):a. Input the instance xjl , and calculate the weightedl l · sum of inputs and weights z jl wj xj .b. Put the weighted sum into the sigmoid activation function, and get the output from each layer:1 ojl . Regard the output of layer l as thegain zll 1.layer l to unit j in layer l 1 could be denoted by xjiWeight vector for unit j in layer l. The weight betweenunit i in layer l and unit j in layer l 1 could be denoted byadj ll 1wji. In addition, we use wji (t) to represent the adjustedweight at the tth iteration. zj l : Weighted sum of the inputs for unit j in layer l. oj l :Output vector for unit j in layer l.Multiple Hidden layerInput layerN*1U1*1U2*U11BiasNote.N — Number of input elementsU(i) — Number of units in layer iW(i, j) — Weight vector for layer i which has j unitsFIG. 1.4 U2*1U(L 1)*1W(L,U*N1Bias U(L)*1MLPN structure.JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2009DOI: 10.1002/asiSigmoidU*1Output layerSigmoid W(2,U1)SigmoidBiasSigmoidInput (p)W(1,N)U1*N1j1 einput of layer l 1.3. Initialize all the weights to small random values (e.g.,between 0.5 and 0.5).4. Use DELTA backpropagation method to train the networkbackwards using calculated error between target and outputin each layer until the termination condition is met.l l , For each training sample xj t j that could be randomlypicked through the training data set:a. Based on the feedforward neural network constructed in step 2, calculate the error of the outputx jl : Input vector for unit j in layer l. The input from unit i in wj l :jGain: Proportion of δ needed to tune the neural network.OutputTarget

asi6002 0251 20993.tex12/11/200812: 13Page 5layer units δlj gain (tjl olj )(1 olj )olj . At thesame time, we calculate the train error of the wholeneural network as gain (tjl olj )(1 olj )olj .Author Proofj(RBFN) is an alternative to highly nonlinearity-in-theparameters neural network (Park & Sandberg, 1991), whichmeans the determinants of neural centers have high nonlinearity. Traditionally, the RBFN method has been used for strictinterpolation in multidimensional space. The original RBFNmethod requires that there be as many radial basis functioncenters as data points.We continuously trained both neural networks until thetermination condition was satisfied, namely, the current iteration’s error for testing data set was greater than 1.2 timesthe previous iteration’s error. For the MLPN method, theparameters of the number of hidden layers and hidden neurons required were selected based on the experiments. Aftertesting the network several times, we chose two hidden layerswith four and six hidden neurons, respectively. In this study,one iteration means training the network using 3,000 piecesof training data, which equals the number of epochs (10 inthis study) times the number of records in the training dataset (300 in this study).Figures 2 and 3 show that the training error for the MLPNstarts at about 0.9 and is close to 0.2 after iteration 29, whilethe training error for the RBFN starts at about 0.15 andshrinks almost to 0.05 after iteration 17. This phenomenonis explainable according to the training characteristics of theMLPN and the RBFN. The MLPN uses differentiable andcontinuous activation functions within hidden layers to screenout the nonlinear behaviors and to tune weights, while theRBFN uses linear output layer to tune the weights after rulingout all the nonlinear behaviors using clustered centers.Therefore, the RBFN deals with less random and irregular nonlinear data than the MLPN does. For this reason,b. Calculate the error of thein each hidden layer unitsl δl 1 .δlj gain olj (1 olj ) wjiil for each unit i atc. Update neural network weight wjil (n) wl (n) ηδl ol 1 each layer l as follows: wjijij iadj ll (n) means the weight conαwji (n 1), here wjinecting neuron i and neuron j at the nth iteration, andadj lwji (n) means the adjusted weight at the (n 1)thiteration. l l , For each testing sample xj t j ,(1) Use the constructed network and testing sampleto calculate the output error for the entire testingsample:testing error gain (tj l o lj )(1 o lj )o lj .j(2) Use the minimum testing error variable to restorethe minimum testing error found at each iteration.Termination condition:If testing error is less than β* minimum testing error (i.e.,β 1.2), terminate the training process. l l , 5. For each evaluating sample xj t j , evaluate the network.MLPN Compared to RBFNWe also explored another neural network for transactionlog analysis. The Radial Basis Function Neural NetworkThe Trend of Training Error for terationFIG. 2. Training Error for MLPN.The Trend of Training Error for RBFNError0.150.10.051471013161922 25 28Iteration31343740434649FIG. 3. Training Error for RBFN.JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2009DOI: 10.1002/asi5

asi6002 0251 20993.tex12/11/200812: 13Page 6Objective Data49Output Objective Rate0.7220.8Author ProofObjective Data v.s. Output Data using MPLN0.9UserFIG. 4. Training Error for MLPN.Output Data5495195165135105075045010Objective on Rate0.75250.8522Objective Data v.s. Output Data using RBFN0.9UserFIG. 5. Training Error for RBFN.the RBFN could begin to train the network with a lower training error and terminate the iteration earlier. Additionally,between iteration 7 and iteration 17, the training error for theMLPN drops dramatically, while the error does not changemuch for the other parts. As for the RBFN, the trainingerror maintains the same slope.Although the training error of the RBFN is much smallerthan that of the MLPN, we could not say that the RBFN performs better than the MLPN because each calculates errorsbased on different input data sets. The error for the RBFNis based on the output coming out of the hidden layer, whichhas been screened of some nonlinear behaviors, thereby creating data that is more aggregated. We use the evaluationdata set from the transaction log to test the fitting of the curvebetween the output data and objectives to see which neuralnetwork worked better using the transaction log.Figures 4 and 5 show the fitting curves for the MLPN andthe RBFN using the same evaluating data set. We can seethat the MLPN performs much better than the RBFN in thefitting curves. In other words, the RBFN hidden layer cannot6filter out the nonlinear behaviors as well as the MLPN hiddenlayer does. From a practical point of view, different users willhave different searching styles, which is possibly the primarycause of high nonlinearity in the data set.Because the MLPN behaves much better than the RBFNdoes, in the rest of this study, we focus only on the sensitivityanalysis of the input neurons based on the MLPN.Data AnalysisData AnalysisIn this study, we used a Dogpile (www.dogpile.com)search engine transaction log. Owned by Info space, Dogpile is a market leader in the meta-search engine business,incorporating into its search results the listing from othersearch engines, including results from the four leadingWeb search indices (i.e., Ask Jeeves, Google, MSN, andYahoo!). When accepting a submitted query, Dogpile simultaneously sends the query to multiple Web search engines,JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2009DOI: 10.1002/asiAQ4

asi6002 0251 20993.tex12: 13Page 7Fields in the transaction log.FieldAuthor ProofTABLE 1.12/11/2008DescriptionRecord numberA unique identifier for the record. A record is a single tuple in the database. A record is the log of an interaction between the userand the search engine. An interaction is one of the following actions: submit a query, click on a link, or view a results page.IP addressThe Internet protocol (IP) address of the computer on which the user was logged on during the searching session.CookieParcels of text sent by a server to a Web browser and then sent back unchanged by the browser each time the browser accesses thatserver. Cookies are used for authenticating, tracking, and maintaining specific information about users, such as site preferencesand the contents of their electronic shopping carts.TimeThe time when an interaction was recorded by the search engine server.QueryThe terms of the queries that the user typed into the search engine text box when searching.VerticalThere are five types of verticals (Web, Audio, Image, Video, News) representing different content collections. They are representedby tabs on the search engine interface and provide a convenience for the users to find different information in different formats.SponsoredOne of two possible types of links retrieved and presented on the search engine results page (SERP). Sponsored links appear becausea company, organization, or individual purchased the keywords that the users used in the search query. If a user clicked a sponsoredlink, then this field will show 1. Otherwise, the field shows 0.OrganicThe other type of link retrieved and presented on the SERP. These links are retrieved by search engine using its proprietary matchingalgorithm. If the user clicked an organic link, then this field will show 1. Otherwise, the field shows 0.BrowserThe type of browser used by the users.LocationThe place/country where a user used the search engine as determined by the IP address.TABLE 2.Additional calculated fields in the transaction log.FieldUser intentDescriptionThere are three categories of user intent that we calculated, which are informational, transactional, and navigational that reflectedthe type of user desired content.For this process, we select a sample of records containing not only the query but also other attributes, such as the order of the queryin the session, query length, result page, and vertical, and then manually classified the queries in one of three categories, which isderived from work in Rose and Levinson (2004) using an algorithm developed by Jansen, Booth, and Spink (2008).Query lengthThe number of terms contained in a particular query.Results pageA number representing the search engine results page (SERP) viewed (blank is first page, 1 is second page, etc.) during agiven interaction.ReformulationpatternThere are nine categories of query reformulation. We used the algorithm outlined in Jansen, Zhang, and Spink (2007) toclassify the queries.collects the results from each Web search engine, removesduplicates results, and aggregates the remaining results intoa combined ranked listing using a proprietary algorithm.Dogpile has tabbed indexes for federated searching of Web,Images, Audio, and Video content. Dogpile also offers queryreformulation assistance with query suggestions listed in an“Are You Looking for?” section of the interfa

query log) for analysis.A search engine transaction log is an electronic record of the interactions that have occurred dur-ing a searching episode between a Web search engine and users searching for content on that Web search engine. Just as transaction logs have yielded comprehensive documenta-

Related Documents:

1 Lab meeting and introduction to qualitative analysis 2 Anion analysis (demonstration) 3 Anion analysis 4 5. group cation anion analysis 5 4. group cation (demonstration) 6 4. group cation anion analysis 7 3. group cation (demonstration) 8 3. group cation anion analysis 9 Mid-term exam 10 2. group cation (demonstration)

Keywords: Bird Identi cation, Deep Learning, Convolution Neural Net-work, Audio Processing, Data Augmentation, Bird Species Recognition, Acoustic classi cation 1 Introduction 1.1 Motivation Large scale, accurate bird recognition is essential for avian biodiversity conser-vation. It helps us quantify the impact of land use and land management on .

the trial will be shared with researchers, for example, Clinical Study Reports (CSRs) and summaries. However, it is assumed that no other identi able patient data is being shared via this release, including identi able patient data in clinical reports, patient narratives, and any other structured or unstructured datasets. 7/50

Unsupervised individual whales identi cation: spot the di erence in the ocean Alexis Joly 1, Jean-Christophe Lombardo , Julien Champ , and Anjara Saloma3 1 Inria ZENITH team, LIRMM, France, name.surname@inria.fr 2 INA, France, obuisson@ina.fr 3 Cetamada, NGO, anjara@cetamada.com Abstract. Identifying organisms is a key step in accessing information

3. Identi–cation & De–nition The assumptions above (validity and relevance) enable us to identify the parameters of the model. Loosely speaking, identi–cation means that we

and as benchmarks for sensitivity analysis (Imai et al., 2010b; Sj olander, 2009). However, although the identi cation conditions invoked in current mediation analysis are based on the same formal principles (Appendix B),3 the articulation of these conditions in common scienti c terms become highly varied and unreliable, making it hard for .

infrastructure for the identi cation and tracking of both users and assets. The integrated RFID/WSN network provides a centralized and autonomous performance, that is, without requiring any human intervention. On the one hand, RFID technology provides an unassisted and auto-mated identi cation system at object level. Speci cally, the Ultra High .

tify users across social media sites. We use the minimum amount of information available across sites. Section 2 formally presents the user identi cation prob-lem across social media sites. Section 3 describes behavioral patterns that users exhibit in social media that can be har-nessed to develop a user identi cation technique. Section 4