A Model-Based Approach To Measuring Expertise In

2y ago

32 Views

3 Downloads

204.51 KB

6 Pages

Last View : 24d ago

Last Download : 3m ago

Upload by : Oscar Steel

Report this link

Download PDF

Transcription

A Model-Based Approach to Measuring Expertise in Ranking TasksMichael D. Lee (mdlee@uci.edu)Mark Steyvers (msteyver@uci.edu)Mindy de Young (mdeyoung@uci.edu)Brent J. Miller (brentm@uci.edu)Department of Cognitive Sciences,University of California, IrvineIrvine, CA, USA 92697-5100AbstractWe apply a cognitive modeling approach to the problemof measuring expertise on rank ordering tasks. In thesetasks, people must order a set of items in terms of a givencriterion. Using a cognitive model of behavior on thistask that allows for individual differences in knowledge,we are able to infer people’s expertise directly from therankings they provide. We show that our model-basedmeasure of expertise outperforms self-report measures,taken both before and after doing the task, in termsof correlation with the actual accuracy of the answers.Based on these results, we discuss the potential andlimitations of using cognitive models in assessingexpertise.Keywords: expertise, ordering task, wisdom ofcrowds, model-based measurementIntroductionUnderstanding expertise is an important goal for cognitive science, for both theoretical and practical reasons.Theoretically, expertise is closely related to the structureof individual differences in knowledge, representation,decision-making, and a range of other cognitive capabilities (Wright & Bolderm, 1992). Practically, the ability to identify and use experts is important in a widerange of real-world settings. There are many possibletasks that people could do to provide their expertise, including estimating numerical values (e.g., ”what is thelength of the Nile?”), predicting categorical future outcomes (”who will win the FIFA World Cup?”), and soon. In this paper, we focus on the task of ranking a set ofgiven items in terms of some criterion, such as orderinga set of cities from most to least populous.One prominent theory of expertise argues that the keyrequirements are discriminability and consistency (e.g.,Shanteau, Weiss, Thomas, & Pounds, 2002). Expertsmust be able to discriminate between different stimuli,and they must be able to make these discriminations reliably or consistently. Protocols for measuring expertise in terms of these two properties are well-developed,and have been applied in settings as diverse as audit judgment, livestock judgment, personnel hiring, anddecision-making in the oil and gas industry (Malhotra,Lee, & Khurana, 2007). However, because these protocols need to assess discriminability and consistency,they have two features that will not work in all appliedsettings. First, they rely on knowing the answers to thediscrimination questions, and so must have access to aground truth. Second, they must ask the same (or verysimilar) questions of people repeatedly, and so are timeconsuming. Given these limitations, it is perhaps not surprising that expertise is often measured in simpler andcruder ways, such as by self-report.In this paper, we approach the problem of expertisefrom the perspective of cognitive modeling. The basicidea is to build a model of how a number of people withdifferent levels of expertise produce judgments or estimates that reflect their knowledge. This requires makingassumptions about how individual differences in knowledge are structured, and how people apply decisionmaking processes to their knowledge to produce answers.There are two key attractive properties of this approach. The first is that, if a reasonable model can beformulated, the knowledge people have can be inferredby fitting the model to their behavior. This avoids theneed to rely on self-reported measures of expertise, orto use elaborate protocols to extract a measure of expertise. The cognitive model does all of the work, providingan account of task behavior that is sensitive to the latentexpertise of the people who do the task.The second attraction is that expertise is determinedby making inferences about the structure of the differentanswers provided by individuals. This means that performance does not have to be assessed in terms of an accuracy measure relative to the ground truth. It is possibleto measure the relative expertise of individuals, withoutalready having the expertise to answer the question.The structure of this paper is as follows. We first describe an experiment that asks people to rank order setsof items, and rate their expertise both before and afterhaving done the ranking. We then describe a simple cognitive model of the ranking task, and use the model toinfer individual differences in the precision of the knowledge each person has. In the results section, we show thatthis individual differences parameter provides a goodmeasure of expertise, in the sense that it correlates wellwith actual performance. We also show it outperformsthe self-reported measures of expertise. We concludewith some discussion of the strengths and limitations ofour cognitive modeling approach to assessing expertise.

Table 1: The six rank ordering tasks. Each involves ten items, shown in correct order.HolidaysNew Year’sMartin Luther daChinaUnited AmendmentsFreedom speech and religionRight to bear armsNo quartering of soldiersNo unreasonable searchesDue processTrial by juryCivil trial by juryNo cruel punishmentRight to non-specified rightsPower for states and peopleExperimentParticipantsA total of 70 participants completed the experiment. Participants were undergraduate students recruited from theUniversity of California, Irvine subject pool, and givencourse credit as compensation.StimuliWe used six rank ordering problems, all with ten items,as shown in Table 1. All involve general ‘book’ knowledge, and were intended to be of a varying levels of difficulty for our participants, and lead to individual differences in expertise.ProcedureThe experimental procedure involved three parts. In thefirst part, participants completed a pre-test self-report oftheir level of expertise in the general content area of eachof the stimuli. This was done on a 5-point scale, simplyby asking questions like “Please rate, on a scale from1 to 5, where 1 is no knowledge and 5 is expert, yourknowledge of the order of American holidays.”.In the second part, participants completed each of thesix ranking questions from Table 1 in a random order.Within each problem, the ten items were presented in aninitially random order, and could then be ‘dragged anddropped’ to any part of the list to update the order. Participants were free to move items as often as they wanted,with no time restrictions. They hit a ‘submit’ button oncethey were satisfied with their answer. No time limit wasplacedThe third part of the experimental procedure was completed immediately after each final ordering answer wassubmitted. Participants were asked to express their levelof confidence in their answer, again on a 5-point scale,were 1 was ‘not confident at all’ and 5 was ‘extremelyconfident’.US CitiesNew YorkLos AngelesChicago.HoustonPhoenixPhiladelphiaSan AntonioSan DiegoDallasSan nRooseveltWilsonRooseveltTrumanEisenhowerWorld CitiesTokyoMexico CityNew YorkSao PauloMumbaiDelhiShanghaiKolkataBuenos AiresDhakadeveloped in the context of the ‘wisdom of the crowd’phenomenon as applied to order data. The basic wisdomof the crowd idea is that the average of the answers ofmany individuals may be as good as or better than all ofthe individual answers (Surowiecki, 2004). An importantcomponent in developing good group answers is weighting those individuals who know more, and so the modelwe use already is designed to accommodate individualdifferences in expertise.We first illustrate the model intuitively, and explainhow its parameters can be interpreted in terms of levels of knowledge and expertise. We then provide somemore formal details, including some information aboutthe inference procedures we used to fit the model to ourdata.Overview of ModelA Thurstonian Model of RankingThe model is described in Figure 1, using a simple example involving three items and two individuals. Figure 1(a) shows the ‘latent ground truth’ representationfor the three items, represented by µ1 , µ2 , and µ3 on aninterval scale. Importantly, these coordinates do not necessarily correspond to the actual ground truth, but ratherrepresent the knowledge that is shared among individuals. Therefore, these coordinates are latent variables inthe model that can be estimated on the basis of the orderings from a group of individuals.Figure 1(b) and (c) show how these items might giverise to mental representations for two individuals. Theindividuals might not have precise knowledge about theexact location of each item on the interval scale due tosome sort of noise or uncertainty. This mental noisemight be due to a variety of sources such as encodingand retrieval errors. In the model, all these sources ofnoise are combined together into a single Gaussian distribution1.The model assumes that the means of these item distributions are the same for every individual, because, everyindividual is assumed to have access to the same infor-We use a previously developed Thurstonian model ofhow people complete ranking tasks (Steyvers, Lee,Miller, & Hemmer, 2009). Originally, this model was1 In our experiment, participants give only one ranking foreach problem. Therefore, the model cannot disentangle the different sources of error related to encoding and retrieval.

./012345367895:3793;µ! "# "# σixi?@ABCDCEF !! !"# %&'(&)#')&'* ,-yi"!GHOIJKLKMNi participantsQ!!PFigure 1: Illustration of the Thurstonian model.mation about the objective ground truth. The widths ofthe distributions, however, are allowed to vary, to capturethe notion of individual differences. There is a singlestandard deviation parameter, σi for the ith participant,that is applied to the distribution of all items. In Figure 1Individual 1 is shown as having more precise item information than Individual 2, and so σ1 σ2 .The model assumes that the realized (latent) mentalrepresentation is based on a single sample from each itemdistribution, represented by x in Figure 1, where xi j is thesample for the ith item and jth participant. The orderingproduced by each individual is then based on an orderingof the mental samples. For example, individual 1 in Figure 1(b) draws sample for items that leads to the ordering(1,2,3) whereas individual 2 draws a sample for the thirditem that is smaller than the sample for the second item,leading to the ordering (1,3,2). Therefore, the overlap inthe item distributions can lead to errors in the orderingsproduced by individuals.The key parameters in the model are µ and σi . Interms of the original wisdom of the crowd motivation, themost important was µ, because it represents the assumedcommon latent ordering individuals share. Inferring thisordering corresponds to constructing a group answer tothe ranking problem. In our context of measuring expertise, however, it is the σi parameters that are important.These are naturally interpreted as a measure of expertise. Smaller values will lead to more consistent answerscloser to the underlying ordering. Larger values will leadto more variable answers, with more possibility of deviating from the underlying ordering.Generative Model and InferenceFigure 2 shows the Thurstonian model, as it applies toa single question, using graphical model notation (seeKoller, Friedman, Getoor, & Taskar, 2007; Lee, 2008;Shiffrin, Lee, Kim, & Wagenmakers, 2008, for statistical and psychological introduction). The nodes represent variables and the graph structure is used to indicate the conditional dependencies between variables.Figure 2:model.Graphical representation of ThurstonianStochastic and deterministic variables are indicated bysingle and double-bordered nodes, and observed data arerepresented by shaded nodes. The plates represent independent replications of the graph structure, which corresponds to individual participants in this model.The observed data are the ordering given by the ithparticipant, denoted by the vector y i , where y i j representsthe item placed in the jth position by the participant.To explain how these data are generated, the model begins with the underlying location of the items, given bythe vector µ . Each individual is assumed to have accessto this group-level information. To determine the orderof items, the ith participant samples for the jth item, asxi j Gaussian(µ j , σi ), where σi is the uncertainty thatthe ith individual has about the items, and the samples xi jrepresent the realized mental representation for the individual. The ordering for each individual is determinedby the ordering of their mental samples y i rank(xxi ).We used a flat prior for µ and a σi Gamma(λ, 1/λ)prior on the standard deviations, where λ is a hyperparameter that determines the variability of the noise distributions across individuals. We set λ 3 in the currentmodeling, but plan to explore a more general approachwhere λ is given a prior, and inferred, in the future.Although the model is straightforward as a generativeprocess for the observed data, some aspects of inferenceare difficult because the observed variable y j is a deterministic ranking. Yao and Böckenholt (1999), however,have developed appropriate Markov chain Monte Carlo(MCMC) methods. We used an MCMC sampling procedure that allowed us to estimate the posterior distributionover the latent variables xi j , σi , and µ, given the observedorderings y i . We use Gibbs sampling to update the mental samples xi j , and Metropolis-Hastings updates for σiand µ. Details of the MCMC inference procedure areprovided in the appendix.ResultsWe first describe how we measure the accuracy of a rankorder provided by a participant, as a ground truth assess-

Holidays260Landmass40 0.241 2 3 4 5Pre Report0 0.091 2 3 4 5400Amendments 0.281 2 3 4 527270US Cities24 0.161 2 3 4 5270Presidents 0.241 2 3 4 5340World Cities 0.111 2 3 4 53424Tau26270 0.471 2 3 4 5Post Report 0.251 2 3 4 54026000.920300 0.541 2 3 4 5270.920300 0.281 2 3 4 5270.810300 0.611 2 3 4 50.7830 0.091 2 3 4 53424000.950300.5603SigmaFigure 3: Results comparing the relationship between the three measures of expertise and the accuracy of individualanswers. The plots are organized with the measures in rows, and the problems in columns.ment of their expertise. We then examine the correlations between this ground truth and their pre- and postreported self-assessments, and the model-based measure.Ground Truth AccuracyTo evaluate the performance of participants, we measured the distance between their provided order, and thecorrect orders given in Table 1. A commonly used distance metric for orderings is Kendall’s τ, which countsthe number of adjacent pairwise disagreements betweenorderings. Values of τ range from 0 τ n (n 1)/2,where n 10 is the number of items. A value of zeromeans the ordering is exactly right, and a value of onemeans that the ordering is correct except for two neighboring items being transposed, and so on, up to the maximum possible value of 45.Relationship Between Expertise and AccuracyFigure 3 presents the relationship between the three measures of expertise—pre-reported expertise, post-reportedconfidence, and the mean of the σ parameter inferred inthe Thurstonian model—and the τ measures of accuracy.In each plot, a point corresponds to a participant. Theplots are organized with the six problems in columns, andthe three measures as rows. The Pearson correlations arealso shown. Note that, for the self-reported measures,the goal is for higher levels of rated expertise should correspond to lower (more accurate) values of τ, and so anegative correlation would mean the measure was effective. For the model-based σ measure, smaller values correspond to higher expertise, and so a positive correlationmeans the measure is effective.Figure 3 shows that the six different problems rangedin difficulty. Looking at the maximum τ needed to showresults, the Holidays, Amendments, US Cities and Presidents questions were more accurately answered than theLandmass and World Cities questions. This finding accords with our intuitions about the difficulty of the topicdomains and the experience of our participant pool.More importantly, there is a clear pattern, for all sixproblems, in the way the three expertise measures relateto accuracy. The correlations are generally in the rightdirection, but small in absolute size, for the pre-reportedexpertise. They continue to be in the right direction, andhave larger absolute values, for the post-reported confidence measure of expertise. But correlations are inthe right direction, and strongest, for the model-based σmeasure of expertise.Perhaps most importantly, it is also clear that themodel-based measure improves upon the self-reportedmeasures. It achieves, for all but the world cities problem, an impressively high level of correlation with accuracy. With correlations around 0.9, the σ measure of expertise explains about 80% of the variance between people in their accuracy in completing the rank orderings.22 A legitimate concern is that the correlations for theThurstonian model benefit from σ being continuous, whereasthe pre- and post-report measures are binned. To check this, wealso calculated correlations for the Thurstonian model using 5binned values of σ, and found correlations of 0.88, 0.88, 0.80,0.77, 0.92 and 0.54 for the six problems in the order shownin Figure 3. While slightly reduced, these correlations clearlysupport the same conclusions.

DiscussionWe first discuss the advantages of the modeling approachwe have explored for measuring expertise, then acknowledge some of its limitations, before finally mentioningsome possible extensions.AdvantagesOur results could be used to make a strong case for theassessment of expertise, at least in the context of rankorder questions, using the Thurstonian model. We haveshown that by having a group of participants completethe ordering task, the model can infer an interpretablemeasure of expertise that correlates highly with the actual accuracy of the answers.One attractive feature of this approach is that it doesnot require self-ratings of expertise. It simply requirespeople to do the ordering task. Our results indicate thatthe model-based measure is much more useful than selfreported assessments taken before doing the task, focusing on general domain knowledge, or confidence ratingsdone after having done the task, focusing on the specificanswer provided.An even more attractive feature of the modeling approach is that it does not require access to the groundtruth to assess expertise. We used ground truth accuracies to assess whether the measured expertise was useful,but we did not need the τ values to estimate the σ measures themselves. The model-based expertise emergesfrom the patterns of agreement and disagreement acrossthe participants, under the assumption there is some fixed(but unknown) ground truth, as per the wisdom of thecrowd origins of the model.A natural consequence is that the approach developedhere could be applied to prediction tasks, where there isnot (yet) a ground truth. For example, we could ask people to predict the end-of-season rankings of sports teams,and potentially use the model to assess their expertiseahead of time. If the model-based approach continuesto perform well with prediction, it would be especiallyvaluable, since standard measures of expertise based onself-report are have often been found to be unreliable predictors of forecasting accuracy (e.g., Tetlock, 2006).LimitationsA basic property of the approach we have presented isthat it involves assessing the relative expertise for a largegroup of people. There are two inherent limitations withthis.One is that a possibly quite large number of participants need to complete the

Related Documents:

Craft Council of Newfoundland and Labrador - Webflow

work/products (Beading, Candles, Carving, Food Products, Soap, Weaving, etc.) ⃝I understand that if my work contains Indigenous visual representation that it is a reflection of the Indigenous culture of my native region. ⃝To the best of my knowledge, my work/products fall within Craft Council standards and expectations with respect to

310 Views

2y ago

Approaches to the study of Political Science

The modern approach is fact based and lays emphasis on the factual study of political phenomenon to arrive at scientific and definite conclusions. The modern approaches include sociological approach, economic approach, psychological approach, quantitative approach, simulation approach, system approach, behavioural approach, Marxian approach etc. 2 Wasby, L Stephen (1972), “Political Science .

95 Views

3y ago

Image Processing: Stochastic Model Based Approach

based or region-based approach. Though the region-based approach and edge-based approaches are complementary to each other the edge-based approach has been used widely. Using the edge-based approach, a number of methods have been proposed for low-level analysis viz. image compressi

46 Views

2y ago

Modul - Fakultas Ekonomi dan Bisnis Islam

akuntansi musyarakah (sak no 106) Ayat tentang Musyarakah (Q.S. 39; 29) لًََّز ãَ åِاَ óِ îَخظَْ ó Þَْ ë Þٍجُزَِ ß ا äًَّ àَط لًَّجُرَ íَ åَ îظُِ Ûاَش

319 Views

2y ago

ĀMIC DA‘WAH ACADEMY

Collectively make tawbah to Allāh S so that you may acquire falāḥ [of this world and the Hereafter]. (24:31) The one who repents also becomes the beloved of Allāh S, Âَْ Èِﺑاﻮَّﺘﻟاَّﺐُّ ßُِ çﻪَّٰﻠﻟانَّاِ Verily, Allāh S loves those who are most repenting. (2:22

340 Views

2y ago

JEPPESEN LGAV (Eleftherios Venizelos Intl)

Athens Approach Control 132.975 Athens Approach Control 131.175 Athens Approach Control 130.025 Athens Approach Control 128.95 Athens Approach Control 126.575 Athens Approach Control 125.525 Athens Approach Control 124.025 Athens Approach Control 299.50 Military Athinai Depature Radar 128.95 Departure ServiceFile Size: 2MB

100 Views

2y ago

SCHOOL DISCIPLINE AND SCHOOL INDISCIPLINE

Mendelr Model-1988, 1992, The Jacob Kounin Model -1971, Neo-Skinnerian Model-1960, Haim Ginott Model (considered non-interventionist model approach) -1971, William Glasser Model-1969, 1985, 1992 (Quality school), Rudolf Dreikurs Model (Model of democracy)-1972, Lee and Marlene Canter Model (Assertive Discipline Model is one of the most spread

91 Views

2y ago

MODFLOW - Conceptual Model Approach - New Mexico Institute of Mining ...

GMS TUTORIALS MODFLOW - Conceptual Model Approach Two approaches can be used to construct a MODFLOW simulation in GMS: the grid approach or the conceptual model approach. The grid approach involves working directly with the 3D grid and applying sources/sinks and other model parameters on a cell-by-cell basis.

7 Views

9m ago

Recent Views

Grammar as a Foreign Language - List of Proceedings

Grammar as a Foreign Language Oriol Vinyals Google vinyals@google.com Lukasz Kaiser Google lukaszkaiser@google.com Terry Koo Google terrykoo@google.com Slav Petrov Google slav@google.com Ilya Sutskever Google ilyasu@google.com Geoffrey Hinton Google geoffhinton@google.com Abstract Synta

2y ago

445 Views

Attention is All you Need - NIPS

Google Brain avaswani@google.com Noam Shazeer Google Brain noam@google.com Niki Parmar Google Research nikip@google.com Jakob Uszkoreit Google Research usz@google.com Llion Jones Google Research llion@google.com Aidan N. Gomezy University of Toronto aidan@cs.toronto.edu Łukasz Kaiser Google Brain lukaszkaiser@google.com Illia Polosukhinz illia .

1y ago

303 Views

GSA Implementation of Google (G) Suite

Google Meet Classic Hangouts Google Chat Google Calendar Google Drive and Shared Drive Google Docs Google Sheets Google Slides Google Forms Google Sites Google Keep Apps Script D

2y ago

316 Views

Google Drive (Google Docs, Google Sheets, Google Slides)

Google Drive (Google Docs, Google Sheets, Google Slides) Employees are automatically issued a Kyrene Google account. Navigate to drive.google.com. Use Kyrene email address and network password to login. Launch in Chrome browser for best experience. Google Drive is a cloud storage sys

2y ago

388 Views

Quick Guide of Using Google Home to Control Smart Devices

Configuration needs Google Home app. Search "Google Home" in App Store or Google Play to install the app. 3.1 Set up Google Home with Google Home app You can skip this part if your Google Home is already set up. 1. Make sure your Google Home is energized. 2. Open the Google Home app by tapping the app icon on your mobile device. 3.

1y ago

326 Views

Elaboração de Provas Online usando o Formulário Google Docs

2 Após o login acesse o Google Drive ou o Google Docs e selecione a ferramenta Google Forms (Formulários). Clique na caixa de Ferramentas do Google, localizada no canto direito superior da tela e selecione o Google Drive. Na tela do Google Drive clique em New , opção More e selecione Google Forms. OBS: É possível acessar o google

11m ago

123 Views

ACS WASC Templates

File upload, Folder upload, Google Docs, Google Sheets, or Google Slides. You can also create Google Forms, Google Drawings, Google My Maps, etc. Share with exactly who you want — without email attachments. Search or sort your list of files, folders, and Google Docs. Preview files and Google Docs.

2y ago

366 Views

Google Drive - San Bernardino City Unified School District

Google Apps All of the Google applications that are available upon logging into Google.com (G , Gmail, Gphotos, Gdrive, etc.). Google Suite Google’s online cloud based office companion applications (Docs, Sheets, Slides). Google Drive Google’s online cloud storage and file sharing/collaboration application.

2y ago

378 Views

Single Sign On for Google Apps with NetScaler Unified Gateway

Google Apps for Work is a suite of cloud computing productivity and collaboration applications provided by Google on a subscription basis. It includes Google’s popular web applications including Gmail, Google Drive, Google Hangouts, Google Calendar and Google

2y ago

295 Views

Serviceteil

Google 84, 87, 124 Google 110 Google AdWords 101, 103 Google Alerts 127 Google Analytics 89 Google Maps 100, 110, 173 Google-Maps 63 Google Places 100, 103, 124 Graphiken 66 H Haftung 170 Haftungsausschluss 72 Hausfarbe 11 Headline 35 Heilmittelwerbegesetz 14, 69, 163 Heilversprechen 164 HONcode 78 HTML 58 HWG 31 I Imagefilm 31

2y ago

336 Views

Best practices for managing identities when you move to Google Cloud

Google Cloud. To provide t he informat ion an organizat ion would ne e d to transfer data and ownership from one Google Account to anot her for s ome of t he noncore Google s er vice s, such as Google Ads, Google Analyt ics, or DV360. Intende d audience Organizat ion administrators. Sta planning Google Cloud / Google Wor kspace migrat ion. Key .

1y ago

481 Views

LIST OF EXHIBITORS

jec world 2020 list of exhibitors / updated on november 20, 2019 company name country corelite inc usa coriolis composites france corso magenta france new cousin composites france covestro deutschland ag germany cpic international co.,limited hong-kong cqfd composites france creaform (ametek sas - division creaform) france crepim france ctmi france

2y ago

115 Views

Introduction - Google Earth User Guide

Google Earth Community: Learn from other Google Earth users by asking questions and sharing answers on the Google Earth Community forums. Using Google Earth: This blog describes how you can use some of the interesting features of Google Earth. Selecting a Server Note: This section is relevant to Google Earth Pro and EC users.

3y ago

288 Views

Using Google Forms to Manage Officials Signups

Google Sheets, deleting a response from the form or sheet will not affect the other. Once the Google Form is linked to a Google Sheet, clicking on the spreadsheet icon will open the linked Google Sheet. Google Responses Sheet Google automatically creates and populates the sp

2y ago

276 Views

Google Cheat Sheets - Shake Up Learning

Google Slides Cheat Sheet p. 15-18 Google Sheets Cheat Sheet p. 19-22 Google Drawings Cheat Sheet p. 23-26 Google Drive for iOS Cheat Sheet p. 27-29 Google Chrome Cheat Sheet p. 30-32 ShakeUpLearning.com Google Cheat Sheets - By Kasey Bell 3

2y ago

296 Views

A Model-Based Approach To Measuring Expertise In

It looks like you're using an ad-blocker