Modeling Cross-linguistic Production . - Stanford University

2y ago
32 Views
2 Downloads
398.72 KB
11 Pages
Last View : 20d ago
Last Download : 3m ago
Upload by : Julius Prosser
Transcription

Proceedings of the Society for Computation in LinguisticsVolume 4Article 202021Modeling cross-linguistic production of referring expressionsBrandon WaldonStanford University, bwaldon@stanford.eduJudith DegenStanford University, jdegen@stanford.eduFollow this and additional works at: https://scholarworks.umass.edu/scilPart of the Computational Linguistics CommonsRecommended CitationWaldon, Brandon and Degen, Judith (2021) "Modeling cross-linguistic production of referringexpressions," Proceedings of the Society for Computation in Linguistics: Vol. 4 , Article 20.Available at: is Paper is brought to you for free and open access by ScholarWorks@UMass Amherst. It has been accepted forinclusion in Proceedings of the Society for Computation in Linguistics by an authorized editor ofScholarWorks@UMass Amherst. For more information, please contact scholarworks@library.umass.edu.

Modeling cross-linguistic production of referring expressionsBrandon WaldonStanford Universitybwaldon@stanford.eduAbstractWe present a novel probabilistic model ofreferring expression production, synthesizingrecent analyses proposed within the Rational Speech Act (RSA) framework (Frank andGoodman, 2012). Our model makes incremental utterance choice predictions (Cohn-Gordonet al. 2018a; Cohn-Gordon et al. 2018b) andassumes a non-deterministic semantics for adjectives in referring expressions (Degen et al.2020). The model captures previously attestedproduction patterns in reference game experiments, including English speakers’ tendencyto produce redundant color adjectives more frequently than redundant size adjectives, as wellas Spanish speakers’ tendency to employ redundant color adjectives less frequently thanEnglish speakers. We report the predictionsmade by the model under various parameterregimes, motivating future empirical work.11Using language to referA key communicative use of language is to refer.Understanding the constraints on referring expression production has therefore been a key enterprisein experimental and computational psycholinguistics alike (Pechmann, 1989; Sedivy, 2003; Gattet al., 2011; van Deemter et al., 2012; Dale and Reiter, 1995). Here we focus on reference to objectspresumed to be in visual common ground betweenspeaker and listener.Figure 1 contains two theoretically-relevanttypes of referring contexts which will be the focus of this paper. Their respective names—thesize-sufficient (SS) scene and the color-sufficient1We thank the audience at the interActive Language Processing Lab at Stanford (ALPS), Vera Gribanova, and BethLevin for helpful feedback and discussion. We also gratefullyacknowledge three anonymous SCiL reviewers and the Spanish judgments provided by four informants: Evelyn RocioFernandez-Lizarraga, Sabrina Grimberg, Adolfo Hermosillo,and Erika Petersen.Judith DegenStanford Universityjdegen@stanford.edu(CS) scene—derive from expectations of how pragmatically competent speakers can use language tounambiguously establish reference to the target object osmall blue (highlighted by green border). Grice(1975) proposed that in order to recover speakermeaning, listeners employ interpretive heuristicsthat can be formulated in terms of assumptionsabout how cooperative speakers behave in conversation, including that they should be as informative,but no more informative, than required. This hasbeen interpreted as an expectation that speakersproduce the minimally informative expressions thatmeet the standards of communicative sufficiencyin context. In the SS scene, one can establish reference to osmall blue using just a size adjective plus ahead noun (e.g. the small pin). In the CS scene, onecan refer to that object using just a color adjectiveplus a head noun (e.g. the blue pin).Contra what we might expect in light of theabove discussion, speakers routinely produce redundant adjectival modifiers in referential contexts(Pechmann, 1989; Nadig and Sedivy, 2002; Maeset al., 2004; Engelhardt et al., 2006; Arts et al.,2011; Koolen et al., 2011). For example, speakersproduce the small blue pin to refer to osmall blue inthe SS scene, where the modifier blue is an instanceof redundant color modification. The small bluepin in the CS scene is an instance of redundant sizemodification, which is much more rarely attested.In addition, rates of redundant modification appear to vary cross-linguistically. Languages suchas Spanish—in which modification tends to occur post-nominally—exhibit lower rates of redundant color modification than does English, in whichthe canonical adjective placement is pre-nominal(Rubio-Fernández, 2016). As Rubio-Fernández(2016) and Cohn-Gordon et al. (2018b) discuss,this result suggests the need to design theories ofreferring expression production that are sensitiveto the linear order of words within those expres-206Proceedings of the Society for Computation in Linguistics (SCiL) 2021, pages 206-215.Held on-line February 14-19, 2021

ish-postnom.-conj.Size-sufficient (SS) sceneobig blueobig redosmall blueColor-sufficient (CS) sceneosmall redobig redosmall blueblue pin, red pin, big pin, small pin,big blue pin, big red pin, small blue pinpin blue, pin red, pin big, pin small,pin blue big, pin red big, pin blue smallpin blue, pin red, pin big, pin small,big pin blue, big pin red, small pin bluepin blue, pin red, pin big, pin small,big pin blue, big pin red,pin small and blue, pin blue and smallblue pin, red pin, big pin, small pin,small red pin, big red pin, small blue pinpin blue, pin red, pin big, pin small,pin red small, pin red big, pin blue smallpin blue, pin red, pin big, pin small,small pin red, big pin red, small pin bluepin blue, pin red, pin big, pin small,small pin red, big pin red,pin small and blue, pin blue and smallFigure 1: Scenes of interest, utterance alternatives in English, and “Englishified” utterance alternatives in ourthree hypothetical Spanish idiolects. Bolded utterances include a redundant adjectival modifier for the purposes ofreferring to the target object (highlighted in green border).sions, and to the inherently incremental nature oflinguistic production and comprehension.In this paper, we present a novel computationalmodel of speakers’ choice of referring expression,synthesizing recent analyses proposed within theRational Speech Act (RSA) framework. In Section 2, we review relevant findings from the experimental literature on linguistic reference, including within-language and cross-linguistic patternsin production choice that inform our desiderata ofa successful computational model. In Section 3,we examine the properties of existing models andargue that a synthesis of those models is necessaryto meet our desiderata. We present that synthesisin Section 4 and report the cross-linguistic predictions made by the model under various parameterregimes. Section 5 extends the analysis to variouspossible Spanish idiolects, which vary according totheir preferred complex multi-adjectival determinerphrase (DP) structures.2Previous experimental findingsA theory of referring expression production shouldexplain observed human production choices. Wefocus on two phenomena that such a theory shouldcapture: the color/size asymmetry in overmodification observed in English, and cross-linguisticvariation in overmodification.2.1 The color/size asymmetryRedundant modification is attested in both SS andCS-like contexts. However, in English, Dutch,and German—the prenominal adjective languageswhich have received the most attention—redundant207color modification is much more frequent than redundant size modification (Degen et al., 2020; Gattet al., 2011; Koolen et al., 2013; Pechmann, 1989;Sedivy, 2003).2.2Cross-linguistic variationLinear ordering of DPs varies cross-linguistically,and there is empirical evidence that this variationpatterns with differing rates of overmodificationacross languages. In particular, Rubio-F

Stanford University, bwaldon@stanford.edu Judith Degen Stanford University, jdegen@stanford.edu Follow this and additional works at: https://scholarworks.umass.edu/scil Part of the Computational Linguistics Commons Recommended Citation Waldon, Brandon and Degen, Judith

Related Documents:

SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet Popularity Qingyuan Zhao Stanford University qyzhao@stanford.edu Murat A. Erdogdu Stanford University erdogdu@stanford.edu Hera Y. He Stanford University yhe1@stanford.edu Anand Rajaraman Stanford University anand@cs.stanford.edu Jure Leskovec Stanford University jure@cs.stanford .

Studies applying one or both of these cross-linguistic methods have yielded six basic findings, summarized briefly as follows. (1) Cross-linguistic variation: First, the papers in this issue (and related cross-linguistic studies by these investigators and other research groups- . much more cross-linguistic research, we hope that this .

Domain Adversarial Training for QA Systems Stanford CS224N Default Project Mentor: Gita Krishna Danny Schwartz Brynne Hurst Grace Wang Stanford University Stanford University Stanford University deschwa2@stanford.edu brynnemh@stanford.edu gracenol@stanford.edu Abstract In this project, we exa

Computer Science Stanford University ymaniyar@stanford.edu Madhu Karra Computer Science Stanford University mkarra@stanford.edu Arvind Subramanian Computer Science Stanford University arvindvs@stanford.edu 1 Problem Description Most existing COVID-19 tests use nasal swabs and a polymerase chain reaction to detect the virus in a sample. We aim to

Stanford University Stanford, CA 94305 bowang@stanford.edu Min Liu Department of Statistics Stanford University Stanford, CA 94305 liumin@stanford.edu Abstract Sentiment analysis is an important task in natural language understanding and has a wide range of real-world applications. The typical sentiment analysis focus on

The Linguistic Wars. Oxford University Press. Harris, Roy. and Talbot Taylor (eds.) (1997). Landmarks In Linguistic Thought Volume I: The Western Tradition From Socrates To Saussure (History of Linguistic Thought), Routledge. [on Frege, Saussure] Heine, Bernd. and Heiko Narrog (eds.) (2010) The Oxford Handbook of Linguistic Analysis.

A city is a kaleidoscope to observe various social and linguistic activities, where people are surrounded by numerous linguistic artifacts, such as posters, billboards, public road signs, and shop signs. Languages displayed in public linguistic artifacts are linguistic landscape (henceforth, LL). The study on the presence,

paper no.1( 2 cm x 5 cm x 0.3 mm ) and allowed to dry sera samples at 1: 500 dilution and their corresponding at room temperature away from direct sun light after filter paper extracts at two-fold serial dilutions ranging that stored in screw-capped air tight vessels at – 200C from 1: 2 up to 1: 256.