J.A. Gray’s Reinforcement Sensitivity Theory (RST) Of .

2y ago
19 Views
1 Downloads
451.34 KB
18 Pages
Last View : 1m ago
Last Download : 1y ago
Upload by : Maxton Kershaw
Transcription

9781412946513-Ch114/18/0812:49 PMPage 23911J.A. Gray’s ReinforcementSensitivity Theory (RST)of PersonalityAlan Pickering and Philip CorrJeffrey Gray’s (1976, 1982) behaviouralinhibition system (BIS) theory of anxiety hasstood well the test of time. This theory ofpersonality – which is now widely known asreinforcement sensitivity theory (RST) – hasgradually evolved over the past 30 years,seeing its major revision in 2000 by Gray andMcNaughton, and even further elaborationsand refinements subsequently (McNaughtonand Corr, 2004, 2008; Corr and McNaughton,2008). However, recent data that havestrengthened the general foundations of theneural basis of the theory have also forcedsignificant modifications of, and additions to,its superstructure. These changes are notinconsequential; as such, predictions cannotnow be based on prior knowledge of the 1982version. These changes, we contend, have thepotential to lead to confusion. A majorpurpose of this chapter is to review the currentscientific status of Gray’s RST and draw outsome of its major implications for futureresearch.RST is built upon a state description ofneural systems and associated, relativelyshort-term, emotions and behaviours, which,according to the theory, give rise to longerterm trait dispositions of emotion andbehaviour. This theory argues that statisticallydefined personality factors are sources ofvariation that are stable over time andthat derive from underlying properties of anindividual; it is these, and current changes inthe environment, that comprise the neuropsychological foundations of ‘personality’. Thisassertion is demanded by the fact thatpersonality traits account for behaviouraldifferences between individuals presentedwith identical environments; also, behaviouraldifferences show consistency across time.Thus, the ultimate goal of personalityresearch is to identify the relatively static(underlying) biological variables that determinethe (superficial) factor structure measured inbehaviour. It would, of course, be a mistaketo deny the relevance of the environment incontrolling behaviour, but to produce consistent long-term effects, environmental influences must be mediated by, and instantiatedin, biological systems.

9781412946513-Ch112404/18/0812:49 PMPage 240THE SAGE HANDBOOK OF PERSONALITY THEORY AND ASSESSMENTGray’s approach to the biological basisof personality followed a particular pattern:(a) first identify the fundamental propertiesof brain-behavioural systems that might beinvolved in the important sources of variationobserved in human behaviour and (b) thenrelate variations in these systems to knownmeasures of personality. Central to thisapproach is the assumption that the variationobserved in the functioning of these brainbehavioural systems comprise what we term‘personality’. As discussed below, relating(a) to (b) has proved the major challenge toRST researchers.Now, most RST studies have tested theunrevised (pre-2000) version of RST. But, aswe shall see, in many crucial respects, therevised Gray and NcNaughton (2000) theoryof the underlying neural systems and theirfunction is very different, leading to the formulation of new personality hypotheses,some of which stand in opposition to thosegenerated from the unrevised theory (for moredetailed discussion of these matters, see Corr,2004, 2008; Corr and McNaughton, 2008;McNaughton and Corr, 2004, 2008).‘CLASSIC’ (1970–2000) ANDREVISED (2000–) REINFORCEMENTSENSITIVITY THEORYToday, in personality research, it is commonto relate personality factors to emotion andmotivational systems, but this consensus didnot prevail before the time of Gray’s originalwork. It is a mark of achievement that Gray’s(1970, 1982) approach is today so widelyaccepted, and the emergence of a neuroscience of personality can be seen to belargely shaped by his work. In a similar veinto Hans Eysenck’s (1957, 1967) theoriesbefore him, Gray’s innovation was to puttogether the existing pieces of the scientificjigsaw in order to provide the foundationsof a general theory of personality. Gray,like Pavlov (1927) before him, advocated atwin-track approach: the conceptual nervoussystem (cns), and the central nervous system(CNS) (cf. Hebb, 1955). That is, the cnscomponents of personality (e.g. learningtheory; see Gray, 1975) and the componentbrain systems underlying systematic variations in behaviour (ex hypothesi, personality). As noted by Gray (1972a), these twolevels of explanation must be compatible, butgiven a state of imperfect knowledge itwould be unwise to abandon one approach infavour of the other. Gray used the languageof cybernetics, in the form of cns–CNSbridge, to show how the flow of informationand control of outputs is achieved (e.g. theGray and Smith, 1969, ‘arousal-decision’model).Theoretical origins of RSTIn contrast to Gray’s bottom-up generalapproach, Hans Eysenck adopted a verydifferent ‘top-down’ method. His search forcausal systems was determined by thestructure of statistically derived personalityfactors/dimensions. In an important respect,Eysenck’s approach was viable: this was tounderstand the causal bases of observedpersonality structure, defined as a unitarywhole (e.g. extraversion and neuroticism).For this very reason, it is perhaps not surprising to learn that Eysenck’s causal systemsnever developed beyond the postulation of asmall number of very general brainprocesses, principally the ascending reticularactivating system (ARAS), underlying thedimension of introversion–extraversion andcortical arousal (for a summary see Corr,2004). A second dimension, neuroticism (N),was related to activation of the limbic systemand emotional instability (see Eysenck andEysenck, 1985). Taken together, Gray’s andEysenck’s approaches are complementary,tackling important problems at differentlevels of analysis.Eysenck’s (1967) arousal theory of extraversion hypothesized that introverts andextraverts differ with respect to the sensitivityof their cortical arousal system; and this is in

9781412946513-Ch114/18/0812:49 PMPage 241J.A. GRAY’S REINFORCEMENT SENSITIVITY THEORY (RST) OF PERSONALITY241FFFS(BIS)PUN: Punishment sensitivity‘anxiety’NeuroticismBASREW: Reward onStabilityFigure 11.1 Position in factor space of the fundamental punishment sensitivity and rewardsensitivity (unbroken lines) and the emergent surface expressions of these sensitivities, viz.extraversion (E) and neuroticism (N) (broken lines). The current working hypothesis is that‘punishment sensitivity’ – which, in the unrevised model, was labelled ‘anxiety – relatesto both the FFFS and BIS’consequence of differences in responsethresholds of their ARAS. According to thistheory, compared with extraverts, introvertshave lower response thresholds and thushigher cortical arousal. In general, introvertswere said to be more cortically aroused andmore arousable when faced with sensorystimulation. However, the extraversionarousal champions marched under a bannerupon which was blazoned an inverted-Usymbol – chosen, in large measure, by virtueof the Pavlovian notion of transmarginalinhibition (TMI; a protective mechanism thatbreaks the link between increasing stimuliintensity and behaviour at high intensitylevels – in the Hullian learning literaturethis effect went under the name of ‘stimulusintensity dynamism’). It was against thistheoretical backdrop that RST developed.Gray’s (1970, 1972b, 1981) modificationof Eysenck’s theory proposed changes: (a) tothe position of extraversion (E) and neuroticism(N) in Eysenckian factor space; and (b) totheir neuropsychological bases. Gray arguedthat E and N should be rotated by approximately 30 degrees to form the more causallyefficient axes of ‘punishment sensitivity’,reflecting anxiety (Anx), and ‘rewardsensitivity’, reflecting impulsivity (Imp)(Figure 11.1; see Pickering et al., 1999).This modification stated that Imp individuals are more sensitive to signals ofreward, relative to Imp individuals, andAnx individuals are more sensitive tosignals of punishment, relative to Anx individuals. The proposed independence of theaxes suggested that (a) responses to rewardshould be the same at all levels of Anx and(b) responses to punishment should be thesame at all levels of Imp – this position wasdubbed the ‘separable subsystems hypothesis’by Corr (2001, 2002). According to RST,Eysenck’s E and N dimensions are derivativesecondary factors of these more fundamental

9781412946513-Ch112424/18/0812:49 PMPage 242THE SAGE HANDBOOK OF PERSONALITY THEORY AND ASSESSMENTpunishment and reward sensitivities: E reflectsthe balance of punishment and rewardsensitivities; N reflects their joint strengths(Gray, 1981).Clinical neurosisEysenck’s taxonomic model of personalitywas based on the factor analysis of thesymptoms of war ‘neurotics’ (1944, 1947),and his 1957 and 1967 causal theories weredesigned to explain the genesis of theseneuroses; it is, thus, on these grounds thatthe theory is critically tested. In brief,Eysenck postulated that introverts are moreprone to suffer from anxiety disorders byvirtue of their greater conditionability,especially of emotional responses. Thistheory was later elaborated to include thenotion of incubation effects in conditioning(Eysenck, 1979), in order to account forthe ‘neurotic paradox’ (i.e. the failure ofextinction with continued non-reinforcementof the CS). Coupled with emotional instability,reflected in N, this made the introvertedneurotic (E /N ) particularly prone toanxiety disorders.However, from the very beginning of thisarousal-based theory of personality, anumber of problems refused to be silenced.For one, introverts show weaker classicalconditioning under conditions conducive tohigh arousal (which, we must assume, is alsoinduced by aversive UCSs), as seen ineyeblink conditioning studies (Eysenckand Levey, 1967). This finding supportsEysenck’s own theory that introverts aretransmarginally inhibited by high arousal,but at the very same moment fails to explainadequately the genesis of clinical neurosis.Other problems also screamed out to beheard. For example, impulsivity (inclinedinto the N plane; see Figure 11.1), not sociability (defining the extraversion axis), isoften found to be associated with conditioning effects (Eysenck and Levey, 1972), butthis places high arousability, and thus highconditionability, along an axis that isorthogonal to the one which has its highpole in the neurotic-introvert quadrantwhere clinical neurosis is located. Thus,Eysenck’s own theory seems unable toexplain the development of anxiety in neurotic-introverts. Time-of-day effects furtherundermine the central postulates ofEysenck’s personality theory of clinicalneurosis (see Gray, 1981).In addition to the above problems, Graycited a further reason to prefer a nonconditioning explanation (Corr, 2008). Now,classical conditioning theory states that asa result of the conditioned stimulus (CS)and unconditioned stimulus (UCS) beingsystematically paired, the CS comes to takeon many of the eliciting properties of theUCS. That is, when presented alone afterconditioning, the CS produces a response(i.e. the conditioned response, CR) thatresembles the unconditioned response(UCR) elicited by the UCS. However, the CRdoes not substitute for the UCR – in severalimportant respects, the CR does not evenresemble the UCR. For example, a painUCS will elicit a wide variety of reactions(e.g. vocalization and behavioural excitement) which are quite different to thoseelicited by a CS signalling pain, which consists of a quite different set of behaviours(e.g. quietness and behavioural inhibition).We thus have a theory that does not seem fitfor purpose: classical conditioning cannotexplain the pathogenesis or phenomenologyof neurosis, although it can explain howinitially neutral stimuli (CSs) acquire themotivational power to elicit this state. Grayasked the crucial question: if classicalconditioning does not account for the generation of the negative emotional state thatcharacterises neurosis, then what does? Hisanswer – based upon extensive animalresearch (e.g. behavioural, pharmacological,lesion, and electrical stimulation studies) –was an innate mechanism, namely thebehavioural inhibition system (BIS; Gray,1976, 1982).

9781412946513-Ch114/18/0812:49 PMPage 243J.A. GRAY’S REINFORCEMENT SENSITIVITY THEORY (RST) OF PERSONALITYThree systems of ‘classic’ RSTRST gradually developed over the years toinclude three major systems of emotion:1 The behavioural inhibition system (BIS) waspostulated to be sensitive to conditioned aversivestimuli (i.e. signals of both punishment andthe omission/termination of reward) relating toAnx, but also to extreme novelty, high-intensitystimuli, and innate fear stimuli (e.g. snakes, blood),which are more related to fear.In addition, two other systems werepostulated:2 The fight/flight system (FFS) was postulated tobe sensitive to unconditioned aversive stimuli(i.e. innately painful stimuli), mediating the emotions of rage and panic. This system was relatedto the state of negative affect (NA) (associatedwith pain) and speculatively associated by Graywith Eysenck’s trait of psychoticism.3 The behavioural approach system (BAS) waspostulated to be sensitive to conditioned appetitivestimuli, forming a positive feedback loop, activatedby the presentation of stimuli associated withreward and the termination/omission of signalsof punishment. This system was related to thestate of positive affect (PA) and the trait of Imp.The BIS was modelled on the detailed patternof behavioural effects of classes of drugsknown to affect anxiety in human beings. Bythis route, Gray argued, anxiety could beoperationally specified as those behaviourschanged by anxiolytic drugs. Of course, thereexists here the danger of circularity ofargument; this was avoided by the postulation that anxiolytic drugs do not simplyreduce anxiety (itself a vacuous tautology),but could be shown to have a number ofbehavioural effects in typical animal learningparadigms. Experimental evidence showedthat anti-anxiety drugs affected responses toconditioned aversive stimuli, the omission ofexpected reward and conditioned frustration,all of which Gray postulated were mediated bya BIS, which was responsible for suppressing243ongoing operant behaviour in the face ofthreat, as well as enhancing informationprocessing and vigilance. (We shall see thatin this revised theory, these effects can bereclassified as conflict effects.) Later, theBAS was added to account for behaviouralreactions to rewarding stimuli – these werelargely unaffected by anti-anxiety drugs. Thedanger of a circularity of argument wasfurther reduced by the behavioural profile ofthe newer classes of anxiolytics which, itturned out, had the same behavioural effectsand acted on the same neural systems as theolder class of drugs, despite the fact thatthey had different psychopharmacologicalmodes of action and side-effects (Gray andMcNaughton, 2000).Revised (2000–) RSTThe Gray and McNaughton (2000) revisedtheory updates and extends the ‘classic’version. These changes are, in parts, substantial:but, in other parts, more a clarification of the1982 theory. Revised RST postulates threesystems.1 The fight–flight–freeze system (FFFS) is responsible for mediating reactions to aversive stimuli ofall kinds, conditioned and unconditioned. It further proposes that there exists a hierarchicalarray of neural modules, responsible for avoidance and escape behaviours. Now, the FFFS mediates the emotion of fear, not anxiety. Theassociated personality factor comprises fearproneness and avoidance, which is clinicallymapped onto such disorders as phobia and panic.2 The BAS mediates reactions to all appetitive stimuli, conditioned and unconditioned. This systemgenerates the appetitively hopeful emotion of‘anticipatory pleasure’, and hope itself. The associated personality comprises optimism, rewardorientation and impulsiveness, which clinicallymaps onto addictive behaviours (e.g. pathological gambling) and various varieties of high-risk,impulsive behaviour, and possibly the appetitivecomponent of mania. The BAS is largelyunchanged in the revised Gray and McNaughtonversion of RST.

9781412946513-Ch112444/18/0812:49 PMPage 244THE SAGE HANDBOOK OF PERSONALITY THEORY AND ASSESSMENT3 The BIS is responsible, not, as in the 1982 version,for mediating reactions to conditioned aversivestimuli and the special class of innate fearstimuli, but for the resolution of goal conflict ingeneral (e.g. between BAS-approach and FFFSavoidance, as in foraging situations – but it isalso involved in BAS–BAS and FFFS–FFFS conflicts).The BIS generates the emotion of anxiety, whichentails the inhibition of prepotent conflictingbehaviours, the engagement of risk assessmentprocesses, and the scanning of memory andthe environment to help resolve concurrentgoal conflict.The BIS resolves conflicts by increasing,through recursive loops, the negative valenceof stimuli (these are adequate inputs intothe FFFS), until behavioural resolutionoccurs in favour of approach or avoidance.Subjectively, this state is experienced as worryand rumination. The associated personalitycomprises worry-proneness and anxiousrumination, leading to being constantly on thelook-out for possible signs of danger, whichmap clinically onto such conditions as generalized anxiety and obsessional-compulsivedisorder (OCD). There is an optimal level ofBIS activation: too little leads to risk seeking(e.g. psychopathy) and too much to riskaversion (generalized anxiety), both reflectingsuboptimal conflict resolution.NEUROPSYCHOLOGICAL STRUCTUREOF THE REVISED THEORYRevised RST agrees with the classical versionin its assertion that substantive affectiveevents fall into just two distinct majorclasses: positive and negative (Gray, 1975;Gray, 1982; Gray and McNaughton, 2000).Rewards and punishments are the obviousexemplars of positive and negative events,respectively. But, importantly for humanexperiments, the absence of an expected positive event is functionally the same as thepresence of a negative event and vice-versa(Gray, 1975). Omission of expected reward isthus punishing. Similarly, the absence of anexpected negative event is functionally thesame as the presence of a positive event.Omission of punishment is rewarding. Thisbasic scheme gives rise to a two-dimensionalmodel of the neuropsychology of emotion,motivation, and personality that simplifies thetheory, as well as serving as a point of unification of the otherwise complex arrangementof the separate neural modules underlyingbehaviour (McNaughton and Corr, 2004).Fear and anxiety –defensive directionThe first dimension, ‘defensive direction’, iscategorical. It rests on a functional distinctionbetween behaviours that remove an animalfrom a source of danger (FFFS-mediated)and those that allow it cautiously to approacha source of potential danger (BIS-mediated).These functions are ethologically and pharmacologically distinct and, on each of theseseparate grounds, can be identified with fearand anxiety, respectively. The revised theorytreats fear and anxiety as not only quitedistinct but also, in a sense, as opposites.The categorical separation of fear fromanxiety as classes of defensive responses hasbeen demonstrated by Robert and CarolineBlanchard (Blanchard and Blanchard, 1988,1990; Blanchard et al., 1997).The Blanchards used ‘ethoexperimentalanalysis’ of the innate reactions of rats to catsto determine the functions of specific classesof behaviour. One class of behaviours waselicited by the immediate presence of a predator. This class could clearly be attributed toa state of fear. The behaviours, grouped intothe class on purely ethological grounds, weresensitive to panicolytic drugs but not todrugs that are specifically anxiolytic. This isconsistent with the insensitivity to anxiolyticdrugs of active avoidance in a wide variety ofspecies, and phobia in humans is also insensitive to anxiolytic

version. These changes, we contend, have the potential to lead to confusion. A major purpose of this chapter is to review the current scientific status of Gray’s RST and draw out some of its major implications for future research. RST is built upon a state description of neural systems and associated, relatively short-term, emotions and .

Related Documents:

La paroi exerce alors une force ⃗ sur le fluide, telle que : ⃗ J⃗⃗ avec S la surface de la paroi et J⃗⃗ le vecteur unitaire orthogonal à la paroi et dirigé vers l’extérieur. Lorsque la

Using a retaining wall as a case-study, the performance of two commonly used alternative reinforcement layouts (of which one is wrong) are studied and compared. Reinforcement Layout 1 had the main reinforcement (from the wall) bent towards the heel in the base slab. For Reinforcement Layout 2, the reinforcement was bent towards the toe.

Footing No. Footing Reinforcement Pedestal Reinforcement - Bottom Reinforcement(M z) x Top Reinforcement(M z x Main Steel Trans Steel 2 Ø8 @ 140 mm c/c Ø8 @ 140 mm c/c N/A N/A N/A N/A Footing No. Group ID Foundation Geometry - - Length Width Thickness 7 3 1.150m 1.150m 0.230m Footing No. Footing Reinforcement Pedestal Reinforcement

Standard Colors Leatherette* Cloth/cloth* Black P4 PL Gray PU PT Blue FE FC Burgundy AA A9 Tan 61 6R Brown N/A 6Q Black with red insert P5 PM Black with gray insert P6 PN Black with tan insert P7 PP Black with blue insert P8 PQ Gray with black insert PX n/a Gray with blue insert PV n/a Gray with red insert PW n/a Blue with gray insert FF FD

IEOR 8100: Reinforcement learning Lecture 1: Introduction By Shipra Agrawal 1 Introduction to reinforcement learning What is reinforcement learning? Reinforcement learning is characterized by an agent continuously interacting and learning from a stochastic environment. Imagine a robot movin

2.3 Deep Reinforcement Learning: Deep Q-Network 7 that the output computed is consistent with the training labels in the training set for a given image. [1] 2.3 Deep Reinforcement Learning: Deep Q-Network Deep Reinforcement Learning are implementations of Reinforcement Learning methods that use Deep Neural Networks to calculate the optimal policy.

Meta-reinforcement learning. Meta reinforcement learn-ing aims to solve a new reinforcement learning task by lever-aging the experience learned from a set of similar tasks. Currently, meta-reinforcement learning can be categorized into two different groups. The first group approaches (Duan et al. 2016; Wang et al. 2016; Mishra et al. 2018) use an

In this section, we present related work and background concepts such as reinforcement learning and multi-objective reinforcement learning. 2.1 Reinforcement Learning A reinforcement learning (Sutton and Barto, 1998) environment is typically formalized by means of a Markov decision process (MDP). An MDP can be described as follows. Let S fs 1 .