Journal of Vision (2014) 14(7):4, 41Modeling visual clutter perception using proto-objectsegmentationChen-Ping YuDepartment of Computer Science,Stony Brook University, Stony Brook, NY, USA# Dimitris SamarasDepartment of Computer Science,Stony Brook University, Stony Brook, NY, USA# Department of Psychology, Stony Brook University,Stony Brook, NY, USADepartment of Computer Science,Stony Brook University, Stony Brook, NY, USA# Gregory J. ZelinskyWe introduce the proto-object model of visual clutterperception. This unsupervised model segments animage into superpixels, then merges neighboringsuperpixels that share a common color cluster toobtain proto-objects—defined here as spatiallyextended regions of coherent features. Clutter isestimated by simply counting the number of protoobjects. We tested this model using 90 images ofrealistic scenes that were ranked by observers fromleast to most cluttered. Comparing this behaviorallyobtained ranking to a ranking based on the modelclutter estimates, we found a significant correlationbetween the two (Spearman’s q ¼ 0.814, p , 0.001).We also found that the proto-object model was highlyrobust to changes in its parameters and wasgeneralizable to unseen images. We compared theproto-object model to six other models of clutterperception and demonstrated that it outperformedeach, in some cases dramatically. Importantly, we alsoshowed that the proto-object model was a betterpredictor of clutter perception than an actual count ofthe number of objects in the scenes, suggesting thatthe set size of a scene may be better described byproto-objects than objects. We conclude that thesuccess of the proto-object model is due in part to itsuse of an intermediate level of visual representation—one between features and objects—and that this isevidence for the potential importance of a protoobject representation in many common visual perceptsand tasks.IntroductionBehavioral studies of visual clutterClutter is deﬁned colloquially as ‘‘a crowded ordisordered collection of things’’ ). More operational deﬁnitions have also been proposed, deﬁningclutter as ‘‘the state in which excess items, or theirrepresentation or organization, lead to a degradation ofperformance at some task’’ (Rosenholtz, Li, & Nakano,2007; p. 3). Whatever deﬁnition one chooses, visualclutter is a perception that permeates our lives in anuntold number of ways. It affects our ability to ﬁndthings (e.g., Neider & Zelinsky, 2011), how productsare marketed and sold to us (Pieters, Wedel, & Zhang,2007), the efﬁciency in which we interact with devices(Stone, Fishkin, & Bier, 1994), and even whether weﬁnd displays aesthetically pleasing or not (Michailidou,Harper, & Bechhofer, 2008). For these reasons, clutterand its consequences have been actively researched overthe past decade in ﬁelds as diverse as psychology andvision science, marketing, visualization, and interfacedesign. The goal of this study is to apply techniquesfrom computer vision to better quantify the behavioralperception of clutter, not only to make available clutterestimates to these widely varying domains but also tomore fully understand this ubiquitous and importantpercept.The effects of visual clutter have been studied mostaggressively in the context of a search task, whereseveral studies have shown that increasing clutterCitation: Yu, C.-P., Samaras, D., & Zelinsky, G. J. (2014). Modeling visual clutter perception using proto-object segmentation.Journal of Vision, 14(7):4, 1–16, http://www.journalofvision.org/content/14/7/4, doi:10.1167/14.7.4.doi: 10 .116 7 /1 4. 7. 4Received December 14, 2013; published June 5, 2014ISSN 1534-7362 Ó 2014 ARVO
Journal of Vision (2014) 14(7):4, 1–16Yu, Samaras, & Zelinsky2Figure 1. What is the set size of these scenes? Although quantifying the number of objects in realistic scenes may be an ill-posedproblem, can you make relative clutter judgments between these scenes?negatively impacts the time taken to ﬁnd a target in ascene (Mack & Oliva, 2004; Rosenholtz et al., 2007;Bravo & Farid, 2008; Henderson, Chanceaux, & Smith,2009; van den Berg, Cornelissen, & Roerdink, 2009;Neider & Zelinsky, 2011).1 Fueling this interest inclutter among visual search researchers is the set sizeeffect—the ﬁnding that search performance degrades asobjects are added to a display. Many hundreds ofstudies have been devoted to understanding set sizeeffects (e.g., Wolfe, 1998), but the vast majority of thesehave been in the context of very simple displaysconsisting of well segmented objects. Quantifying setsize in such displays is trivial—one need only count thenumber of objects. But how many objects are there inan image of a forest, or a city, or even a kitchen (Figure1)? Is each tree or window a different object? Whatabout each branch of a tree or each brick in a wall? Ithas even been argued that the goal of quantifying setsize in a realistic scene is not only difﬁcult, it is illconceived (Neider & Zelinsky, 2008). As the visualsearch community has moved over the past decade tomore realistic scenes (Eckstein, 2011), it has thereforefaced the prospect of abandoning its most cherishedtheoretical concept—the set size effect.The quantiﬁcation of visual clutter offers a potentialsolution to this problem. Given that search performance also degrades with increasing clutter (e.g.,Henderson et al., 2009; Neider & Zelinsky, 2011),clutter has been proposed as a surrogate measure of theset size effect, one that can be applied to images ofrealistic scenes (Rosenholtz et al., 2007). The logic hereis straightforward; if it is not possible to quantify thenumber of objects in a scene, ﬁnd a correlate to set sizethat can be quantiﬁed and use it instead.Although models of clutter will be reviewed in thefollowing section, one of the earliest attempts to modelvisual clutter used edge density—the ratio of thenumber of edges in an image to the image size (Mack &Oliva, 2004). This edge density model was followedshortly after by the more elaborate feature congestionmodel, which estimates clutter in terms of the density ofintensity, color, and texture features in an image(Rosenholtz et al., 2007). Despite other more recentmodeling efforts (Bravo & Farid, 2008; Lohrenz,Trafton, Beck, & Gendron, 2009; van den Berg et al.,2009), the simplicity and early success of the featurecongestion model, combined with the fact that the codeneeded to run the model was available for publicdownload, led to its adoption as a clutter quantiﬁcationbenchmark by the community of visual clutterresearchers.The feature congestion model has been extensivelyevaluated in studies of visual clutter. Prominent amongthese was a study by Henderson et al. (2009), whomeasured the effect of visual clutter on search behaviorin terms of manual and oculomotor dependentvariables and using images of real-world scenes asstimuli, which marked a departure from previous workthat used simpler chart and map stimuli. They foundthat increasing visual clutter indeed negatively impacted search performance, both in terms of longer searchtimes and a less efﬁcient direction of gaze to targets,thereby supporting the claim that clutter can be used asa surrogate measure of set size in real-world scenes.However, they also found that the feature congestionmodel was no better than a simpler measure of edgedensity in predicting this effect of visual clutter onsearch. Building on this work, Neider and Zelinsky(2011) sought again to quantify effects of clutter onmanual and oculomotor search behavior, this timeusing scenes that were highly semantically related toeach other (thereby ruling out semantic contributionsto any observed clutter effects). They did this by usingthe game SimCity to obtain a database of city imagesthat grew visually more cluttered over the course ofgame play. Their results largely replicated those fromthe earlier Henderson et al. study (2009), ﬁnding thatedge density was at least as good as feature congestionin predicting the effect of clutter on search.Computational models of clutterIn one of the earliest studies relating clutter to visualsearch, Bravo and Farid (2004) used ‘‘simple’’ stimuli,deﬁned as objects composed of one material, and‘‘compound’’ stimuli, deﬁned as objects having two ormore parts, and found an interesting interaction between
Journal of Vision (2014) 14(7):4, 1–16Yu, Samaras, & Zelinskythis manipulation and clutter. Search performance forsimple and compound stimuli was roughly comparablewhen these objects were arranged into sparse searchdisplays. However, when these objects were denselypacked in displays, a condition that would likely beperceived as more cluttered, search efﬁciency was foundto degrade signiﬁcantly for the compound objects. Thisobservation led to their quantiﬁcation of clutter using apower law model (Bravo & Farid, 2008). This modeluses a graph-based image segmentation method (Felzenszwalb & Huttenlocher, 2004) and a power lawfunction having the form y ¼ cxk, where x is a smallestsegment-size parameter. Setting the exponent k to 1.32,they ﬁnd the best ﬁtting c and use it as the clutterestimate for a given image. Using 160 ‘‘what’s-in-yourbag’’ images (http://whatsinyourbag.com/), they reported a correlation of 0.62 between these clutter estimatesand behavioral search time (Bravo & Farid, 2008).As Bravo and Farid (2004) were conducting theirseminal clutter experiments, Rosenholtz, Li, Mansﬁeld,and Jin (2005) were developing their aforementionedfeature congestion model of visual clutter. Thisinﬂuential model extracts color, luminance, and orientation information from an image, with color andluminance obtained after conversion to CIElab colorspace (Pauli, 1976) and orientation obtained by usingorientation speciﬁc ﬁlters (Bergen & Landy, 1991) tocompute oriented opponent energy. The local varianceof these features, computed through a combination oflinear ﬁltering and nonlinear operations, is then used tobuild a three-dimensional ellipse. The volume of thisellipse therefore becomes a measure of feature variability in an image, which is used by the model as theclutter estimate—the larger the volume, the greater theclutter (Rosenholtz et al., 2007). Using a variety of mapstimuli, they tested their model against the edge densitymodel (Mack & Oliva, 2004) and found that bothpredicted search times reasonably well (experiment 1; r¼ 0.75 for feature congestion; r ¼ 0.83 for edge density).However, when the target was deﬁned by the contrastthreshold needed to achieve a given level of searchperformance (experiment 2) the feature congestionmodel (r ¼ 0.93) outperformed the edge density model(r ¼ 0.83).More recently, Lohrenz et al. (2009) proposed theirC3 (Color-Cluster Clutter) model of clutter, whichderives clutter estimates by combining color densitywith global saliency. Color density is computed byclustering into polygons those pixels that are similar inboth location and color. Global saliency is computedby taking the weighted average of the distances betweeneach of the color density clusters. They tested theirmodel in two experiments: one using 58 displaysdepicting six categories of maps (airport terminal maps,ﬂowcharts, road maps, subway maps, topographiccharts, and weather maps) and another using 54 images3of aeronautical charts. Behavioral clutter ratings wereobtained for both stimulus sets. These behavioralratings were found to correlate highly with clutterestimates from the C3 model (r ¼ 0.76 and r ¼ 0.86 inexperiments 1 and 2, respectively), more so thancorrelations obtained from the feature congestionmodel (r ¼ 0.68 and r ¼ 0.75, respectively).Another recent approach, the crowding model ofvisual clutter, focuses on the density of information in adisplay (van den Berg et al., 2009). Images are ﬁrstconverted to CIElab color space, then decomposedusing oriented Gabors and a Gaussian pyramid (Burt &Adelson, 1983) to obtain color and luminance channels.The luminance channel of the image is then ﬁltered withdifference-of-Gaussian ﬁlters to obtain a contrastimage, and all of the channels are post-processed withlocal averaging. It is this local averaging that ishypothesized to be the mechanism of crowding underthis model. The channels are then ‘‘pooled’’ by taking aweighted average with respect to the center of theimage, resulting in a progressive blurring radiating outfrom the image’s center. Pooled results are compared tothe original channels using a sliding window thatcomputes the KL-divergence between the two, therebyquantifying the loss of information due to possiblecrowding, and this procedure is repeated over all scalesand features and ﬁnally combined by taking a weightedsum to produce the ﬁnal clutter score. They evaluatedtheir model on the 25 map images used to test theoriginal version of the feature congestion model(Rosenholtz et al., 2005) and found a comparablecorrelation with the behavioral ratings (r ¼ 0.84; vanden Berg, Roerdink, & Cornelissen, 2007).Image segmentation and proto-objectsMotivating the study of clutter is the assumptionthat objects cannot be meaningfully segmented fromimages of arbitrary scenes, but is this true? Thecomputer vision community has been working fordecades on this problem and has made good progress.Of the hundreds of scholarly reports on this topic, theones that are most relevant to the goal of quantifyingthe number of objects in a scene (i.e., obtaining a setsize) are those that use an unsupervised analysis of animage that requires no prior training or knowledge ofparticular object classes. Among these methods, themost popular have been normalized cut (Shi & Malik,2000), mean-shift image segmentation (Comaniciu &Meer, 2002), and a graph-based method developed byFelzenszwalb and Huttenlocher (2004). However,despite clear advances and an impressive level ofsuccess, these methods are still far from perfect.Crucially, these methods are typically evaluated againsta ground truth of object segmentations obtained from
Journal of Vision (2014) 14(7):4, 1–16Yu, Samaras, & Zelinskyhuman raters (the Berkeley segmentation dataset;Martin, Fowlkes, Tal, & Malik, 2001; Arbelaez, Maire,Fowlkes, & Malik, 2011), which, as already discussed,is purely subjective and also imperfect. This reliance ona human ground truth means that image segmentationmethods, regardless of how accurate they become, willnot be able to answer the question of how many objectsexist in a scene, as this answer ultimately depends onwhat people believe is, and is not, an object.Recognizing the futility of obtaining objective andquantiﬁable counts of the objects in scenes, the approachtaken by most existing models of clutter (reviewed above)has been to abandon the notion of objects entirely. Theclearest example of this is the feature congestion model,which quantiﬁes the feature variability in an imageirrespective of any notion of an object. Abandoningobjects altogether, however, seems to us an overlyextreme conceptual movement in the opposite direction,and that there exists an alternative that ﬁnds a middleground; rather than attempting to quantify clutter interms of features or objects, attempt this quantiﬁcationusing something between the two—proto-objects.The term proto-object, or pre-attentive object(Pylyshyn, 2001), was coined by Rensink and Enns(1995; 1998) and elaborated in later work by Rensinkon coherence theory (Rensink, 2000). Coherence theorystates that proto-objects are low-level representationsof feature information computed automatically by thevisual system over local regions of space, and thatattention is the process that combines or groups theseproto-objects to form objects. Under this view protoobjects are therefore the representations from whichobjects are built, with attention being the metaphoricalhand that holds them together. Part of the appeal ofproto-objects is that they are biologically plausible—requiring only the grouping of similar low-level featuresfrom neighboring regions of space. This is consistentwith the integration of information over increasinglylarge regions of space as processing moves farther fromthe feature detectors found in V1 (Olshausen, Anderson, & Van Essen, 1993; see also Eckhorn et al., 1988).Since their proposal, proto-objects have appeared asprominent components in several models of visualattention. Orabona, Metta, and Sandini (2007) proposed a model based on proto-objects that aresegmented using blob detectors, operators that extractblobs using Difference-of-Gaussian (Collins, 2003) orLaplacian-of-Gaussian (Lindeberg, 1998) ﬁlters (Lindeberg, 1998; Collins, 2003), which are combined into asaliency map for their visual attention model. A similarapproach was adopted by Wischnewski, Steil, Kehrer,and Schneider (2009), who proposed a model of visualattention that uses a color blob detector (Forssén,2004) to form proto-objects. These proto-objects arethen combined with the Theory of Visual Attention(TVA, Bundesen, 1990) to produce a priority map that4captures both top-down and bottom-up contributionsof attention, with the bottom-up contribution being thelocally grouped features represented by proto-objects.Follow-up work has since extended this proto-objectbased model from static images to video, therebydemonstrating the generality of the approach (Wischnewski, Belardinelli, Schneider, & Steil, 2010).The proto-object model of clutter perceptionUnderlying our approach is the assumption that,whereas quantifying and counting the number ofobjects in a scene is a futile effort, quantifying andcounting proto-objects is not. We deﬁne proto-objectsas coherent regions of locally similar features that canbe used by the visual system to build perceptual objects.While conceptually related to other proto-objectsegmentation approaches reported in the behavioralvision literature, our approach differs from these in onekey respect. Although previous approaches have usedblob detectors to segment proto-objects from saliencymaps (Walther & Koch, 2006; Hou & Zhang, 2007),bottom-up representations of feature contrast in animage (Itti, Koch, & Niebur, 1998; Itti & Koch, 2001),or applied color blob detectors directly to an image orvideo (Wischnewski et al., 2010), this reliance on blobdetection likely results in only a rough approximationof the information used to create proto-objects. Blobdetectors, by deﬁnition, constrain proto-objects to havean elliptical shape, and this loss of edge informationmight be expected to lower the precision of anysegmentation. The necessary consequence of this is thatapproaches using blob detection will fail to capture theﬁne-grained spatial structure of irregularly shaped realworld objects. It would be preferable to extract protoobjects using methods that retain this spatial structureso as to better approximate the visual complexity ofobjects in our everyday world. For this we turn toimage segmentation methods from computer vision.We propose the proto-object model of clutterperception, which combines superpixel image segmentation with a clustering method (mean-shift, Comaniciu& Meer, 2002) to merge featurally similar superpixelsinto proto-objects. These methods from computer visionare well-suited to the goal of creating proto-objects, asthey address directly the problem of grouping similarimage pixels into larger contiguous regions of arbitraryshape. However, our proto-object model differs fromstandard image segmentation methods in one importantrespect. Standard methods aim to match extractedsegments to a labeled ground truth segmentation ofobjects, as determined by human observers, where eachsegment corresponds to a complete and (hopefully)recognizable object. One example of this is the BerkeleySegmentation Dataset (Arbelaez et al., 2011), a currently
Journal of Vision (2014) 14(7):4, 1–16Yu, Samaras, & Zelinsky5Figure 2. Left: one of the images used in this study. Right, top row: a SLIC superpixel segmentation using 200 (left) and 1,000 (right)seeds. Right, bottom row: an entropy rate superpixel segmentation using 200 (left) and 1,000 (right) seeds. Notice that thesuperpixels generated by SLIC are more compact and regular, whereas those generated by the entropy rate method have greaterboundary adherence but are less regular.popular benchmark against which image segmentationmethods can be tested and their parameters tuned.However, proto-objects are the fragments from whichobjects are built, making these object-based groundtruths not applicable. Nor is it reasonable to askobservers to reach down into their mid-level visualsystems to perform a comparable labeling of protoobjects. For better or for worse, there exists no groundtruth for proto-object segmentation that can be used toevaluate models or tune parameters. We therefore use asa ground truth behaviorally obtained rankings of imageclutter and then determine how well our proto-objectmodel, and the models of others, can predict theserankings. Our approach is therefore interdisciplinary,applying superpixel segmentation and clustering methods from computer vision to the task of modelinghuman clutter perception.MethodsComputationalThe proto-object model of clutter perception consistsof two basic stages: A superpixel segmentation stage toobtain image fragments, followed by a clustering andmerging stage to assemble these fragments into protoobjects. Given that proto-objects are then simplycounted to estimate clutter, the core function of themodel is therefore captured by these two stages, whichare detailed in the following sections.Superpixel segmentationWe deﬁne an image fragment as a set of pixels thatshare similar low-level color features in some colorspace, such as RGB, HSV, or CIElab. This makes animage fragment computationally equivalent to an imagesuperpixel, an atomic region of an image containingpixels that are similar in some feature space, usuallyintensity or color (Veksler, Boykov, & Mehrani, 2010).Superpixels have become very popular as a preprocessing stage in many bottom-up image segmentationmethods (Wang, Jia, Hua, Zhang, & Quan, 2008; Yang,Wright, Ma, & Sastry, 2008; Kappes, Speth, Andres,Reinelt, & Schn, 2011; Yu, Au, Tang, & Xu, 2011) andobject detection methods (Endres & Hoiem, 2010; vande Sande, Uijlings, Gevers, & Smeulders, 2011) becausethey preserve the boundaries between groups of similarpixels. Boundary preservation is a desirable property asit enables object detection methods to be applied tooversegmented images (i.e., many fragments) rather thanindividual pixels, without fear of losing important edgeinformation. The ﬁrst and still very popular superpixelsegmentation algorithm is normalized cut (Shi & Malik,2000). This method takes an image and a singleparameter value, the number of desired superpixels (k),and produces a segmentation by analyzing the eigenspace of the image’s intensity values. However, becausethe run time of this method increases exponentially withimage resolution, it is not suitable for the large images(e.g., 800 · 600) used in most behavioral experiments,including ours. We therefore experimented with twomore recent and computationally efﬁcient methods forsuperpixel segmentation, the SLIC superpixel (Achantaet al., 2012) and the entropy rate superpixel (Liu, Tuzel,Ramalingam, & Chellappa, 2011).Figure 2 shows representative superpixel segmentations using these two methods. Both methods initiallydistribute ‘‘seeds’’ evenly over an input image, thenumber of which is speciﬁed by a user-supplied inputparameter (k), and these seeds determine the number ofsuperpixels that will be extracted from an image. Thealgorithms then iteratively grow each seed’s pixel
Journal of Vision (2014) 14(7):4, 1–16Yu, Samaras, & Zelinsky6Figure 3. The computational procedure illustrated for a representative scene. Top row (left to right): a SLIC superpixel segmentationusing k ¼ 600 seeds; 51 clusters of median superpixel color using mean-shift (bandwidth ¼ 4) in HSV color space; 209 proto-objectsobtained after merging, normalized visual clutter score ¼ 0.345; a visualization of the proto-object segmentation showing each protoobject filled with the median color from the corresponding pixels in the original image. Bottom row (left to right): an entropy ratesuperpixel segmentation using k ¼ 600 seeds; 47 clusters of median superpixel color using mean-shift (bandwidth ¼ 4) in HSV colorspace; 281 proto-objects obtained after merging, normalized visual clutter score ¼ 0.468; a visualization of the proto-objectsegmentation showing each proto-object filled with the median color from the corresponding pixels in the original image.coverage by maximizing an objective function thatconsiders edge strengths and local afﬁnity until all theseeds have converged to a stationary segment coverage.It is worth noting that this approach of oversegmentingan image also fragments large uniform areas intomultiple superpixels, as multiple seeds would likelyhave been placed within such regions (e.g., the sky issegmented into multiple superpixels in Figure 3).Because superpixel segmentation is usually used as apreprocess, this oversegmentation is not normally aproblem, although clearly it is problematic for thepresent purpose. More fundamentally, because theparameter k determines the number of superpixels thatare created, and that the number of proto-objects willbe used as our estimate of clutter, this user speciﬁcationof k makes superpixel segmentation wholly inadequateas a direct method of proto-object creation and clutterestimation. For these reasons we therefore need asecond clustering stage that uses feature similarity tomerge these superpixel image fragments into coherentregions (proto-objects).Superpixel clusteringTo merge neighboring superpixels having similarfeatures we perform a cluster analysis on the colorfeature space. Given the singular importance placed oncolor in this study, three different color spaces areexplored: RGB, HSV, and CIElab. In this respect, ourapproach is related to the C3 clutter model, whichgroups similar pixels by spatial proximity and colorsimilarity if they fall under a threshold (Lohrenz et al.,2009). However, our work differs from this previousmodel by using mean-shift (Cheng, 1995; Comaniciu &Meer, 2002) to ﬁnd the color clusters in a given image,then assigning superpixels to one of these clusters basedon the median color over the image fragment in thecolor feature space. We then merge adjacent superpixels (ones that share a boundary) falling within thesame color cluster into a larger region, thereby forminga proto-object.We should note that the mean-shift algorithm hasitself been used as an image segmentation method(Comaniciu & Meer, 2002) and indeed is one of themethods that we evaluate in our comparative analysis.Mean-shift clusters data into an optimal number ofgroups by iteratively shifting every data point to acommon density mode, with a bandwidth parameterdetermining the search area for the shift directions; thedata that converge to the same density mode areconsidered to belong to the same cluster. Thisclustering algorithm has been applied to imagesegmentation by ﬁnding a density mode for every imagepixel, then assigning pixels that converge to a commonmode to the same cluster, again based on spatialproximity and color similarity. Doing this for allcommon modes results in a segmentation of pixels intocoherent regions. Our approach differs from thisstandard application of mean-shift in that we use thealgorithm, not for segmentation, but only for clustering. Speciﬁcally, mean-shift is applied solely to the
Journal of Vision (2014) 14(7):4, 1–16Yu, Samaras, & Zelinskyspace of color medians in an image (i.e., using only thefeature-space bandwidth parameter and not both thefeature-space and spatial bandwidth parameters as inthe original formulation of the algorithm), where eachmedian corresponds to a superpixel, and it returns theoptimal number of color clusters in this space. Havingclustered the data, we then perform the above describedassignment of superpixels to clusters, followed bymerging, outside of the mean-shift segmentationmethod. By applying mean-shift at the level of superpixels, and by using our own merging method, we willshow that our proto-object model is a better predictorof human clutter perception than standard mean-shiftimage segmentation.Summary of the proto-object modelFigure 3 illustrates the key stages of the proto-objectmodel of clutter perception, which can be summarizedas follows:1. Obtain superpixels for an image and ﬁnd the mediancolor for each. We will argue that our model isrobust with respect to the speciﬁc superpixelsegmentation method used and will show that thebest results were obtained with entropy rate superpixels (Liu et al., 2011) using k ¼ 600 initial seeds.2. Apply mean-shift clustering to the color spacedeﬁned by the superpixel medians to obtain theoptimal number of color clusters in the featurespace. We will again argue that our model is robustwith respect to the speciﬁc color space that is usedbut that slightly better correlations with humanclutter rankings were found using a bandwidth offour in an HSV color feature space.3. Assign each superpixel to a color cluster based onthe median color similarity and merge adjacentsuperpixels falling into the same cluster to create aproto-object segmentation.4. Normalize the proto-object quantiﬁcation betweenzero and one by dividing the ﬁnal number of protoobjects computed for an image by the initial knumber of superpixel seeds. Higher normalizedvalues indicate more cluttered images.BehavioralBehavioral data collection was limited to thecreation of a set of clutter ranked images. We did thisout of concern that the previous image sets used toevaluate models were limited in various respects,especially in that some of these sets contained only asmall number of images and that some scene types weredisproportionately represented among these images—both factors that might severely
benchmark by the community of visual clutter researchers. The feature congestion model has been extensively evaluated in studies of visual clutter. Prominent among these was a study by Henderson et al. (2009), who measured the effect of visual clutter on search behavior in terms of manual and oculomotor dependent
Radar cross section (RCS) RCS per unit illuminated area, σo (m 2/m 2) RCS per unit illuminated volume, η (m 2/m 3) Radar equation and pattern-propagation factor F Sea clutter RCS and spikes Land clutter RCS Sea and land clutter statistics The compound-Gaussian model Clutter spectral models
1 11/16/11 1 Speech Perception Chapter 13 Review session Thursday 11/17 5:30-6:30pm S249 11/16/11 2 Outline Speech stimulus / Acoustic signal Relationship between stimulus & perception Stimulus dimensions of speech perception Cognitive dimensions of speech perception Speech perception & the brain 11/16/11 3 Speech stimulus
Contents Foreword by Stéphanie Ménasé vii Introduction by Thomas Baldwin 1 1 The World of Perception and the World of Science 37 2 Exploring the World of Perception: Space 47 3 Exploring the World of Perception: Sensory Objects 57 4 Exploring the World of Perception: Animal Life 67 5 Man Seen from the Outside 79 6 Art and the World of Perception 91 7 Classical World, Modern World 103
Attributes of Rain Clutter Rain both attenuates and reflects radar signals Problems caused by rain lessen dramatically with longer wavelengths (lower frequencies) – Much less of a issue at L-Band than X-Band Rain is diffuse clutter (wide geographic extent) – Travels horizontally with the wind – Has mean Doppler velocity and .
and different polarization analyzeare d-, and the results of the analysis contribute to the design and analysis of signal detection algorithm performance sea clutter environment, and improving in the signal detection capability. In this paper, the radar clutter data obtained from IPIX radar McMaster University in Canada of
Tactile perception refers to perception mediated solely by vari- ations in cutaneous stimulation. Two examples are the perception of patterns drawn onto the back and speech perception by a "listener" who senses speech information by placing one hand on the speaker's jaw and lips
Sensory Deprivation and Restored Vision Perceptual Adaptation Perceptual Set Perception and Human Factor 5 Perception Is there Extrasensory Perception? Claims of ESP Premonitions or Pretensions Putting ESP to Experimental Test 6 Perception The process of selecting, organizing, and
Topographical Anatomy A working knowledge of human anatomy is important for you as an EMT. By using the proper medical terms, you will be able to communicate correct information to medical professionals with the least possible confusion. At the same time, you need to be able to communicate with others who may or may not understand medical terms. Balancing these two facets is one of the most .