Cluster Analysis - Cleveland State University

1y ago
3 Views
1 Downloads
2.06 MB
49 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Duke Fulford
Transcription

1Cluster AnalysisPresented by: Lauren Franklin and Maria BakarmanCOM 631April 2017

2I.ModelData Set: Film and TV Usage National Survey 2015 (Jeffres & Neuendorf)Internal/clustering variables (4 scales from 25 items total): Tech Savvy—A 6-item additive scale (alpha .770) consisting of:Q28A- I often watch videos on my cellphoneQ28B-I often search for videos on YouTube to watchQ28C- I often share videos via FacebookQ28D- I often share videos on InstagramQ28E – I like to watch TV shows on laptop/ tablets/ phoneQ28F- I like to make short videos that I can share with others(All measured on a 7 point response scale, where 1-Not at alllike and 7-Very much like) Traditionalist—A 4-item additive scale (alpha .612) consisting of:Q29B- I am more traditionalist preferring to read physical copiesQ29C- I like the variety of entertainment available today but sometimes Ifeel it is too muchQ29D – I think that the new technologies have begun to dominate ourlivesQ29G – I still rather talk to people over the phone than text(All measured on a 7 point Likert like response scale, where 1-completelydisagree and 7-completely agree.) Leisure Tech Savvy—A 6-item additive scale (alpha .525) consisting of:Q3G- watch a film not at a theaterQ3H- surf the internet for pleasure not workQ3I- go to see live musical concert/ eventsQ3J- go on FacebookQ3K- play video games in some deviceQ3O- text family and friends rather than calling them on phones(All measured on an 8 point response scale, where 1-Never and 8-Severaltimes each day) Leisure Traditionalist—A 8-item additive scale (alpha .695) consisting of:Q3B- listen to the radioQ3C- read a magazineQ3D- read a bookQ3E- read a newspaperQ3F- go out to see a film in a theaterQ3L- go to see live musical concert/eventsQ3A- watch televisionQ3M- go to see live plays perform in the theater

3(All measured on an 8 point response scale, where 1-Never and 8-Severaltimes each day)External Variables/Profiling Variables:Income:1 1- 15000 or less2 2- 15001 to 200003 3- 20001 to 300004 4- 30001 to 400005 5 – 40001 to 500006 6- 50001 to 750007 7 – 75001 to 1000008 8 – 100001 to 1250009 9- 125001 to 15000010 10- 150,001 or moreG1: Male 0, Female 1Q18d: how often watch sci-fi genreQ18dd: how often watch superheroQ18q: how often watch chick flicksQ18g: how often watch film noirQ18b: how often watch western(All measured on an 6 point Likert like response scale, where 1-never 6-All the time)

4II. Running SPSS1- Analyze - Classify - Hierarchical Cluster.2- Select your Internal Variables for analysis.The four scales:Techsavvy, Traditionalists. Leisure Techsavvy, and Leisure Traditionalists

53- Click “Statistics” Box4- Make sure that the “Agglomeration Schedule” box is checked.5- Then, under Cluster Membership, check the circle “Range of Solutions”.6- Indicate your chosen minimum number of clusters and the maximumnumber of clusters. (e.g., 3 to 6, or 4 to 7).7- Then click “Continue”.8- Click “Plots” Box

69- Note that you must select either the “Dendrogram” box or something under“Icicle”. We ran Icicle, All Clusters.10- Then click “Continue”.11- Click “Method” Box.12- From “Cluster Method” drop down arrow Select “Ward’s Method”.

713- Under “Measure”, select “Interval” circle.

814- From drop down arrow select “Squared Euclidean Distance”.15- Then click “Continue”.16- ClickBox.17- Under“Save”“ClusterMembership” select the circle “Range of Solutions”. Type your chosen minimum (e.g., 4) into“Minimum number of clusters” box and type your chosen maximum (e.g., 7) into “Maximumnumber of clusters” box.18- Then click “Continue”.19- Click “OK” Box (or “Paste” to save syntax and then run).Note: This point marks the end of the actual Cluster procedure in SPSS. The Hierarchical ClusterAnalysis procedure has produced an Agglomerative Schedule and a Cluster Membership Tablein SPSS output. This procedure has also created and saved at the end of the dataset new nominalvariables. In our specific example, a 4-cluster variable, a 5-cluster variable, a 6-cluster variable,and a 7-cluster variable have all been produced and added to the end of the data ******************************

9Next:Further Frequencies and ANOVA analysis procedures will help decide which clustersolution to ultimately select.Now we examine the cluster groupings.1- Analyze Descriptive Statistics Frequencies2- Select the cluster variables. These are the newly created variables that will be at bottom ofSPSS list.“Ward Method [Clus7 1]” (Note we changed name in label to Ward Method 7 Cluster soeasier to identify distinctions in SPSS output charts)“Ward Method [Clus6 1]” (Note we changed name in label to Ward Method 6 Cluster so

10easier to identify distinctions in SPSS output charts)“Ward Method [Clus5 1]” (Note we changed name in label to Ward Method 5 Cluster soeasier to identify distinctions in SPSS output charts)“Ward Method [Clus4 1]” (Note we changed name in label to Ward Method 4 Cluster soeasier to identify distinctions in SPSS output charts).3- Click “OK” Box.Next:Run Means (with ANOVA tests) to compare means among the clusters.Analyze Compare Means Means

114- Select the four scales (Internal Variables) and enter into the “Dependent List”.5- Select the 7 total External Variables and enter into the “Dependent List”.

12NOTE: Actions that follow are based on the decision to use only the 4-cluster solution for furtheranalyses.6- Select “Ward Method 4 Cluster” and enter into “Independent List”.NOTE: You could run all the cluster-created variables, by also including “Ward Method 5”,“Ward Method 6” and “Ward Method 7” in the Independent List to see ANOVA meanscomparison based upon various cluster solutions.

137-Click “Options” Box.8- Click “Anova table and eta” to make sure you get an F-test comparing the means.

14III. SPSS OutputCOMPUTE LesiureTechSavvyRev 63 - LesiureTechsavvy.EXECUTE.COMPUTE LesiureTradtionalistRev 72 - LesiureTradtionalist.EXECUTE.CLUSTER LesiureTradtionalistRev LesiureTechSavvyRev Technologysavvy Tradtionalist/METHOD WARD/MEASURE SEUCLID/PRINT SCHEDULE CLUSTER(4,7)/PLOT VICICLE/SAVE CLUSTER(4,7).Cluster:Case Processing ercent32660.021740.0543100.0a. Squared Euclidean Distance usedb. Ward LinkageWard LinkageAgglomeration ScheduleStage Cluster FirstCluster Combined CoefficientAppearsStage Cluster 1 Cluster 2sCluster 1Cluster 04332174731681533695217

918696197110117100138135247108165118180180

102176144233168252156203245193115126193215161

64246220192243128221250238155182210187225258204235

31232246167225216226284237213218

15241247262278299261230229227

91274279295256276277

284313290307310298305304295306315293300

3203233223233243253243250

23Cluster 748495153545557586062636466Cluster Membership7654Clusters Clusters Clusters 333333333333222265226522333322222222

131433324321

11333233324121111232414311112333

41133332233142123311141214144432

33414312433332233314112343213421

14342214113234121133112134341231

32442312133341121112111432111334

43113223134114424311134421111223

55351222436143443412224351434434122243314344341222

32FREQUENCIES VARIABLES CLU7 1 CLU6 1 CLU5 1 CLU4 1/STATISTICS STDDEV VARIANCE MEAN MEDIAN MODE SKEWNESS SESKEW KURTOSIS SEKURT/ORDER ANALYSIS.FrequenciesStatisticsWardWardMethod 7Method 6ClustersClustersNValidMissingMeanMedianModeStd. DeviationVarianceSkewnessStd. Error ofSkewnessKurtosisStd. Error ofKurtosisWardMethod 5clustersWardMethod 7-1.031-1.322.269.269.269.269

33Valid1234567TotalMissin SystegmTotalWard Method 7 ClustersFrequenc 0Ward Method 6 alMissin Table:

34Valid12345TotalMissin SystegmTotalValid1234TotalMissin SystegmTotalWard Method 5 ard Method 4 21740.0543100.029.416.0100.084.0100.0

35MEANS TABLES LesiureTechSavvyRev LesiureTradtionalistRev TechnologysavvyTradtionalist BY CLU7 1 CLU6 1 CLU5 1 CLU4 1/CELLS MEAN COUNT STDDEV/STATISTICS ANOVA.Case Processing eTechSavvyRev * WardMethod 7ClustersLesiureTradtionalistRev * WardMethod 7ClustersTechnologysavvy * WardMethod 7ClustersTradtionalist *Ward Method 7ClustersLesiureTechSavvyRev * WardMethod 6ClustersLesiureTradtionalistRev * WardMethod 6ClustersTechnologysavvy * WardMethod 6ClustersTradtionalist *Ward Method 6ClustersTotalNPercent32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%

36LesiureTechSavvyRev * WardMethod 5clustersLesiureTradtionalistRev * WardMethod 5clustersTechnologysavvy * WardMethod 5clustersTradtionalist *Ward Method 5clustersLesiureTechSavvyRev * WardMethod 4clustersLesiureTradtionalistRev * WardMethod 4clustersTechnologysavvy * WardMethod 4clustersTradtionalist *Ward Method 4clusters32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%

37LesiureTechSavvyRev LesiureTradtionalistRev Technologysavvy Tradtionalist * Ward Method 4 clustersWard Method 4 viationReportLesiureTech LesiureTradt Technologys 8.040294.99141

38Anova TableSum ofSquaresLesiureTechSavvyRevBetween Groups (Combined)* Ward Method 4clustersdfMean Square7239.03232413.011Within adtionalistRev Between Groups (Combined)7085.44432361.81532.876* Ward Method 4Within ogysavvy *Between Groups (Combined)13010.64934336.883Ward Method 4Within adtionalist * WardBetween Groups (Combined)821.2163273.739Method 4 clustersWithin Groups7275.87632222.596Total8097.092325Measures of AssociationEtaEtaSquaredLesiureTechSavvyRev * Ward.640.409Method 4clustersLesiureTradtionalistRev * Ward.633.401Method 5.000

39Technologysavvy * WardMethod 4clustersTradtionalist *Ward Method 4clusters.787.619.318.101MEANS TABLES Income Age Q18g Genderdummy LesiureTechSavvyRev LesiureTradtionalistRev Technologysavvy Tradtionalist Q18d Q18dd Q18b Q18q BY CLU4 1/CELLS MEAN COUNT STDDEV/STATISTICS ANOVA.Case Processing SummaryCasesIncludedExcludedNPercentNPercentIncome * WardMethod 4clustersAge * WardMethod 4clustersQ18g. Howoften Film noirfilms * WardMethod 4clustersGenderdummy* Ward Method4 clustersLesiureTechSavvyRev * WardMethod 4clustersLesiureTradtionalistRev * WardMethod 4clustersTotalNPercent32559.9%21840.1%543 100.0%32559.9%21840.1%543 100.0%32660.0%21740.0%543 100.0%32559.9%21840.1%543 100.0%32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%

40Technologysavvy * WardMethod 4clustersTradtionalist *Ward Method 4clustersQ18d. Howoften Sciencefiction * WardMethod 4clustersQ18dd. Howoften SuperHero films *Ward Method 4clustersQ18b. Howoften Westerns* Ward Method4 clustersQ18q. Howoften Chickflicks * WardMethod 4clusters32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%32660.0%21740.0%543 100.0%

41

42

ANOVA TableSum of SquaresIncome * WardBetween GroupsMethod 4 clusters(Combined)43dfMean Square57.429319.143Within 2.223119.269Age * WardBetween GroupsMethod 4 clustersWithin 8.0933221.423488.850325.6303.210.237Q18g. How oftenBetween GroupsFilm noir films *Within GroupsWard Method 4Total(Combined)(Combined)clustersGenderdummy *Between GroupsWard Method 4Within .01132.468LesiureTechSavvy Between Groups(Combined)(Combined)Rev * WardWithin Groups10454.627322Method 4 siureTradtionali Between Groups(Combined)stRev * WardWithin Groups10586.044322Method 4 clustersTotal17671.488325TechnologysavvyBetween Groups13010.64934336.883* Ward Method 4Within adtionalist *Between Groups821.2163273.739Ward Method 4Within d. How oftenBetween Groups5.57031.857Science fiction *Within Groups633.6233221.968Ward 0.82532510.63633.5451.002Q18dd. How often Between r Hero films * Within GroupsWard Method 4TotalclustersQ18b. How oftenBetween Groups(Combined)Westerns * WardWithin Groups322.606322Method 4 clustersTotal333.242325Q18q. How oftenBetween Groups16.68935.563Chick flicks *Within Groups676.2353222.100Ward Method 115.000.944.4203.073.0283.539.0152.649.049

44clustersMeasures of AssociationEtaEtaSquaredIncome * WardMethod 4.181.033clustersAge * WardMethod 4.291.085clustersQ18g. Howoften Film noirfilms * Ward.251.063Method 4clustersGenderdummy* Ward Method.091.0084 clustersLesiureTechSavvyRev * Ward.640.409Method 4clustersLesiureTradtionalistRev * Ward.633.401Method 4clustersTechnologysavvy * Ward.787.619Method 4clustersTradtionalist *Ward Method 4.318.101clusters

45Q18d. Howoften Sciencefiction * WardMethod 4clustersQ18dd. Howoften SuperHero films *Ward Method 4clustersQ18b. Howoften Westerns* Ward Method4 clustersQ18q. Howoften Chickflicks * WardMethod 4clusters.093.009.167.028.179.032.155.024

46IV. TablingTable 1. Cluster ProfilingCluster name(Cluster 4) 1:Average2:3:Traditionalist YeaSayers4:TechSavvyTotalVariables4 InternalvariablesTech Savvy1 (109)2 (69)3 (96)4 (52)32612.559611.449325.229223.903817.8650174.572 .001TraditionalistLeisure TechSavvyLeisureTraditionalist8 ExternalVariablesQ34: What isyour annualincome?Q30:Male 0,Female 1Q3e1: Age16.119318.565218.072940.019216.877312.115 .00143.880734.318846.375047.076943.101274.320 .00129.908335.913039.281327.192333.506171.840 538.6185.886.44834.6840.5232.7130.7134.709.912 451.692.027.206 .0012.732.903.293.062.982.699.049Q18b: HowoftenwesternQ18d: Howoften sci-fiQ18dd: HowoftensuperheroQ18g: Howoften filmnoirQ18q: Howoften chickflicksFSig.Note. Post hoc tests were not run, so differences in means across the four clusters should beinterpreted with caution.

47V. Write-upThe Film and TV Usage National Survey 2015 (Jeffres & Neuendorf) was chosen forcluster analysis. Four internal or independent variables were made into additive scales. Scaleone, named Tech savvy, includes six items all measured on a 7 point response scale where 1-Notat all like and 7-Very much like: I often watch videos on my cellphone (Q28a), I often searchvideos on YouTube to watch (Q28b), I often share videos via Facebook (28c), I often sharevideos on Instagram (Q28d), I like to watch TV shows on a laptop/tablet/phone when I’m stucksomewhere (Q28e), and I like to make short videos that I can share with others (29f)(alpha .770). Scale two, named Traditionalist, includes four items all measured on a 7 pointLikert response scale where 1-completely disagree and 7-completely agree: I’m more atraditionalist, preferring to read physical copies of books (Q29b), I like the variety ofentertainment available today, but sometimes feel it’s too much (Q29c), I think that the newtechnology have begun to dominate our lives (Q29d), and I would still rather talk to people overthe phone than text (Q29g) (alpha .612). Scale three, named Tech Savvy Leisure, includes sixitems all measured on an 8 point response scale where 1 never and 8 several times a day:Watch film not at a theater (Q3g), Surf the internet for pleasure, not work (Q3h), Check myemail (Q3i), Go on Facebook (Q3j), Play video games on some device (Q3k), and Text familyand friends rather than call them (Q3o) (alpha .525). Scale four, named Traditionalist Leisure,includes eight items measured on an 8 point response scale where 1 never and 8 several timeseach day: Listen to the radio (Q3b), read a magazine (Q3c), read a book (Q3d), read a newspaper(Q3e), go out to see a film in a theater (Q3f), go to see live musical concert/ events (Q3L), watchtelevision (Q3a), go to see a live play preformed in a theater (Q3m) (alpha .695).

48The eight external or “profiling” variables include: Income, age, gender (femaleness),how often film noir (Q18g), how often sci-fi (Q18d), how often superhero (Q18dd), how oftenwestern (Q18b), and how often chick flicks (Q18a) (the Q18 items are all measured on a 6 pointresponse scale, where 1-never [watch] and 6-[watch] all the time).A hierarchical agglomerative cluster analysis was performed to discover the naturalgrouping of the participants. A four cluster solution was chosen using Ward’s Method (withsquared Euclidian distances). The choice of four clusters was supported by examination ofchanges in the agglomeration coefficients from the agglomeration table. Dendrogram and icicleplots were run to give a visual representation of the data clusters. MEANS with ANOVAanalyses were conducted (a) to examine the cluster sizes to make sure all clusters had areasonable n, and (b) to examine the differences among the four clusters with regard to all fourinternal variables. As expected, all internal/clustering variables were significantly differentamong the four clusters. The four clusters have been named: “Average”, “Traditional”, “Yeasayers”, and “Tech Savvy” (See Table 1). To further profile the four clusters, a complementaryset of ANOVA analyses was conducted to test the significance of the differences among the fourclusters against the eight demographic/external variables. All four of the internal variablesshowed highly significant differences across the four clusters (p .001). Of the external variables,all showed significant differences (p .05) across the four clusters, but gender (femaleness) andsci-fi were not significant.Cluster 1 (n 109) is labeled “Average” because this group appeared to be average foreach variable. Cluster 2 (n 69) is labeled “Traditional” because of the high means for thetraditional leisure and traditional media scales. This cluster also tends to be rich, older, and likesfilm noir. Cluster 3 (n 96) is labeled “yea-sayers” because of the high means for all variables.

49This group tends to report liking everything, but not the western genre. Cluster 4 (n 52) islabeled “Tech Savvy” because of the high means for technology use and technology leisurescales. This group also tends to be the youngest, lowest income, and does not like film noir orsci-fi genres.

Analysis procedure has produced an Agglomerative Schedule and a Cluster Membership Table in SPSS output. This procedure has also created and saved at the end of the dataset new nominal variables. In our specific example, a 4-cluster variable, a 5-cluster variable, a 6-cluster variable,

Related Documents:

Cleveland State University 2121 Euclid Avenue, Cleveland, Ohio 44115 www.csuohio.edu 216.687.2000 Cleveland State University ABOUT SITE AFFIRMATIVE ACTION . opportunities. First and foremost, however, Cleveland State is a university. Its basic mission — central to all universities is to preserve existing knowledge, seek new knowledge, and .

20 miles to Q Arena 1. Cambria Suites Avon 2. Residence Inn Avon 2 Cambria Suites Avon 1 Residence Inn Avon \dd Downtown Hotel Cluster Adjacent-12 blocks to Cleveland Convention Center 1. Aloft Cleveland Downtown 2. Cleveland Marriott Downtown at Key Center 3. Cleveland State University 4. Comfort

On HP-UX 11i v2 and HP-UX 11i v3 through a cluster lock disk which must be accessed during the arbitration process. The cluster lock disk is a disk area located in a volume group that is shared by all nodes in the cluster. Each sub-cluster attempts to acquire the cluster lock. The sub-cluster that gets

Cluster Analysis depends on, among other things, the size of the data file. Methods commonly used for small data sets are impractical for data files with thousands of cases. SPSS has three different procedures that can be used to cluster data: hierarchical cluster analysis, k-means cluster, and two-step cluster. They are all described in this

Cluster Analysis depends on, among other things, the size of the data file. Methods commonly used for small data sets are impractical for data files with thousands of cases. SPSS has three different procedures that can be used to cluster data: hierarchical cluster analysis, k-means cluster, and two-step cluster. They are all described in this

Avenue Cleveland OH 44114 (216) 664-6789 3 Cleveland Business John Carter Carter Exterminating Co. 3966 East 131 Street Cleveland OH 44105 (216) 751-1955 3 Cleveland Business Ray C.S. Chan, P.E. Central Engineering, Inc. 869 W. Bagley Road Berea OH 44017-2903 (440) 239-1501 3 Cleveland Business Lonzo Coleman Coleman Spohn Corporation 1775 East 45th

1. Current and prospective City of Cleveland residents (youth through senior citizens) 2. Current and prospective City of Cleveland businesses and property owners 3. Current and prospective City of Cleveland employees 4. Current and prospective City of Cleveland visitors 5. Members of city boards, commissions and committees 6.

Agile Development and Scrum Scrum is, as the reader supposedly knows, an agile method. The agile family of development methods evolved from the old and well- known iterative and incremental life-cycle approaches. They were born out of a belief that an approach more grounded in human reality – and the product development reality of learning, innovation, and change – would yield better .