Programming And Data Management For IBM SPSS Statistics 19 .

3y ago
50 Views
3 Downloads
3.95 MB
458 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Aiyana Dorn
Transcription

Programming and Data Managementfor IBM SPSS Statistics 19A Guide for IBM SPSS Statistics and SAS UsersRaynald Levesque and SPSS Inc.

Note: Before using this information and the product it supports, read the general informationunder “Notices” on p. 435.This document contains proprietary information of SPSS Inc, an IBM Company. It is providedunder a license agreement and is protected by copyright law. The information contained in thispublication does not include any product warranties, and any statements provided in this manualshould not be interpreted as such.When you send information to IBM or SPSS, you grant IBM and SPSS a nonexclusive rightto use or distribute the information in any way it believes appropriate without incurring anyobligation to you. Copyright SPSS Inc. 1989, 2010.

PrefaceExperienced data analysts know that a successful analysis or meaningful report often requiresmore work in acquiring, merging, and transforming data than in specifying the analysis or reportitself. IBM SPSS Statistics contains powerful tools for accomplishing and automating thesetasks. While much of this capability is available through the graphical user interface, many ofthe most powerful features are available only through command syntax—and you can make theprogramming features of its command syntax significantly more powerful by adding the abilityto combine it with a full-featured programming language. This book offers many examples ofthe kinds of things that you can accomplish using command syntax by itself and in combinationwith other programming language.For SAS UsersIf you have more experience with SAS for data management, see Chapter 32 for comparisonsof the different approaches to handling various types of data management tasks. Quite often,there is not a simple command-for-command relationship between the two programs, althougheach accomplishes the desired end.AcknowledgmentsThis book reflects the work of many members of the SPSS Inc. staff who have contributedexamples here and in Developer Central, as well as that of Raynald Levesque, whose examplesformed the backbone of earlier editions and remain important in this edition. We also wish tothank Stephanie Schaller, who provided many sample SAS jobs and helped to define what theSAS user would want to see, as well as Marsha Hollar and Brian Teasley, the authors of theoriginal chapter “IBM SPSS Statistics for SAS Programmers.”A Note from Raynald LevesqueIt has been a pleasure to be associated with this project from its inception. I have for many yearstried to help IBM SPSS Statistics users understand and exploit its full potential. In this context,I am thrilled about the opportunities afforded by the Python integration and invite everyone tovisit my site at www.spsstools.net for additional examples. And I want to express my gratitude tomy spouse, Nicole Tousignant, for her continued support and understanding.Raynald Levesque Copyright SPSS Inc. 1989, 2010iii

Contents1Overview1Using This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Documentation Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Part I: Data Management2Best Practices and Efficiency Tips5Working with Command Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5Creating Command Syntax Files . .Running Commands . . . . . . . . . . .Syntax Rules . . . . . . . . . . . . . . . .Protecting the Original Data . . . . . . . .5677Do Not Overwrite Original Variables. .Using Temporary Transformations . . .Using Temporary Variables . . . . . . . .Use EXECUTE Sparingly . . . . . . . . . . . . . .88910Lag Functions . . . . . . . . . . . . . . . . . .Using CASENUM to Select Cases. . .MISSING VALUES Command . . . . . . .WRITE and XSAVE Commands . . . . . .Using Comments. . . . . . . . . . . . . . . . . . . .1112131313Using SET SEED to Reproduce Random Samples or Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Divide and Conquer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15Using INSERT with a Master Command Syntax File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15Defining Global Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153Getting Data into IBM SPSS Statistics19Getting Data from Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19Installing Database Drivers . . . . .Database Wizard . . . . . . . . . . . . .Reading a Single Database Table .Reading Multiple Tables. . . . . . . .v.19202022

Reading IBM SPSS Statistics Data Files with SQL Statements. . . . . . . . . . . . . . . . . . . . . . . . . . . . 25Installing the IBM SPSS Statistics Data File Driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25Using the Standalone Driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Reading Excel Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27Reading a “Typical” Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27Reading Multiple Worksheets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30Reading Text Data Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32Simple Text Data Files . . . . . . . . . . . . . . .Delimited Text Data . . . . . . . . . . . . . . . . .Fixed-Width Text Data . . . . . . . . . . . . . . .Text Data Files with Very Wide Records . .Reading Different Types of Text Data . . . .Reading Complex Text Data Files. . . . . . . . . . .323336404041Mixed Files . . . . . . . . . . . . . .Grouped Files . . . . . . . . . . . .Nested (Hierarchical) Files . .Repeating Data . . . . . . . . . . .Reading SAS Data Files . . . . . . . .4243454950.Reading Stata Data Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51Code Page and Unicode Data Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524File Operations55Using Multiple Data Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55Merging Data Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58Merging Files with the Same Cases but Different Variables . . . . . . . .Merging Files with the Same Variables but Different Cases . . . . . . . .Updating Data Files by Merging New Values from Transaction Files . .Aggregating Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .58616465Aggregate Summary Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67Weighting Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68Changing File Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70Transposing Cases and Variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70Cases to Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71Variables to Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73vi

5Variable and File Properties77Variable Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77Variable Labels . . . . . . . . . . . . . . . . . . . .Value Labels . . . . . . . . . . . . . . . . . . . . . .Missing Values . . . . . . . . . . . . . . . . . . . .Measurement Level . . . . . . . . . . . . . . . . .Custom Variable Properties . . . . . . . . . . .Using Variable Properties as Templates .File Properties . . . . . . . . . . . . . . . . . . . . . . . .6.Data Transformations.7980808181828385Recoding Categorical Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85Binning Scale Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85Simple Numeric Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88Arithmetic and Statistical Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88Random Value and Distribution Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89String Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90Changing the Case of String Values . .Combining String Values . . . . . . . . . .Taking Strings Apart . . . . . . . . . . . . .Changing Data Types and String Widths . .90919295Working with Dates and Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96Date Input and Display Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97Date and Time Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 997Cleaning and Validating Data105Finding and Displaying Invalid Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105Excluding Invalid Data from Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107Finding and Filtering Duplicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107Data Preparation Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1108Conditional Processing, Looping, and Repeating113Indenting Commands in Programming Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113vii

Conditional Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113Conditional Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113Conditional Case Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116Simplifying Repetitive Tasks with DO REPEAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117ALL Keyword and Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119Vectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119Creating Variables with VECTOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121Disappearing Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121Loop Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122Indexing Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Nested Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Conditional Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Using XSAVE in a Loop to Build a Data File. . . . . . . . . . . .Calculations Affected by Low Default MXLOOPS Setting .9.Exporting Data and Results.123123125126127129Exporting Data to Other Applications and Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129Saving Data in SAS Format . . . . . . . . . . . . . . . . . . . . . . .Saving Data in Stata Format. . . . . . . . . . . . . . . . . . . . . . .Saving Data in Excel Format. . . . . . . . . . . . . . . . . . . . . . .Writing Data Back to a Database . . . . . . . . . . . . . . . . . . .Saving Data in Text Format. . . . . . . . . . . . . . . . . . . . . . . .Reading IBM SPSS Statistics Data Files in Other Applications.129130131131134134Installing the IBM SPSS Statistics Data File Driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135Example: Using the Standalone Driver with Excel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135Exporting Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137Exporting Output to Word/RTF . . . . . . . . . . . . . . . . . . . . .Exporting Output to Excel. . . . . . . . . . . . . . . . . . . . . . . . .Using Output as Input with OMS . . . . . . . . . . . . . . . . . . .Adding Group Percentile Values to a Data File . . . . . . . . .Bootstrapping with OMS . . . . . . . . . . . . . . . . . . . . . . . . .Transforming OXML with XSLT . . . . . . . . . . . . . . . . . . . . .“Pushing” Content from an XML File . . . . . . . . . . . . . . . .“Pulling” Content from an XML File . . . . . . . . . . . . . . . . .XPath Expressions in Multiple Language Environments . .Layered Split-File Processing. . . . . . . . . . . . . . . . . . . . . .Controlling and Saving Output Files. . . . . . . . . . . . . . . . . . . . .viii.137140143144146150151153162162163

10 Scoring data with predictive models165Building a predictive model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165Evaluating the model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166Applying the model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167Part II: Programming with Python11 Introduction17112 Getting Started with Python Programming in IBM SPSSStatistics175The spss Python Module. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175Running Your Code from a Python IDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176The SpssClient Python Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178Submitting Commands to IBM SPSS Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181Dynamically Creating Command Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182Capturing and Accessing Output. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183Modifying Pivot Table Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185Python Syntax Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185Mixing Command Syntax and Program Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187Nested Program Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189Handling Errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191Working with Multiple Versions of IBM SPSS Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192Creating a Graphical User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192Supplementary Python Modules for Use with IBM SPSS Statistics . . . . . . . . . . . . . . . . . . . . . . . 197Getting Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19713 Best Practices199Creating Blocks of Command Syntax within Program Blocks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 199Dynamically Specifying Command Syntax Using String Substitution . . . . . . . . . . . . . . . . . . . . . . 200Using Raw Strings in Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202Displaying Command Syntax Generated by Program Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202ix

Creating User-Defined Functions in Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203Creating a File Handle to the IBM SPSS Statistics Install Directory . . . . . . . . . . . . . . . . . . . . . . . 204Choosing the Best Programming Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205Using Exception Handling in Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207Debugging Python Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20914 Working with Dictionary Information213Summarizing Variables by Measurement Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

publication does not include any product warranties, and any statements provided in this manual should not be interpreted as such. When you send information to IBM or SPSS, you grant IBM and SPSS a nonexclusive right to use or distribute the information in any way it believes appropriate without incurring any obligationtoyou.

Related Documents:

Bruksanvisning för bilstereo . Bruksanvisning for bilstereo . Instrukcja obsługi samochodowego odtwarzacza stereo . Operating Instructions for Car Stereo . 610-104 . SV . Bruksanvisning i original

10 tips och tricks för att lyckas med ert sap-projekt 20 SAPSANYTT 2/2015 De flesta projektledare känner säkert till Cobb’s paradox. Martin Cobb verkade som CIO för sekretariatet för Treasury Board of Canada 1995 då han ställde frågan

service i Norge och Finland drivs inom ramen för ett enskilt företag (NRK. 1 och Yleisradio), fin ns det i Sverige tre: Ett för tv (Sveriges Television , SVT ), ett för radio (Sveriges Radio , SR ) och ett för utbildnings program (Sveriges Utbildningsradio, UR, vilket till följd av sin begränsade storlek inte återfinns bland de 25 största

Hotell För hotell anges de tre klasserna A/B, C och D. Det betyder att den "normala" standarden C är acceptabel men att motiven för en högre standard är starka. Ljudklass C motsvarar de tidigare normkraven för hotell, ljudklass A/B motsvarar kraven för moderna hotell med hög standard och ljudklass D kan användas vid

LÄS NOGGRANT FÖLJANDE VILLKOR FÖR APPLE DEVELOPER PROGRAM LICENCE . Apple Developer Program License Agreement Syfte Du vill använda Apple-mjukvara (enligt definitionen nedan) för att utveckla en eller flera Applikationer (enligt definitionen nedan) för Apple-märkta produkter. . Applikationer som utvecklas för iOS-produkter, Apple .

This presentation and SAP's strategy and possible future developments are subject to change and may be changed by SAP at any time for any reason without notice. This document is 7 provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a .

och krav. Maskinerna skriver ut upp till fyra tum breda etiketter med direkt termoteknik och termotransferteknik och är lämpliga för en lång rad användningsområden på vertikala marknader. TD-seriens professionella etikettskrivare för . skrivbordet. Brothers nya avancerade 4-tums etikettskrivare för skrivbordet är effektiva och enkla att

Den kanadensiska språkvetaren Jim Cummins har visat i sin forskning från år 1979 att det kan ta 1 till 3 år för att lära sig ett vardagsspråk och mellan 5 till 7 år för att behärska ett akademiskt språk.4 Han införde två begrepp för att beskriva elevernas språkliga kompetens: BI