Day1 Basic Statistics Using R

2y ago
18 Views
2 Downloads
2.19 MB
241 Pages
Last View : Today
Last Download : 3m ago
Upload by : Farrah Jaffe
Transcription

Basic statistics using RJarno Tuimala (CSC)Dario Greco (HY)

Day 1

Welcome and introductions

Learning aims¾ To learn R SyntaxData typesGraphicsBasic programming (loops and stuff)¾ To learn basic statistics Exploratory data analysis Statistical testing Liner modeling (regression, ANOVA)

Schedule¾ Day 1 10-16 Basic R usage¾ Day 2 10-16 Descriptive statistics and graphics¾ Day 3 10-16 Statistical testing¾ Day 4 10-16 More advanced features of R

Installing Rhttp://www.r-project.org

On Windows, in general

Downloading R I/V

Downloading R II/V

Downloading R III/V

Downloading R IV/V

Downloading R V/V

Installing

Exercise I

Installing on this course¾ On this course we using an easier setup, where we copy thealready created R installation to each persons computer.¾ This is a version where certain settings have been slightlymodified.¾ Go to 8s, andclick on the link Download R 2.7.0. Save the file on Desktop.¾ Extract the zip-file to desktop (right-click on the file, and selectWinzip - Extract to here).¾ Go to folder R-2.7.0c/bin and right-click on file Rgui.exe. SelectCreate Shortcut.¾ Copy and paste the shortcut to Desktop.

Packages

What are packages?¾ R is built around packages.¾ R consist of a core (that already includes a number of packages)and contributed packages programmed by user around the world.¾ Contributed packages add new functions that are not available inthe core (e.g., genomic analyses).¾ Contributed packages are distributed among several projects CRAN (central R network) Bioconductor (support for genomics) OmegaHat (access to other software)¾ In computer terms, packages are ZIP-files that contain all that isneeded for using the new functions.

How to get new packages?¾ The easiest way is to:1. Packages - Select repository2. Packages - Install packages Select the closest mirror (Sweden probably)¾ You can also download the packages as ZIP-files. Save the ZIP-file(s) into a convenient location, and without extracting them,select Packages - Install from a local ZIP file.

How to access the functions in packages?¾ Before using any functions in the packages, you need to load thepackages in memory.¾ On the previous step packages were just installed on thecomputer, but they are not automatically taken into use.¾ To load a pcakage into memory Packages - Load Packages Or as a command: library(rpart)¾ If you haven’s loaded a package before trying to access thefunctions contained in it, you’ll get an error message: Error: could not find function "rpart"

Help facilities

HTML help¾ To invoke a built-in help browser, select Help- HTML help. Command: help.start()¾ This should open a browser window with help topics:

A basic book

List of installed packages

Help for packages

Anatomy of a help file 1/2Function {package}General descriptionCommand and it’sargumentDetailed descriptionof arguments

Anatomy of a help file 2/2Description of howfunction actuallyworksWhat functionreturnsRelated functionsExamples, can berun from R by:example(mas5)

Search help

Search results

Other search possibilities I/II¾ Help - search.r-project.org

Other search possibilities II/II¾ http://www.r-seek.org

Exercise II

Install packages and use help1. Install the following package(s): car (can be found from CRAN)2. Load the library into memory3. Consult the help files for the car package. What does States contain?What does function scatterplot do?4. What packages are available for data analysis in epidemiology?

Basic use and data import I

Interface¾ Normal text:black¾ User text:blue¾ Prompt:that wheretype thecommands

R as a calculator¾ R can be used as a calculator.¾ You can just type the calculations on the prompt. After typingthese, you should press Return to execute the calculation. 2 12-12*12/12 2#####addsubtractmultiplydividepotency¾ Note: # is a comment mark, nothing after it on the same line is notexecuted¾ Normal rules of calculation apply: 2 2*3 (2 2)*3# 8# 12

Anatomy of functions or commands¾ To use a function in a package, the package needs to be loaded inmemory.¾ Command for this is library( ), for example: library(affy)¾ There are three parts in a command: The command - library Brackets – ( ) Arguments inside brackets (these are not always present) - affy¾ Arguments modify or specify the commands Command library() loads a library, but unless it is given an argument (name ofthe library) it doesn’t know what to load.¾ R is case sensitive! library(affy) Library (affy) # works!# fails

Mathematical functions¾ R contains many mathematical function, also. log(10)log2(8)exp(2.3)sin(10)sqrt(9) sum(v) diff(v)#####natural logarithm, 2.339.97-0.54squre root, 3

Comparisons¾ Is equal ¾ Is larger than ¾ Is larger than or equal to ¾ Smaller than or equal to ¾ Isnot equal to ! ¾ Examples 3 3 2! 3 2 3# TRUE# TRUE# TRUE

Logical operators¾ Basic operators are & # and# or (press Alt Gr and simultaneously)¾ Examples 2 3 3 3 2 3 & 3 3# TRUE (if either is true then print TRUE)# FALSE (another statement is FALSE, so - FALSE)

Creating vectors I/III¾ So far, we’ve been applying the function on only one number at atime.¾ Typically we would like to do the same operation for severalnumber at the same time. Taking a log2 of several numbers, for instance¾ First, we need to create a vector that holds those severalnumbers: v -c(1,2,3,4,5) Everything in R is an object Here, v is an object used for storing these 5 numbers - is the operator that stores something c( ) is a command for creating a vector by typing values to be stored.

Naming objects¾ Never use command names as object names!¾ If you are unsure whether something is a command name, type it tothe comman line first. If it gives an error message, you’re safe to useit. data dat# not good# good¾ Object names can’t start with a number 1a a1# not good# good¾ Never use special characters, such as å, ä, or ö in object names.¾ Object names are case sensitive, just like commands A1 a1# object nro 1# object nro 2

Creating vectors II/III¾ Vectors can also be created using : notation, if the values arecontinous: v - c(1:5)¾ For creating a vector of three 1s, four 2s, and five 3s, there areseveral options: v - c(1,1,1,2,2,2,2,3,3,3,3,3) Using rep( ) v1 -rep(1,3)# Creates a vector of three ones v2 -rep(2,4) v3 -rep(3,5) v -c(v1,v2,v3) Putting the command together: v -c(rep(1,2), rep(2,4), rep(3,5))

Creating vectors III/III¾ Let’s take a closer look at the last command:¾ v -c(rep(1,2), rep(2,4), rep(3,5))Creating individual vectorsPutting the three vector together¾ So you can nest commands, and that is very commonly done!¾ But nothing prevents you from breaking these nested commandsdown, and running them one by one That’s what we did on the last slide

Applying functions to vectors¾ If you apply any of the previously mentioned functions to a vector,it will be applied seperately for every observation in that vector: log2(v)[1] 0.000000 0.000000 0.000000 1.000000 1.000000 1.000000[7] 1.000000 1.584963 1.584963 1.584963 1.584963 1.584963¾ When applied to a vector, the lenght of the result is as long as thestarting vector.¾ When a function is applied to a vector this way, the calculation issaid to be vectorized.

Exercise III

Import Data some calculations¾ A certain American car was followed through seven fill ups. Themileage was: 65311, 65624, 65908, 66219, 66499, 66821, 67145, 674471. Enter the data in R.2. How many observations there are in the data (what is the Rcommand)?3. What is total distance driven during the follow up?4. What are the fill up distances in kilometers (1 mile 1.6 km)?5. Use function diff() on the data. What does it tell you?6. What is the longest distance between two fill ups (search for aappropriate command from the help)?

Basic use and data import II

Factors¾ In vectors you have a list of values. Those can be numbers orstrings.¾ Factors are a different data type. They are used for handlingcategorical variable, e.g., the ones that are nominal or orderedcategorical variables. Instead of simply having values, these contain levels (for that categoricalvariable)¾ Examples: Male,female Featus, baby, toddler, kid, teenager, young adult, middle-aged, senior, aged

Creating factors I/III¾ Factors can be created from vectors, or from a scratch.¾ Here I present only the route from vectors.¾ So, let’s create a vector of numerical values (1 male, 2 female): v -c(1,2,1,1,1,2,2,2,1,2,1)¾ To convert the vector to factor, you need to type: f -as.factor(v)¾ Check what R did: f[1] 1 2 1 1 1 2 2 2 1 2 1Levels: 1 2¾ f is now a vector with two levels (1 and 2).

Creating factors II/III¾ Levels of factors can also be labeled. This makes using them instatistical testing much easier. f -factor(v, labels c(”male”, ”female”))A string vector! f[1] male female male male male female female female male femalemaleLevels: male female¾ Which order do you give the levels then? Check how the values are printed in unique(sort(v)) # 1 2

Creating factors III/III¾ Levels of a factor can also be ordered. These are similar to the unordered factors, but statistical tests treat themquite differently.¾ To create an ordered factor, add argument ordered T: f -factor(v, labels c(”male”, ”female”), ordered T) f[1] male female male male male female female female male femalemaleLevels: male female Note the sign! That identifies the factor as ordered.

Applying functions to factors¾ You can’t calculate, for example, log2 of every observation is afactor. log2(f)Error in Math.factor(f) : log2 not meaningful for factors¾ There are separate function for manipulating factors, such as: table(f)fmale female65

Data frames¾ Data frames are, well, tables (like in any spreadsheet program).¾ In data frames variables are typically in the columns, and cases inthe rows.¾ Columns can have mixed types of data; some can containnumeric, yet others text If all columns would contain only character or numerica data, then the datacan also be saved in a matrix (those are faster to operate on).V1C1 1C2 2C3 3V2010V3onetwothree

Data frames¾ From previous slides we have two variable, v and f.¾ To make a data frame that contains both of these variables, onecan use command: d -data.frame(v, f)¾ To bind the two variables into a table, one could also use d2 -cbind(v, f)¾ The difference between these methods is that the first creates adata frame and the second one a matrix.

Data frames and data import¾ Usually when you import a data set in R, you read it in a dataframe.¾ This is assuming your data is in a table format.¾ One can input the data in a table with some spreadsheet, but itshould be saved as tab-delimited text file to make importing easy.¾ This text file should not contain are (unmatched) quotation marks(’ or ”).¾ It is best to fill in all empty fields with some value (not leave themblank in the spreadsheet). Missing values (no measument): NA Small values: 0?

Starting the work with R (browse to a folder)

Importing a tabular file¾ Simply type: dat -read.table(”filename”, header T, sep ”\t”, row.names 1)dat is the name of tyhe object the data is saved in R - is the assignment operatorread.table( ) is the command that read in tabular filesIt needs to get a filename, complete with the extension (Windowshidesthose by default)¾ If every column contains a title, then argument should be header TRUE(or header T), otherwise header F.¾ If the file is tab-delimited (there is a tab between every column), thensep ”\t”. Other options are, e.g., sep ”,” and sep ” ”.¾ If every case (row) has it’s own unambiquous (non-repeating) title, andthe first column of the file contains these row names, thenrow.names 1, otherwise the argument should be deleted.¾¾¾¾

Importing data from web¾ Code can be downloaded and executed from the web with thecommand source( ) urse data.txt")¾ Files can be downloaded by download.file( ) tot/rairuoho.txt",destfile “rairuoho.txt”)

Checking the objects and memory¾ To see what objects are in memory: ls( )¾ Length of a vector or factor length(v)¾ Dimentions of a data frame or matrix: dim(d)¾ Column and row names of a data frame or matrix col.names(d) row.names(d)

Exercise IV

Import tabular data¾ Download the file from the Internet: t¾ Put the file on desktop.¾ See how the data looks like (use Excel and Wordpad): Are there columns headers? What is the separator between the columns (space, tab, etc)? Are there row names in the data?¾ Now you should know what arguments to specify in theread.table() command, so use it for reading in the data.

Import the rest of the data¾ I have prepared several datasets for this course.¾ These can be downloaded from the web: urse data.txt")¾ The datasets are written as R commands, so the command abovedownloads and runs this command file.¾ Check what object were created in R memory?¾ Run the command showMetaData(). This should show some information about the datasets. Note that the command is written for this course only (by me), and can’t beused in R in general.

Object type conversions

Converting from a data type to another¾ Certain data types can easily be converted to other data types. Vector - factorData frame - matrixData frame - vector / factorMatrix - vector / factor¾ Typical need for converting a vector to a factor is whenperforming some statistical tests.¾ Data frame might need to be converted into a matrix (or viceversa) when running some statistical tests or when plotting thedata.¾ Several vectors can be cleaved from a data frame or a matrix.¾ Several vectors can be combined to a data frame or a matrix.

Converting from a vector to a factor¾ To convert a vector to factor, do v2 -as.factor(v, labels c(”Jan”, ”Feb”)) Unordered factor v2 -factor(v, ordered T, labels c(”Jan”, ”Feb”)) Ordered factor¾ Difference between ordered and unorder factors lies in the detailthat if the factor is unordered, the values are automaticallyordered in plots and statistical test according to lexical scoping(alphabetically).¾ If the factor is ordered, then the levels have an explicit meaning inthe specified order, for example, January becomes beforeFebruary.

Extracting a vector from a data frame I/III¾ As individual variables are storedin the columns of a data frame, itis typically of interest to be ableto extract these column from adata frame.¾ Columns can be addressed usingtheir names or their position(calculated from left to right)¾ Rows can be accessed similarlyto columns.¾ Remember how to check thenames? 7Vidal48Max511

Extracting a vector from a data frame II/III¾ This data frame is stored in anobject called dat.¾ The first column is named Jan, sowe can get the values in it bynotation: dat JanName of the data frame Nameof the columnThere are no brackets, so there is”no” command: we are accessing adata frame.JanFebJarno131Dario212Panu337Vidal48Max511

Extracting a vector from a data frame III/III¾¾This data frame is stored in an objectcalled dat.To get the first column, one can alsopoint to it with the notation: x511# 1, 31And the value on the first row of thefirst column: ¾dat[1,]Feb# 1, 2, 3, 4, 5This is called a subscript.Subscript consists of square brackets.Inside the bracket there are at leastone number.The number before a comma pointsrows, the number after the comma tocolumnsThe first row would be extracted by: ¾dat[,1]Jan#1Again, no brackets - no commands,so we are accessing an object

Extracting several columns of rows¾ One can want to extract severalcolumns or rows from a table.¾ This can be accomplished usinga vector instead of a singlenumber.¾ For example, to get the rows 1and 3 from the previous table: FebJarno131Panu337dat[c(1,3),]¾ Or create the vector first, andextract after that: Janv -c(1,3)dat[v,]¾ These should give you:

Deleting a column or a row¾ One can delete a row or a column(or several of them using a vecterin the place of number) from adata frame by using a negativesubscript: ax511JanFebDario212Vidal48Max511

Selecting a subset by some variable¾ How to get those rowsfor whochthe value for February is below20?¾ Function which gives on index ofthe rows: which(dat Feb 20)[1] 2 4 5¾ To get the rows, use then indexas a subscript: i -which(dat Feb 11

Writing data to disk

Using sink¾ Sink prints everything you would normally see on the screen to afile.¾ Usage: sink(”output.txt”) print(”Just testing!”) sink()# Opens a file# Commands# Closes the file

Using write.table¾ Writing a data frame or a matrix to disk is rather straight-forward. Command write.table()¾ Usage: write.table(dat, ”dat.txt”, sep ”\t”, quote F,row.names T, col.names T) dat”dat.txt”sep ”\t”quote Frow.names Tcol.names Tname of the table in Rname of the file on diskuse tabs to separate columnsdon’t quote anything, not even textwrite out row names (or F if there are no row names)write out column names

Quitting R

Quitting R¾ Command q()¾ Asks whether to save workspace image or not. Answering yes would save all objects on disk in a file .RData. Simultaneously all the commands given in this session are saved in a file.RHistory.¾ These workspace files can be later-on loaded back into memoryfrom the File-menu (Load workspace and Load history).

Exercise V

Extracting columns and rows I/II¾ What is the size of the Students dataset (number or rows andcolumns)?¾ What are column names for the Students dataset?¾ Extract the column containing data for population. How manystudents are from Tampere?¾ Extract the tenth row of the dataset. What is the shoesize of thisperson?¾ Extract the rows 25-29. What is the gender of these persons?¾ Extract from the data only those females who from Helsinki. Howmany observations (rows) are you left?¾ How many males are from Kuopio and Tampere?

Extracting columns and rows II/II¾ Examine Hygrometer dataset. Notice that the measuments weretaken on two different dates (day1 and day2 – each hygrometerwas read before and after a few rainy days).¾ Modify the dataset so that the order of the measurements isretained, but the measurements for the day1 and day2 are in twoseparate column in the same data frame.¾ We will later on use this data frame for running certain statisticaltests (e.g., paired t-test) that require the data in this format.

Recoding variables

Making new variables I/¾ There are several ways to recode variables in R.¾ One way to recode values is to use command ifelse(). ifelse(Students shoesize 40, ”small”, ”large”)1. Comparison: is shoesize smaller than 402. If comparison is true, return ”small”3. If comparison is false, return ”large” You can combine several comparisons with logical operators ifelse((Students shoesize 37 &Students gender "female"), "small", "large")

Making new variables II/¾ If the coding needs to be done in several steps (e.g. we want toassign shoesizes to four classes), a better approach could be thefollowing. s -Students shoesizes[Students shoesize 37] -"minuscule"s[Students shoesize 37 & Students shoesize 39] -"small"s[Students shoesize 39 & Students shoesize 43] -"medium"s[Students shoesize 43] -"large”¾ At each step we select the only the observations that fulfill thecomparsion. At the first step, all students who have a shoesize less than or equal to 37 arecoded as minuscule. At the second step, all students having shoesize larger than 37 but smaller thanor equal to 39 are coded as small. And so forth.

Exercise VI

Making new variables¾ Make a new vector of the shoesize measurements (extract thatcolumn from the data).¾ Code the shoesize as it was done on the previous slides (in therange minuscule large).¾ Turn this character vector into a factor. Make the factor ordered sothat the order of the factor levels is according to the size(minuscule, small, medium, large).¾ Add this new factor to the Students dataset (make a new dataframe).

Day 2

Topics¾ Data exploration¾ Graphics in R¾ Wrap-up of the first half of the course

Exploration

Exploration – first step of analysis¾ Usually the first step of a data analysis is graphical dataexploration¾ The most important aim is to get an overview of thedataset Where is data centered?How is the data spread (symmetric, skewed )?Any outliers?Are the variables normally distributed?How are the relationships between variables: Between dependent and independents Between independents¾ Graphical exploration complements descriptivestatistics

Variable types¾ Continuous (vectors in R) HeightAgeDegrees in centigrade¾ Categorical (factors in R) Make of a carGenderColor of the eyes

Exploration – methods I/II¾ Single continuous variable Plots: boxplot, histogram (density plot, stem-and-leaf), normalprobability plot, stripchartDescriptives: mean, median, standard deviation, fivenumsummary¾ Single categorical variable Plots: contingency table, stripchart, barplotDescriptives: mode, contingency table¾ Two continuous variables Plots: scatterplotDescriptives: individually, same as for a single variable¾ Two categorical variables Plots: contingency table, mosaic plotDescriptives: individually, same as for a single variable

Exploration – methods II/II¾ One continous, one categorical variable Plot: boxplot, histogram, but for each category separatelyDescriptives: mean, median, sd , for each categoryseparately¾ Several continous and / or categorical variables Plots: pairwise scatterplot, mosaic plotDescriptives: as for continuous or categorical variables

Descriptive statistics

Mean¾ Mean sum of all values / the number of values

Standard deviation and variance¾ SD each observation’s squared difference from themean divided by the number of observation minus one. Has the same unit as the original variable¾ Var SD*SD SD 2

Normal distribution I/III¾ Some measurementsare normallydistributed in the realworld HeigthWeight¾ Means of observationstaken from otherwisedistributed data arealso normallydistributed¾ Hence, manydesciptives, andstatistical tests havebeen deviced on theassumption ofnormality

Normal distribution II/III¾ Normal distribution are described by two statistics: MeanStandard deviation¾ These two are enough to tell: Where is the peak (center) of the distribution locatedHow the data are spread around this peak

Normal distribution III/III

Quartiles¾ 1st quartile(25%), Median (50%), and 3rd quartile (75%)¾ 1 2 3 4 5 6 7 8 975% QuMedian75% QuInterquartile range (IQR)¾ Fivenum summary: Minimum (1), 1st Quartile (3), Medium (5), 3rd Quartile (7),maximum (9)

What if distribution is skewed or there areoutliers/deviant observation?¾ Use nonparametric alternatives to descriptives Median instead of meanInter-quartile range instead of standard deviation

Summary of a continuous variable I/II¾ summary( ) x -rnorm(100)summary(x)Min. 1st Qu. Median Mean 3rd Qu. Max.0.005561 0.079430 0.202900 0.310300 0.401000 tile(x, probs c(0.25, 0.75)) 1st and 3rd quartiles

Summary of a continuous variable II/II¾¾¾¾IQR(x)mad(x)sd(x)var(x) # inter-quartile range# robust alternative to IQR# standard deviation# variancesd(x) 2¾ table( )# Makes a table (categ. var.)

Outliers and missing values

What are these outliers then?¾ Outliers Technical errors The measurement is too high, because the machineryfailedCoding errors Male 0, Female 1 Data has some values coded with 2¾ Deviant observations Measurements that are somehow largely different from others,but can’t be treated as outliersIf the observation is not definitely an outlier, better treat it as adeviant observation, and keep it in the data

Outliersgender0 1 211 8 1¾ What are those with gender coded as 2?¾ Probably a typing error What if they are missign values (gender is unknown)?¾ If a typing error, should be checked from the originaldata¾ If a missing value, should be coded as missing value We will come to this shortly

Deviant observations

Missing values¾ Missing values are observation that really are missing avalue Some samples were not measured during the experimentSome students did not answer to certain questions on thefeedback from¾ If the sample was measured, but the results was verylow or not detectable, it should be coded with a smallvalue (half the detection limit, or zero, or something)¾ So, no measurement and measurement, but a smallresult, should be coded separately

Missing values in R I/II¾ In R missing values are coded with NA NA not available¾ Although it is worth treating missing measurements asmissing values, they tend to interfera with the analysis Many graphical, descriptive, and testing procedure fail, if thereare missing values in the data¾ An example x -c(NA, rnorm(10)) mean(x) [1] NA

Missing values in R II/II¾ The most simple way to treat missing values is to deleteall cases (rows) that contain at least one missing value.¾ For vector this means just removing the missingvalues: x2 -na.omit(x) mean(x2) [1] -0.1692371¾ There are other ways to treat missing values, such asimputation, where the missing values are recoded with,e.g., the mean of the continuous variable, or with themost common observation, if the variable iscategorical. x2[is.na(x2)] -mean(na.omit(x))

Graphical methods

Continuous variables

-2024Boxplot

Link between quartiles and boxplot

Histogram I/IIdensity.default(x 0.320000.4Histogram of rnorm(10000)-4-20rnorm(10000)24-4-202N 10000 Bandwidth 0.14324

Histogram II/IIHistogram of Histogram of x995699589960x99629964995699589960x9962

Link between histogram and boxplot

Stem-and-leaf plot¾ The decimal point is at the ¾¾¾¾¾¾-2 90-1 88876664322221000-0 9988866655555444443333222222111100 0011111111122223344456677788888991 001123344555692 3

Scatterplot

QQ-plot¾ QQ-plot is a plot that can be used for graphically testingwhether a variable is normally distributed. Normal distribution is an assumption made by many statisticalprocedures.

Pairwise scatterplot

Categorical variables

Stripchart

Barchart

Mosaicplot

Contingency tableJanuary February March dnesday5444

Exercise VII

Checking distributions¾ Are these data normally distributed?

Checking distributions¾ Are these data normally distributed?

Checking distributions¾ Are these data normally distributed?

UCB admissions¾ Claim: UCB discriminates against females. I.e., More females than males are rejected, and don’t getadmitted to the university.Does UCB discriminate?

¾ Claim: UCB discriminates against females. Does it?Department ARejectedAdmittedDepartment exMaleSexMaleAdmittedDepartment BAdmitAdmitAdmitDepartment DDepartment EDepartment mit

Graphics in R

Basic idea¾ All graphs in R are displayedon a graphical device.¾ If no device is open when theplotting command is called, anew one is opened, and theimage is displayed in it.¾ Graphics device is simply anew window that displayes thegraphic.¾ Graphic device can also be afile where the plot is written. Open itMake the plotClose it

Traditional graphics commands is R¾ High level graphical commands create the plot plot( )hist( )stem( )boxplot( )qqnorm( )mosaicplot( )# Scatter plot, and general plotting# Histogram# Stem-and-leaf# Boxplot# Normal probability plot# Mosaic plot¾ Low level graphical commands add to the plot points( )lines( )text( )abline( )legend( )# Add points# Add lines# Add text# Add lines# Add legend¾ Most command accept also additional graphicalparameters par( )# Set parameters for plotting

Graphical parameters in R¾ par( ) cexcolltylwdmarmfrowomapchxlimylim# font size# color of plotting symbols# line type# line width# inner margins# splits plotting area (mult. figs. per page)# outer margins# plotting symbol# min and max of X axis range# min and max of Y axis range

A few worked examples

Drawing a scatterplot in R I/V¾ Let’s generate some data x -rnorm(100)y -rpois(100, 10)g -c(rep(”horse”, 50), rep(”hound”,50))¾ Simple scatter plot plot(x, y)

Adding a title and axis labels II/V¾ plot(x, y, main ”Horses and hounds”,xlab ”Performance”, ylab ”Races”)

Drawing a scatterplot in R III/V¾ Coloring spots according to the group (horse or hound)they belong to cols -ifelse(g ”horse”, ”Black”, ”Red”)plot(x, y, main ”Horses and hounds”, xlab ”Performance”,ylab ”Races”, col cols)

Drawing a scatterplot in R IV/V¾ Changing the plotting symbol plot(x, y, main ”Horses and hounds”, xlab ”Performance”,ylab ”Races”, col cols, pch 20)plot(x, y, main ”Horses and hounds”, xlab ”Performance”,ylab ”Races”, col cols, pch ” ”)

Drawing a scatterplot in R V/V¾ Saving the image Menu: File - Save As - JPEG / BMP / PDF / postscript¾ Directing the plotting to a file pdf(”hnh.pdf”)plot(x, y, main ”Horses and hounds”, xlab ”Performance”,ylab ”Races”, col cols, pch 20)dev.off()¾ Setting

it will be applied seperately for every observation in that vector: log2(v) [1] 0.000000 0.000000 0.000000 1.000000 1.000000 1.000000 [7] 1.000000 1.584963 1.584963 1.584963 1.584963 1.584963 ¾When applied to a vector, the lenght of the result is as long as the starting vector. ¾When a function

Related Documents:

11 dy dy Chest Cardio Bar k 12 Shoulders dy dy Chest TEST de k dy Core Cardio dy Shoulders Chest REST. Week 1 Workouts Quote: Be the best YOU can be! Exercices Reps Sets Rest Day1 Test-Pushups Max1min 2-3 1min Day1 Test-Pullups Max30sec 2-3 1min Day1 Test-Squats Max1min 2-3 1min

Web Statistics -- Measuring user activity Contents Summary Website activity statistics Commonly used measures What web statistics don't tell us Comparing web statistics Analyzing BJS website activity BJS website findings Web page. activity Downloads Publications Press releases. Data to download How BJS is using its web statistics Future .

Statistics Student Version can do all of the statistics in this book. IBM SPSS Statistics GradPack includes the SPSS Base modules as well as advanced statistics, which enable you to do all the statistics in this book plus those in our IBM SPSS for Intermediate Statistics book (Leech et al., in press) and many others. Goals of This Book

The following is a simple example of using the IBM SPSS Statistics - Integration Plug-in for Java to create a dataset in IBM SPSS Statistics, compute descriptive statistics and generate output. It illustrates the basic features of invoking IBM SPSS Statistics from an external Java application. import com.ibm.statistics.plugin.*;

as economic statistics, education statistics and health statistics, to name a few. Having timely, valid, reliable, and comparable labour statistics is crucial to inform policy formulation, implementation and evaluation, labour market research and goal setting and monitoring. Such labour statistics can be derived from a number of different types of

Pretoria: Statistics South Africa, 2012 1 vol. (various paging) Previous title: South African Statistics 1995 Suid-Afrikaanse Statistieke 1995 Title continues in English only ISBN: 978-0-621-40949-9 1. Population Statistics 2. Tourist trade 3. Vital statistics 4. Education South Africa Statistics 5. Labour Statistics 6. Prices 7. South Africa .

San Joaquin Delta College MATH 12: Introduction to Statistics and Probability Theory (3) San Jose City College MATH 63: Elementary Statistics (3) San Jose State University STAT 095: Elementary Statistics (3) STAT 115a: Elementary Statistics (3) STAT 115B: Intermediate Statistics (3) Santa Barbara City College

of this system requires a new level of close integration between mechanical, electrical and thermal domains. It becomes necessary to have true multi-domain data exchange between engineering software tools to inform the system design from an early concept stage. At the most progressive automotive OEMs, thermal, electrical