INTRODUCTION TO MULTIVARIATE ANALYSIS OF

2y ago
15 Views
2 Downloads
1.30 MB
59 Pages
Last View : 5d ago
Last Download : 3m ago
Upload by : Samir Mcswain
Transcription

INTRODUCTION TO MULTIVARIATEANALYSIS OF ECOLOGICAL DATADavid Zelený & Ching-Feng Li

INTRODUCTION TO MULTIVARIATE ANALYSISEcologial similarity similarity and distance indicesIntroduction to Multivariate Analysis David Zelený Gradient analysis regression, calibration, ordination linear and unimodal methods unconstrained and constrained ordination eigenvalue-based and distance-based ordinations partial ordination, variation partitioning Monte-Carlo permutation tests, forward selectionClassification hierarchical and non-hierarchical algorithms cluster analysis, TWINSPAN2

LITERATUREGotelli & Ellison (2004) A Primer of Ecological Statistics.Sinauer Associates. Lepš & Šmilauer (2003) Multivariate Analysis of EcologicalData Using CANOCO. Cambridge. well explained basics of various methods used for analysis ofecological data, clever examplesLegendre & Legendre (1998) Numerical Ecology. 2nd EnglishEdition, Elsevier. less theory, more practical use, focused on CANOCO users, casestudies for independent work including training datasetsZuur, Ieno & Smith (2007) Analysing Ecological Data. Statisticsfor Biology and Health. Springer. well written, excellent for beginners; not too much about multivariateanalysisIntroduction to Multivariate Analysis David Zelený bible for numerical ecology, surprisingly also quite readableOrdination website of Mike Palmer (http://ordination.okstate.edu/) comprehensive introduction, but a bit out of date3

David ZelenýSimilarity and distance indicesIntroduction to Multivariate AnalysisECOLOGICAL SIMILARITY

SIMILARITIES DISTANCESDavid ZelenýSimilarity indicesrepresent the similarity between samples, not their position inmultidimensional space lowest value 0 – samples have no species in common highest value (1 or other) – samples are identicalIntroduction to Multivariate Analysis Distances among samples allows to locate the sample in multidimensional space the lowest value 0 – samples are identical (at the samelocation) value increases with the increasing dissimilarity betweensamples5

PROBLEM OF „DOUBLE-ZEROS”David ZelenýThe fact, that species is missing simultaneously in bothsamples (double-zeros), can have several meanings:on the gradient, samples are located outside the speciesecological niche Introduction to Multivariate Analysis but we cannot say if both samples are located at the same end ofecological gradient (and are thus quite similar), or they arelocated on opposite sites of the gradient (and thus they are quitedifferent)on the gradient, samples are located inside the speciesecological niche, but the species is missing, because it didn’t get there (dispersal limitation) we overlooked it (sampling bias) just now it’s in dormant stage and we cannot see it (therophytes,geophytes)6

wetspecies 1wetspecies 2mesicspecies 1mesicspecies 2dryspecies 1dryspecies 2PROBLEM OF „DOUBLE-ZEROS”sample 1110000sample 2011110sample 3000011 symetrical indices of similarity: double zeros in two samplesincrease similarity of these samples asymetrical indices: double zeros are ignored

SIMILARITY INDICESDavid Zelenýqualitative quantitativequalitative – for presence-absence data quantitative – for counts, abundances etc.Introduction to Multivariate Analysis symmetrical asymmetrical symmetrical – treats double-zeros in a same way as doublepresences (they contain information about similarity of samples);rarely used in community ecology asymmetrical – ignores double-zeros; the most common indicesin community ecologymetrics semimetrics semimetrics do not follow triangle inequality rule and cannot beused to order points in Euclidean (metric) space8

SIMILARITY INDICESJaccard indexJ a / (a b c) Sørensen indexS 2a / (2a b c) csample 2presence of the species in both samples (a) has double weightcompared to Jaccard indexSimpson index sample 1aIntroduction to Multivariate Analysis bDavid ZelenýQualitative data (presence-absence)Si a / [a min (b,c)]suitable for samples with very different number of speciesQuantitative data (cover, abundance) e.g. generalized Sørensen index (percentage similarity) quantitative variant of Sørensen index suitable for ecological data percentage dissimilarity (PD, Bray-Curtis index) 1 – PS9

DISTANCE MEASURESDavid Zelený Euclidean distancerange of values is strongly dependent on used units intuitive, but sensitive for outliers – not too suitable for ecologicaldata chord distance, relativized Euclidean distance Euclidean distance calculated on samples standardized bysample normchi-square distance Introduction to Multivariate Analysis usually not explicitly calculateddistance among samples in unimodal ordination techniques (e.g.correspondence analysis)all similarity indices could be transformed into distancesD 1 – S, or D (1 – S) square-root formula used e.g. for Sørensen index 10

David ZelenýIntroduction to Multivariate AnalysisGRADIENT ANALYSIS

ORDINATIONJUSTIFICATION OF THE PROBLEMecological gradient usually influences response (abundance)of several species simultaneously species data are redundant – if I know response of one species,I can (somehow) predict also the behavior of other species thanks to this redundancy it makes sense to reduce manydimensions of multidimensional space (spaces no. 1-4) into fewdimensions of ecological space (space no. 5)Introduction to Multivariate Analysis David Zelený if the species response completely independently on eachother, ordination (reduction of multidimensional space) is notworth trying – it doesn’t bring anything new16

David ZelenýZpracování dat v ekologii společenstevhttp://ordination.okstate.edu/17

ORDINATIONFORMULATIONS OF THE PROBLEM3)hidden variables (ordination axes) – find hidden (‘latent’)variables, that represents the best predictors for the valuesof all the speciesIntroduction to Multivariate Analysis2)David Zelený1)configuration of samples in ordination space – find suchconfiguration of samples in reduced ordination space, so astheir distances in this space correspond to theircompositional dissimilarityreduction of dimensionality – project multidimensionalspace defined by particular species into few-dimensionalspace defined by ordination axes18

UNCONSTRAINED CONSTRAINED ORDINATIONDavid ZelenýUnconstrained ordination (indirect gradient analysis)uses only matrix samples species searches for hidden variables (ordination axes), which bestrepresent the variability in species data more for hypothesis generation, not testingIntroduction to Multivariate Analysis Constrained ordination (direct gradient analysis) needs two matrices: samples species and samples environmental variables constrained ordination axes represent the directions of thevariability in species data, which can be explained by knownenvironmental variables more for hypothesis testing than generating19

SPECIES RESPONSE ON ECOLOGICALGRADIENTunimodalDavid 40.6gradient0.8gradient

David Zelenýspecies abundancespecies abundancelong ecological gradientZpracování dat v ekologii společenstevshort ecological gradientenvironmental gradientenvironmental gradient21

BASIC ORDINATION TECHNIQUESunconstrainedordinationPCA(Principal ComponentAnalysis)CA(Correspondence Analysis)constrainedordinationRDACCA(Redundancy Analysis) (Canonical CorrespondenceAnalysis)Introduction to Multivariate Analysisunimodal speciesresponseDavid Zelenýlinear speciesresponse22

PCA (PRINCIPAL COMPONENT ANALYSIS)David ZelenýZpracování dat v ekologii společenstev23

BROKEN-STICK MODELDavid Zelený10203040Zpracování dat v ekologii společenstev0stickstick will break randomly into 6peaces24

EXPLANATORY VARIABLES IN ORDINATIONTWO ALTERNATIVE APPROACHESunconstrained ordination correlation ‒get samples scores on main ordination axescorrelate these samples scores with environmental variablesIntroduction to Multivariate Analysis 2.David Zelený1.for sure will catch the main gradients in species compositionmay not catch the part of variability, which is directly related tomeasured environmental factorsconstrained ordination ‒environmental variables enter the ordination as explanatoryvariablessample scores on the ordination axes is directly influenced bythese variablesfor sure will catch the part of the variability in species composition,which is directly related to measured environmental factorsmay lost information about variability in data, which is not directlyrelated to any environmental factor26

ORDINATION DIAGRAMDavid ZelenýIntroduction to Multivariate Analysisconstrainedordination27Lepš & Šmilauer (2003) Multivariate analysis of .unconstrainedordinationunimodal methodlinear method

ORDINATION DIAGRAMRULES FOR VISUALIZATION- pointsdisplay of species- arrows (linear methods)- points, centroids (unimodal methods) display of ordination axes display of environmental variables horizontal axis should be axis of higher rankaxis orientation is arbitraryarrows (quantitative variables)centroids (categorical variables)types of ordination diagrams scatterplot - 1 type of data (samples or species)biplot - 2 types of data (e.g. samples and species)triplot - 3 types of data (samples, species andenvironmental variables)28Introduction to Multivariate Analysis David Zelenýdisplay of samplesLepš & Šmilauer (2003) Multivariate analysis of .

ARTIFACTS IN ORDINATIONS29Introduction to Multivariate AnalysisArch effect Correspondence Analysis (CA) order of samples along the firstaxis still reflect their realdissimilarity the second axis is non-linearcombination of the first axisDavid Zelenýhttp://ordination.okstate.eduHorseshoe effect Principal Component Analysis(PCA) order of samples along the firstaxis doesn’t reflect their realdissimilarity in extreme case, the ends of thehorseshoe may cross

ARTIFACTS IN ORDINATIONSHORSESHOE AND ARCH EFFECTSDavid ZelenýPossible explanations:algorithm consequence – each higher ordination axis has to be linearlyindependent on the lower one, but nonlinear dependence is notconsidered projection consequence – nonlinear relationships between species andenvironmental gradients are projected into linear space defined byEuclidean distanceshttp://ordination.okstate.eduIntroduction to Multivariate Analysis 30

ARTIFACTS IN ORDINATIONSPOSSIBLE SOLUTIONSdetrending – removal of the trend from ordination axes Detrended Correspondence Analysis (DCA, Hill & Gauch 1980) detrending by segments (the most common) detrending by polynomials (if there are covariables in analysis)Introduction to Multivariate Analysis David Zelený use of distance-based ordination techniques, which allows toordinate the samples using distance coefficients different fromEuclidean distance (PCA) or chi-square distance (CA) Principal Coordinate Analysis (PCoA, synonym for MetricDimensional Scaling, MDS) Non-metric Multidimensional Scaling (NMDS)31

DISTANCE-BASED EIGENVALUE-BASEDORDINATION METHODSDavid ZelenýDistance -based ordination methodsNMDS and PCoA based on dissimilarity matrix interpretation is focused on the distance among samples inordination spaceIntroduction to Multivariate Analysis Eigenvalue-based ordination methods PCA, CA or DCA, respectively, and their constrained twins RDA,CCA or DCCA, respectively based on the matrix samples x species, from which mainordination axes (eigenvectors) are extracted interpretation is focused on the directions of variability in speciesdata, expressed by particular ordination axes32

DETRENDED CORRESPONDENCE ANALYSISTHE PROCESS OF “DETRENDING”David ZelenýStep 1 – the first axis is divided into several segmentsIntroduction to Multivariate Analysishttp://ordination.okstate.eduStep 2 – samples in each segment are centered along the second axis33

DETRENDED CORRESPONDENCE ANALYSISTHE PROCESS OF “DETRENDING”David Zelený- resulting ordination diagram has axesscaled in SD values (SD standarddeviation in Gaussian curve)Introduction to Multivariate Analysister Braak (1987)Step 3 – nonlinear rescaling of the firstaxis, which removes the clumping ofsamples at the ends of the gradient- half-change in species composition willoccur along gradient of length 1-1.4 SD34http://ordination.okstate.edu

DETRENDED CORRESPONDENCE ANALYSISPROS AND CONSDavid Zelený inelegant method, which is sometimes compared to the use ofhammer on dataIntroduction to Multivariate Analysis the result is strongly dependent on the decision about thenumber of segments (recommendation: do not stick to defaultonly) if there are two or more strong environmental gradients in data,DCA cannot handle them (but this is similar also for otherordination methods) the gradient of the second (and higher) ordination axis isdistorted by detrending even hammer, if used by expert, can be an effective tool – themethod gives often results with good ecological interpretation axes of DCA are in SD units, allowing for estimation of gradientlength36

SELECTION OF ORDINATION METHODLINEAR OR UNIMODAL? recommendation of Lepš & Šmilauer (2003) – use DCA(detrending by segments) to determine gradient length, and ifthe first DCA axis isIntroduction to Multivariate Analysislinear methods requires homogeneous data, unimodalmethods can handle heterogeneous dataDavid Zelený shorter than 3 SD – use linear techniquelonger than 4 SD – use unimodal techniquebetween 3-4 SD – both techniques are OK however, this is just a ‘cookbook’ recommendation, not basedon research, and doesn’t have to apply in every case37

THREE APPROACHES TO UNCONSTRAINEDORDINATION ANALYSISDavid Zelený38Legendre & Legendre (2012)

PCOA AND NMDSDISTANCE-BASED ORDINATION METHODScalculation based on matrix of dissimilarities between samples result dependent on the choice of distance measure usedDavid Zelený Introduction to Multivariate AnalysisPCoA – Principal Coordinate Analysis (Metric DimensionalScaling) if Euclidean distance is used - result identical to PCA if Chi-square distance is used - result identical to CANMDS – Non-metric Multidimensional Scaling non-metric alternative of PCoA iterative method, each run can find different solution in the beginning, number of dimensions (k) need to be chosen with larger datasets VERY time consuming39

COMPARISON OF DCA AND NMDSdetrending in reality twists theordination space, so it lookspretty in 2D, but not so in 3Dand higher. points will produce triangularor diamond shape – which isactually artifact of detrending!NMDS the method tries to projectsamples into 2D figure, so asdistances between thesesamples maximallycorrespond to the sampledissimilaritiesnon-metric method – doesn’tassume the unimodal shapeof species response curvesaccording to Minchin (1987)it’s the most robustunconstraint ordinationmethod in vegetation ecologyIntroduction to Multivariate Analysis David ZelenýDCA40

COMPARISON OF DCA AND NMDSDavid ZelenýDCANMDSIntroduction to Multivariate Analysis41triangle-shape artifacttends to display results as sphere

HOW TO READ RESULTS OF ORDINATION?David Zelený variability explained by main ordination axescalculated as: axis eigenvalue / total variance indicates, how successful the ordination process was the more are species correlated with each other, the morevariability will be explained by several main ordination axes it makes sense to compare variability explained by variousordination methods on the same dataset it doesn’t make sense to compare variability explained by thesame ordination methods applied on different datasets(eigenvalues are dependent on number of players in a game –species and samples)Introduction to Multivariate Analysis 43

HOW TO READ RESULTS OF ORDINATION?sample scores on ordination axes in ordination diagram samples represented by points (both linearand unimodal techniques) distance between samples in ordination space is proportional tothe dissimilarity in their species compositionscores of independent (environmental) variables * Introduction to Multivariate Analysis David Zelený regression coefficients, important are their signs (positive /negative)test of significance (Monte-Carlo permutation test) * indicates statistical significance of used environmental variables44* only constrained ordination techniques

PASIVELY PROJECTED ENVIRONMENTALVARIABLES IN UNCONTRAINED ORDINATIONDavid ZelenýDCA245DCA1Zpracovánídat vekologiispolečen

Environmental variables in unconstrained spe1species data matrixenvironmentalvariablesDCA2sample scores on the firstand second DCA axis.sam1.r1ordination diagram DCArelationship betweenenvironmental variables andordination tion ofenvironmentalvariables andscores onordination axes46

1060515202530sam 7201015gradientsam 32530sam 2sam 3sam 4sam 5-20sam 42005species 1 (residual)0residualsresiduály2040sam 1sam 2sam 6sam 5sam 6sam 7spe 3sam 6gradient0env 2env 1sam 1spe 3sam 540species 1sam 7spe 2sam 40sam 6spe 2sam 5spe 110040sam 380sam 4sam 26080100sam 3species 1 (predicted)sam 2sam 120regression ofabundance onenvironmentalgradientsam 1predikovanépredictedhodnotyvalues0spe 3spe 2spe 1matrix of samples speciesspe 1PRINCIPLE OF CONSTRAINED ORDINATION (RDA)matrix ofexplanatoryvariablessam 7051015gradient202530

principle of constrained ordinationconstrained ordination axesspe 3spe 2spe 1matrix of predictedvaluesnumber ofconstrainedordination axes number ofexplanatory varaiblessam 1sam 2ordinacesam 3sam 4sam 5(if the explanatoryvariable is categorical,number of constrainedaxes equals to numberof categories -1)sam 6spe 3spe 2spe 1sam 7sam 1sam 2ordinacesam 3sam 4sam 5sam 6sam 7matrix of residuals48unconstrained ordinationaxes

CONSTRAINED ORDINATION ANALYSISMONTE-CARLO PERMUTATION TEST test of the first canonical axis – tests the effect of only one (quantitative)variable test of all canonical axes – tests the effect of all environmental variables,or effect of one categorical environmental variable (no of axes no ofcategories-1)Introduction to Multivariate Analysisit tests the null hypothesis, that the species composition is not related toany of environmental variablesDavid Zelený nx – no of permutationswith Fperm FdataN – no of all permutations49

PARTIAL ORDINATION (E.G. PCCA) „not interesting“ variables are defined as covariables after this, remaining variability in species data is analyzedusing unconstrained or constrained ordination if unconstrained analysis follows – ordination axes representthe variability in species composition, which remains afterremoving the effect of covariables if constrained analysis follows – ordination axes representsthe net effect of all other environmental variables without theeffect of covariablesIntroduction to Multivariate Analysisremoves the part of the variability explained by environmentalfactors, which are not relevant / interesting (e.g. the effect ofblock)David Zelený 50

VARIANCE PARTITIONINGBORCARD ET AL. 1992, ECOLOGY 73: 1045–1055David ZelenýCalculation procedure:covariableexplainedvariability1 and 2none[a] [b] [c]12[a]21[c]Introduction to Multivariate Analysisexplanatoryvariable[d][a][b]variable 1[c]variable 2[a] [b] – marginal effect of variable 1[a] – conditional effect of variable 1 (conditioned by variable 2)shared variance: [b] ([a] [b] [c])-[a]-[c]variability not explained by the model: [d] Total inertia – ([a] [b] [c])Borcard et al. 1992, Ecology 73: 1045–105551

VARIANCE PARTITIONINGØKLAND (1999) J. VEG.SCI. 10: 131-136 the common interpretation of unexplained variability asrandom variation (noise) in data is inappropriate recommendation: do not calculate eigenvalue-to-total inertiaratio; instead, focus on relative amount of variation explainedby different sets of explanatory variablesIntroduction to Multivariate Analysisamount of compositional variability extracted by ecologicallyinterpretable ordination axes, if calculated as eigenvalue-tototal inertia ratio, is underestimated due to lack-of-fit of data tomodelDavid Zelený 52

FORWARD SELECTION OF ENVIRONMENTALVARIABLES in each step, significance of particular variables is tested byMonte-Carlo permutation tests it selects the variables, which explains the most variability andis significant – this variable is included in the model ascovariable in the next step, it includes other variables and continues instatistical testing significance tests suffer from the problem of multiplecomparisons and thus they are quite liberal (number ofsignificant variables included in the model is unrealisticallyhigh, Bonferroni correction is desirable)Introduction to Multivariate Analysisfrom available environmental variables choose only those withsignificant effectDavid Zelený 53

WHY ORDINATION? reduced low-dimensional ordination space represents mainecological gradients and reduce the noise in data (ordination noise reduction technique) in case of statistical testing, the ordination doesn’t suffer fromthe problem of multiple comparisons we can determine the relative importance of differentgradients; this is virtually impossible with univariatetechniques some techniques (DCA) provide the measure of betadiversity the graphical results from most techniques often lead to readyand intuitive interpretations of okstate.eduIntroduction to Multivariate Analysisit is impossible to visualize more than three dimensions – butecological data have hundreds of dimensionsDavid Zelený 54

THREE ALTERNATIVE APPROACHES TOCONSTRAINED ORDINATIONLegendre & Legendre (2012) following Legendre & Gallagher (2001)

PROBLEM OF MULTIPLE TESTINGDavid ZelenýSimulation:25 randomly generatedvariables test the significance of thecorrelation of each pair significant correlations (p 0.05) are represented bydark squares total of 300 analyses, 16significant solution: apply correctionfor multiple testing (e.g.Bonferroni)Zpracování dat v ekologiispolečenstev 56

MANTEL TESTDavid ZelenýDeenvironmental 1234species x sample 41.410.31.41r 0.965p 0.015Zpracovánídat vekologiispolečen

David ZelenýIntroduction to Multivariate AnalysisCLASSIFICATION

iationanalysis)Introduction to Multivariate Analysisnon-hierarchical(K-meansclustering)David Zelenýclassificationmethodsagglomerative(cluster analysiss.s.)polythetic(TWINSPAN)59

CLUSTER ANALYSISDavid ZelenýIntroduction to Multivariate AnalysisResult of cluster analysis is influenced bysequence of decisions, which aremade at different stagesof data processingresultingdendrogramdissimilarity matrixraw data choice of clusteralgorithm (singlelinkage, completelinkage etc.) transformation standardization distance measure(Euclidean, BrayCurtis, Manhattanetc.)data collection choice ofimportance value(cover, abundance)60

INFLUENCE OF DATA TRANSFORMATIONDavid ZelenýSingle linkage / Euclidean distance / LOG transformationIntroduction to Multivariate 1065181113121041614214121719138317491Single linkage / Euclidean distance / no transformationtransformation of data (e.g. log transformation) may stronglyinfluence the result of classification (in case of Euclidean distanceand single linkage method it is especially true)61

SINGLE LINKAGE COMPLETE LINKAGEDavid ZelenýBray-Curtis distance / Complete linkageIntroduction to Multivariate 18161312141911171719Bray-Curtis distance / Single linkagesingle linkage has pronouncedchaining62

CLUSTERING ALGORITHMSEuclidean distance / UPGMADavid ZelenýSingle linkage (Nearest neighbor) 48391312152011181621017196includes several methods (e.g.UPGMA) standing between single andcomplete linkage57 14Average linkageEuclidean distance / Ward's methodmore meaningful for ecological dataWard’s minimum variance method498316152013121118141cannot be combined with semimetricsimilarity indices (e.g. Sørensen / BrayCurtis similarity index)2106571719 Introduction to Multivariate Analysis1Complete linkage (Furthest neighbor)

CLUSTERING ALGORITHMS highest chaining for β 1,lowest chaining for β -1 optimal representation ofdistances among samples forβ -0.25Introduction to Multivariate Analysisparameter β influences thechaining of the dendrogramDavid Zelený Legendre & Legendre 1998Flexible clustering (betaflexible)

CLUSTER ANALYSIS TWINSPANDavid ZelenýCluster analysisagglomerative method – clusters are formed from the bottom, by clusteringindividual samples decisions: which distance measure and clustering algorithm to useIntroduction to Multivariate Analysis TWINSPAN (Two-Way INdicator Species Analysis) divisive method – cuts the data from the top suitable for data structured by one strong ecological gradient and fordetermining several (few) groups along this gradient results into two-way sorted table, similar to the one used in phytosociology algorithm: samples are sorted along the first axis of correspondence analysis (CA, DCA) andthen divided into two groups (positive and negative scores)method has complicated way how to treat the samples located close to the axiscenter, which have high probability of being misclassified includes number of arbitrary numerical steps (often criticized, but still favorite) decisions: stopping rules for division, pseudospecies65

CLASSIFICATION IN GENERALDavid ZelenýSubjective Introduction to Multivariate Analysisbased on subjective decisions of the researcher, not easy to bereproduced by somebody elseFormalized (not objective!) unsupervised selection of clearly defined classification criteria, easy to reproducenumerical methods of classification (e.g. cluster analysis,TWINSPAN)allow only for very rough control of the classification result (you canchoose the method and set up several parameters)supervised ANN – artificial neural networks, classification trees, random forests,COCKTAILneed to be trained first and than the method can reproduce the sameclassification structure66

Gotelli & Ellison (2004) A Primer of Ecological Statistics. Sinauer Associates. well written, excellent for beginners; not too much about multivariate analysis Lepš & Šmilauer (2003) Multivariate Analysis of Ecological Data Using CANOCO.Cambridge. less theory, more practical use, focused on CANOCO users, case

Related Documents:

Introduction to Multivariate methodsIntroduction to Multivariate methods – Data tables and Notation – What is a projection? – Concept of Latent Variable –“Omics” Introduction to principal component analysis 8/15/2008 3 Background Needs for multivariate data analysis Most data sets today are multivariate – due todue to

An Introduction to Multivariate Design . This simplified example represents a bivariate analysis because the design consists of exactly two dependent or measured variables. The Tricky Definition of the Multivariate Domain Some Alternative Definitions of the Multivariate Domain . “With multivariate statistics, you simultaneously analyze

6.7.1 Multivariate projection 150 6.7.2 Validation scores 150 6.8 Exercise—detecting outliers (Troodos) 152 6.8.1 Purpose 152 6.8.2 Dataset 152 6.8.3 Analysis 153 6.8.4 Summary 156 6.9 Summary:PCAin practice 156 6.10 References 157 7. Multivariate calibration 158 7.1 Multivariate modelling (X, Y): the calibration stage 158 7.2 Multivariate .

Multivariate longitudinal analysis for actuarial applications We intend to explore actuarial-related problems within multivariate longitudinal context, and apply our proposed methodology. NOTE: Our results are very preliminary at this stage. P. Kumara and E.A. Valdez, U of Connecticut Multivariate longitudinal data analysis 5/28

Multivariate Statistics 1.1 Introduction 1 1.2 Population Versus Sample 2 1.3 Elementary Tools for Understanding Multivariate Data 3 1.4 Data Reduction, Description, and Estimation 6 1.5 Concepts from Matrix Algebra 7 1.6 Multivariate Normal Distribution 21 1.7 Concluding Remarks 23 1.1 Introduction Data are information.

Multivariate data 1.1 The nature of multivariate data We will attempt to clarify what we mean by multivariate analysis in the next section, however it is worth noting that much of the data examined is observational rather than collected from designed experiments. It is also apparent th

Multivariate Analysis Notes Adrian Bevan , These notes have been developed as ancillary material used for both BABAR analysis school lectures, and as part of an undergraduate course in Statistical Data Analysis techniques. They provide a basic introduction to the topic of multivariate analysis.

Multivariate calibration has received significant attention in analytical chemistry, particularly in spectroscopy. Martens and Naesl provide an excellent general reference on multivariate calibration. Examples of multivariate calibration in a spectroscopic context are associated w