Ryan Tibshirani Data Mining: 36-462/36-662 March 26 2013

2y ago

4 Views

1 Downloads

271.02 KB

26 Pages

Last View : 3m ago

Last Download : 3m ago

Upload by : Grady Mosby

Report this link

Download PDF

Transcription

Model selection and validation 1: Cross-validationRyan TibshiraniData Mining: 36-462/36-662March 26 2013Optional reading: ISL 2.2, 5.1, ESL 7.4, 7.101

Reminder: modern regression techniquesOver the last two lectures we’ve investigated modern regressiontechniques. We saw that linear regression has generally low bias(zero bias, when the true model is linear) but high variance,leading to poor predictions. These modern methods introducesome bias but significantly reduce the variance, leading to betterpredictive accuracyGiven a response y Rn and predictors X Rn p , we can thinkof these modern methods as constrained least squares:β̂ argmin ky Xβk22 subject to kβkq t,β RpPpIq 1 gives the lasso (and kβk1 j 1 βj )Iq 2 gives ridge regression (and kβk2 qPp2j 1 βj )2

Regularization at largeMost other modern methods for regression can be expressed asβ̂ argmin ky Xβk22 subject to R(β) t,β Rpor equivalently,β̂ argmin ky Xβk22 λ · R(β).β RpThe term R is called a penalty or regularizer, and modifying theregression problem in this way is called applying regularizationIRegularization can be applied beyond regression: e.g., it canbe applied to classification, clustering, principal componentanalysisIRegularization goes beyond sparsity: e.g., design R to inducesmoothness or structure, instead of pure sparsity3

Example: smoothing splinesSmoothning splines use a form of regularization:fˆ argminnXf 2yi f (xi ) λ ·i 1Z 2f 00 (x) dx{z} R(f ) 0.00.20.41210 0.8λ too small1.00.00.20.4 6 0.6 46 8108 242 4 26 0810 0 01212Example with n 100 points: 0.60.8λ just right1.00.00.20.40.60.81.0λ too big4

Choosing a value of the tuning parameterEach regularization method has an associated tuning parameter:e.g., this was λ in the smoothing spline problem, and λ for lassoand ridge regression in the penalized forms (or t in the constrainedforms)The tuning parameter controls the amount of regularization, sochoosing a good value of the tuning parameter is crucial. Becauseeach tuning parameter value corresponds to a fitted model, we alsorefer to this task as model selectionWhat we might consider a good choice of tuning parameter,however, depends on whether our goal is prediction accuracy orrecovering the right model for interpretation purposes. We’ll coverchoosing the tuning parameter for the purposes of prediction;choosing the tuning parameter for the latter purpose is a harderproblem5

Prediction error and test errorOur usual setup: we observeyi f (xi ) i ,i 1, . . . nwhere xi Rp are fixed (nonrandom) predictor measurements, f isthe true function we are trying to predict (think f (xi ) xTi β fora linear model), and i are random errorsWe call (xi , yi ), i 1, . . . n the training data. Given an estimatorfˆ built on the training data, consider the average prediction errorover all inputs Xn 210ˆˆyi f (xi )PE(f ) Eni 1where yi0 f (xi ) 0i , i 1, . . . n are another set of observations,independent of y1 , . . . yn6

Suppose that fˆ fˆθ depends on a tuning parameter θ, and wewant to choose θ to minimize PE(fˆθ )If we actually had training data y1 , . . . yn and test data y10 , . . . yn0(meaning that we don’t use this to build fˆθ ) in practice, then wecould simply useTestErr(fˆθ ) n 21X 0yi fˆθ (xi )ni 1called the test error, as an estimate for PE(fˆθ ). The larger that nis, the better this estimateWe usually don’t have test data. So what to do instead?7

What’s wrong with training error?It may seem liken 21X 0ˆyi fˆθ (xi )andTestErr(fθ ) ni 1n 21XˆTrainErr(fθ ) yi fˆθ (xi )ni 1shouldn’t be too different—after all, yi and yi0 are independentcopies of each other. The second quantity is called the trainingerror: this is the error of fˆ as measured by the data we used tobuild itBut actually, the training and test error curves are fundamentallydifferent. Why?8

Example: smoothing splines (continued)Back to our smoothing spline example: for a small value of thetuning parameter λ, the training and test errors are very different104 12 0.20.40.60.81.0 0.0 6 2 4 26 08 10 0TestError 1.834812TrainError 0.2210.00.20.40.60.81.09

Training and test error curves, averaged over 100 simulations:2468Training errorTest error1e 071e 051e 031e 01λ10

Cross-validationCross-validation is a simple, intuitive way to estimate predictionerror1.5Prediction error1.00.50.0Even if θ is a continuous parameter, it’s usually not practically feasible to consider all possible valuesof θ, so we discretize the rangeand consider choosing θ over somediscrete set {θ1 , . . . θm }2.02.5Given training data (xi , yi ), i 1, . . . n and an estimator fˆθ ,depending on a tuning parameter θ 2 46 8 10θ11

For a number K, we split the training pairs into K parts or “folds”(commonly K 5 or K 10)K-fold cross validation considers training on all but the kth part,and then validating on the kth part, iterating over k 1, . . . K(When K n, we call this leave-one-out cross-validation, becausewe leave out one data point at a time)12

K-fold cross validation procedure:I Divide the set {1, . . . n} into K subsets (i.e., folds) of roughlyequal size, F1 , . . . FKI For k 1, . . . K:I Consider training on (x , y ), i / Fk , and validating oni i(xi , yi ), i FkI For each value of the tuning parameter θ {θ , . . . θ },1mcompute the estimate fˆθ k on the training set, and recordthe total error on the validation set:X 2ek (θ) yi fˆθ k (xi )i FkIFor each tuning parameter value θ, compute the average errorover all folds,KCV(θ) K 21XX1Xyi fˆθ k (xi )ek (θ) nnk 1k 1 i Fk13

2.5Having done this, we get a cross-validation error curve CV(θ) (thiscurve is a function of θ), e.g.,2.0 1.51.0 0.5CV error 0.0 2 46810θand we choose the value of tuning parameter that minimizes thiscurve,θ̂ argmin CV(θ)θ {θ1 ,.θm }14

Example: choosing λ for the lassoRecall our running example from last time: n 50, p 30, andthe true model is linear with 10 nonzero coefficients. We considerβ̂ lasso argmin ky Xβk22 λkβk1β Rpand use 5-fold cross-validation to choose λ. Sample code formaking the folds:K 5i.mix sample(1:n)folds vector(mode "list",length K)# divide i.mix into 5 chunks of size n/K 10, and# store each chunk in an element of the folds list folds[[1]][1] 30 29 4 2 50 42 27 25 23 41[[2]][1] 21 44 45.9 10 26 166 24 4815

4.0We consider the values: lam seq(0,12,length 60). Theresulting cross-validation error curve:3.5 λ 3.458 3.0 2.5 2.0 1.5CV error 0246λ8101216

1.74.0How does this compare to the prediction error curve? Because thisis a simulation, we can answer this question λ 4.0683.5 3.0 2.5 1.5 2.0 1.41.5CV error 1.6 Prediction error λ 3.4580246λ81012024681012λ17

Standard errors for cross-validationOne nice thing about K-fold cross-validation (for a small K n,e.g., K 5) is that we can estimate the standard deviation ofCV(θ), at each θ {θ1 , . . . θm }First, we just average the validation errors in each fold:CVk (θ) 21 X1yi fˆθ k (xi )ek (θ) nknki Fkwhere nk is the number of points in the kth foldWe then compute the sample standard deviation of CV1 (θ), . . .CVK (θ),q SD(θ) var CV1 (θ), . . . CVK (θ)Finally we estimate the standard deviation of CV(θ)—calledthe standard error of CV(θ)—by SE(θ) SD(θ)/ K18

Example: choosing λ for the lasso (continued)4.0The cross-validation error curve from our lasso example, with standard errors:3.5 λ 3.458 3.0 2.5 2.0 1.5CV error 0246λ8101219

The one standard error ruleThe one standard error rule is an alternative rule for choosing thevalue of the tuning parameter, as opposed to the usual ruleθ̂ argmin CV(θ)θ {θ1 ,.θm }We first find the usual minimizer θ̂ as above, and then move θ inthe direction of increasing regularization (this may be increasing ordecreasing θ, depending on the parametrization) as much as wecan, such that the cross-validation error curve is still within onestandard error of CV(θ̂). In other words, we maintainCV(θ) CV(θ̂) SE(θ̂)Idea: “All else equal (up to one standard error), go for the simpler(more regularized) model”20

Example: choosing λ for the lasso (continued)4.0Back to our lasso example: the one standard error rule chooses alarger value of λ (more regularization):One standard error rule: λ 6.3053.5 Usual rule: λ 3.458 3.0 2.5 2.0 1.5CV error 0246λ8101221

What happens if we really shouldn’t be shrinking in the first place?We’d like cross-validation, our automated tuning parameterselection procedure, to choose a small value of λ λ 0.407 2.5 Prediction error8 λ 0.407 6 2.0 4CV error103.012Recall our other example from last time: n 50, p 30, and thetrue model is linear with all moderately large coefficients 0 246λ81012024681012λ22

12The standard errors here reflect uncertainty about the shape of theCV curve for large λUsual rule: λ 0.4078CV error10One standard error rule: λ 1.017 6 4 0 24681012λThe one standard error rule doesn’t move much from the minimum23

Prediction versus interpretationRemember: cross-validation error estimates prediction error at anyfixed value of the tuning parameter, and so by using it, we areimplicitly assuming that achieving minimal prediction error is ourgoalChoosing λ for the goal of recovering the true model, for the sakeof interpretation, is somewhat of a different task. The value of theparameter that achieves the smallest cross-validation error oftencorresponds to not enough regularization for these purposes. Butthe one standard error rule is a step in the right directionThere are other (often more complicated) schemes for selecting avalue of the tuning parameter for the purposes of true modelrecovery24

Recap: cross-validationTraining error, the error of an estimator as measured by the dataused to fit it, is not a good surrogate for prediction error. It justkeeps decreasing with increasing model complexityCross-validation, on the other hand, much more accurately reflectsprediction error. If we want to choose a value for the tuningparameter of a generic estimator (and minimizing prediction erroris our goal), then cross-validation is the standard toolWe usually pick the tuning parameter that minimizes thecross-validation error curve, but we can also employ the onestandard error rule, which helps us pick a more regularized model25

Next time: model assessment, more cross-validation12How would we do cross-validation for a structured problem likethat solved by smoothing splines? 810 6 42 0 0.00.20.40.60.81.026

Model selection and validation 1: Cross-validation Ryan Tibshirani Data Mining: 36-462/36-662 Ma

Related Documents:

Ryan Tibshirani Data Mining: 36-462/36-662 March 19 2013

Modern regression 1: Ridge regression Ryan Tibshirani Data Mining: 36-462/36-6

18 Views

2y ago

DATA MINING - University of Rajshahi

Preface to the First Edition xv 1 DATA-MINING CONCEPTS 1 1.1 Introduction 1 1.2 Data-Mining Roots 4 1.3 Data-Mining Process 6 1.4 Large Data Sets 9 1.5 Data Warehouses for Data Mining 14 1.6 Business Aspects of Data Mining: Why a Data-Mining Project Fails 17 1.7 Organization of This Book 21 1.8 Review Questions and Problems 23

16 Views

1y ago

Data Mining in Bioinformatics - UQAM

DATA MINING What is data mining? [Fayyad 1996]: "Data mining is the application of specific algorithms for extracting patterns from data". [Han&Kamber 2006]: "data mining refers to extracting or mining knowledge from large amounts of data". [Zaki and Meira 2014]: "Data mining comprises the core algorithms that enable one to gain fundamental in

44 Views

2y ago

Multi Relational Data Mining Approaches: A Data Mining Technique

Data Mining and its Techniques, Classification of Data Mining Objective of MRD, MRDM approaches, Applications of MRDM Keywords Data Mining, Multi-Relational Data mining, Inductive logic programming, Selection graph, Tuple ID propagation 1. INTRODUCTION The main objective of the data mining techniques is to extract .

13 Views

7m ago

Strong Rules for Discarding Predictors in Lasso-type Prob ...

Strong Rules for Discarding Predictors in Lasso-type Prob-lems Robert Tibshirani, Jacob Bien, Jerome Friedman, Trevor Hastie, Noah Simon, Jonathan Tay-lor, and Ryan J. Tibshirani Departments of Statistics and Health Research and Policy, Stanford University, Stanford CA 94305, USA. Email: tibs@stanford.edu. Summary.

38 Views

3y ago

Data Mining: Why Data Mining? - Leiden University

October 20, 2009 Data Mining: Concepts and Techniques 7 Data Mining: Confluence of Multiple Disciplines Data Mining Database Technology Statistics Machine Learning Pattern Recognition Algorithm Other Disciplines Visualization October 20, 2009 Data Mining: Concepts and Techniques 8 Why Not Traditional Data Analysis? Tremendous amount of data

43 Views

3y ago

Exploration and Mining in Canada

enable mining to leave behind only clean water, rehabilitated landscapes, and healthy ecosystems. Its objective is to improve the mining sector's environmental performance, promote innovation in mining, and position Canada's mining sector as the global leader in green mining technologies and practices. Source: Green Mining Initiative (2013).

25 Views

1y ago

Marshall University News Letter, March 17, 1983

ner, Gladys Thomas, Charles McKinney, Mary Pelfrey, Christine Qualls, Dora Turner, David Petry, Cleone Gor don, Dorothy Scruggs, Phyllis Rice, Jacquelyn White, Rowena Napier, William Smith, Annie Smith, Ruth Ann Workman, Barbara Johnson and Letha Esque. The awards were presented by MU President Robert B. Hayes on March 4. Faculty meet Tuesday A general faculty meeting has been scheduled for .

46 Views

3y ago

Recent Views

Ministries of Finance and Nationally Determined Contributions

Rodrigo Rojo, IDB Sr. Consultant and advisor to Ministry of Finance of Chile. Colombia German Romero Otalora and Laura Marcela Ruiz Daza — Office of the Vice-Minister — Ministry of Finance. Ireland Paul Ryan — International Finance Division — Ministry of Finance Sean Judge — Department of Finance — Ministry of Finance

1y ago

241 Views

Public HealtH Strategy for 2011-2017 - WHO

ME – Ministry of Economics MES – Ministry of Education and Science MEPRD – Ministry of Environmental Protection and Regional Development MF – Ministry of Finance MH – Ministry of Health MI – Ministry of the Interior MJ – Ministry of Justice MRDLG – Ministry of Regional Development and Local Government MT – Ministry of Transport

3y ago

176 Views

2019 - 2020 Budget Kit - Parliament of Fiji

Ministry of Justice 35 Fiji Corrections Service 37 Ministry of Communications 40 Ministry of Civil Service 43 . Ministry of Health and Medical Services 60 Ministry of Housing and Community Development 64 Ministry of Women, Children and Poverty Alleviation 68 Ministry of Youth and Sports 73 Tertiary Scholarships and Loans Schemes 77 Ministry .

1y ago

131 Views

Ministry of Environment, Government of India

Ministries/Departments of the Government of India, namely, Department of Space, Ministry of Agriculture, Ministry of Chemicals and Fertilizers, Ministry of Coal, Ministry of Commerce and Industry, Ministry of Communications and Information Technology, Ministry of Drinking Water and Sanitation, Ministry of Earth

1y ago

152 Views

Men'S Ministry Guide Tp Rev2 Ai

4 CONTENTS Introduction to the Outreach Ministry Guides Series 6 Introduction to the Men's Ministry Volunteer Handbook 8 Section 1 Men's Ministry Foundations Chapter 1 Why Men's Ministry 12 Chapter 2 Ways The Bible Speaks To Men's Ministry 17 Chapter 3 9 Foundations Of An Effective Men's Ministry 21 Section 2 The Anatomy Of An Effective Men's Ministry

1y ago

118 Views

MANAGERIAL FINANCE - GBV

of Managerial Finance page 2 Introduction to Managerial Finance 1 Starbucks—A Taste for Growth page 3 1.1 Finance and Business What Is Finance? 4 Major Areas and Opportunities in Finance 4 Legal Forms of Business Organization 5 Why Study Managerial Finance? Review Questions 9 1.2 The Managerial Finance Function 9 Organization of the Finance

3y ago

6.8K Views

Chapter 1 The roles of finance function in organisations

The roles of the finance function in organisations 4. The role of ethics in the role of the finance function Ethics is the system of moral principles that examines the concept of right and wrong. Ethics underpins an organisation’s sustained value creation. The roles that the finance function performs should be carried out in an .File Size: 888KBPage Count: 10Explore furtherRole of the Finance Function in the Financial Management .www.managementstudyguide.c Roles and Responsibilities of a Finance Department in a .www.pharmapproach.comRoles and Responsibilities of a Finance Department .www.smythecpa.comTop 10 – Functions of Business Finance in an om23 Functions and Duties of Accounting and Finance nded to you b

2y ago

342 Views

2017-2018 GRANDE ÉCOLE MSc in MANAGEMENT

Descriptif des cours Course Outlines 10 Catalogue des cours/ Course Catalog 2017-2018 FIN: Finance/Finance A : Actuariat/Actuarial, Insurance E : Finance d’entreprise/Corporate Finance The course liste tables and the course outlines G : Finance générale/General Finance M : Finance de marché/Market Finance S : Synthèse/Synthesis IDS: Systèmes d’Information, Sciences de la Décision et .

3y ago

318 Views

Behavioral Finance and Wealth L Management

Introduction to Behavioral Finance CHAPTER1 What Is Behavioral Finance? Behavioral Finance: The Big Picture Standard Finance versus Behavioral Finance The Role of Behavioral Finance with Private Clients How Practical Application of Behavioral Finance Can Create a Successful Advisory Rel

2y ago

386 Views

Catalogue des Cours Course Catalog - ESSEC Business School

10 Catalogue des cours/Course Catalog 2021-2022 FIN: Finance/Finance E : Finance d'entreprise/Corporate Finance G : Finance générale/General Finance M : Finance de marché/Market Finance S : Synthèse/Synthesis IDS: Systèmes d'Information, Sciences de la Décision et Statistiques/ Information Systems, Decision Sciences and Statistics

1y ago

226 Views

PARLIAMENT OF THE REPUBLIC OF FIJI Research and Library .

Ministry of Civil Service 583.8 1,056.4 Fiji Police Force 59.0 25.0 Ministry of Education, Heritage and Arts 23,638.2 18,608.3 Ministry of Health and Medical Services 10,642.7 16,766.0 Ministry of Women, Children and Poverty Alleviation 2,906.3 6,581.5 Ministry of Youth and Sports 212.3 - Ministry of Agriculture 8,662.8 9,216.8

3y ago

174 Views

First Baptist Church Valdosta, Georgia First Family Chimes

Student Ministry 2 2 Children’s Ministry 2 Sunday, April 29 Education Ministry Music Ministry 3 3 Family Night Supper First Family News 3 4 Ministry This Week Facts and Figures 5 5 11:00 Worship Guide Sun. Evening Classes 6 7 Coming Events Adult Ministry 7 7 Volum

3y ago

180 Views

Government of India Ministry of New and Renewable Energy

Government of India Ministry of New and Renewable Energy MNRE . 1,00,000 MW Till year 2022 20,000 MW 20,000 MW 40,000 MW 20,000 MW Solar Park Unemployed Graduate States/Private/ . Ministry MW Potential Ministry of Agriculture 12 Ministry of Chemicals and Fertilizers Ministry of Health and Family 401

1y ago

146 Views

SINGAPORE - Kelly Services

FINANCE Chief Financial Officer Degree/Master 15 20,000 25,000 Finance Assistant Diploma 1-3 2,800 3,400 Finance Controller Degree 10-15 10,000 18,000 Finance Director Degree 15 15,000 20,000 Finance Executive/ Senior Finance Executive Degree 2-5 3,000 6,000 Finance Manager/ Assistan

2y ago

538 Views

Trade Finance & Supply Chain Finance Awards 2022

In February 2022, Global Finance will publish its annual selections for the World's Best Trade Finance and Supply Chain Finance Providers. Global Finance will name the best trade finance providers in more than 100 countries and territories, eight global regions and

1y ago

220 Views

Ryan Tibshirani Data Mining: 36-462/36-662 March 26 2013

It looks like you're using an ad-blocker