Machine Learning And Applied Econometrics

2y ago
24 Views
5 Downloads
882.46 KB
20 Pages
Last View : 22d ago
Last Download : 3m ago
Upload by : Sasha Niles
Transcription

Machine LearningandApplied EconometricsTree-Based Models4/23/2019Machine Learning and Econometrics1

Machine Learning and Econometrics This introductory lecture is based on– Kevin P. Murphy, Machine Learning A ProbabilisticPerspective, The MIT Press, 2017.– Darren Cook, Practical Machine Learning withH2O, O'Reilly Media, Inc., 2017.– Scott Burger, Introduction to Machine Learningwith R: Rigorous Mathematical Analysis, O’ReillyMedia, Inc., 2018.4/23/2019Machine Learning and Econometrics2

Supervised Machine Learning Regression-based Methods– Generalized Linear Models Linear Regression Logistic Regression– Deep Learning (Neural Nets) Tree-based Ensemble Methods– Random Forest (Bagging: Bootstrap Aggregation) Parallel ensemble to reduce variance– Gradient Boost Machine (Boosting) Sequential ensemble to reduce bias4/23/2019Machine Learning and Econometrics3

Tree-Based Models Random Forest (Bagging: BootstrapAggregation) Parallel ensemble to reduce variance Gradient Boost Machine (Boosting) Sequential ensemble to reduce bias4/23/2019Machine Learning and Econometrics4

Trees Classification Tree4/23/2019 Regression TreeMachine Learning and Econometrics5

Random Forest Random Forest is a bagging (bootstrapaggregation) of trees. Given a set of data, each of these trees inthe forest is a weak learner built on a subsetof rows (data observations) and columns(features or variables). More trees will reduce the variance, whichmay be processed in parallel.4/23/2019Machine Learning and Econometrics6

Random Forest4/23/2019Machine Learning and Econometrics7

Random Forest Modeling with H2O Basic Model– h2o.randomForest (x, y, training frame,model id NULL, seed -1, ) Model Specification Options– ntrees 50, max depth 20, mtries -1,– sample rate 0.632,– sample rate per class NULL,col sample rate change per level 1,col sample rate per tree 1,– min rows 1, nbins 20,– nbins top level 1024, nbins cats 1024,4/23/2019Machine Learning and Econometrics8

Random Forest Modeling with H2O Model Specification Options (Continued)– distribution c("AUTO", "bernoulli","multinomial", "gaussian", "poisson", "gamma","tweedie", "laplace", "quantile", "huber"),– histogram type c("AUTO", "UniformAdaptive","Random", "QuantilesGlobal", "RoundRobin"),– checkpoint NULL,4/23/2019Machine Learning and Econometrics9

Random Forest Modling with H2O Cross-Validation Parameters––––––validation frame NULL,nfolds 0, seed -1,keep cross validation models TRUE,keep cross validation predictions FALSE,keep cross validation fold assignment FALSE,fold assignment c("AUTO", "Random", "Modulo","Stratified"),– fold column NULL,4/23/2019Machine Learning and Econometrics10

Random Forest Modeling with H2O Early Stopping– stopping rounds 0,– stopping metric c("AUTO", "deviance","logloss", "MSE", "RMSE", "MAE", "RMSLE","AUC", "lift top group", "misclassification","mean per class error", "custom","custom increasing"),– stopping tolerance 0.001,– max runtime secs 0,4/23/2019Machine Learning and Econometrics11

Random Forest Modeling with H2O Other Important Control Parameters––––––––4/23/2019balance classes FALSE,class sampling factors NULL,max after balance size 5,max hit ratio k 0,min split improvement 1e-05binomial double trees FALSE,col sample rate change per level 1,col sample rate per tree 1,Machine Learning and Econometrics12

Gradient Boosting Machine Gradient Boosting Machine (GBM) is a forwardlearning ensemble method. It combinesgradient-based optimization and boosting.– Gradient-based optimization uses gradientcomputations to minimize a model’s loss functionin terms of the training data.– Boosting additively collects an ensemble of weakmodels to create a robust learning system forpredictive tasks.

Boosting4/23/2019Machine Learning and Econometrics14

Gradient Boosting Machine4/23/2019Machine Learning and Econometrics15

Gradient Boosting with H2O Basic Model– h2o.gbm (x, y, training frame, model id NULL,seed -1, ) Model Specification Options––––4/23/2019ntrees 50, max depth 5, min rows 10,nbins 20, nbins top level 1024, nbins cats 1024,learn rate 0.1, learn rate annealing 1,sample rate 1, sample rate per class NULL,col sample rate 1,col sample rate change per level 1,col sample rate per tree 1, max abs leaf,node pred Inf, )Machine Learning and Econometrics16

Gradient Boosting with H2O Model Specification Options (Continued)– distribution c("AUTO", "bernoulli","quasibinomial", "multinomial", "gaussian","poisson", "gamma", "tweedie", "laplace","quantile", "huber"),– quantile alpha 0.5,– tweedie power 1.5,– huber alpha 0.9,– checkpoint NULL4/23/2019Machine Learning and Econometrics17

Gradient Boosting with H2O Cross-Validation Parameters––––––validation frame NULL,nfolds 0, seed -1,keep cross validation models TRUE,keep cross validation predictions FALSE,keep cross validation fold assignment FALSE,fold assignment c("AUTO", "Random", "Modulo","Stratified"),– fold column NULL,4/23/2019Machine Learning and Econometrics18

Gradient Boosting with H2O Early Stopping– stopping rounds 0,– stopping metric c("AUTO", "deviance","logloss", "MSE", "RMSE", "MAE", "RMSLE","AUC", "lift top group", "misclassification","mean per class error", "custom","custom increasing"),– stopping tolerance 0.001,– max runtime secs 0,4/23/2019Machine Learning and Econometrics19

Gradient Boosting with H2O Other Important Control Parameters– min split improvement 1e-05– histogram type c("AUTO", "UniformAdaptive","Random", "QuantilesGlobal", "RoundRobin")4/23/2019Machine Learning and Econometrics20

Machine Learning and Econometrics This introductory lecture is based on –Kevin P. Murphy, Machine Learning A Probabilistic Perspective, The MIT Press, 2017. –Darren Cook, Practical Machine Learning with H2O, O'Reilly Media, Inc., 2017. –Scott Burger, Introduction to Machine Learning

Related Documents:

Harmless Econometrics is more advanced. 2. Introduction to Econometrics by Stock and Watson. This textbook is at a slightly lower level to Introductory Econometrics by Wooldridge. STATA 3. Microeconometrics Using Stata: Revised Edition by Cameron and Trivedi. An in-depth overview of econometrics with STATA. 4. Statistics with STATA by Hamilton .

Econometrics is the branch of economics concerned with the use of mathematical methods (especially statistics) in describing economic systems. Econometrics is a set of quantitative techniques that are useful for making "economic decisions" Econometrics is a set of statistical tools that allows economists to test hypotheses using

Warsaw School of Economics Institute of Econometrics Department of Applied Econometrics Department of Applied Econometrics Working Papers Warsaw School of Economics Al. Niepodleglosci 164 02-554 Warszawa, Poland Working Paper No. 3-10 Empirical power of the Kwiatkowski-Phillips-Schmidt-Shin test Ewa M. Syczewska Warsaw School of Economics

Nov 14, 2016 · Econ 612 Time Series Econometrics (Masters Level) Econ 613 Applied Econometrics: Micro (Masters Level) MA students who want to go on to a Ph.D. in Economics or a related field are encouraged to take the required Ph.D. Econometrics sequence (

of Basic Econometrics is to provide an elementary but comprehensive intro-duction to econometrics without resorting to matrix algebra, calculus, or statistics beyond the elementary level. In this edition I have attempted to incorporate some of the developments in the theory and practice of econometrics that have taken place since the

1.1 USING EVIEWS FOR PRINCIPLES OF ECONOMETRICS, 5E This manual is a supplement to the textbook Principles of Econometrics, 5th edition, by Hill, Griffiths and Lim (John Wiley & Sons, Inc., 2018). It is not in itself an econometrics book, nor is it a complete computer manual. Rather it is a step-by-step guide to using EViews 10

What is Econometrics? (cont'd) Introductory Econometrics Jan Zouhar 7 econometrics is not concerned with the numbers themselves (the concrete information in the previous example), but rather with the methods used to obtain the information crucial role of statistics textbook definitions of econometrics: "application of mathematical statistics to economic data to lend

British Association of Social Workers (2014) The Code of Ethics for Social Work. Birmingham: BASW First published: January 2012 Updated: October 2014 Typographically reset: 2018 . 3 The Code is binding on all social workers who are BASW members in all roles, sectors and settings in the UK. Social workers have a responsibility to promote and work to the Code of Ethics in carrying out their .