Predictive Learning From Data - People.ece.umn.edu

1y ago

6 Views

2 Downloads

633.19 KB

53 Pages

Last View : 2m ago

Last Download : 3m ago

Upload by : Jamie Paz

Report this link

Download PDF

Transcription

Predictive Learningfrom DataLECTURE SET 2Problem Setting, Basic LearningProblems and Inductive PrinciplesElectrical and Computer Engineering1

OUTLINE2.0 Objectives Background- formalization of inductive learning- classical statistics vs predictive approach2.1 Terminology and Learning Problems2.2 Basic Learning Methods and ComplexityControl2.3 Inductive Principles2.4 Alternative Learning Formulations2.5 Summary2

2.0 Objectives- To quantify the notions of explanation,prediction and model- Introduce terminology- Describe common learning problems Past observations data pointsExplanation (model) functionLearning function estimation (from data)Prediction using the model to predict new inputs3

Example: classification problemtraining samples, modelGoal 1: explain training data min training errorGoal 2: generalization (for future data) Learning (model estimation) is ill-posed4

Mathematical formalization Learning machine predictive systemGeneratorof samplesxˆyLearningMachineySystem Unknown joint distribution P(x,y)Set of functions (possible models) f (x, )Pre-specified Loss function L(y, f (x, ))(by convention, non-negative Loss )5

Inductive Learning Setting The learning machine observes samples (x ,y), andreturns an estimated response yˆ f (x, w) Two types of inference: identification vs imitation Risk Loss(y, f(x,w)) dP(x,y) minGeneratorof y6

Two Views of Empirical Inference Two approaches to empirical or statistical inferenceEMPIRICAL DATAKNOWLEDGE,ASSUMPTIONSSTATISTICAL INFERENCEPROBABILISTICMODELING RISK-MINIMIZATIONAPPROACHThese two approaches are different both technically andconceptually7

Classical Approaches to Inductive InferenceGeneric problem: finite data Model(1) Classical Science hypothesis testingexperimental data is generated by a given model(single function scientific theory)(2) Classical statistics max likelihood data generated by a parametric model for density.Note: loss fct likelihood (not problem-specific) The same solution approach for all types of problemsR. Fisher: “uncertain inferences” from finite datasee: R. Fisher (1935), The Logic of Inductive Inference, J. Royal StatisticalSociety, available at http://www.dcscience.net/fisher-1935.pdf8

Discussion Math formulation useful for quantifying- explanation fitting error (training data)- generalization prediction errorNatural assumptions- future similar to past: stationary P(x,y),i.i.d.data- discrepancy measure or loss function,i.e. mean squared error (MSE)What if these assumptions do not hold?9

OUTLINE2.0 Objectives2.1 Terminology and Learning Problems- supervised/ unsupervised- classification- regression etc.2.2 Basic Learning Methods and ComplexityControl2.3 Inductive Principles2.4 Alternative Learning Formulations2.5 Summary10

Supervised Learning: Regression Data in the form (x,y), where- x is multivariate input (i.e. vector)- y is univariate output (‘response’)2()Regression: y is real-valued L( y, f (x)) y f (x) Estimation of real-valued function11

Regression Estimation ProblemGiven: training data (xi , yi ), i 1,2,.nFind a function f (x, w ) that minimizes squarederror for a large number (N) of future samples:N k 1[( y k f (x k , w)]2 min2 dP(x,y) min(y f(x,w)) BUT future data is unknown P(x,y) unknown All estimation problems are ill-posed12

Supervised Learning: Classification Data in the form (x,y), where- x is multivariate input (i.e. vector)- y is univariate output (‘response’)Classification: y is categorical (class label) 0 if y f (x )L( y, f (x )) 1 if y f (x ) Estimation of indicator function13

Density Estimation Data in the form (x), where- x is multivariate input (feature vector)Parametric form of density is given: f (x, )The loss function is likelihood or, morecommon, the negative log-likelihoodL( f (x, )) ln f (x, ) The goal of learning is minimization ofR( ) lnf (x, ) p (x )dxfrom finite training data, yielding f (x, 0 )14

Unsupervised Learning 1 Data in the form (x), where- x is multivariate input (i.e. feature vector)Goal: data reduction or clustering Clustering estimation of mapping X C,2L(x,f(x)) x f(x)where C c1, c2 ,.,c m and15

Unsupervised Learning 2 Data in the form (x), where- x is multivariate input (i.e. vector)Goal: dimensionality reduction Mapping f (x) is projection of the data ontolow-dimensional subspace, minimizing lossL(x, f (x)) x f (x)216

OUTLINE2.0 Objectives2.1 Terminology and Learning Problems2.2 Basic Learning Methods andComplexity Control- Parametric modeling- Non-parametric modeling- Data reduction- Complexity control2.3 Inductive Principles2.4 Alternative Learning Formulations2.5 Summary17

Basic learning methodsGeneral idea Specify a wide set of possible models f (x, )where is an abstract set of ‘parameters’*Estimate model parameters by minimizinggiven loss function for training data ( ERM)Learning methods differ in Chosen parameterizationLoss function usedOptimization method used for parameterestimation18

Parametric Modeling ( ERM)Given training data (xi , yi ), i 1,2,.n(1) Specify parametric model(2) Estimate its parameters (via fitting to data) Example: Linear regression F(x) (w x) bn21Remp ( w, b) yi ( w xi ) b minn i 119

Parametric Modeling: classificationGiven training data(xi , yi ), i 1,2,.n(1) Specify parametric model(2) Estimate its parameters (via fitting to data)Example: univariate classification data set(a) Linear decision boundary(b) third-order polynomialf ( x) sign (x b )(f ( x) sign x wx b220)

Parametric Methods in Classical Statistics Learning density estimation, i.i.d. data Maximum Likelihood inductive principle:Given n training samples X, find w* maximizingP data model P (X w ) p(x i ; w )ni 1equivalently, minimize negative log-likelihood-See textbook, Section 2.2, for example:Estimate two parameters of normal distributionfrom i.i.d. data samples via max likelihood empirical mean and empirical variance)21

Maximum Likelihood (cont’d) Similar approach for regression for knownparametric distribution (normal noise) maximizing likelihood min squared loss Similar approach for classification: forknown class distributions (Gaussian)maximizing likelihood second-orderdecision boundaryGeneral approach: (statistical decision theory) Start with parametric form of a distribution Estimate its parameters via max likelihood Use estimated distributions for makingdecision (prediction)22

Non-Parametric ModelingGiven training data(xi , yi ), i 1,2,.nEstimate the model (for given x 0) as‘local average’ of the data.Note: need to define ‘local’, ‘average’ Example: k-nearest neighbors regressionkf (x 0 ) yj 1jk23

Data Reduction ApproachGiven training data, estimate the model as ‘compactencoding’ of the data.Note: ‘compact’ # of bits to encode the modelor# of bits to encode the data (MDL) Example: piece-wise linear regressionHow many parameters neededfor two-linear-component model?24

Data Reduction Approach (cont’d)Data Reduction approaches are commonly usedfor unsupervised learning tasks. Example: clustering.Training data encoded by 3 points (cluster centers)HIssues:- How to find centers?- How to select thenumber of clusters?25

Diverse terminology (for learning methods) Many methods differ in parameterizationof admissible models or approximatingfunctions yˆ f (x, w)- neural networks- decision trees- signal processing ( wavelets) How training samples are used:Batch methodsOn-line or flow-through methods26

Motivation for Complexity ControlEffect of model control on generalization(a) Classification(b) Regression27

Complexity Control: parametric modelingConsider regression estimation Ten training samplesy x 2 N (0, 2 ), where 2 0.25 Fitting linear and 2-nd order polynomial:28

Complexity Control: local estimationConsider regression estimation Ten training samples fromy x 2 N (0, 2 ), where 2 0.25 Using k-nn regression with k 1 and k 4:29

Complexity Control (summary) Complexity (of admissible models) affectsgeneralization (for future data) Specific complexity indices for– Parametric models: # of parameters– Local modeling: size of local region– Data reduction: # of clusters Complexity control choosing optimalcomplexity ( good generalization) forgiven (training) data set not well-understood in classical statistics30

OUTLINE2.0 Objectives2.1 Terminology and Learning Problems2.2 Basic Learning Methods andComplexity Control2.3 Inductive Principles- Motivation- Inductive Principles: Penalization,SRM, Bayesian Inference, MDL2.4 Alternative Learning Formulations2.5 Summary31

Conceptual Motivation Generalization from finite data requires:a priori knowledge any info outsidetraining data, e.g. ?inductive principle general strategies forcombining a priori knowledge and datalearning method constructiveimplementation of inductive principle Example: Empirical Risk Minimization parametric modeling approachQuestion: what are possible limitations of ERM?32

Motivation (cont’d) Need for flexible (adaptive) methods- wide ( flexible) parameterization ill-posed estimation problems- need provisions for complexity control Inductive Principles originate fromstatistics, applied math, info theory,learning theory – and they adopt distinctlydifferent terminology & concepts33

Inductive Principles Inductive Principles differ in terms of- representation of a priori knowledge- mechanism for combining a prioriknowledge with training data- applicability when the true model doesnot belong to admissible models- availability of constructive procedures(learning methods/ algorithms)Note: usually prior knowledge about parameterization34

PENALIZATION Overcomes the limitations of ERMPenalized empirical risk functionalR pen ( ) Remp ( ) f (x, ) f (x, ) is non-negative penalty functionalspecified a priori (independent of the data); itslarger values penalize complex functions.is regularization parameter (non-negativenumber) tuned to training dataExample: ridge regression 35

Structural Risk Minimization Overcomes the limitations of ERMComplexity ordering on a set of admissiblemodels, as a nested structureS0 S1 S2 .Examples: a set of polynomial models, Fourierexpansion etc. Goal of learning minimization of empiricalrisk for an optimally selected elementSk36

Bayesian Inference Probabilistic approach to inference Explicitly defines a priori knowledge as priorprobability (distribution) on a set of modelparametersBayes formula for updating prior probabilityusing the evidence given by training data: P data model P model P model data P data P model data posterior probabilityP data model likelihood (probability that the data aregenerated by a model)37

Bayesian Density Estimation Consider parametric density estimation whereprior probability distribution P model p(w)Given training data X, the posterior probabilitydistribution is updatedP(X w ) p(w )p(w X ) P(X )P model data P model 0w38

Implementation of Bayesian Inference Maximum Likelihood,i.e. choose w* maximizingP data model P (X w ) p(x i ; w )ni 1 (equivalent to ERM)True Bayesian inference (averaging) (x X ) p(x ; w ) p(w X )dwWhere p(x; w ) is a set of admissible densitiesP(X w ) p(w )andp(w X ) P(X )39

Minimum Description Length (MDL) Information-theoretic approach- any training data set can be optimally encoded- code length generalization capabilityRelated to the Data Reduction approachintroduced (informally) earlier.Two possible implementations:- lossy encoding- lossless encoding of the data (as in MDL)40

Binary Classification under MDL Consider training data setX {xk,yk} (k 1,2,.n) where y {0,1}Given data object X {x1,., xn} is a binary stringy1,.,yn random?if there is a dependency then the output stringcan be encoded by a shorter code:- the model having code length L (model)- the error term L( data model) the total length of such a code for string y is:b L (model) L( data model)and the compression coefficient is K b / n41

Comparison of Inductive Principles Representation of a priori knowledge/ complexity:penalty term, structure, prior distribution,codebookFormal procedure for complexity control:penalized risk, optimal element of a structure,posterior distributionConstructive implementation of complexity control:resampling, analytic bounds, marginalization,minimum code length***See Table 2.1 in [Cherkassky & Mulier, 2007]***42

OUTLINE2.0 Objectives2.1 Terminology and Learning Problems2.2 Basic Learning Methods and ComplexityControl2.3 Inductive Principles2.4 Alternative Learning Formulations- Motivation- Examples of non-standard formulations- Formalization of application domain2.5 Summary43

Motivation Estimation of predictive modelStep 1: Problem specification/ FormalizationStep 2: Model estimation, learning, inference Standard Inductive Formulation- usually assumed in all ML algorithms- certainly may not be the best formalization forgiven application problem44

Standard Supervised LearningGeneratorof samplesxLearningMachineSystem f(x.w)LossL(f(x,w),y)yAvailable (training) data format (x,y)Test samples (x-values) are unknownStationary distribution, i.i.d samplesSingle model needs to be estimatedSpecific loss functions adopted for common tasks(classification, regression etc.)45

Non-standard Learning Settings Available Data Format- x-values of test samples are knownduring training Transduction, semi-supervised learningDifferent (non-standard) Loss Function- see later example ‘learning the sign of afunction’Univariate Output ( a single model)- multiple outputs may need to beestimated from available data46

Transduction predicting function values at given points:Given labeled training set x-values of test dataEstimate (predict) y-values for given test inputsa priori ontrainingdatatransductionpredictedoutput47

Learning sign of a function Given training data (xi , yi ), i 1,2,.ny [ 2, 2]with y-values in a bounded rangeEstimate function f (x) predicting sign of yLoss function L( y, f ( x)) yf (x)If prediction is wrong real-valued loss yIf prediction is correct real-valued gain y Neither standard regression, nor classificationPractical application: frequent trading48

Multiple Model Estimation Training data in the form (x,y), where- x is multivariate input- y is univariate real-valued output (‘response’)Similar to standard regression, but subsets ofdata may be described by different models49

Formalization of Application Problems Problem Specification Step cannot be formalizedBut Several guidelines can be helpful duringformalization process Mapping process:Application requirements Learning formulationSpecific components of this mapping processare shown next50

APPLICATIONLossFunctionNEEDSInput, output,other variablesTraining/test dataAdmissibleModelsFORMAL PROBLEM STATEMENTLEARNING THEORY51

Summary Standard Inductive Learning function estimation Goal of learning (empirical inference):to act/perform well, not system identification Important concepts:- training data, test data- loss function, prediction error ( prediction risk)- basic learning problems Complexity control Inductive principles – which one is the ‘best’ ?52

Summary (cont’d) Assumptions for inductive learning Non-standard learning formulationsAside: predictive modeling ofphysical systems vs social systemsNote: main assumption (stationarity) does not hold insocial systems (business data, financial data etc.) For discussion think of example application thatrequires non-standard learning formulationNote: (a) do not use examples similar to onespresented in my lectures and/or text book(b) you can email your example to instructor(maximum half-a-page)53

R. Fisher: "uncertain inferences" from finite data see: R. Fisher (1935), The Logic of Inductive Inference, . - Parametric modeling - Non-parametric modeling - Data reduction - Complexity control 2.3 Inductive Principles 2.4 Alternative Learning Formulations 2.5 Summary. 18

Related Documents:

A Simple Framework for Building Predictive Models

predictive analytics and predictive models. Predictive analytics encompasses a variety of statistical techniques from predictive modelling, machine learning, and data mining that analyze current and historical facts to make predictions about future or otherwise unknown events. When most lay people discuss predictive analytics, they are usually .

33 Views

2y ago

Predictive Analytics and Machine Learning for Utilities

SAP Predictive Analytics Data Manager Automated Modeler Expert Modeler (Visual Composition Framework) Predictive Factory Hadoop / Spark Vora SAP Applications SAP Fraud Management SAP Analytics Cloud HANA Predictive & Machine Learning Spatial Graph Predictive (PAL/APL) Series Data Streaming Analytics Text Analytics

22 Views

1y ago

Predictive Analytics with Social Media Data - INET Oxford

extant literature on predictive analytics with social media data. First, we discuss the dif-ference between predictive vs. explanatory models and the scientific purposes for and advantages of predictive models. Second, we present and discuss the foundational statisti-cal issues in predictive modelling in general with an emphasis on social media .

19 Views

1y ago

Predictive HCM Using Machine Learning Data Management Platforms - Oracle

Predictive Appl: HCM Cloud—Workforce Predictions Integrated data management embedded predictive analytics Full 360 degree employee view Single source of HCM data data Interactive dashboards and "What if" analysis Customizable if desired to add input variables to predictive models Mobile Oracle Cloud solutions 1 4

8 Views

1y ago

Alteryx Predictive Master Exam Prep Guide - Amazon Web Services

machine learning to provide data-driven insights to your organization. Predictive Masters have full command of the data science lifecycle, including: data preparation, data exploration, predictive modeling and evaluation, and interpretation and deployment. The scope of this exam includes predictive analytics concepts

7 Views

1y ago

Predictive Tree: An Efﬁcient Index for Predictive Queries On Road Networks

the existing index structure and incur minimal cost in response to the movement of the object. We propose the iRoad framework that leverages the introduced predictive tree to support a wide variety of predictive queries including predictive point, range, and KNN queries. we provide an experimental evidence based on real and

14 Views

1y ago

Week 4: Embedded Predictive and Machine Learning Unit 1: Intelligent ...

Machine learning and predictive analytics: architecture and concepts Embedded predictive models in SAP S/4HANA PROCESS THE ALGORITHMS WHERE THE DATA IS: LOW TCO & OPTIMAL PERFORMANCE LEAD BACK PREDICTIVE ANALYTICS TO CDS VIEWS: CONTENT & CONCEPT REUSE SAP S/4HANA SAP HANA Analytical Engine S s ISLM Repository Modeling & Administration SAS OData .

14 Views

1y ago

Predictive Analytics Modeler - kaplan.com.sg

The Predictive Analytics Modeler career path prepares students to learn the essential analytics models to collect and analyze data efficiently. This will require skills in predictive analytics models, such as data mining, data collection and integration, nodes, and statistical analysis. The Predictive Analytics Modeler will use tools for market

24 Views

1y ago

Recent Views

Grammar as a Foreign Language - List of Proceedings

Grammar as a Foreign Language Oriol Vinyals Google vinyals@google.com Lukasz Kaiser Google lukaszkaiser@google.com Terry Koo Google terrykoo@google.com Slav Petrov Google slav@google.com Ilya Sutskever Google ilyasu@google.com Geoffrey Hinton Google geoffhinton@google.com Abstract Synta

2y ago

445 Views

Attention is All you Need - NIPS

Google Brain avaswani@google.com Noam Shazeer Google Brain noam@google.com Niki Parmar Google Research nikip@google.com Jakob Uszkoreit Google Research usz@google.com Llion Jones Google Research llion@google.com Aidan N. Gomezy University of Toronto aidan@cs.toronto.edu Łukasz Kaiser Google Brain lukaszkaiser@google.com Illia Polosukhinz illia .

1y ago

303 Views

GSA Implementation of Google (G) Suite

Google Meet Classic Hangouts Google Chat Google Calendar Google Drive and Shared Drive Google Docs Google Sheets Google Slides Google Forms Google Sites Google Keep Apps Script D

2y ago

316 Views

Google Drive (Google Docs, Google Sheets, Google Slides)

Google Drive (Google Docs, Google Sheets, Google Slides) Employees are automatically issued a Kyrene Google account. Navigate to drive.google.com. Use Kyrene email address and network password to login. Launch in Chrome browser for best experience. Google Drive is a cloud storage sys

2y ago

388 Views

Quick Guide of Using Google Home to Control Smart Devices

Configuration needs Google Home app. Search "Google Home" in App Store or Google Play to install the app. 3.1 Set up Google Home with Google Home app You can skip this part if your Google Home is already set up. 1. Make sure your Google Home is energized. 2. Open the Google Home app by tapping the app icon on your mobile device. 3.

1y ago

326 Views

Elaboração de Provas Online usando o Formulário Google Docs

2 Após o login acesse o Google Drive ou o Google Docs e selecione a ferramenta Google Forms (Formulários). Clique na caixa de Ferramentas do Google, localizada no canto direito superior da tela e selecione o Google Drive. Na tela do Google Drive clique em New , opção More e selecione Google Forms. OBS: É possível acessar o google

11m ago

123 Views

ACS WASC Templates

File upload, Folder upload, Google Docs, Google Sheets, or Google Slides. You can also create Google Forms, Google Drawings, Google My Maps, etc. Share with exactly who you want — without email attachments. Search or sort your list of files, folders, and Google Docs. Preview files and Google Docs.

2y ago

366 Views

Share a Google Doc in Schoology - fcps.edu

After you have connected your Google Drive to Schoology (directions in a separate handout), another way to share a Doc with students is to use the Google Drive Resource App. To share a Google Doc using the Google Drive Resources App: 1. From the Add Materials drop down menu, select Import from Resources. 2. Select Apps. Then Google Drive .

1y ago

92 Views

Google Drive - San Bernardino City Unified School District

Google Apps All of the Google applications that are available upon logging into Google.com (G , Gmail, Gphotos, Gdrive, etc.). Google Suite Google’s online cloud based office companion applications (Docs, Sheets, Slides). Google Drive Google’s online cloud storage and file sharing/collaboration application.

2y ago

378 Views

Single Sign On for Google Apps with NetScaler Unified Gateway

Google Apps for Work is a suite of cloud computing productivity and collaboration applications provided by Google on a subscription basis. It includes Google’s popular web applications including Gmail, Google Drive, Google Hangouts, Google Calendar and Google

2y ago

295 Views

Serviceteil

Google 84, 87, 124 Google 110 Google AdWords 101, 103 Google Alerts 127 Google Analytics 89 Google Maps 100, 110, 173 Google-Maps 63 Google Places 100, 103, 124 Graphiken 66 H Haftung 170 Haftungsausschluss 72 Hausfarbe 11 Headline 35 Heilmittelwerbegesetz 14, 69, 163 Heilversprechen 164 HONcode 78 HTML 58 HWG 31 I Imagefilm 31

2y ago

336 Views

Best practices for managing identities when you move to Google Cloud

Google Cloud. To provide t he informat ion an organizat ion would ne e d to transfer data and ownership from one Google Account to anot her for s ome of t he noncore Google s er vice s, such as Google Ads, Google Analyt ics, or DV360. Intende d audience Organizat ion administrators. Sta planning Google Cloud / Google Wor kspace migrat ion. Key .

1y ago

481 Views

Google Analytics 101 - Content Jam

Google Analytics 101 201 301 Google Ads 101 201 Google Tag Manager 101 Google Data Studio 101 Google Optimize 101. Welcome Fun Facts: Share . Google Analytics 301 35 Web Property The web property ID is of the form UA-XXXXXX-YY. It's often called the "UA number" since it starts with

1y ago

107 Views

Introduction - Google Earth User Guide

Google Earth Community: Learn from other Google Earth users by asking questions and sharing answers on the Google Earth Community forums. Using Google Earth: This blog describes how you can use some of the interesting features of Google Earth. Selecting a Server Note: This section is relevant to Google Earth Pro and EC users.

3y ago

288 Views

Using Google Forms to Manage Officials Signups

Google Sheets, deleting a response from the form or sheet will not affect the other. Once the Google Form is linked to a Google Sheet, clicking on the spreadsheet icon will open the linked Google Sheet. Google Responses Sheet Google automatically creates and populates the sp

2y ago

276 Views

Predictive Learning From Data - People.ece.umn.edu

It looks like you're using an ad-blocker