1 PAC Learning Model - ECE/CIS

2y ago

22 Views

2 Downloads

317.96 KB

6 Pages

Last View : 18d ago

Last Download : 3m ago

Upload by : Philip Renner

Report this link

Download PDF

Transcription

ELEG/CISC 867: Advanced Machine LearningSpring 2019Lecture 3: PAC LearningLecturer: Xiugang Wu02/21/2019, 02/26/2019 & 02/28/2019In the previous lecture, we have introduced a simplified learning model, and shown that for a finite hypothesisclass with realizability assumption, if ERM rule is applied on a sufficiently large training sample (whose size isindependent of the underlying distribution or labeling function), then the output hypothesis will be probablyapproximately correct. This lecture will discuss the PAC (Probably Approximately Correct) learning modelin its full generality.1PAC Learning ModelLast lecture, we have made several assumptions on the following learning model to simplify our discussion.We now relax these assumptions and introduce the general PAC learning framework.Learning:{(Xi , Yi )}ni 1h2HLearnerPrediction:XhŶ h(X)Figure 1: PAC learning model.Label set. Last time, we assumed the label set Y {0, 1}, which corresponds to predicting a binary labelto a given example X. However, many learning tasks take a different form. For example, one may wish topredict a real valued number (say, the temperature at 9:00 p.m. tomorrow), or a label picked from a finiteset of labels (like the topic of the main story in tomorrow’s paper). One can easily extend our model to suchregression or multiclass classification problems by relaxing the label set to be the set of real vectors or theset of multiple labels. This generalized Y set is also often referred to as the target set.Data-Generation Mechanism. Last time, we assumed that the target variable Y is fully determined bythe input X, which may not be a realistic assumption in many cases, e.g. think of predicting which team willwin a basketball game based on their history. From now on, we will replace the “target labeling function”with a more flexible notion, i.e. a data-labels generating distribution. In particular, we consider a jointdistribution PXY , or simply P , over X Y. One can view such a distribution as being composed of twoparts: a distribution PX over unlabeled domain points (sometimes called the marginal distribution) and aconditional distribution over labels for each domain point, PY X .Performance Measure of a Predictor. We now introduce a general framework of quantifying theperformance of a predictor. Given a target set Y and a reconstructed target set Ŷ, let be any functionfrom Y Ŷ to the set of nonnegative real numbers, : Y Ŷ R . Such functions are referred to asloss functions as they are used to quantify how bad we feel about our reconstruction ŷ once we find out theground truth y. Define the risk L(h, P ) associated with a predictor h under data-generating distribution P1

2Lecture 3: PAC Learningas the expected loss when applying h to X, i.e.L(h, P ) , E(X,Y ) P [ (Y, h(X))].This risk is also called the true risk as it statistically measures the true performance of predictor h on unseendata. In contrast, one can also consider the empirical risk that the predictor h incurs over the trainingsample,n1X (yi , h(xi )).n i 1It can be readily seen that the the empirical risk is simply the risk of h evaluated under the empiricaldistribution Pn , i.e. L(h, Pn ).Several widely used loss functions are as follows: 0-1 loss: The 0-1 loss, widely used in classification, is defined as 0 1 (y, ŷ) I(y 6 ŷ).The risk of h under distribution P and loss function 0 1 is simply the probability of error,EP [ 0 1 (Y, h(X))] P (Y 6 h(X)). Square loss: The square loss, also known as 2 loss or quadratic loss, is usually used in the regressionproblem and is defined as sq (y, ŷ) ky ŷk2 .The risk of h under distribution P and loss function sq is generally known as the Mean Square Error(MSE),EP [ sq (Y, h(X))] EP [kY h(X)k2 ]. Logarithmic loss: The logarithmic loss, or log loss in short, is a loss function widely used in classificationwhen the reconstruction is “soft” and ŷ represents a distribution over Y, log (y, ŷ) log1.ŷ(y)The risk of h under distribution P and loss function log is given byE(X,Y ) P [ log (Y, h(X))] E(X,Y ) P [ log [h(X)](Y )].The empirical risk of h under training sample z n and loss function log is given byn11Xlog,n i 1[h(xi )](yi )which can be also written asn1XH(pi , qi ),n i 1

3Lecture 3: PAC Learningwhere H(pi , qi ) is the cross entropy between the distributions pi and qi over Y, defined asX1H(pi , qi ) ,pi (y) log,qi (y)y Yand pi and qi are respectively the distribution induced by seeing yi and the distribution the predictoroutputs, i.e.pi (y) I(y yi ),qi (y) [h(xi )](y).For this reason, log loss is also widely known as the cross-entropy loss.Bayes Predictor. Suppose that one knows the underlying distribution P . Then the predictorf argmin L(h, P )hthat minimizes the true risk is called the Bayes predictor, or Bayes estimator, or Bayes decision rule, andits resultant riskmin L(h, P )his called the Bayes risk. Under different loss functions, the Bayes predictor takes different forms: 0-1 loss: Under the 0-1 loss, the Bayes predictor f is given by the well-known maximum a posteriori(MAP) rule, i.e.,f (x) argmax pY X (y x),y Ywith the Bayes riskL(f, P ) 1 Xx Xmax pX,Y (x, y).y Y Square loss: Under the square loss, the Bayes predictor f is given by the conditional expectation of Ygiven X x, i.e.,f (x) EP [Y X x],with the Bayes riskL(f, P ) EP [Var(Y X)]. Log loss: Under the log loss, the Bayes predictor f is given by the conditional distribution of Y givenX x, i.e.,[f (x)](y) pY X (y x),with the Bayes risk being the conditional entropy of Y given X:L(f, P ) E(X,Y ) P [ log pY X (Y X)] HP (Y X).Unfortunately, since we do not know P , we cannot utilize the above Bayes predictors to achieve the minimalpossible error. Instead, what the learner does have access to is the training sample. So we will choose somehypothesis class, and require that the learner will, based on the training sample, find a predictor whose erroris not much larger than the best possible error achievable by any hypothesis within the class. This idea isformalized in the following definition of PAC learnability.

4Lecture 3: PAC Learning1.1PAC LearnabilityGiven the above general formulation of a learning model, we are now ready to define PAC learnability.Definition 1.1 (PAC Learnability) A hypothesis class H is PAC (Probably Approximately Correct) learnableif there exist a function nH : (0, 1) (0, 1) N and a learning algorithm with the following property: Forevery , δ (0, 1) and every distribution P , if n nH , then PZ n P n L(hZ n , P ) min L(h, P ) 1 δ.h HSeveral remarks about the above definition of PAC learnability are as follows: Accuracy and confidence parameters: The definition of PAC learnability contains two approximationparameters mentioned before. The accuracy parameter determines how far the output predictor canbe from the optimal one within the class (this corresponds to the “approximately correct”), and theconfidence parameter δ indicates how likely the output predictor is to meet that accuracy requirement(corresponds to the “probably” part of “PAC”). Sample complexity: The function nH determines the sample complexity of learning H, that is, howmany examples at least are required to guarantee a probably approximately correct solution. The sample complexity is a function of the accuracy and confidence parameters. It also depends on propertiesof the hypothesis class H — for example, in last lecture we showed that for a finite class satisfyingthe realizability assumption the sample complexity depends on log of the size of H. In fact, using theabove definition of PAC learnability, one can rephrase the result we showed in the last lecture as thefollowing: Every finite hypothesis class H satisfying the realizability assumption is PAC learnable with.sample complexity nH ( , δ) log( H /δ) Agnostic PAC learning: This framework is also generally known as agnostic PAC learning as it doesn’tassume realizability. We will often omit the prefix “agnostic” in this course, in which case PAC learningrefers to this general agnostic case.2Finite Hypothesis Class: Agnostic CaseWe now revisit the finite hypothesis class and show that any finite hypothesis class is learnable with ERMin the agnostic case, i.e. even without the realizability assumption.2.1Uniform Convergence Is Sufficient For LearnabilityFor ERM to work, it suffices to ensure that the empirical risk of all hypothesis in H are good approximationsof their true risk. In other words, we need that uniformly over all hypothesis in H, the empirical risk is closeto the true risk. This is formalized in the following.Definition 2.1 ( -representative sample) A training sequence Z n is called -representative if h H, L(h, Pn ) L(h, P ) .Lemma 2.1 If Z n is /2-representative, then the output hZ n of ERMH (Z n ) satisfiesL(hZ n , P ) L(h , P ) ,where we assume that h achieves the minimum risk within the class H.

5Lecture 3: PAC LearningProof:L(hZ n , P ) L(hZ n , Pn ) /2 L(h , Pn ) /2 L(h , P ) /2 /2 L(h , P ) . The above lemma implies that to ensure that ERM is a PAC learner, it suffices to show that with probabilityof at least 1 δ, Z n is -representative. The uniform convergence condition formalizes this requirement.Definition 2.2 (Uniform convergence) We say H has the uniform convergence property if there exists2nUCa function nUCH : (0, 1) N such that for every , δ (0, 1) and P , if Z P with n nH then withprobability of at least 1 δ, Z n is -representative.Corollary 2.1 If H has the uniform convergence property with a function nUCH , then the class is PAClearnable with sample complexity nH ( , δ) nUC( /2,δ),andinthiscaseERMis a successful PAC learnerHfor H.2.2Finite Classes Are Agnostic PAC LearnableWe now show that finite classes are agnostic PAC learnable by showing that uniform convergence holds forany finite hypothesis class. For this, we first introduce a measure concentration inequality due to Hoeffding,which quantifies the gap between empirical averages and their expected value.Lemma 2.2 (Hoeffding’s Inequality) Let X1 , X2 , . . . , Xn be a sequence of independent random variablesand assume that for all i, E[Xi ] µ and P(Xi [a, b]) 1. Then, for any 0,!n21X 2n Xi µ 2e (b a)2 .Pn i 1Now consider the empirical risk of any h H under training sample Z n ,nL(h, Pn ) 1X (Yi , h(Xi )),n i 1where (Yi , h(Xi )), i [1 : n] are i.i.d. with mean L(h, P ) for any i [1 : n]. Let us further assume that therange of is [0, 1]. Then applying Hoeffding’s inequality to the sequence of (Yi , h(Xi )), we obtain2P n ( L(h, Pn ) L(h, P ) ) 2e 2n ,and therefore2P n ( h H, s.t. L(h, Pn ) L(h, P ) ) 2 H e 2n .This shows that ifn log(2 H /δ),2 2thenP n ( L(h, Pn ) L(h, P ) , h H) 1 δ,and the corollary below follows immediately.

6Lecture 3: PAC LearningCorollary 2.2 Let H be a finite hypothesis class and be a loss function with range [0, 1]. Then H enjoysthe uniform convergence property with sample complexitynUCH ( , δ) log(2 H /δ).2 2Furthermore, the class is agnostically PAC learnable using the ERM algorithm with sample complexitynH ( , δ) nUCH ( /2, δ) 2 log(2 H /δ). 2

4 Lecture 3: PAC Learning 1.1 PAC Learnability Given the above general formulation of a learning model, we are now ready to de ne PAC learnability. De nition 1.1 (PAC Learnability) A hypothesis class His PAC (Probably Approximately Correct) learnable if there exist a function n H: (0;1)

Related Documents:

PARTS CATALOG REVISED EDITION-D

pac-mk51bc pac-mk51bcb pac-mk31bc pac-mk31bcb [model name] branch box pac-mk51bc pac-mk51bcb pac-mk31bc pac-mk31bcb pac-mk51bc. ocb589d 2 3 1-1. branch box functional parts . 7 t7w e55 401lev assy mka50bc 1 includes coils 8 t7w e77 242coil(lev)/xap-5p grn 1 gr

15 Views

2y ago

PLASMA CATALOG

pt-600 36–37 pac 120/125t max 40cs/max 42/max 43 38–39 pac 140/m max 40 40 pac 140/m ht 40 41 pac 130/m max 70/max 80/max 100 42 pac 110 powermax 350 43 pac 123t/m powermax 600 44–45 pac 121t/m / pac 125t/m powermax 800 46–47 thermacut re

35 Views

2y ago

Marathon Pac Bulk Wire Systems

MIG-brazing wires 440 lbs (200 kg) Marathon Pac is the most advanced bulk wire packaging system available to fabricators. The complete Marathon Pac family consists of: Standard Marathon Pac Jumbo Marathon Pac Mini Marathon Pac Endless Marathon Pac Mini Marathon Pacwas introduced as the perfect solution for fabricators with

30 Views

2y ago

ECONO-PAC™/OCTA-PAC OCTA-PAC PLUS Power Inductors …

Aug 07, 2007 · OCTA-PAC’s are designed around high frequency, low loss MPP core material ECONO-PAC’s are a lower cost version of OCTA-PAC’s offering high saturation flux density, Powder Iron core material OCTA-PAC PLUS’s offer higher current ratings and higher saturation flux densities than OCTA

29 Views

2y ago

Electrical & Computer Engineering Student Affairs Office ...

Electrical & Computer Engineering Student Affairs Office ece.ucsd.edu . ECE 174. ECE 175A: ECE 175B* Year 4: ECE 171B* ECE 172A* DESIGN. PROF. ELECTIVE: PROF. ELECTIVE. TECH. ELECTIVE: TECH. ELECTIVE. MACHINE LEARNING & CONTROLS DEPTH *Pick one of ECE 171B, 172A or 175B to complete the 4th Depth course requirement.

46 Views

2y ago

2019-20 PAC-12 MEN’S BASKETBALL STANDINGS PAC-12 …

Fri., March 13 PAC-12 TOURNAMENT SEMIFINALS CANCELED Sat., March 14 PAC-12 TOURNAMENT CHAMPIONSHIP GAME CANCELED For Immediate Release \\ Tuesday, April 7, 2020 Contacts \\ Jesse Hooker (jhooker@pac-12.org, 607-329-3925), Nicole Mitchell (nmitchell@pac-12.org) 2019-20

23 Views

2y ago

PAC-10 CONFERENCE PAC-12 CONFERENCE

PAC-12 GYMNASTICS WEEKLY RELEASE // January 13, 2015 3 WOMEN’S GYMNASTICS ON PAC-12 NETWORKS Pac-12 gymnastics teams will see an unprec-edented amount of television coverage this season. In the third year of Pac-12 Networks, 21 women’s gymnastics

18 Views

2y ago

AngularJS Essentials - GitHub Pages

AngularJS, and honestly, I cannot imagine writing this same application using another kind of technology in this short period of time. I was so excited about it that I wrote an article on using AngularJS with Spring MVC and Hibernate for a magazine called Java Magazine. After that, I created an AngularJS training program that already has more than 200 developers who enrolled last year. This .

66 Views

3y ago

Recent Views

Aina Haina Shopping Center Hours of Operation 820 West .

Grilled Pork Chop & Tofu (Cơm Sườn Nướng & Đậu Hũ) - 14.25 15. Grilled Tofu (Cơm Đậu Hũ Nướng) - 14.25 16. Grilled Shrimp & Tofu (Cơm Tôm Nuong & Đậu Hũ) - 16.25 . Thai Tea (Trà Thai) - 4.50 . Iced Coffee (Cà Phê S

2y ago

190 Views

Yahoo: Failures - Harvard University

Stock closes at an all time low 8.11 Yahoo invested 1Bn in Alibaba Yahoo co-founder & CEO Jerry Yang steps down after 18 months Microsoft and Yahoo agree to search partnership 2008 Yahoo tries to buy Google for 3Bn. Google denied the offer 2009 Yahoo acquires many media companies Microsoft tries to buy Yahoo for 44.6Bn Yahoo denied offer .

1y ago

200 Views

Reviewers Guide – AT&T Yahoo! Go Mobile

Reviewers Guide – AT&T Yahoo! Go Mobile AT&T Yahoo! Go Mobile gives you access to a wide range of the Yahoo! services you . select download then select attachments to view and download the attachment. 4 . emoticons, audibles, voice IMs and attach photos to IM conversations. To use Yahoo! Messenger, click on Messenger in the Yahoo! Go .

2y ago

369 Views

MANAGERIAL FINANCE - GBV

of Managerial Finance page 2 Introduction to Managerial Finance 1 Starbucks—A Taste for Growth page 3 1.1 Finance and Business What Is Finance? 4 Major Areas and Opportunities in Finance 4 Legal Forms of Business Organization 5 Why Study Managerial Finance? Review Questions 9 1.2 The Managerial Finance Function 9 Organization of the Finance

3y ago

6.8K Views

Chapter 1 The roles of finance function in organisations

The roles of the finance function in organisations 4. The role of ethics in the role of the finance function Ethics is the system of moral principles that examines the concept of right and wrong. Ethics underpins an organisation’s sustained value creation. The roles that the finance function performs should be carried out in an .File Size: 888KBPage Count: 10Explore furtherRole of the Finance Function in the Financial Management .www.managementstudyguide.c Roles and Responsibilities of a Finance Department in a .www.pharmapproach.comRoles and Responsibilities of a Finance Department .www.smythecpa.comTop 10 – Functions of Business Finance in an om23 Functions and Duties of Accounting and Finance nded to you b

2y ago

335 Views

Yahoo Microsoft: A Horizontal Romance, or a Broken

News, Finance, Sports and Rivals Entertainment -Yahoo! Music, Movies, TV, Games, Video and omg! Life Style - Yahoo! Autos, Real Estate, Food, Tech, Kids, Health o Connected Life - Co-branded broadband, Yahoo! Moblie Digital Home, Desktop

1y ago

127 Views

2017-2018 GRANDE ÉCOLE MSc in MANAGEMENT

Descriptif des cours Course Outlines 10 Catalogue des cours/ Course Catalog 2017-2018 FIN: Finance/Finance A : Actuariat/Actuarial, Insurance E : Finance d’entreprise/Corporate Finance The course liste tables and the course outlines G : Finance générale/General Finance M : Finance de marché/Market Finance S : Synthèse/Synthesis IDS: Systèmes d’Information, Sciences de la Décision et .

3y ago

312 Views

Behavioral Finance and Wealth L Management

Introduction to Behavioral Finance CHAPTER1 What Is Behavioral Finance? Behavioral Finance: The Big Picture Standard Finance versus Behavioral Finance The Role of Behavioral Finance with Private Clients How Practical Application of Behavioral Finance Can Create a Successful Advisory Rel

2y ago

377 Views

Catalogue des Cours Course Catalog - ESSEC Business School

10 Catalogue des cours/Course Catalog 2021-2022 FIN: Finance/Finance E : Finance d'entreprise/Corporate Finance G : Finance générale/General Finance M : Finance de marché/Market Finance S : Synthèse/Synthesis IDS: Systèmes d'Information, Sciences de la Décision et Statistiques/ Information Systems, Decision Sciences and Statistics

1y ago

222 Views

kama sastry 2004@yahoo.co.uk in.groups.yahoo .

kama_sastry_2004@yahoo.co.uk up/hot-indi

2y ago

477 Views

IX. “Can You Buy Me Now?”: The Erratic Closing of the .

2016-2017 Developments in Banking law 547 by both parties, Verizon was supposed to purchase Yahoo’s shares for 4,825,800,000.965 Excluded from the transaction were Yahoo’s holdings in Yahoo Japan and Alibaba.966 The sale will end Yahoo’s twenty

2y ago

358 Views

Implementasi Rest Web Service Pada Aplikasi Pengolah Pesan Yahoo . - Core

REST Web Service: Gambar 3. Desain Sistem REST Web Service 3. HASIL DAN PEMBAHASAN 3.1 Gambaran Umum Aplikasi Pada Penelitian ini akan menghasilkan sebuah aplikasi pengolah pesan Yahoo Messenger dan Aplikasi REST Web Service. Aplikasi pengolah pesan Yahoo Messenger berfungsi untuk mengirim dan menerima pesan Yahoo Messenger.

1y ago

165 Views

SINGAPORE - Kelly Services

FINANCE Chief Financial Officer Degree/Master 15 20,000 25,000 Finance Assistant Diploma 1-3 2,800 3,400 Finance Controller Degree 10-15 10,000 18,000 Finance Director Degree 15 15,000 20,000 Finance Executive/ Senior Finance Executive Degree 2-5 3,000 6,000 Finance Manager/ Assistan

2y ago

527 Views

Ministries of Finance and Nationally Determined Contributions

Rodrigo Rojo, IDB Sr. Consultant and advisor to Ministry of Finance of Chile. Colombia German Romero Otalora and Laura Marcela Ruiz Daza — Office of the Vice-Minister — Ministry of Finance. Ireland Paul Ryan — International Finance Division — Ministry of Finance Sean Judge — Department of Finance — Ministry of Finance

1y ago

232 Views

Trade Finance & Supply Chain Finance Awards 2022

In February 2022, Global Finance will publish its annual selections for the World's Best Trade Finance and Supply Chain Finance Providers. Global Finance will name the best trade finance providers in more than 100 countries and territories, eight global regions and

1y ago

215 Views

1 PAC Learning Model - ECE/CIS

It looks like you're using an ad-blocker