Learning From Data Lecture 9 Logistic Regression And .

3y ago

30 Views

3 Downloads

3.24 MB

23 Pages

Last View : Today

Last Download : 3m ago

Upload by : Jenson Heredia

Report this link

Download PDF

Transcription

Learning From DataLecture 9Logistic Regression and Gradient DescentLogistic RegressionGradient DescentM. Magdon-IsmailCSCI 4100/6100

recap:Linear Classification and RegressionThe linear signal:s wtxAlgorithmsGood Features are ImportantLinear Classification.Pocket algorithm can tolerate errorsSimple and efficientyLinear Regression.Single step learning:Before looking at the data, we can reason thatsymmetry and intensity should be good featuresbased on our knowledge of the problem.c AML Creator: Malik Magdon-Ismailw X†y (XtX) 1Xt yVery efficient O(N d2 ) exact algorithm.x1x2Logistic Regression and Gradient Descent: 2 /23Predicting a probability

Predicting a ProbabilityWill someone have a heart attack over the next year?agegenderblood sugarHDLLDLMassHeight.62 yearsmale120 mg/dL40,00050120190 lbs5′ 10′′.Classification: Yes/NoLogistic Regression: Likelihood of heart attackh(x) θdXi 0c AML Creator: Malik Magdon-Ismailw i xi!logistic regression y [0, 1] θ(wtx)Logistic Regression and Gradient Descent: 3 /23What is θ?

Predicting a ProbabilityWill someone have a heart attack over the next year?agegenderblood sugarHDLLDLMassHeight.62 yearsmale120 mg/dL40,00050120190 lbs5′ 10′′.Classification: Yes/NoLogistic Regression: Likelihood of heart attackh(x) θdXi 0c AML Creator: Malik Magdon-Ismailw i xi!logistic regression y [0, 1]1θ(s)θ(s) es1 .s1 e1 e st θ(w x)0sLogistic Regression and Gradient Descent: 4 /23θ( s) 1e s 1 θ(s).1 e s1 esData is binary 1

The Data is Still Binary, 1D (x1, y1 1), · · · , (xN , yN 1)xn a person’s health informationyn 1 did they have a heart attack or notWe cannot measure a probability.We can only see the occurence of an event and try to infer a probability.c AML Creator: Malik Magdon-IsmailLogistic Regression and Gradient Descent: 5 /23f is noisy

The Target Function is Inherently Noisyf (x) P[y 1 x].The data is generated from a noisy target function:P (y x) f (x) 1 f (x)c AML Creator: Malik Magdon-Ismailfor y 1;for y 1.Logistic Regression and Gradient Descent: 6 /23When is h good?

What Makes an h Good?‘fitting’ the data means finding a good hh is good if: h(xn) 1 h(xn) 0whenever yn 1;whenever yn 1.A simple error measure that captures this:N 21 X1Ein(h) h(xn) 2 (1 yn) .N n 1Not very convenient (hard to minimize).c AML Creator: Malik Magdon-IsmailLogistic Regression and Gradient Descent: 7 /23Cross entropy error

The Cross Entropy Error MeasureN1 X yn ·wt xEin(w) ln(1 e)N n 1It looks complicated and ugly (ln, e(·), . . .),But,– it is based on an intuitive probabilistic interpretation of h.– it is very convenient and mathematically friendly (‘easy’ to minimize).Verify: yn 1 encourages wt xn 0, so θ(wt xn ) 1; yn 1 encourages wt xn 0, so θ(wt xn ) 0;c AML Creator: Malik Magdon-IsmailLogistic Regression and Gradient Descent: 8 /23Probabilistic interpretation

The Probabilistic InterpretationSuppose that h(x) θ(wtx) closely captures P[ 1 x]:P (y x) t θ(w x) 1 θ(wtx)c AML Creator: Malik Magdon-Ismailfor y 1;for y 1.Logistic Regression and Gradient Descent: 9 /231 θ(s) θ( s)

The Probabilistic InterpretationSo, if h(x) θ(wtx) closely captures P[ 1 x]:P (y x) t θ(w x) θ( wtx)c AML Creator: Malik Magdon-Ismailfor y 1;for y 1.Logistic Regression and Gradient Descent: 10 /23Simplify to one equation

The Probabilistic InterpretationSo, if h(x) θ(wtx) closely captures P[ 1 x]:P (y x) t θ(w x) θ( wtx)for y 1;for y 1. . . or, more compactly,P (y x) θ(y · wtx)c AML Creator: Malik Magdon-IsmailLogistic Regression and Gradient Descent: 11 /23The likelihood

The LikelihoodP (y x) θ(y · wtx)Recall: (x1, y1), . . . , (xN , yN ) are independently generatedLikelihood:The probability of getting the y1, . . . , yN in D from the corresponding x1, . . . , xN :P (y1, . . . , yN x1, . . . , xn) NYP (yn xn).n 1The likelihood measures the probability that the data were generated if f were h.c AML Creator: Malik Magdon-IsmailLogistic Regression and Gradient Descent: 12 /23Maximize the likelihood

Maximizing The LikelihoodmaxQNn 1 P (yn QN maxln maxPN min xn)n 1 P (ynn 1 ln P (yn min1N min1N min1N1N(why?) xn) xn)PNn 1 ln P (yn xn)PN1n 1 ln P (yn xn )PN1lnn 1θ(yn·wt xn )PN yn ·wn 1 ln(1 e txnwe specialize to our “model” here)N1 X yn ·wt xnEin(w) ln(1 e)N n 1c AML Creator: Malik Magdon-IsmailLogistic Regression and Gradient Descent: 13 /23How to minimize Ein (w)

How To Minimize Ein(w)Classification – PLA/Pocket (iterative)Regression – pseudoinverse (analytic), from solving w Ein(w) 0.Logistic Regression – analytic won’t work.Numerically/iteratively set w Ein(w) 0.c AML Creator: Malik Magdon-IsmailLogistic Regression and Gradient Descent: 14 /23Hill analogy

Finding The Best Weights - Hill DescentBall on a complicated hilly terrain— rolls down to a local valley this is called a local minimumQuestions:How to get to the bottom of the deepest valey?How to do this when we don’t have gravity?c AML Creator: Malik Magdon-IsmailLogistic Regression and Gradient Descent: 15 /23Our Ein is convex

In-sample Error, EinOur Ein Has Only One ValleyWeights, w. . . because Ein (w) is a convex function of w.c AML Creator: Malik Magdon-IsmailLogistic Regression and Gradient Descent: 16 /23(So, who care’s if it looks ugly!)How to roll down?

How to “Roll Down”?Assume you are at weights w(t) and you take a step of size η in the direction v̂.w(t 1) w(t) η v̂We get to pick v̂ what’s the best direction to take the step?Pick v̂ to make Ein(w(t 1)) as small as possible.c AML Creator: Malik Magdon-IsmailLogistic Regression and Gradient Descent: 17 /23The gradient

The Gradient is the Fastest Way to Roll DownApproximating the change in Ein Ein Ein (w(t 1)) Ein (w(t)) Ein (w(t) ηv̂) Ein (w(t)) η Ein (w(t))tv̂ O(η 2 )(Taylor’s Approximation) {z} Ein (w(t))minimized at v̂ Ein (w(t)) η Ein (w(t)) in (w(t)) attained at v̂ E Ein (w(t)) The best (steepest) direction to move is the negative gradient:v̂ c AML Creator: Malik Magdon-Ismail Ein(w(t)) Ein(w(t)) Logistic Regression and Gradient Descent: 18 /23Iterate the gradient

“Rolling Down” Iterating the Negative Gradientw(0) negative gradientw(1) negative gradient w(2) negative gradient w(3) negative gradient.c AML Creator: Malik Magdon-IsmailLogistic Regression and Gradient Descent: 19 /23η 0.5; 15 stepsWhat step size?

The ‘Goldilocks’ Step Sizeη too smallη too largevariable ηt – just rightIn-sample Error, EinIn-sample Error, EinIn-sample Error, Einlarge ηsmall ηWeights, wWeights, wWeights, wη 0.1; 75 stepsη 2; 10 stepsvariable ηt; 10 stepsc AML Creator: Malik Magdon-IsmailLogistic Regression and Gradient Descent: 20 /23Fixed learning rate gradient descent

Fixed Learning Rate Gradient Descentηt η · Ein(w(t)) 1:2: Ein(w(t)) 0 when closer to the minimum.v̂ ηt · (Ex. 3.7 in LFD)gt Ein(w(t)). Ein(w(t)) Ein (w(t)) η · Ein (w(t)) ·3:Initialize at step t 0 to w(0).for t 0, 1, 2, . . . doCompute the gradient4: Ein (w(t)) Ein (w(t)) 5:w(t 1) w(t) ηvt .6:7:v̂ η · Ein(w(t))Move in the direction vt gt .Update the weights:8:Iterate ‘until it is time to stop’.end forReturn the final weights.Gradient descent can minimize any smooth function, for exampleN1 X yn ·wt xEin(w) ln(1 e)N n 1c AML Creator: Malik Magdon-IsmailLogistic Regression and Gradient Descent: 21 /23 logistic regressionStochastic gradient descent

Stochastic Gradient Descent (SGD)A variation of GD that considers only the error on one data point.NNX1 X1tEin(w) ln(1 e yn·w x) e(w, xn, yn)N n 1N n 11. The ‘average’ move is the same as GD; Pick a random data point (x , y )2. Computation: fraction Run an iteration of GD on e(w, x , y )1Ncheaper per step;3. Stochastic: helps escape local minima;4. Simple;w(t 1) w(t) η w e(w, x , y )5. Similar to PLA.Logistic Regression:w(t 1) w(t) y x η1 ey wtx (Recall PLA: w(t 1) w(t) y x )c AML Creator: Malik Magdon-IsmailLogistic Regression and Gradient Descent: 22 /23GD versus SGD, a picture

Stochastic Gradient DescentGDη 610 stepsN 10c AML Creator: Malik Magdon-IsmailSGDη 230 stepsLogistic Regression and Gradient Descent: 23 /23

2 y Linear Regression. Single step learning: w X†y (XtX) 1Xty Very eﬃcient O(Nd2) exact algorithm. c AML Creator: MalikMagdon-Ismail LogisticRegressionand Gradient Descent: 2/23 Predictingaprobability

Related Documents:

CHEMICAL REACTION ENGINEERING

Introduction of Chemical Reaction Engineering Introduction about Chemical Engineering 0:31:15 0:31:09. Lecture 14 Lecture 15 Lecture 16 Lecture 17 Lecture 18 Lecture 19 Lecture 20 Lecture 21 Lecture 22 Lecture 23 Lecture 24 Lecture 25 Lecture 26 Lecture 27 Lecture 28 Lecture

94 Views

2y ago

LECTURE NOTES on PROGRAMMING & DATA STRUCTURE Course Code : BCS101

Lecture 1: A Beginner's Guide Lecture 2: Introduction to Programming Lecture 3: Introduction to C, structure of C programming Lecture 4: Elements of C Lecture 5: Variables, Statements, Expressions Lecture 6: Input-Output in C Lecture 7: Formatted Input-Output Lecture 8: Operators Lecture 9: Operators continued

56 Views

1y ago

MSE 460: Electronic Materials, Devices, and Processing

Lecture 1: Introduction and Orientation. Lecture 2: Overview of Electronic Materials . Lecture 3: Free electron Fermi gas . Lecture 4: Energy bands . Lecture 5: Carrier Concentration in Semiconductors . Lecture 6: Shallow dopants and Deep -level traps . Lecture 7: Silicon Materials . Lecture 8: Oxidation. Lecture

150 Views

2y ago

【E-book】Texts & Questions of 50 Lectures for TOEFL ...

TOEFL Listening Lecture 35 184 TOEFL Listening Lecture 36 189 TOEFL Listening Lecture 37 194 TOEFL Listening Lecture 38 199 TOEFL Listening Lecture 39 204 TOEFL Listening Lecture 40 209 TOEFL Listening Lecture 41 214 TOEFL Listening Lecture 42 219 TOEFL Listening Lecture 43 225 COPYRIGHT 2016

142 Views

2y ago

Partial Differential Equations MSO-203-B - IIT Kanpur

Partial Di erential Equations MSO-203-B T. Muthukumar tmk@iitk.ac.in November 14, 2019 T. Muthukumar tmk@iitk.ac.in Partial Di erential EquationsMSO-203-B November 14, 2019 1/193 1 First Week Lecture One Lecture Two Lecture Three Lecture Four 2 Second Week Lecture Five Lecture Six 3 Third Week Lecture Seven Lecture Eight 4 Fourth Week Lecture .

35 Views

11m ago

Course 395: Machine Learning - Imperial College London

Lecture 5-6: Artificial Neural Networks (THs) Lecture 7-8: Instance Based Learning (M. Pantic) . (Notes) Lecture 17-18: Inductive Logic Programming (Notes) Maja Pantic Machine Learning (course 395) Lecture 1-2: Concept Learning Lecture 3-4: Decision Trees & CBC Intro Lecture 5-6: Artificial Neural Networks .

15 Views

1y ago

Introduction to Quantum Field Theory for Mathematicians

Introduction to Quantum Field Theory for Mathematicians Lecture notes for Math 273, Stanford, Fall 2018 Sourav Chatterjee (Based on a forthcoming textbook by Michel Talagrand) Contents Lecture 1. Introduction 1 Lecture 2. The postulates of quantum mechanics 5 Lecture 3. Position and momentum operators 9 Lecture 4. Time evolution 13 Lecture 5. Many particle states 19 Lecture 6. Bosonic Fock .

214 Views

3y ago

Lecture Notes for EE263 - Stanford University

Lecture 11 – Eigenvectors and diagonalization Lecture 12 – Jordan canonical form Lecture 13 – Linear dynamical systems with inputs and outputs Lecture 14 – Example: Aircraft dynamics Lecture 15 – Symmetric matrices, quadratic forms, matrix norm, and SVD Lecture 16 – SVD applications

83 Views

2y ago

Recent Views

Legal Proceedings and Legal Privilege Exemptions: Myth-busting - ICO

If asking for legal advice, say so, and start new email chain If giving legal advice, say so Involve lawyers (before litigation contemplated) Maintain confidentiality of legal advice documents Limit dissemination of legal advice (need to know; original only) Make internal communications re legal advice factual

1y ago

240 Views

Smart People Ask for (My) Advice: Seeking Advice Boosts .

advice strategically is likely to be a different experi-ence for the advice seeker than seeking advice with the intention of using it, from the advisor’s perspec-tive, strategic advice seeking may elicit the same per-ceptual effects as authentic advice seeking because the advice seeker’s intentions (and her reliance on advice)

3y ago

177 Views

Legal Action Group The Role of Advice Services in Health Outcomes

The Role of Advice Services in Health Outcomes Evidence Review and Mapping Study June 2015 The Role of Advice Services in Health Outcomes . tor.!Our! r,!

1y ago

170 Views

Legal Information vs Legal Advice Guidelines - TMCEC

giving legal advice. Legal advice is a written or oral statement that: o Interprets some aspect of the law, court rules, or court procedures; o Recommends a specific course of conduct a person should take in an actual or potential legal proceeding; or o Applies the law to the individual person's specific factual circumstances. What is Legal .

1y ago

225 Views

ProQual L2 Certificate Supporting Access to Legal Advice

R/502/7657 Communicating with legal advice clients 2 3 D/503/0822 Supporting clients to make use of the legal advice service 2 3 R/502/7660 Enabling legal advice clients to access signposting and referral opportunities 2 3 Optional Units - a minimum of 6 credits Unit Reference Number Unit Title Unit Level Credit Value

1y ago

173 Views

Guidance for opponents in civil legal aid cases - Scottish Legal Aid Board

injury case - may apply for civil legal aid (since this leaﬂet deals only with civil legal aid, where we refer to "legal aid" we mean "civil legal aid"). Legal aid is ﬁnancial help from public funds. It helps people who qualify to get legal advice and the help of a solicitor to put their case in court.

4m ago

110 Views

legal and ethical dimensions of practice - Dovetail

Material in this Guide should never be taken as providing you or any other person with legal advice. Legal advice regarding the application of the law to a particular circumstance or situation can only come from a legal practitioner. A range of sources for legal advice can be found in the Guide.

1y ago

167 Views

How Social Welfare Legal Advice and Social Prescribing can work .

The position of social welfare legal advice and its role in London's recovery The Mayor of London and partners should position social welfare legal advice as a core pillar of Londons recovery from the OVID-19 pandemic, with a core focus on ensuring adequate funding and practical support for advice agencies to ensure ongoing viability.

1y ago

172 Views

WHAT TO DO IF YOU ARE SEXUALLY HARASSED

There are many legal clinics or legal information centres you can contact to obtain legal information, educational resources or legal referrals. Alberta Central Alberta Community Legal Clinic (Red Deer) Centre for Public Legal Education Alberta Pro Bono Law Alberta Women's Centre Legal Advice Clinic (Calgary)

3y ago

245 Views

Legal Advocacy Essentials

Legal Advocacy Essentials: a core training for legal advocates Presented by the Washington State Coalition Against Domestic Violence, 2008. This information is not intended as a substitute for legal advice. 1 Legal Advocacy Essentials . A core training for legal advocates . Table of Contents . What is a legal advocate?

1y ago

249 Views

Legal & Corporate Services: Strategic Plan - CP6

the provision of legal advice, managing legal risk and managing the legal supply chain. By doing this well, the team will move towards its vision. Legal Services is made up of 4 teams, each serving different customers with a dedicated legal resource. This is summarised in the figure right. Although Legal Services has customerdistinct, -focussed .

1y ago

171 Views

Regulatory Guide RG 90 Example Statement of Advice: Scaled advice for a .

representatives and advisers who give personal advice to retail clients. It explains how and why we have developed an example Statement of Advice (SOA) for scaled advice (i.e. personal advice that is limited in scope) on personal insurance for a new retail client. The example SOA was developed in consultation with stakeholders, and we

1y ago

186 Views

Removal of licence disqualification - Legal Aid WA

agencies, permission must first be obtained from Legal Aid Western Australia. This Kit provides information about the law only and does not constitute legal advice. You should seek legal advice if you have a specific legal problem. Every effort is made to ensure that the information contai

2y ago

253 Views

Legal Information vs - txcourts.gov

giving legal advice. Legal advice is a written or oral statement that: Inter p rets some as ect of th elaw, courtles, or du s; Recomme nd s a pecific c ourse of ndu ters h ld k ein an actual or ntial legal proceeding; or 'sApplies th elaw to individu alperso n seci fic actu circums a . What is Legal Information?

1y ago

174 Views

AUGUST 11, 2020 Business or Legal Advice? - acc

legal advice to the client. Mixed purpose depending on specific purpose for tax advice. 27 In-House Privilege: Scenario 2: Tax Advice. FACTS An anonymous employee submits a complaint about his boss. In-house counsel interviews various employees about the allegations. Is the

1y ago

136 Views

Learning From Data Lecture 9 Logistic Regression And .

It looks like you're using an ad-blocker