CPSC 540: Machine Learning

3y ago

39 Views

2 Downloads

610.56 KB

23 Pages

Last View : Today

Last Download : 3m ago

Upload by : Brady Himes

Report this link

Download PDF

Transcription

Conditional Random FieldsBeyond Basic CRFsCPSC 540: Machine LearningConditional Random FieldsMark SchmidtUniversity of British ColumbiaWinter 2020

Conditional Random FieldsBeyond Basic CRFs3 Classes of Structured Prediction Methods3 main approaches to structured prediction (predicting object y given features x):1 Generative models use p(y x) p(y, x) as in naive Bayes.Turns structured prediction into density estimation.But remember how hard it was just to model images of digits?We have to model features and solve supervised learning problem.2Discriminative models directly fit p(y x) as in logistic regression (next topic).View structured prediction as conditional density estimation.Just focuses on modeling y given x, not trying to model features x.Lets you use complicated features x that make the task easier.3Discriminant functions just try to map from x to y as in SVMs.Now you don’t even need to worry about calibrated probabilities.

Conditional Random FieldsMotivation: Automatic Brain Tumor SegmentationTask: identification of tumours in multi-modal MRI.Applications:Radiation therapy target planning, quantifying treatment response.Mining growth patterns, image-guided surgery.Challenges:Variety of tumor appearances, similarity to normal tissue.“You are never going to solve this problem”.Beyond Basic CRFs

Conditional Random FieldsBeyond Basic CRFsBrain Tumour Segmentation with Label DependenciesAfter a lot pre-processing and feature engineering (convolutions, priors, etc.),final system used logistic regression to label each pixel as “tumour” or not.p(yc xc ) exp(yc wT xc )1 1 exp( 2yc wT xc )exp(wT xc ) exp( wT xc )Gives a high “pixel-level” accuracy, but sometimes gives silly results:Classifying each pixel independently misses dependence in labels y i :We prefer neighbouring voxels to have the same value.

Conditional Random FieldsBeyond Basic CRFsBrain Tumour Segmentation with Label DependenciesWith independent logistic, conditional distribution over all labels in one image iskYexp(yc wT xc )exp(wT xc ) exp( wT xc )c 1!dXT expyc w x c ,p(y1 , y2 , . . . , yk x1 , x2 , . . . , xk ) c 1where here xc is the feature vector for position c in the image.We can view this as a log-linear UGM with no edges,φc (yc ) exp(yc wT xc ),so given the xc there is no dependence between the yc .

Conditional Random FieldsBeyond Basic CRFsBrain Tumour Segmentation with Label DependenciesAdding an Ising-like term to model dependencies between yi gives kXXp(y1 , y2 , . . . , yk x1 , x2 , . . . , xk ) exp yc wT xc yc yc0 v ,c 1(c,c0 ) ENow we have the same “good” logistic regression model,but v controls how strongly we want neighbours to be the same.Note that we’re going to jointly learn w and v.We’ll find the optimal joint logistic regression and Ising model.When we model conditional of y given x as a UGM,we call it a conditional random field (CRF).Key advantadge of this (discriminative) approach:Don’t need to model features x as in “generative” models.We saw with MNIST digits that modeling images is hard.

Conditional Random FieldsBeyond Basic CRFsConditional Random Fields for SegmentationRecall the performance with the independent classifier:The pairwise CRF better modelled the “guilt by association”:Trained with pseudo-likelihood. Added constraint v 0 to use graph cut decoding.(We were using edge features xcc0 too, see bonus (and different λ on edges).)CRFs are like logistic regression (no modeling x) vs naive Bayes (modeling x).p(y x) (discriminative) vs. p(y, x) (generative).

Conditional Random FieldsBeyond Basic CRFsConditional Random FieldsThe brain CRF can be written as a conditional log-linear models,p(y x, w) 1exp(wT F (x, y)),Z(x)for some parameters w and features F (x, y).The NLL is convex and has the form log p(y x, w) wT F (x, y) log Z(x),and the gradient can be written as log p(y x, w) F (x, y) Ey x [F (x, y)].Unlike before, we now have a Z(x) and set of expectations for each x.Train using gradient methods like quasi-Newton, SG, or SAG.

Conditional Random FieldsBeyond Basic CRFsRain Data without Month InformationConsider an Ising UGM model for the rain data with tied parameters,!kkXXp(y1 , y2 , . . . , yk ) expyc ω yc yc 1 ν .c 1c 2First term reflects that “not rain” is more likely.Second term reflects that consecutive days are more likely to be the same.This model is equivalent to a Markov chain model.We could condition on month to model “some months are less rainy”.

Conditional Random FieldsBeyond Basic CRFsRain Data with Month Information using CRFsDiscriminative appraoch: fit a CRF model conditioned on month x, kdk X12XXXp(y1 , y2 , . . . , yk x) exp yc ω yc yc 1 ν yc xj vj .c 1c 2c 1 j 1The conditional UGM given x has a chain-structure 12Xyi xj vj , φij (yi , yj ) exp(yi yj ν),φi (yi ) exp yi ω j 1so inference can be done using forward-backward.And it’s log-linear so the NLL will be convex.

Conditional Random FieldsBeyond Basic CRFsRain Data with Month InformationSamples from CRF conditioned on x being December (left) and July (right):Conditional NLL is 16.21, compared to Markov chain which gets NLL 16.81.Better than mixture of 10 Markov chains (EM training), which gets 16.53.Probably due to finding global minima when fitting CRF.

Conditional Random FieldsBeyond Basic CRFsRain Data with Month Information using CRFsA CRF model conditioned on month x, kdk X12XXX1p(y1 , y2 , . . . , yk x) exp yc ω yc yc 1 ν yc x j v j .Z(x)c 1c 2c 1 j 1Comparing this to other approaches:Generative: model p(y1 , y2 , . . . , yk , x).Have to model distribution of x, and inference is more expensive (not a chain).Also uses known clusters.Learning is still convex.Mixtre/Boltzmann: add latent variables z that might learn month information.Have to model distribution of z, inference is more expensive (not a chain).Doesn’t use known clusters so needs more data.But might learn a better clustering if months aren’t great clusters.Learning is non-convex due to sum over z values.

Conditional Random FieldsBeyond Basic CRFsOutline1Conditional Random Fields2Beyond Basic CRFs

Conditional Random FieldsBeyond Basic CRFsModeling OCR DependenciesWhat dependencies should we model for this problem?φ(yc , xc ): potential of individual letter given image.φ(yc 1 , yc ): dependency between adjacent letters (‘q-u’).φ(yc 1 , yc , xc 1 , xc ): adjacent letters and image dependency.φc (yi 1 , yc ): inhomogeneous dependency (French: ‘e-r’ ending).φc (yc 2 , yc 1 , y i ): third-order and inhomogeneous (English: ‘i-n-g’ end).φ(y D): is y in dictionary D?

Conditional Random FieldsBeyond Basic CRFsTractability of Discriminative ModelsFeatures can be very complicated, since we just condition on the xc , .Given the xc , tractability depends on the conditional UGM on the yc .Inference/decoding will be fast or slow, depending on the yc graph.Besides “low treewidth”, some other cases where exact computation is possible:Semi-Markov chains (allow dependence on time you spend in a state).Context-free grammars (allows potentials on recursively-nested parts of sequence).Sum-product networks (restrict potentials to allow exact computation).“Dictionary” feature is non-Markov, but exact computation still easy.We can alternately use our previous approximations:123Pseudo-likelihood (what we used).Monte Carlo approximate inference (eventually better but probably much slower).Variational approximate inference (fast, quality varies).

Conditional Random FieldsBeyond Basic CRFsCRF “Product of Marginals” ObjectiveIn CRFs we typically optimize the likelihood, p(y x, w).This focuses on getting the joint likelihood of the sequence y right.What if we are interested in getting the “parts” yc right?In sequence labeling, your error is “number of positions you got wrong” in sequence.As opposed to “did you get the whole sequence right?”In this setting, it could make more sense to optimize the product of marginals:kYc 1p(yc x, w) kYXp(y 0 x, w).c 1 {y 0 yc0 yc }Non-convex, but probably a better objective.If you know how to do inference, this paper shows how to get gradients:https://people.cs.umass.edu/ domke/papers/2010nips.pdf

Conditional Random FieldsBeyond Basic CRFsLearning for Structured Prediction (Big Picture)3 types of classifiers discussed in CPSC 340/540:ModelGenerative model p(y, x)Discriminative model p(y x)Discriminant function y f (x)“Classic ML”Naive Bayes, GDALogistic regressionSVMStructured PredictionUGM (or “MRF”)CRFStructured SVMDiscriminative models don’t need to model x.Don’t need “naive Bayes” or Gaussian assumptions.Discriminant functions don’t even worry about probabilities.Based on decoding, which is different than inference in structured case.Useful when inference is hard but decoding is easy.Examples include “attractive” graphical models, matching problems, and ranking.I put my material on structured SVMs here:https://www.cs.ubc.ca/ schmidtm/Courses/540-W19/L28.5.pdf

Conditional Random FieldsBeyond Basic CRFsFeedforward Neural NetworksIn 340 we discussed feedforward neural networks for supervised learning.With 1 hidden layer the classic model has this structure:Motivation:For some problems it’s hard to find good features.This learns features z that are good for particular supervised learning problem.

Conditional Random FieldsBeyond Basic CRFsNeural Networks as DAG ModelsIt’s a DAG model but there is an important difference with our previous models:The latent variables zc are deterministic functions of the xj .Makes inference given x trivial: if you observe all xj you also observe all zc .In this case y is the only random variable.

Conditional Random FieldsBeyond Basic CRFsDeep Learning for Structured Prediction (Big Picture)How is deep learning being used for structured prediction?Discriminative approaches are most popular.Typically you will send x through a neural network to get representation z, then:1Perform inference on p(y z) (backpropagate using exact/approximate marginals).2Run m approximate inference steps on p(y z), backpropagate through these steps.Neural network learns features, CRF “on top” models dependencies in yc .“Learn to use the inference you will be using” (usually with variational inference).3Just model each p(yc z) (treat labels as independent given representation).Assume that structure is already captured in neural network goo (no inference).Current trend: less dependence on inference and more on learning representation.“Just use an RNN rather than thinking about stochastic grammars.”We’re improving a lot at learning features, less so for inference.This trend may or may not reverse in the future.

Conditional Random FieldsBeyond Basic CRFsSummary3 types of structured prediction:Generative models, discriminative models, discriminant functions.Conditional random fields generalize logistic regression:Discriminative model allowing dependencies between labels.Log-linear parameterization again leads to convexity.But requires inference in graphical model.Reducing the reliance on inference is a current trend in the field.Rely on neural network to learn clusters and dependencies.Next time: our (overdue) visit to the world of deep learning.

Conditional Random FieldsBeyond Basic CRFsBrain Tumour Segmentation with Label DependenciesWe got a bit more fancy and used edge features xij , dXX1p(y 1 , y 2 , . . . , y d x1 , x2 , . . . , xd ) exp y i w T xi y i y j v T xij .Zi 1(i,j) EFor example, we could use xij 1/(1 xi xj ).Encourages yi and yj to be more similar if xi and xj are more similar.This is a pairwise UGM withφi (y i ) exp(y i wT xi ),φij (y i , y j ) exp(y i y j v T xij ),so it didn’t make inference any more complicated.

Conditional Random FieldsBeyond Basic CRFsPosterior RegularizationIn some cases it might make sense to use posterior regularization:Regularize the probabilities in the resulting model.Consider an NLP labeling task whereYou have a small amount of labeled sentences.You have a huge amount of unlabeled sentences.Maximize labeled likelihood, plus total-variation penalty on p(yc x, w) values.Give high regularization weights to words appearing in same ful for “out of vocabulary” words (words that don’t appear in labeled data).

\Learn to use the inference you will be using" (usually with variational inference). 3 Just model each p(y c jz) (treatlabels as independent given representation). Assume that structure is already captured in neural network goo (no inference). Current trend:less dependence on inference and more on learning representation.

Related Documents:

Computer Science Information Security Data Science

CPSC-62800 Programming for Digital Forensics CPSC-63000 Database Systems CPSC-65500 Cloud Computing and Virtualization CPSC-66000 Programming Languages CPSC-66500 Software Vulnerabilities and Defenses. 5. Admission Requirements In order to be accepted into this program, you mus

36 Views

2y ago

Tax Year 2014 H&R Block Online Tax Program State Forms

540 2EZ Resident Income Tax Return 540-Sch CA Adjustments – Residents 540-Sch D Capital Gain and Loss 540-Sch D1 Sales of Business Property 540-Sch G1 Tax on Lump – Sum Distributions 540-Sch P AMT and Credit Limitations 540-Sch S Other State Tax Credit 540ES Estimated Tax for Individuals

29 Views

2y ago

British English Spelling Edition 827 e-book titles

The Invisible Man The Very Big Potato Two Feet Up, Two Feet Down Up, Up, and Away: The Story of Amelia Earhart . The Spider and the Beehive Weird Science: How Freaky Animals Got That Way . 530 530 530 530 530 530 530 540 540 540 540 540 540 540 550 550 550 550 550 550 550 550 550 550 55

39 Views

2y ago

Tax Year 2018 H&R Block Online Tax Program State Forms

540 2EZ Resident Income Tax Return 540-Sch CA Resident Adjustments 540-Sch D Capital Gain and Loss 540-Sch G1 Tax on Lump-Sum Distributions 540-Sch P AMT and Credit Limitations 540-Sch S Other State Tax Credit 540ES Estimated Tax for Individuals 540NR-Long Part Year / Nonresident Income Tax Retur

39 Views

2y ago

Directory of Participating Dental Providers Virginia ...

9 pinnacle dr ste a01 (540) 886-5371 good, stacey c., dmd 9 pinnacle dr ste a01 (540) 886-5371 hammock, mark a., dds 49 tinkling spring rd (540) 942-9013 james w willis ii dds plc 41 s medical park dr ste 10 (540) 885-8037 kamreddy, keerthi r., dds 9 pinnacle dr ste 101 (540) 886-5371 la grua, daniel b., dmd 9 pinnacle dr ste a01 (540) 886-5371

39 Views

2y ago

Undergraduate Handbook 2022-2023 Edition

The Electrical Engineering major includes CPSC 120, so CPSC 120 does not count toward the 12-unit requirement. However, the remaining 12 units of minor courses ordinarily do not count toward the Electrical Engineering major, so Electrical Engineering majors typically only need to pass 15 units of CPSC courses to complete a Computer Science minor.

7 Views

1y ago

CPSC eFiling Alpha Pilot: Business Rules

CBP has asked CPSC for any business rules by suggesting four categories outlined below. These categories are examples of considerations that could contribute to CPSC's business rules. For the eFiling Alpha Pilot, only the "Messaging" category will require changes to current business rules, and therefore, impact CBP development. If

7 Views

1y ago

Specification and Price of Automatic Rendering Machine (FOB ... - AR

decoration machine mortar machine paster machine plater machine wall machinery putzmeister plastering machine mortar spraying machine india ez renda automatic rendering machine price wall painting machine price machine manufacturers in china mail concrete mixer machines cement mixture machine wall finishing machine .

16 Views

3m ago

Recent Views

Legal Proceedings and Legal Privilege Exemptions: Myth-busting - ICO

If asking for legal advice, say so, and start new email chain If giving legal advice, say so Involve lawyers (before litigation contemplated) Maintain confidentiality of legal advice documents Limit dissemination of legal advice (need to know; original only) Make internal communications re legal advice factual

1y ago

240 Views

Smart People Ask for (My) Advice: Seeking Advice Boosts .

advice strategically is likely to be a different experi-ence for the advice seeker than seeking advice with the intention of using it, from the advisor’s perspec-tive, strategic advice seeking may elicit the same per-ceptual effects as authentic advice seeking because the advice seeker’s intentions (and her reliance on advice)

3y ago

177 Views

Legal Action Group The Role of Advice Services in Health Outcomes

The Role of Advice Services in Health Outcomes Evidence Review and Mapping Study June 2015 The Role of Advice Services in Health Outcomes . tor.!Our! r,!

1y ago

170 Views

Legal Information vs Legal Advice Guidelines - TMCEC

giving legal advice. Legal advice is a written or oral statement that: o Interprets some aspect of the law, court rules, or court procedures; o Recommends a specific course of conduct a person should take in an actual or potential legal proceeding; or o Applies the law to the individual person's specific factual circumstances. What is Legal .

1y ago

225 Views

ProQual L2 Certificate Supporting Access to Legal Advice

R/502/7657 Communicating with legal advice clients 2 3 D/503/0822 Supporting clients to make use of the legal advice service 2 3 R/502/7660 Enabling legal advice clients to access signposting and referral opportunities 2 3 Optional Units - a minimum of 6 credits Unit Reference Number Unit Title Unit Level Credit Value

1y ago

173 Views

Guidance for opponents in civil legal aid cases - Scottish Legal Aid Board

injury case - may apply for civil legal aid (since this leaﬂet deals only with civil legal aid, where we refer to "legal aid" we mean "civil legal aid"). Legal aid is ﬁnancial help from public funds. It helps people who qualify to get legal advice and the help of a solicitor to put their case in court.

4m ago

110 Views

Priority Banking Tariff - Standard Chartered

Foreign exchange rate Free Free Free Free Free Free Free Free Free Free Free Free Free Free Free SMS Banking Daily Weekly Monthly. in USD or in other foreign currencies in VND . IDD rates min. VND 85,000 Annual Rental Fee12 Locker size Small Locker size Medium Locker size Large Rental Deposit12,13 Lock replacement

2y ago

206 Views

legal and ethical dimensions of practice - Dovetail

Material in this Guide should never be taken as providing you or any other person with legal advice. Legal advice regarding the application of the law to a particular circumstance or situation can only come from a legal practitioner. A range of sources for legal advice can be found in the Guide.

1y ago

167 Views

How Social Welfare Legal Advice and Social Prescribing can work .

The position of social welfare legal advice and its role in London's recovery The Mayor of London and partners should position social welfare legal advice as a core pillar of Londons recovery from the OVID-19 pandemic, with a core focus on ensuring adequate funding and practical support for advice agencies to ensure ongoing viability.

1y ago

172 Views

WHAT TO DO IF YOU ARE SEXUALLY HARASSED

There are many legal clinics or legal information centres you can contact to obtain legal information, educational resources or legal referrals. Alberta Central Alberta Community Legal Clinic (Red Deer) Centre for Public Legal Education Alberta Pro Bono Law Alberta Women's Centre Legal Advice Clinic (Calgary)

3y ago

245 Views

Legal Advocacy Essentials

Legal Advocacy Essentials: a core training for legal advocates Presented by the Washington State Coalition Against Domestic Violence, 2008. This information is not intended as a substitute for legal advice. 1 Legal Advocacy Essentials . A core training for legal advocates . Table of Contents . What is a legal advocate?

1y ago

249 Views

Legal & Corporate Services: Strategic Plan - CP6

the provision of legal advice, managing legal risk and managing the legal supply chain. By doing this well, the team will move towards its vision. Legal Services is made up of 4 teams, each serving different customers with a dedicated legal resource. This is summarised in the figure right. Although Legal Services has customerdistinct, -focussed .

1y ago

171 Views

Regulatory Guide RG 90 Example Statement of Advice: Scaled advice for a .

representatives and advisers who give personal advice to retail clients. It explains how and why we have developed an example Statement of Advice (SOA) for scaled advice (i.e. personal advice that is limited in scope) on personal insurance for a new retail client. The example SOA was developed in consultation with stakeholders, and we

1y ago

186 Views

Removal of licence disqualification - Legal Aid WA

agencies, permission must first be obtained from Legal Aid Western Australia. This Kit provides information about the law only and does not constitute legal advice. You should seek legal advice if you have a specific legal problem. Every effort is made to ensure that the information contai

2y ago

253 Views

Legal Information vs - txcourts.gov

giving legal advice. Legal advice is a written or oral statement that: Inter p rets some as ect of th elaw, courtles, or du s; Recomme nd s a pecific c ourse of ndu ters h ld k ein an actual or ntial legal proceeding; or 'sApplies th elaw to individu alperso n seci fic actu circums a . What is Legal Information?

1y ago

174 Views

CPSC 540: Machine Learning

It looks like you're using an ad-blocker