CSE 5243 INTRO. TO DATA MINING - GitHub Pages

2y ago
14 Views
2 Downloads
4.18 MB
66 Pages
Last View : 13d ago
Last Download : 3m ago
Upload by : Julia Hutchens
Transcription

CSE 5243 INTRO. TO DATA MININGLocality Sensitive HashingYu Su, CSE@The Ohio State UniversitySlides adapted from UIUC CS412 by Prof. Jiawei Han and OSU CSE5243 by Prof.Huan Sun

MMDS Secs. 3.2-3.4.Slides adapted from: J. Leskovec, A. Rajaraman,J. Ullman: Mining of Massive Datasets,http://www.mmds.orgFINDING SIMILAR ITEMS

Scene Completion Problem33

Scene Completion Problem44

[Hays and Efros, SIGGRAPH 2007]Scene Completion Problem5510 nearest neighbors from a collection of 20,000 images

Scene Completion Problem6610 nearest neighbors from a collection of 2 million images

A Common Metaphor7 Many problems can be expressed asfinding “similar” sets: Find near-neighbors in high-dimensional spaceExamples: Pages with similar wordsn Customers who purchased similar productsn Products with similar customer setsImages with similar featuresn7For duplicate detection, classification by topicUsers who visited similar websites

Problem for Today’s Lecture8 Given: High dimensional data points 𝑥!, 𝑥", And some distance function 𝒅(𝒙𝟏 , 𝒙𝟐 ) For example: Image is a long vector of pixel colors1 2 10 2 1 [1 2 1 0 2 1 0 1 0]0 1 0Which quantifies the “distance” between 𝒙𝟏 and 𝒙𝟐Goal: Find all pairs of data points (𝒙𝒊 , 𝒙𝒋 ) that are within some distancethreshold 𝒅 𝒙𝒊 , 𝒙𝒋 𝒔Note: Naïve solution would take 𝑶 𝑵𝟐 Lwhere 𝑵 is the number of data points 8MAGIC: This can be done in 𝑶 𝑵 !! How?

Task: Finding Similar Documents Goal: Given a large number (𝑵 in the millions or billions) ofdocuments, find “near duplicate” pairsApplications:Mirror websites, or approximate mirrors à remove duplicates Similar news articles at many news sites à cluster 9

Task: Finding Similar Documents Goal: Given a large number (𝑵 in the millions or billions) ofdocuments, find “near duplicate” pairsApplications:Mirror websites, or approximate mirrors à remove duplicates Similar news articles at many news sites à cluster What are the challenges?10

Task: Finding Similar Documents Goal: Given a large number (𝑵 in the millions or billions) of documents,find “near duplicate” pairsApplications:Mirror websites, or approximate mirrors à remove duplicates Similar news articles at many news sites à cluster Problems:Many small pieces of one document can appear out of order in another Too many documents to compare all pairs Documents are so large or so many that they cannot fit in main memory 11

Two Essential Steps for Similar Docs1.2.Shingling: Convert documents to setsMin-Hashing: Convert large sets to short signatures, whilepreserving similarityHost of follow up applicationse.g. Similarity SearchData PlacementClustering etc.12

The Big PictureDocumentShinglingMinHashingThe setof stringsof length kthat appearin the document13Similarity SearchData PlacementClustering etc.Signatures:short integervectors thatrepresent thesets, andreflect theirsimilarityJ. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

DocumentShinglingThe setof stringsof length kthat appearin the documentSHINGLINGStep 1: Shingling: Convert documents to sets

Documents as High-Dim Data Step 1: Shingling: Convert documents to sets Simple approaches:Document set of words appearing in document Document set of “important” words Don’t work well for this application. Why? 15Need to account for ordering of words!A different way: Shingles!J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Define: Shingles A k-shingle (or k-gram) for a document is a sequence of k tokensthat appears in the docTokens can be characters, words or something else, depending on theapplication Assume tokens characters for examples 16J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Define: Shingles A k-shingle (or k-gram) for a document is a sequence of k tokensthat appears in the docTokens can be characters, words or something else, depending on theapplication Assume tokens characters for examples 17Example: k 2; document D1 abcabSet of 2-shingles: S(D1) {ab, bc, ca}J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Define: Shingles A k-shingle (or k-gram) for a document is a sequence of k tokensthat appears in the docTokens can be characters, words or something else, depending on theapplication Assume tokens characters for examples Example: k 2; document D1 abcabSet of 2-shingles: S(D1) {ab, bc, ca} 18Another option: Shingles as a bag (multiset), count ab twice: S’(D1) {ab, bc, ca, ab}J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Shingles: How to treat white-space chars?It makes sense to replace any sequence of one or more white-space characters (blank, tab,newline, etc.) by a single blank.This way distinguishes shingles that cover two or more words from those that do not.19J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

How to choose K? Documents that have lots of shingles in common have similar text,even if the text appears in different orderCaveat: You must pick k large enough, or most documents will havemost shinglesk 5 is OK for short documents k 10 is better for long documents 20J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Compressing Shingles To compress long shingles, we can hash them to (say) 4 bytesLike a Code Book If #shingles manageable à Simple dictionary suffices e.g., 9-shingle bucket number [0, 2 32 - 1](using 4 bytes instead of 9)21J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Compressing Shingles To compress long shingles, we can hash them to (say) 4 bytesLike a Code Book If #shingles manageable à Simple dictionary suffices Doc represented by the set of hash/dict. values of its k-shingles 22Idea: Two documents could appear to have shingles in common, when thehash-values were sharedJ. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Compressing Shingles To compress long shingles, we can hash them to (say) 4 bytesLike a Code Book If #shingles manageable à Simple dictionary suffices 23Doc represented by the set of hash/dict. values of its k-shinglesExample: k 2; document D1 abcabSet of 2-shingles: S(D1) {ab, bc, ca}Hash the singles: h(D1) {1, 5, 7}J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Similarity Metric for Shingles Document D1 is a set of its k-shingles C1 S(D1) Equivalently, each document is a 0/1 vector in the space of k-shingles 24 Each unique shingle is a dimension Vectors are very sparseA natural similarity measure is the Jaccard similarity:sim(D1, D2) C1ÇC2 / C1ÈC2 J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Motivation for Minhash/LSH Suppose we need to find similar documents among 𝑵 𝟏 milliondocumentsNaïvely, we would have to compute pairwise Jaccard similarities forevery pair of docs𝑵(𝑵 𝟏)/𝟐 5*1011 comparisons At 105 secs/day and 106 comparisons/sec,it would take 5 days 25For 𝑵 𝟏𝟎 million, it takes more than a year J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

DocumentShinglingMin-HashingThe setof stringsof length kthat appearin the documentSignatures:short integervectors thatrepresent thesets, and reflecttheir similarityMINHASHINGStep 2: Minhashing: Convert large variable length sets toshort fixed-length signatures, while preserving similarity

Encoding Sets as Bit Vectors 27Many similarity problems can be formalized as finding subsets thathave significant intersectionJ. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Encoding Sets as Bit Vectors Many similarity problems can be formalized as finding subsets thathave significant intersectionEncode sets using 0/1 (bit, boolean) vectors 28One dimension per element in the universal setInterpret set intersection as bitwise AND, andset union as bitwise ORJ. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Encoding Sets as Bit Vectors Many similarity problems can be formalized as finding subsets thathave significant intersectionEncode sets using 0/1 (bit, boolean) vectors 29One dimension per element in the universal setInterpret set intersection as bitwise AND, andset union as bitwise ORExample: C1 10111; C2 10011 Size of intersection 3; size of union 4, Jaccard similarity (not distance) 3/4 Distance: d(C1,C2) 1 – (Jaccard similarity) 1/4J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

From Sets to Boolean Matrices Rows elements (shingles)DocumentsColumns sets (documents) 30Note: Transposed Document Matrix1 in row e and column s if and only if e is a valid shingle ofdocument represented by sColumn similarity is the Jaccard similarity of the correspondingsets (rows with value 1)Typical matrix is sparse!Shingles 1110101100110001100111101010J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Outline: Finding Similar Columns So far:DocumentsA documents a set of shingles Represent a set as a boolean vector in a matrixShingles 311110101100110001100111101010J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Outline: Finding Similar Columns So far:DocumentsA documents a set of shingles Represent a set as a boolean vector in a matrix Next goal: Find similar columns while computingsmall signatures 32Similarity of columns similarity of signaturesShingles 1110101100110001100111101010J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Outline: Finding Similar Columns Next Goal: Find similar columns, Small signatures Naïve approach: 331) Signatures of columns: small summaries of columnsJ. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Outline: Finding Similar Columns Next Goal: Find similar columns, Small signatures Naïve approach:1) Signatures of columns: small summaries of columns 2) Examine pairs of signatures to find similar columns n 34Essential: Similarities of signatures and columns are related3) Optional: Check that columns with similar signatures are really similarJ. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Outline: Finding Similar Columns Next Goal: Find similar columns, Small signatures Naïve approach:1) Signatures of columns: small summaries of columns 2) Examine pairs of signatures to find similar columns n 3) Optional: Check that columns with similar signatures are really similarWarnings: Comparing all pairs may take too much time: Job for LSHn35Essential: Similarities of signatures and columns are relatedThese methods can produce false negatives, and even false positives (if the optional check isnot made)J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Hashing Columns (Signatures) : LSH principle Key idea: “hash” each column C to a small signature h(C), such that:(1) h(C) is small enough that the signature fits in RAM (2) sim(C1, C2) is the same as the “similarity” of signatures h(C1) and h(C2) 36J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Hashing Columns (Signatures) : LSH principle Key idea: “hash” each column C to a small signature h(C), such that:(1) h(C) is small enough that the signature fits in RAM (2) sim(C1, C2) is the same as the “similarity” of signatures h(C1) and h(C2) Goal: Find a hash function h(·) such that:If sim(C1,C2) is high, then with high prob. h(C1) h(C2) If sim(C1,C2) is low, then with high prob. h(C1) h(C2) 37J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Hashing Columns (Signatures) : LSH principle Key idea: “hash” each column C to a small signature h(C), such that:(1) h(C) is small enough that the signature fits in RAM (2) sim(C1, C2) is the same as the “similarity” of signatures h(C1) and h(C2) Goal: Find a hash function h(·) such that:If sim(C1,C2) is high, then with high prob. h(C1) h(C2) If sim(C1,C2) is low, then with high prob. h(C1) h(C2) 38Hash docs into buckets. Expect that “most” pairs of near duplicate docshash into the same bucket!J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Min-Hashing Goal: Find a hash function h(·) such that:if sim(C1,C2) is high, then with high prob. h(C1) h(C2) if sim(C1,C2) is low, then with high prob. h(C1) h(C2) Clearly, the hash function depends on the similarity metric: 39Not all similarity metrics have a suitable hash functionJ. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Min-Hashing Goal: Find a hash function h(·) such that:if sim(C1,C2) is high, then with high prob. h(C1) h(C2) if sim(C1,C2) is low, then with high prob. h(C1) h(C2) Clearly, the hash function depends on the similarity metric: 40Not all similarity metrics have a suitable hash functionThere is a suitable hash function for the Jaccard similarity: It is calledMin-HashingJ. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Min-Hashing 41Imagine the rows of the boolean matrix permuted under randompermutation pJ. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Min-Hashing 42Imagine the rows of the boolean matrix permuted under randompermutation pDefine a “hash” function hp(C) the index of the first (in thepermuted order p) row in which column C has value 1:hp (C) minp p(C)J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Min-Hashing 43Imagine the rows of the boolean matrix permuted under randompermutation pDefine a “hash” function hp(C) the index of the first (in thepermuted order p) row in which column C has value 1:hp (C) minp p(C)Use several (e.g., 100) independent hash functions (that is,permutations) to create a signature of a columnJ. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Min-Hashing 44Imagine the rows of the boolean matrix permuted under randompermutation pDefine a “hash” function hp(C) the index of the first (in thepermuted order p) row in which column C has value 1:hp (C) minp p(C)Use several (e.g., 100) independent hash functions (that is,permutations) to create a signature of a columnJ. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Zoo example (shingle size k 1)Universe{ dog, cat, lion, tiger, mouse}[ cat, mouse, lion, dog, tiger][ lion, cat, mouse, dog, tiger]A { mouse, lion }45J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Zoo example (shingle size k 1)Universe{ dog, cat, lion, tiger, mouse}[ cat, mouse, lion, dog, tiger][ lion, cat, mouse, dog, tiger]A { mouse, lion }46mh1(A) min ({mouse, lion } ) mousemh2(A) min ({ mouse, lion } ) lionJ. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Min-Hashing ExamplePermutation p47Input matrix (Shingles x ture matrix M2121J. Leskovec, A. Rajaraman, J. Ullman:Mining of Massive Datasets, http://www.mmds.org

Min-Hashing Example2nd element of the permutationis the first to map to a 1Permutation p48Input matrix (Shingles x ture matrix M2121J. Leskovec, A. Rajaraman, J. Ullman:Mining of Massive Datasets, http://www.mmds.org

Min-Hashing ExamplePermutation p49Input matrix (Shingles x Documents)Signature matrix M2 4101021213 210011011417 1026 301011 601015 710104 51010J. Leskovec, A. Rajaraman, J. Ullman:Mining of Massive Datasets, http://www.mmds.org

Min-Hashing Example2nd element of the permutationis the first to map to a 1Permutation p50Input matrix (Shingles x Documents)Signature matrix M2 4101021213 210011011417 1026 301011 601015 710104 510104th element of the permutationis the first to map to a 1J. Leskovec, A. Rajaraman, J. Ullman:Mining of Massive Datasets, http://www.mmds.org

Min-Hashing ExamplePermutation p51Input matrix (Shingles x Documents)Signature matrix M2 4 3101021213 2 410011011417 1 7021012126 3 2011 6 601015 7 110104 5 51010J. Leskovec, A. Rajaraman, J. Ullman:Mining of Massive Datasets, http://www.mmds.org

Note: Another (equivalent) way is tostore row indexesor raw shingles(e.g. mouse, lion):Min-Hashing Example1265341165342nd element of the permutationis the first to map to a 1Permutation p52Input matrix (Shingles x Documents)Signature matrix M2 4 3101021213 2 410011011417 1 7021012126 3 2011 6 601015 7 110104 5 510104th element of the permutationis the first to map to a 1J. Leskovec, A. Rajaraman, J. Ullman:Mining of Massive Datasets, http://www.mmds.org

Min-Hash Signatures Pick K 100 random permutations of the rowsThink of sig(C) as a column vectorsig(C)[i] according to the i-th permutation, the index of the firstrow that has a 1 in column Csig(C)[i] min (pi(C)) 53Note: The sketch (signature) of document C is small 𝟏𝟎𝟎 bytes!We achieved our goal! We “compressed” long bit vectors intoshort signaturesJ. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Key FactFor two sets A, B, and a min-hash function mhi():Unbiased estimator for Sim using K hashes (notation policy – thisis a different K from size of shingle)54J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Min-Hashing ExamplePermutation p55Input matrix (Shingles x Documents)Signature matrix M2 4 3101021213 2 410011011417 1 7021012126 3 2011 6 601015 7 110104 5 51010Similarities:1-32-4 1-2Col/Col 0.75 0.75 0Sig/Sig 0.67 1.00 03-400J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

The Min-Hash Property0000Choose a random permutation pClaim: Pr[hp(C1) hp(C2)] sim(C1, C2)Why?11000110 One of the two cols had to have 1 atposition y56J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

The Min-Hash Property0000Choose a random permutation pClaim: Pr[hp(C1) hp(C2)] sim(C1, C2)Why?11000110 Let X be a doc (set of shingles), yÎ X is a shingleOne of the two cols had to have 1 atposition y57J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

The Min-Hash Property0000Choose a random permutation pClaim: Pr[hp(C1) hp(C2)] sim(C1, C2)Why?11000110 Let X be a doc (set of shingles), yÎ X is a shingleThen: Pr[p(y) min(p(X))] 1/ X nIt is equally likely that any yÎ X is mapped to the min elementOne of the two cols had to have 1 atposition y58J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

The Min-Hash Property0000Choose a random permutation pClaim: Pr[hp(C1) hp(C2)] sim(C1, C2)Why?11000110 Let X be a doc (set of shingles), yÎ X is a shingleThen: Pr[p(y) min(p(X))] 1/ X nIt is equally likely that any yÎ X is mapped to the min element Let y be s.t. p(y) min(p(C1ÈC2)) Then either:p(y) min(p(C1)) if y Î C1 , orp(y) min(p(C2)) if y Î C259One of the two cols had to have 1 atposition yJ. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

The Min-Hash Property0000Choose a random permutation pClaim: Pr[hp(C1) hp(C2)] sim(C1, C2)Why?11000110 Let X be a doc (set of shingles), yÎ X is a shingleThen: Pr[p(y) min(p(X))] 1/ X nIt is equally likely that any yÎ X is mapped to the min element Let y be s.t. p(y) min(p(C1ÈC2)) Then either:p(y) min(p(C1)) if y Î C1 , orp(y) min(p(C2)) if y Î C2 60One of the two cols had to have 1 atposition ySo the prob. that both are true is the prob. y Î C1 Ç C2Pr[min(p(C1)) min(p(C2))] C1ÇC2 / C1ÈC2 sim(C1, C2)J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

The Min-Hash Property (Take 2: simpler proof) Choose a random permutation pClaim: Pr[hp(C1) hp(C2)] sim(C1, C2)Why? Given a set X, the probability that any one element is the minhash under p is 1/ X ß (0)nIt is equally likely that any yÎ X is mapped to the min elementGiven a set X, the probability that one of any k elements is themin-hash under p is k/ X ß (1) For C1 È C2, the probability that any element is the min-hashunder p is 1/ C1 È C2 (from 0)ß (2) For any C1 and C2, the probability of choosing the same min-hashunder p is C1ÇC2 / C1 È C2 ß from (1) and (2) 61

Similarity for Signatures 62We know: Pr[hp(C1) hp(C2)] sim(C1, C2)Now generalize to multiple hash functionsThe similarity of two signatures is the fraction of the hash functions inwhich they agreeNote: Because of the Min-Hash property, the similarity of columns isthe same as the expected similarity of their signaturesJ. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Min-Hashing ExamplePermutation p63Input matrix (Shingles x Documents)Signature matrix M2 4 3101021213 2 410011011417 1 7021012126 3 2011 6 601015 7 110104 5 51010Similarities:1-32-4 1-2Col/Col 0.75 0.75 0Sig/Sig 0.67 1.00 03-400J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Min-Hash Signatures Pick K 100 random permutations of the rowsThink of sig(C) as a column vectorsig(C)[i] according to the i-th permutation, the index of the firstrow that has a 1 in column Csig(C)[i] min (pi(C)) 64Note: The sketch (signature) of document C is small 𝟏𝟎𝟎 bytes!We achieved our goal! We “compressed” long bit vectors intoshort signaturesJ. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Implementation Trick Permuting rows even once is prohibitiveApproximate Linear Permutation HashingPick K independent hash functions (use a, b below) Apply the hash function on each column (document) for each hash function and get minhash signatureHow to pick a randomhash function h(x)?Universal hashing:ha,b(x) ((a·x b) mod p) mod Nwhere:a,b random integersp prime number (p N)65J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Summary: 2 Steps Shingling: Convert documents to sets 66We used hashing to assign each shingle an IDMin-Hashing: Convert large sets to short signatures, whilepreserving similarity We used similarity preserving hashing to generate signatures withproperty Pr[hp(C1) hp(C2)] sim(C1, C2) We used hashing to get around generating random permutationsJ. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

22 Compressing Shingles To compress long shingles, we can hashthem to (say) 4 bytes Like a Code Book If #shingles manageable àSimple dictionary suffices Doc represented by the set of hash/dict. values of its k-shingles Idea:Two documents could appear to have shingles in common, whenthe hash-values were shared J. Leskov

Related Documents:

92 vipul sharma it 93 rishabh jain cse 94 manik arora cse 95 nishant bhardwaj cse . 96 rajit shrivastava it 97 shivansh gaur cse 98 harsh singh cse 99 shreyanshi raj cse 100 rahul bedi cse 101 pallavi anand cse 102 divya cse 103 nihal raj it 104 kanak

cse-148 kuriakose jijo george n t george cse-149 kusum joshi ramesh chandra joshi cse-150 m mithun bose n k mohandasan cse-151 madhuri yadav rajbir yadav cse-152 malini shukla r s sharma cse-153 manisha khattar sunil kumar khattar cse-154 m

CSE 440: Introduction to HCI CSE 441: Advanced HCI CSE 510: Advanced Topics in HCI CSEP 510: Human-Computer Interaction CSE 332: Data Structures. Who We Are You Computing. Who We Are Eunice Jun Prefer: Eunice / She / Her Background: BS,Cognitive Studies & Computer Science Vanderbilt, 2016

1 CSE 474 Introduction 1 CSE 474 – Introduction to Embedded Systems n Instructor: q Bruce Hemingway n CSE 464, Office Hours: 11:00-12:00 p.m., Tuesday, Thursday n or whenever the door is open n bruceh@cs.washington.edu q Teaching Assistants: q Cody Ohlsen, Kendall Lowrey and Ying-Chao (Tony) Tung CSE 474 Introduction 2

F14 CSE550 45 Combinatorial Algorithms and Intractability S14 CSE 555 46 Advanced Theory of Computation S14 CSE591/MAT591 40 Combinatorial Design Theory F13 CSE 355 82 Intro Theory of Computation F13 CSE 552 32 Randomized and Approximation Algorithms S13 CSE 555 30 Advanced Theory of Computation S1

CSE Citation Style – Quick Guide 7th Edition 1 This guide outlines how to cite some of the more common information sources in the Council of Science Editor’s (CSE) Style Name-Year system. For a comprehensive listing, please consult: Scientific Style and Format: The CSE Manual for Authors, Editors, and Publishers, 7th edition

PowerPoint presentation CSE Training Handbook Contains the WAC 296-809-100 Outlines what you must do CSE Training Resource Tools for setting up your CSE program Confined Space Entry Program Resources Four Major Sections: WAC 296-809 –Confined Spaces Identify and Control permit-requi

ANSI A300 defines as a tree risk assess-ment: “A systematic process used to identify, analyze, and evaluate risk.” “Mitigation” is a term that I see com-monly used inappropriately. In the Standard, it is very clearly defined as the process of diminishing risk. We do not eliminate risk in trees when we perform some form of mitigation practice. We are minimizing the risk to some .