Learning Compact Visual Attributesfor Large-scale Image ClassificationYu Su and Frédéric JurieUniversité de Caen Basse NormandieGREYC-CNRS UMR 6072
OutlineMotivation Method Experiments 2
Image Classification Assign one or multiple labels to an image based onits semantic content.MotorbikePerson3
Image Classification Assign one or multiple labels to an image based onits semantic content.MotorbikePerson Small scale datasets 15Scene (15 classes, 5K images)PASCAL VOC (20 classes, 10K images)Caltech101 (101 classes, 8K images)Large scale datasets SUN (397 classes, 100K images)LSVRC (1K classes, 1.2M images)ImageNet (10K classes, 9M images)4
Image Representation Fisher Vector [Perronnin et al., ECCV’10] State-of-the-art image representationBoWLLCSuperVectorFisherVectorPASCAL VOC(20 classes)56.1%57.6%58.2%61.7%SUN(397 classes)27.9%34.1%35.5%41.3%- Bag of Words (BoW) [Sivic&Zisserman, ICCV’03]- Locality-constrained Linear Coding (LLC) [Wang et al., CVPR’10]- Super Vector [Zhou et al., ECCV’10]5
Image Representation Fisher Vector [Perronnin et al., ECCV’10] State-of-the-art image representationLarge Scale Visual Recognition Challenge (LSVRC, 0.2000.1330.1100.1580.1000.000XRCEXRCE(Fisher Vector)UvAFlat costISIHierarchical costNII6
Image Representation Fisher Vector [Perronnin et al., ECCV’10] High dimensionality GMM with 256 components, SIFT reduced to 64-d by PCASpatial pyramid: 1x1, 2x2, 3x1Fisher Vector: 256x64x2x8 262,144-ddescriptor xtg11st order Fisher Vectorg4g2g3Gaussian Mixtures of SIFT2nd order Fisher Vector7
Image Representation Fisher Vector [Perronnin et al., ECCV’10] High dimensionality GMM with 256 components, SIFT reduced to 64-d by PCASpatial pyramid: 1x1, 2x2, 3x1Fisher Vector: 256x64x2x8 262,144-dCompression Product Quantization (PQ)Locality-Sensitive Hashing (LSH)Principal Component Analysis (PCA)Visual Attributes (our work)8
Visual AttributesLampert et al., CVPR’09Farhadi et al., CVPR’09 Li et al., NIPS’10Su and Jurie., IJCV’12Compact image representationBUT need large amont of human efforts Define attributes from expertise or ontologyCollect and annotate training images9
OutlineMotivation Method Experiments 10
Overview – Region Attributesrandomly sampled image 1
Overview – Region Attributesrandomly sampled image redictionPhaseattributefeaturestest imageimageregionsfishervectorsregionattributes12
Image/Region Representation Generate image regionsvs. Randomly samplingImage segmentation simple, no paras semantic meaningful- less meaningful- many paras, slowerFisher Vector13
Image/Region Clustering Spectral clustering Multi-level clustering Suits for high-dimensional Fisher Vector (32,768-d)Gaussian kernel as similarity measurement# of clusters: 50, 100, . , 500 (totally 2750 clusters)Learn attribute (cluster) classifiers SVM with linear kernelOne-vs-rest strategy14
Generate Attribute Features Classifier-based soft assignment: the probability that attribute a appears in image/region f: linear classifier (SVM) of attribute aImage attributes:Region attributes:15
Compact Image Signature Attribute selection Objective: compact set of attributes with low redundancyAlgorithm: sequential greedy search [Peng et al, PAMI’05]16
Compact Image Signature Attribute selection Objective: compact set of attributes with low redundancyAlgorithm: sequential greedy search [Peng et al, PAMI’05]Binarization Locality-Sensitive Hashing: random projection andthresholding. 1, pT x 0h p ( x) 0, elsep : randomly generated projection.17
OutlineMotivation Proposed method Experiments 18
Examples of Learned Attributes“horizontal structure”“road/ground”“vertical structure”“group of persons”“circular object”“animal in the grass”19
Databases PASCAL VOC 2007 [Everingham et al., 2007] 20 objects, 9963 imagesBinary classificationPerformance measure: mean Average Precision (mAP)20
Databases Caltech-256 [Griffin et al., CIT-TR, 2007] 256 objects, 30K imagesMulti-class classificationPerformance measure: mean accuracy21
Databases SUN-397 [Xiao et al., CVPR’10] 397 scenes, 100K imagesMulti-class classificationPerformance measure: mean accuracy22
Implementation Details SIFT descriptor Fisher Vector Densely sampled, reduced to 64-d by PCAGMM with 256 componentsDimension: 256x64x2 32,768Image classificationSVM with linear kernel is determined on PASCAL train/val set Attribute learning (including clustering, feature selectionetc.) is ONLY performed on PASCAL train/val set.23
Learn & Predict AttributePASCAL VOC 2007 train/validationSpectral Clustering vs. K-means24
Learn & Predict AttributePASCAL VOC 2007 train/validationSpectral Clustering vs. K-meansDifferent Encoding Methods25
Real-valued Attribute FeatureCaltech-256 (ntrain 30)SUN-397 (ntrain 50)- FV with SPM (1x1, 2x2, 3x1) : 262,144-d- FV SPM PCA: PCA is learnt on PASCAL VOC- Classemes [Torresani, ECCV’10]: multiple low-level features- Our method:(1) 500 times more compact than FV SPM with 3% performance loss(2) better than PCA and Classemes26
Real-valued Attribute FeatureCaltech-256 (ntrain 30)SUN-397 (ntrain 50)- FV with SPM (1x1, 2x2, 3x1) : 262,144-d- FV SPM PCA: PCA is learnt on PASCAL VOC- Classemes [Torresani, ECCV’10]: multiple low-level features- Our method:(1) 500 times more compact than FV SPM with 3% performance loss(2) better than PCA and Classemes27
Binary Attribute FeatureCaltech-256 (ntrain 30)SUN-397 (ntrain 50)- FV SPM: 262,144 x 4 bytes- Classemes [Torresani, ECCV’10] : binarized by thresholding- PiCoDes [Bergamo, NIPS’11]: optimizing an independent classification task- Our method:(1) 2048 times more compact than FV SPM with 3% performance loss(2) better than Classemes and PiCodes28
Thanks for your attention !29
Image Classification Small scale datasets 15Scene (15 classes, 5K images) PASCAL VOC (20 classes, 10K images) Caltech101 (101 classes, 8K images) Large scale datasets SUN (397 classes, 100K images) LSVRC (1K classes, 1.2M images) ImageNet (10K classes, 9M images) 4 Motorbike Person Assign one or multiple labels to an image based on
Bruksanvisning för bilstereo . Bruksanvisning for bilstereo . Instrukcja obsługi samochodowego odtwarzacza stereo . Operating Instructions for Car Stereo . 610-104 . SV . Bruksanvisning i original
10 tips och tricks för att lyckas med ert sap-projekt 20 SAPSANYTT 2/2015 De flesta projektledare känner säkert till Cobb’s paradox. Martin Cobb verkade som CIO för sekretariatet för Treasury Board of Canada 1995 då han ställde frågan
service i Norge och Finland drivs inom ramen för ett enskilt företag (NRK. 1 och Yleisradio), fin ns det i Sverige tre: Ett för tv (Sveriges Television , SVT ), ett för radio (Sveriges Radio , SR ) och ett för utbildnings program (Sveriges Utbildningsradio, UR, vilket till följd av sin begränsade storlek inte återfinns bland de 25 största
Hotell För hotell anges de tre klasserna A/B, C och D. Det betyder att den "normala" standarden C är acceptabel men att motiven för en högre standard är starka. Ljudklass C motsvarar de tidigare normkraven för hotell, ljudklass A/B motsvarar kraven för moderna hotell med hög standard och ljudklass D kan användas vid
LÄS NOGGRANT FÖLJANDE VILLKOR FÖR APPLE DEVELOPER PROGRAM LICENCE . Apple Developer Program License Agreement Syfte Du vill använda Apple-mjukvara (enligt definitionen nedan) för att utveckla en eller flera Applikationer (enligt definitionen nedan) för Apple-märkta produkter. . Applikationer som utvecklas för iOS-produkter, Apple .
Equipment, Gunn JCB Compact Equipment, Holt JCB Compact Equipment, Scot JCB Compact, TC Harrison JCB Compact and Watling JCB Compact - will also create separate compact equipment showrooms at selected depots covering England, Scotland and Wales. The dealers' expansion follows a decision earlier this year by JCB to set up its own Compact
och krav. Maskinerna skriver ut upp till fyra tum breda etiketter med direkt termoteknik och termotransferteknik och är lämpliga för en lång rad användningsområden på vertikala marknader. TD-seriens professionella etikettskrivare för . skrivbordet. Brothers nya avancerade 4-tums etikettskrivare för skrivbordet är effektiva och enkla att
Den kanadensiska språkvetaren Jim Cummins har visat i sin forskning från år 1979 att det kan ta 1 till 3 år för att lära sig ett vardagsspråk och mellan 5 till 7 år för att behärska ett akademiskt språk.4 Han införde två begrepp för att beskriva elevernas språkliga kompetens: BI