Learning Compact Visual Attributes For Large-scale Image Classification

1y ago
19 Views
2 Downloads
1.15 MB
29 Pages
Last View : 13d ago
Last Download : 3m ago
Upload by : Ciara Libby
Transcription

Learning Compact Visual Attributesfor Large-scale Image ClassificationYu Su and Frédéric JurieUniversité de Caen Basse NormandieGREYC-CNRS UMR 6072

OutlineMotivation Method Experiments 2

Image Classification Assign one or multiple labels to an image based onits semantic content.MotorbikePerson3

Image Classification Assign one or multiple labels to an image based onits semantic content.MotorbikePerson Small scale datasets 15Scene (15 classes, 5K images)PASCAL VOC (20 classes, 10K images)Caltech101 (101 classes, 8K images)Large scale datasets SUN (397 classes, 100K images)LSVRC (1K classes, 1.2M images)ImageNet (10K classes, 9M images)4

Image Representation Fisher Vector [Perronnin et al., ECCV’10] State-of-the-art image representationBoWLLCSuperVectorFisherVectorPASCAL VOC(20 classes)56.1%57.6%58.2%61.7%SUN(397 classes)27.9%34.1%35.5%41.3%- Bag of Words (BoW) [Sivic&Zisserman, ICCV’03]- Locality-constrained Linear Coding (LLC) [Wang et al., CVPR’10]- Super Vector [Zhou et al., ECCV’10]5

Image Representation Fisher Vector [Perronnin et al., ECCV’10] State-of-the-art image representationLarge Scale Visual Recognition Challenge (LSVRC, 0.2000.1330.1100.1580.1000.000XRCEXRCE(Fisher Vector)UvAFlat costISIHierarchical costNII6

Image Representation Fisher Vector [Perronnin et al., ECCV’10] High dimensionality GMM with 256 components, SIFT reduced to 64-d by PCASpatial pyramid: 1x1, 2x2, 3x1Fisher Vector: 256x64x2x8 262,144-ddescriptor xtg11st order Fisher Vectorg4g2g3Gaussian Mixtures of SIFT2nd order Fisher Vector7

Image Representation Fisher Vector [Perronnin et al., ECCV’10] High dimensionality GMM with 256 components, SIFT reduced to 64-d by PCASpatial pyramid: 1x1, 2x2, 3x1Fisher Vector: 256x64x2x8 262,144-dCompression Product Quantization (PQ)Locality-Sensitive Hashing (LSH)Principal Component Analysis (PCA)Visual Attributes (our work)8

Visual AttributesLampert et al., CVPR’09Farhadi et al., CVPR’09 Li et al., NIPS’10Su and Jurie., IJCV’12Compact image representationBUT need large amont of human efforts Define attributes from expertise or ontologyCollect and annotate training images9

OutlineMotivation Method Experiments 10

Overview – Region Attributesrandomly sampled image 1

Overview – Region Attributesrandomly sampled image redictionPhaseattributefeaturestest imageimageregionsfishervectorsregionattributes12

Image/Region Representation Generate image regionsvs. Randomly samplingImage segmentation simple, no paras semantic meaningful- less meaningful- many paras, slowerFisher Vector13

Image/Region Clustering Spectral clustering Multi-level clustering Suits for high-dimensional Fisher Vector (32,768-d)Gaussian kernel as similarity measurement# of clusters: 50, 100, . , 500 (totally 2750 clusters)Learn attribute (cluster) classifiers SVM with linear kernelOne-vs-rest strategy14

Generate Attribute Features Classifier-based soft assignment: the probability that attribute a appears in image/region f: linear classifier (SVM) of attribute aImage attributes:Region attributes:15

Compact Image Signature Attribute selection Objective: compact set of attributes with low redundancyAlgorithm: sequential greedy search [Peng et al, PAMI’05]16

Compact Image Signature Attribute selection Objective: compact set of attributes with low redundancyAlgorithm: sequential greedy search [Peng et al, PAMI’05]Binarization Locality-Sensitive Hashing: random projection andthresholding. 1, pT x 0h p ( x) 0, elsep : randomly generated projection.17

OutlineMotivation Proposed method Experiments 18

Examples of Learned Attributes“horizontal structure”“road/ground”“vertical structure”“group of persons”“circular object”“animal in the grass”19

Databases PASCAL VOC 2007 [Everingham et al., 2007] 20 objects, 9963 imagesBinary classificationPerformance measure: mean Average Precision (mAP)20

Databases Caltech-256 [Griffin et al., CIT-TR, 2007] 256 objects, 30K imagesMulti-class classificationPerformance measure: mean accuracy21

Databases SUN-397 [Xiao et al., CVPR’10] 397 scenes, 100K imagesMulti-class classificationPerformance measure: mean accuracy22

Implementation Details SIFT descriptor Fisher Vector Densely sampled, reduced to 64-d by PCAGMM with 256 componentsDimension: 256x64x2 32,768Image classificationSVM with linear kernel is determined on PASCAL train/val set Attribute learning (including clustering, feature selectionetc.) is ONLY performed on PASCAL train/val set.23

Learn & Predict AttributePASCAL VOC 2007 train/validationSpectral Clustering vs. K-means24

Learn & Predict AttributePASCAL VOC 2007 train/validationSpectral Clustering vs. K-meansDifferent Encoding Methods25

Real-valued Attribute FeatureCaltech-256 (ntrain 30)SUN-397 (ntrain 50)- FV with SPM (1x1, 2x2, 3x1) : 262,144-d- FV SPM PCA: PCA is learnt on PASCAL VOC- Classemes [Torresani, ECCV’10]: multiple low-level features- Our method:(1) 500 times more compact than FV SPM with 3% performance loss(2) better than PCA and Classemes26

Real-valued Attribute FeatureCaltech-256 (ntrain 30)SUN-397 (ntrain 50)- FV with SPM (1x1, 2x2, 3x1) : 262,144-d- FV SPM PCA: PCA is learnt on PASCAL VOC- Classemes [Torresani, ECCV’10]: multiple low-level features- Our method:(1) 500 times more compact than FV SPM with 3% performance loss(2) better than PCA and Classemes27

Binary Attribute FeatureCaltech-256 (ntrain 30)SUN-397 (ntrain 50)- FV SPM: 262,144 x 4 bytes- Classemes [Torresani, ECCV’10] : binarized by thresholding- PiCoDes [Bergamo, NIPS’11]: optimizing an independent classification task- Our method:(1) 2048 times more compact than FV SPM with 3% performance loss(2) better than Classemes and PiCodes28

Thanks for your attention !29

Image Classification Small scale datasets 15Scene (15 classes, 5K images) PASCAL VOC (20 classes, 10K images) Caltech101 (101 classes, 8K images) Large scale datasets SUN (397 classes, 100K images) LSVRC (1K classes, 1.2M images) ImageNet (10K classes, 9M images) 4 Motorbike Person Assign one or multiple labels to an image based on

Related Documents:

Bruksanvisning för bilstereo . Bruksanvisning for bilstereo . Instrukcja obsługi samochodowego odtwarzacza stereo . Operating Instructions for Car Stereo . 610-104 . SV . Bruksanvisning i original

10 tips och tricks för att lyckas med ert sap-projekt 20 SAPSANYTT 2/2015 De flesta projektledare känner säkert till Cobb’s paradox. Martin Cobb verkade som CIO för sekretariatet för Treasury Board of Canada 1995 då han ställde frågan

service i Norge och Finland drivs inom ramen för ett enskilt företag (NRK. 1 och Yleisradio), fin ns det i Sverige tre: Ett för tv (Sveriges Television , SVT ), ett för radio (Sveriges Radio , SR ) och ett för utbildnings program (Sveriges Utbildningsradio, UR, vilket till följd av sin begränsade storlek inte återfinns bland de 25 största

Hotell För hotell anges de tre klasserna A/B, C och D. Det betyder att den "normala" standarden C är acceptabel men att motiven för en högre standard är starka. Ljudklass C motsvarar de tidigare normkraven för hotell, ljudklass A/B motsvarar kraven för moderna hotell med hög standard och ljudklass D kan användas vid

LÄS NOGGRANT FÖLJANDE VILLKOR FÖR APPLE DEVELOPER PROGRAM LICENCE . Apple Developer Program License Agreement Syfte Du vill använda Apple-mjukvara (enligt definitionen nedan) för att utveckla en eller flera Applikationer (enligt definitionen nedan) för Apple-märkta produkter. . Applikationer som utvecklas för iOS-produkter, Apple .

Equipment, Gunn JCB Compact Equipment, Holt JCB Compact Equipment, Scot JCB Compact, TC Harrison JCB Compact and Watling JCB Compact - will also create separate compact equipment showrooms at selected depots covering England, Scotland and Wales. The dealers' expansion follows a decision earlier this year by JCB to set up its own Compact

och krav. Maskinerna skriver ut upp till fyra tum breda etiketter med direkt termoteknik och termotransferteknik och är lämpliga för en lång rad användningsområden på vertikala marknader. TD-seriens professionella etikettskrivare för . skrivbordet. Brothers nya avancerade 4-tums etikettskrivare för skrivbordet är effektiva och enkla att

Den kanadensiska språkvetaren Jim Cummins har visat i sin forskning från år 1979 att det kan ta 1 till 3 år för att lära sig ett vardagsspråk och mellan 5 till 7 år för att behärska ett akademiskt språk.4 Han införde två begrepp för att beskriva elevernas språkliga kompetens: BI