Fast R-CNN - University Of California, Davis

1y ago
49 Views
2 Downloads
1.87 MB
56 Pages
Last View : 1d ago
Last Download : 3m ago
Upload by : River Barajas
Transcription

Fast R-CNNRoss GirshickFacebook AI Research (FAIR)Work done at Microsoft ResearchPresented by:Nick Joodi Doug Sherman

Fast Region-based ConvNets (R-CNNs)FastSorry about the black BG, Girshick’s slides were all black.2

The Pascal Visual Object Classes Challenge Overview Classification, Detection, SegmentationFor each image: Does it contain the class? classification Where is it? detection via bounding boxEvaluation Mean Average Precision (mAP) Participants submitted results in the form ofconfidence Produce Precision Recall curves Average precision for each class Take mean to get mAP3

Object detection renaissance (2013-Present)4Adapted from Fast R-CNN [R. Girshick (2015)]

Object detection renaissance (2013-Present)5Adapted from Fast R-CNN [R. Girshick (2015)]

Object detection renaissance (2013-Present)6Adapted from Fast R-CNN [R. Girshick (2015)]

Agenda1.Pre-existing Modelsa.b.2.Ways to improvea.b.3.“Slow” R-CNNSPP-netSGD Mini-BatchNew Loss FunctionFast R-CNNa.b.ArchitectureResults & Future Work7

Region-based convnets (R-CNNs) R-CNN (aka “slow R-CNN”) [Girshick et al. CVPR14]SPP-net [He et al. ECCV14]8

9Adapted from Fast R-CNN [R. Girshick (2015)]

10Adapted from Fast R-CNN [R. Girshick (2015)]

11Adapted from Fast R-CNN [R. Girshick (2015)]

12Adapted from Fast R-CNN [R. Girshick (2015)]

13Adapted from Fast R-CNN [R. Girshick (2015)]

14Adapted from Fast R-CNN [R. Girshick (2015)]

What’s wrong with slow R-CNN? Ad hoc training objectives Fine-tune network with softmax classifier (log loss)Train post-hoc linear SVMs (hinge loss)Train post-hoc bounding-box regressors (L2 loss)15

What’s wrong with slow R-CNN? Ad hoc training objectives Fine-tune network with softmax classifier (log loss)Train post-hoc linear SVMs (hinge loss)Train post-hoc bounding-box regressors (L2 loss)Training is slow (84h), takes a lot of disk space16

What’s wrong with slow R-CNN? Ad hoc training objectives Fine-tune network with softmax classifier (log loss)Train post-hoc linear SVMs (hinge loss)Train post-hoc bounding-box regressors (L2 loss)Training is slow (84h), takes a lot of disk spaceInference (detection) is slow 47s / image with VGG16 [Simonyan & Zisserman. ICLR15]Fixed by SPP-net [He et al. ECCV14]17

Agenda1.Pre-existing Modelsa.b.2.Ways to improvea.b.3.“Slow” R-CNNSPP-netSGD Mini-BatchNew Loss FunctionFast R-CNNa.b.ArchitectureResults & Future Work18

19Adapted from Fast R-CNN [R. Girshick (2015)]

20Adapted from Fast R-CNN [R. Girshick (2015)]

21Adapted from Fast R-CNN [R. Girshick (2015)]

22Adapted from Fast R-CNN [R. Girshick (2015)]

23Adapted from Fast R-CNN [R. Girshick (2015)]

24Adapted from Fast R-CNN [R. Girshick (2015)]

Pyramid Pooling Layer(w/4 x h/4)(2 x 1)To FC(w/2 x h/2)(4 x 2)812341(w/1 x h/1)(8 x 2)2344RegionStride/WindowSizeOutput ofPoolingConcatenated25

26Adapted from Fast R-CNN [R. Girshick (2015)]

What’s wrong with SPP-net? Inherits the rest of R-CNN’s problems Ad hoc training objectiveTraining is slow (25h), takes a lot of disk spaceIntroduces a new problem: cannot update parameters below SPP layerduring training27

28Adapted from Fast R-CNN [R. Girshick (2015)]

Agenda1.Pre-existing Modelsa.b.2.Ways to improvea.b.3.“Slow” R-CNNSPP-netSGD Mini-BatchNew Loss FunctionFast R-CNNa.b.ArchitectureResults & Future Work29

SGD Mini-Batch Method for RoIs30Adapted from Fast R-CNN [R. Girshick (2015)]

SGD Mini-Batch Method for RoIs31Adapted from Fast R-CNN [R. Girshick (2015)]

SGD Mini-Batch Method for RoIsInput size for SPP-net32Adapted from Fast R-CNN [R. Girshick (2015)]

SGD Mini-Batch Method for RoIs33Adapted from Fast R-CNN [R. Girshick (2015)]

SGD Mini-Batch Method for RoIs34Adapted from Fast R-CNN [R. Girshick (2015)]

SGD Mini-Batch Method for RoIs35Adapted from Fast R-CNN [R. Girshick (2015)]

SGD Mini-Batch Method for RoIs36Adapted from Fast R-CNN [R. Girshick (2015)]

SGD Mini-Batch Method for RoIs37Adapted from Fast R-CNN [R. Girshick (2015)]

Agenda1.Pre-existing Modelsa.b.2.Ways to improvea.b.3.“Slow” R-CNNSPP-netSGD Mini-BatchNew Loss FunctionFast R-CNNa.b.c.ArchitectureResultsFuture Work38

Revised loss functionFor the classificationFor the bounding boxp: Predicted RoI Classificationu: True RoI Classificationtu (tx,ty,tw,th): Predicted Bounding Boxv (vx,vy,vw,vh): True Bounding Boxƛ : Controls the balance between the two losses39

Revised loss function40

Revised loss functionSmooth: Continuously Differentiable41

Agenda1.Pre-existing Modelsa.b.2.Ways to improvea.b.3.“Slow” R-CNNSPP-netSGD Mini-BatchNew Loss FunctionFast R-CNNa.b.ArchitectureResults & Future Work42

Fast R-CNN Fast test-time, like SPP-netOne network, trained in one stageHigher mean average precision than slow R-CNN and SPP-net43

44Adapted from Fast R-CNN [R. Girshick (2015)]

45Adapted from Fast R-CNN [R. Girshick (2015)]

46Adapted from Fast R-CNN [R. Girshick (2015)]

47Adapted from Fast R-CNN [R. Girshick (2015)]

48Adapted from Fast R-CNN [R. Girshick (2015)]

49Adapted from Fast R-CNN [R. Girshick (2015)]

50Adapted from Fast R-CNN [R. Girshick (2015)]

Agenda1.Pre-existing Modelsa.b.2.Ways to improvea.b.3.“Slow” R-CNNSPP-netSGD Mini-BatchNew Loss FunctionFast R-CNNa.b.ArchitectureResults & Future Work51

52Adapted from Fast R-CNN [R. Girshick (2015)]

53Adapted from Fast R-CNN [R. Girshick (2015)]

54Adapted from Fast R-CNN [R. Girshick (2015)]

What’s still wrong? Out-of-network region proposals Selective search: 2s / img; EdgeBoxes: 0.2s / imgFortunately, this has already been solvedS. Ren, K. He, R. Girshick & J. Sun. “Faster R'CNN: Towards Real'TimeObject Detection with Region Proposal Networks.” NIPS (2015).55

Fast R-CNN take-aways End-to-end training of deep ConvNets for object detectionFast training timesOpen source for easy experimentationA large number of ImageNet detection and COCO detection methods arebuilt on Fast R-CNN56

Fast R-CNN a. Architecture b. Results & Future Work Agenda 42. Fast R-CNN Fast test-time, like SPP-net One network, trained in one stage Higher mean average precision than slow R-CNN and SPP-net 43. Adapted from Fast R-CNN [R. Girshick (2015)] 44.

Related Documents:

fast-rcnn. 2. Fast R-CNN architecture and training Fig. 1 illustrates the Fast R-CNN architecture. A Fast R-CNN network takes as input an entire image and a set of object proposals. The network first processes the whole image with several convolutional (conv) and max pooling

Fast R-CNN [2] enables end-to-end detector training on shared convolutional features and shows compelling accuracy and speed. 3 FASTER R-CNN Our object detection system, called Faster R-CNN, is composed of two modules. The first module is a deep fully convolutional network that proposes regions, and the second module is the Fast R-CNN detector [2]

CNN R-CNN: Regions with CNN features Figure 1: Object detection system overview. Our system (1) takes an input image, (2) extracts around 2000 bottom-up region proposals, (3) computes features for each proposal using a large convolutional neural network (CNN), and then (4) classifies each region using class-specific linear SVMs. R-CNN .

A deep CNN is learned to jointly optimize pedestrian detection and other semantic tasks to im-prove pedestrian detection performance [32]. In [5,36,20,40,33,4], Fast R-CNN [16] or Faster R-CNN [27] is adapted for pedestrian detection. In this paper, we explore how to learn a deep CNN to improve performance for detecting partially occluded .

High Brow Hickory Smart Little Kitty Smart Lil Highbrow Lena Joe Peppy 1992 Consigned by CNN Quarter Horses CNN Highbrow Lady 2006 Bay Mare CNN Highbrow Lady 4902100 NOTES: CNN Highbrow Lady is a smart, fancy, Highbrow filly out of a powerful female line. She is well broke.

Jia-Bin Huang, Virginia Tech. Today's class Overview Convolutional Neural Network (CNN) Understanding and Visualizing CNN Training CNN. Image Categorization: Training phase Training . CNN as a Similarity Measure for Matching FaceNet [Schroff et al. 2015] Stereo matching [Zbontar and LeCun CVPR 2015]

Zhang et al. [35] used CNN to regress both the density map and the global count. It laid the foundation for subsequent works based on CNN methods. To improve performance, some methods aimed at improving network structures. MCNN [36] and Switch-CNN [2] adopted multi-column CNN structures for mapping an im-age to its density map.

accounting purposes, and are rarely designed to have a voting equity class possessing the power to direct the activities of the entity, they are generally VIEs. The investments or other interests that will absorb portions of a VIE’s expected losses or receive portions of its expected residual returns are called variable interests. In February 2015, the Financial Accounting Standards Board .