Convolutional Neural Nets II Hands On - GitHub Pages

3y ago

22 Views

2 Downloads

3.29 MB

38 Pages

Last View : 24d ago

Last Download : 3m ago

Upload by : Milo Davies

Report this link

Download PDF

Transcription

Convolutional Neural Nets IIHands OnOliver DürrDatalab-Lunch Seminar SeriesWinterthur, April 22nd, 20151

Outline Motivation for CNN Focus here on image classificationFrameworks Caffe / Lasagne Recap MLP / Demo MLP Recap CNN / Demo CNN Some Tricks DropoutTraining and Test augmentation (Learning Symmetries)Demo CNN with AugmentationCode for demos: https://github.com/oduerr/dl tutorial

History, milestones of CNN 1980 Kunihiko Fukushima introduction 1998 Le Cun (Backpropagation) Many Contests won 2011& 2014 MINST Handwritten Dataset201X Chinese Handwritten Character 2011 German Traffic SignsImageNet Success Story Alex Net (2012) winning solution of ImageNet

Imagenet 2012,2013,2014Some ExamplesWith Alexnet results1000 Classes20122010-2014SuperVisionAlexNet 7 layers deephttp://cs.nyu.edu/ fergus/presentations/nips2013 b519Oxfordnet up to 19 layersGoogLeNet 6.7%

A really convincing factKaggle Plankton Competition (2015)There is another bold one

Frameworks

Overview of frameworksDisclaimer: „This is a fast changing field. The list is not exclusive“.Survey from the partcipiants of the Kaggle Challenge (Feb 2015)Most Mentioned lasagne / nolearn Python based on Theano, very flexible (winning team „Deep See“ used it)caffe C based library, python bindings, convenience functions, many existing / pretrainedmodelsAlso used Theano (plain vanilla) Symbolic computation of gradients and construction of numerical C-codeTorch, lua (used by e-your-toolkit

CaffeLasagne / nolearn C library with python bindings Settings (network layout and others) viafiles Documentation: poor Feels like a Blindflug Up to data components Alexnet,GoogLeNet Lasage: python using theano (librarydefine, optimize, and evaluatemathematical expressions on GPU) Nolearn is a wrapper around lasagne toprovide a similar interface then scikit-learn Documentation: poor, but use the sourceluke Feels like you understand /control bellsand whistles Custom components possible (providedTheano has functionallity) Input: Images or strange DBs Input: Any numpy arrays Data Augmentation: Not possible frompython Data Augmentation: Easy No predefined models (yet) Predefined models availableWe will focus on lasagne

Links for CaffeWe will focuss on Lasagne but some links to Caffe for reference (thanks toGabriel) r/python/FaceCaffe slides/Caffe/caffe tutorial.pdf caffe tutorial.pdf https://docs.google.com/presentation/.

Recap Neural Nets

Recap Neural Networks: Basic UnitN-D log RegressionActivation Function a.k.a.Nonlinearityf(z) exp(z) f (z) 1 exp(z) max(0,z) z x1W1 x2W2 W3 θ T xMotivation:Green:logistic regression.Red:ReLU fasterconvergenceSource:AlexnetKrizhevsky etal 2012For a more detail explanation see: https://home.zhaw.ch/ dueo/bbs/files/ConvNets 17 Dec 1.pdf

Recap Neural Networks: Stacking things togetherOutput (Softmax)f (zi ) eziNzie i 1Contains many weights W(l )ijThis is just a complex functions of the manyweights θ W l ij and the input predicting theprob. of a class.For a more detail explanation see: https://home.zhaw.ch/ dueo/bbs/files/ConvNets 17 Dec 1.pdfFigure taken from: iLayerNeuralNetworks/Propability of a classgiven the input image X

Recap Neural Networks: Training the NN Use the training data j 1,.,Ntrain to optimize a costfunction J sensitive tomisclassication Usual a subset n (minibatch) of the training data is taken for optimization in onego. nJ(θ ) n (Mini Batch Size) Cost of Training example X ii 1 Motivation of cost function from MaxLikelihhod Optimal weights are found in many iterations using gradient descent (α learning rate)θi θi α J(θ ) θ i Backpropagation a.k.a chainrule is used to calculate the gradient

Illustration of Gradient Descentθ2 J(θ )θi θi α θ iθ1,θ2 just two from millionsθ1

Demo MLP

Definition Data/Network (MLP)Images 28x28 784In reality much more nodes5005010Now for the Demo

To many weightsBut we want many layers.Remedy: Weight sharing à Convolution Sparse connectivity à Pooling

The convolutional layerIngredient I: ConvolutionWhat is convolution?The 9 weightsWij are calledKernel.The weights are not fixed they arelearned!Gimp documentation: http://docs.gimp.org/en/plug-in-convmatrix.html

The convolutional layerIngredient I: ConvolutionThe same weights are slid over the imageIllustration: http://deeplearning.stanford.edu/tutorial/

Example of a KernelEdge enhanceFilterBut again!The weights are not fixed. They are learned!Gimp documentation: http://docs.gimp.org/en/plug-in-convmatrix.html

The convolutional layerIngredient II: Max-PoolingAlso sliding windowversions.Simply join e.g. 2x2 adjacent pixels in one.Hinton: „The pooling operation used in convolutional neural networks is abig mistake and the fact that it works so well is a disaster“

A simple version of the CNN (LeNet5 Architecture)20 Kernels a 5x5weights to go fromone to the nextConvMax PoolConvMax Pool Full Con 1MultinomialLog. Reg

A typical recent architecture (AlexNet, 2012)Senimal paper. introduced 26.2% error à 16.5% Dropout (see below) ReLU instead of sigmoid Parallelisation on many GPUs Local Response Normalization (not used widelynowadays)A bit of a simplification, since Alex Net is build for 2 GPUs and normalization. Caffe code from here

Figure 3: GoogLeNet network with all the bells and whistlesWinning architecture (GoogLeNet, 2014)The inception module (convolutions and maxpooling)Few parameters, quite hardto train.Comments see hereGoing deeper with convolution http://arxiv.org/abs/1409.4842

A typical very recent architecture („Oxford Net“(s), 2014) Small pooling More than 1 conv beforemaxpooling. No strides (stride 1) ReLU after conv and FC More traditional, easier tomore weights thanGoogLeNet Caffe et Challenge 2014, 2nd classification

A typical very recent architecture („Oxford Net“(s), 2014)Definition (16 Layer) taken 7b538e2d8#file-readme-md

Demo Lasagne II(Convolutional)

Demo: Nolearn/Lasagne (LeNet Architecture)layers [('input', layers.InputLayer),Have ReLu non-linearity by default('conv1', layers.Conv2DLayer),('pool1', layers.MaxPool2DLayer),('conv2', layers.Conv2DLayer),('pool2', layers.MaxPool2DLayer),('hidden4', layers.DenseLayer),('output', layers.DenseLayer),],input shape (None, 1, PIXELS, PIXELS),conv1 num filters 32, conv1 filter size (3, 3), pool1 ds (2, 2),conv2 num filters 64, conv2 filter size (2, 2), pool2 ds (2, 2),hidden4 num units 500,output num units 10, output nonlinearity den4Image taken from: Master Thesis Christopher Mitchell .pdfoutput

(Some) tricks of the trade

Data Augmentation Create „new training“ data by „lable preserving transformation“ Force invariances under translational symetrie (translation, rotation, )OriginalAugmentedTaken from the winning solution of the plankton challenge. http://benanne.github.io/2015/03/17/plankton.html

DropoutOriginalTrainingAt each mini-batchremove randomnodes “dropout”Training timeTest TimeIdea: Averaging over many different configuration (exact in case of linear).Typically 10% performance invcreaseSrivastava et al., Journal of Machine Learning Research 15 (2014) 1929-1958

Demo Lasagne III(Convolutional with TrainingData Augmentation)

Attic

Taking a closer look: ConvolutionDocumentation: The convolutional layer is finished with a nonlinearity: Possible: identity (nothing, linear), rectify (default), tanh, softmax (good for last layer),sigmoidDifferent sizes no padding / padding via conv1 border mode ‘valid‘ #(None, 1, 28, 28) à (None, 32, 26, 26) conv1 border mode 'same‘ #(None, 1, 28, 28) à (None, 32, 28, 28) conv1 border mode ‘full‘ #(None, 1, 28, 28) à (None, 32, 30, 30) #? stride schrittweite default (1,1) conv1 strides (2,2) #(None, 1, 28, 28) à (None, 32, 13, 13)Observation:Seems to take quite a while for compiling

Cost Functions: ReLUSource: Krizhevsky et al 2012Six Times faster Convergence, than traditional approach.Intuition: Backpropagation

Generell Notes on Optimization Gradient Descent (only first order) Newton Taylor Expansion 2nd Order using Hessian

Taking a closer look: Learning Rate & Momentum Nice Description Caffe Tutorial Nice Visualization: torial/Problem with (stochastic) Gradient Descent are Valleys. You bounce up anddown the walls and don‘t descent the slope. Solutions Momentum, Nesterov Momentum (NAG) Put mu 0 and we have (S)GD

Taking a closer look: Learning Rate & Momentum AdaGrad use all historic information Hessian Free Optimizer

The convolutional layer Ingredient II: Max-Pooling Simply join e.g. 2x2 adjacent pixels in one. Hinton: „The pooling operation used in convolutional neural networks is a big mistake and the fact that it works so well is a disaster“ Also sliding window versions.

Related Documents:

Flexible, High Performance Convolutional Neural Networks ...

2 Convolutional neural networks CNNs are hierarchical neural networks whose convolutional layers alternate with subsampling layers, reminiscent of sim-ple and complex cells in the primary visual cortex [Wiesel and Hubel, 1959]. CNNs vary in how convolutional and sub-sampling layers are realized and how the nets are trained. 2.1 Image processing .

54 Views

3y ago

LNCS 8692 - Learning a Deep Convolutional Network for ...

Learning a Deep Convolutional Network for Image Super-Resolution . a deep convolutional neural network (CNN) [15] that takes the low- . Convolutional Neural Networks. Convolutional neural networks (CNN) date back decades [15] and have recently shown an explosive popularity par-

47 Views

3y ago

Video Super-Resolution With Convolutional Neural Networks

Video Super-Resolution With Convolutional Neural Networks Armin Kappeler, Seunghwan Yoo, Qiqin Dai, and Aggelos K. Katsaggelos, Fellow, IEEE Abstract—Convolutional neural networks (CNN) are a special type of deep neural networks (DNN). They have so far been suc-cessfully applied to image super-resolution (SR) as well as other image .

37 Views

3y ago

A Lightweight Multi-Source Fast Android Malware Detection Model

Performance comparison of adaptive shrinkage convolution neural network and conven-tional convolutional network. Model AUC ACC F1-Score 3-layer convolutional neural network 97.26% 92.57% 94.76% 6-layer convolutional neural network 98.74% 95.15% 95.61% 3-layer adaptive shrinkage convolution neural network 99.23% 95.28% 96.29% 4.5.2.

11 Views

1y ago

Convolutional Neural Network Architectures: from LeNet to ...

ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012 M. Zeiler and R. Fergus, Visualizing and Understanding Convolutional Networks, ECCV 2014 K. Simonyan and A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition, ICLR 2015

42 Views

3y ago

MPACT

Make the 3D shapes 13 Use the nets you just made. 1. Put the nets flat on thin cardboard or thick paper. 2. Trace around the nets with a pencil to draw the nets on the thin cardboard. Or you can glue your paper net on the thin cardboard. 3. Cut out the cardboard nets. 4. Decorate the

57 Views

2y ago

APPLICATIONS OF PETRI NETS

APPLICATIONS OF PETRI NETS A Thesis Submitted to . In this thesis we research into the analysis of Petri nets. Also we give the structure of Reachability graphs of Petri nets and . (Ye and Zhou 2003) about Petri nets and its’ properties. One can ﬁnd further information about Pet

15 Views

2y ago

A short guide to referencing figures and tables for ...

original reference. Referencing another writer’s graph. Figure 6. Effective gallic acid on biomass of Fusarium oxysporum f. sp. (Wu et al., 2009, p.300). A short guide to referencing figures and tables for Postgraduate Taught students Big Data assessment Data compression rate Data processing speed Time Efficiency Figure 5. Data processing speed, data compression rate and Big Data assessment .

71 Views

3y ago

Recent Views

SHORT RUSSIAN PHRASEBOOK FOR ENGLISH-SPEAKING TRAVELERS .

Russian words in English. Version 4.0 December 2011 English-Russian phrases for trips to Belarus (Russia) compiled by Andrei Burdenkov, a certified Minsk guide a@minskguide.travel 2 Common Russian phrases. Numerals Russian phrase Say it in Russian English translation 0 – ноль nol' zero 1 – один Odin one 2 – два Dva two 3 – три Tri three

3y ago

209 Views

UTOMOTIVEEMC - unitest

AUTOMOTIVEEMC SAFETY&EMC2011 Table3ISO11452-2severitylevels SeverityLevel Field/(V/m) I25 II50 III75 IV100 V(opentotheusersofthestandard) a

3y ago

38 Views

DELHI TECHNOLOGICAL UNIVERSITY

4 ME104 Basic Mechanical Engineering AEC 4 4 0 0 3 0 25 - 25 50 - 5 ME106 Workshop Practice AEC 2 0 0 3 0 3 - 50 - - 50 6 HU102 Communication Skills HMC 3 3 0 0 3 0 25 - 25 50 - Total 21 16 1 7. ME-9 II Year: Odd Semester S.No. Code Title Area Cr L T P TH PH CWS PRS MTE ETE PRE 1. PE251 Engineering Materials & Metallurgy AEC 4 3 0 2 3 0 15 15 30 40 2. ME201 Mechanics of Solids DCC 4 3 0 2 3 0 .

3y ago

49 Views

Director Due Diligence

what to consider before nominating a potential director onto a board and should not be considered as a checklist in deciding whether to accept a potential candidate or not. From a potential director’s perspective, the paper aims to guide the individual on what to consider about a company prior to accepting an appointment. Terminology used in the paper Whilst the terms “company” and .

3y ago

154 Views

Hierarchical Level and Leadership Style

ORGANIZATIONAL BEHAVIOR AND HUMAN PERFORMANCE 18, 131--145 (1977) Hierarchical Level and Leadership Style ARTHUR G. JAGO AND VICTOR H. VROOM School of Organization and Management, Yale University This research investigates the relationship between the hierarchical level of managerial personnel and individual differences in their leadership styles, specifically the degree to which they are .

3y ago

64 Views

EFEKTIVITAS MEDIA PEMBELAJARAN E-LEARNING

EFEKTIVITAS MEDIA PEMBELAJARAN E-LEARNING TERHADAP PRESTASI BELAJAR PENDIDIKAN AGAMA ISLAM SISWA DI SMA NEGERI 1 YOGYAKARTA SKRIPSI Diajukan Kepada Fakultas Ilmu Tarbiyah dan Keguruan Universitas Islam Negeri Sunan Kalijaga Yogyakarta Untuk Memenuhi Sebagian Syarat Memperoleh Gelar Sarjana Strata Satu Pendidikan Islam Disusun oleh: Aldila Siddiq Hastomo (09410111) PENDIDIKAN AGAMA ISLAM .

3y ago

179 Views

USAHA PENGERINGAN EMPON-EMPON BAHAN OBAT HERBAL DI .

Keywords: empon-empon, herbal medicine, production, management, marketing. PENDAHULUAN Indonesia merupakan salah satu negara penghasil komoditi obat-obatan yang potensial. Aneka ragam jenis tanaman obat telah diproduksi sebagai bahan baku obat modern maupun obat tradisional (jamu). Prospek pengembangan produksi tanaman obat cukup cerah antara lain karena berkembangnya industri obat modern dan .

3y ago

92 Views

Sara Cristina Oliveira A Guitarra clássica Marques Almeida .

caracterização técnica, estilística e estética do repertório para guitarra de Carlo Domeniconi. Associados a este ponto fundamental, apresenta-se uma abordagem histórica e analítica dos recursos técnicos e estilísticos da guitarra, e, de uma forma sucinta dada a natureza do trabalho, os movimentos estéticos da vanguarda e a sua reflexão na criação artística, que constituem um .

3y ago

36 Views

UNCLASSIFIED//FOR OFFICIAL USE ONLY DELIBERATIVE DOCUMENT .

to provide intelligence and information on transnational organized crime, terrorism, cyber-threat actors, counterintelligence vulnerabilities, economic security, and other developing threats that pose a critical danger to the Nation’s security and our democratic way of life.

3y ago

89 Views

2020 Asia Pacific Virtual Summit Agenda Overview

Cyber Threat Hunting: Resourcing & Methods Sindhu HS, Vice President, Goldman Sachs y To effectively and efficiently execute cyber threat missions leveraging collaborative Tiger Teams to address resource constraints common in every organization y To proactively identify previously undetected malicious activity and

3y ago

28 Views

Corporate Governance and Controls: The Federal Reserve’s .

The Federal Reserve has proposed new supervisory guidance on corporate governance (Governance Proposal) that would apply to large U.S. financial institutions . its internal risk management, its internal controls and, for U.S. G-SIBs, its recovery

3y ago

20 Views

Cross Stitch Tutorial - allstitches4you

cross stitch kit. It is better to cut and fasten off your thread at the back of the needle work as normal, and start again at the new area of the design. Half Cross Stitch Many projects now have areas worked in half cross stitch, for example to give a "soft focus" background.

3y ago

32 Views

Service Manual for WP6 Diesel Engine

Service Manual for WP6 Diesel Engine Preface WP6 mechanical pump series diesel engine has the features of compact structure, reliable

3y ago

121 Views

From Eye to Insight

medaka, and Xenopus, are often used in molecular and developmental biology. An adult zebrafish is shown below. Adult zebrafish (Danio rerio). In molecular and developmental biology, these aquatic vertebrate model organisms are widely applied to study molecular processes of development and as disease models. To study these molecular

3y ago

25 Views

Geometry SIG Update - POSC Caesar

ISO 15926 P&ID/3D Information Models and Proteus Schema Mapping Project yCurrent Team Members – Adrian Laud (Lead), Keith Willshaw, Andrew Prosser, Manoj Dharwadkar yDocument P&ID and 3D Models in terms of Classes and Templates yMapping the Proteus Schema to the ISO 15926 P&ID/3D Information Model

3y ago

45 Views

Convolutional Neural Nets II Hands On - GitHub Pages

It looks like you're using an ad-blocker