Lecture 4 Fundamentals Of Deep Learning And Neural Networks

2y ago
31 Views
2 Downloads
7.14 MB
96 Pages
Last View : 16d ago
Last Download : 3m ago
Upload by : Ronnie Bonney
Transcription

Lecture 4Fundamentals of deep learningand neural networksSerena YeungBIODS 388

Deep learning: Machine learning models based on“deep” neural networks comprising millions (sometimesbillions) of parameters organized into hierarchical layers.Features are multiplied and added together repeatedly,with the outputs from one layer of parameters being fedinto the next layer -- before a prediction is made.

Contrast with linear regression:

Agenda for today- More on the structure of neural network models- Machine learning training loop and concept of loss, in the context ofneural networks- Minimizing the loss for complex neural networks: gradient descentand backpropagation

Let’s start by considering again logistic regression,for binary classification

Nonlinearsquashing to(0,1) withsigmoidnonlinearity

Also commonly used inmodern neural networks!

The logistic regression with sigmoid that we just saw can beconsidered as a single “neuron” model:

The logistic regression with sigmoid that we just saw can beconsidered as a single “neuron” model:

The logistic regression with sigmoid that we just saw can beconsidered as a single “neuron” model:

A layer of a neural networks consists of a set of neurons that eachtake the same input!

A layer of a neural networks consists of a set of neurons that eachtake the same input!

A layer of a neural networks consists of a set of neurons that eachtake the same input!Note: each neuron willhave its own set ofparameters that it learns,which will produce differentoutputs

A layer of a neural networks consists of a set of neurons that eachtake the same input!

Concatenate the multiple outputs from a layer of a neural network tobe the input to the next layer

Represents increasinglycomplex (and hierarchical)function that is beingcomputed!

Fully connected layer: all neurons in the layer takes as input the fullinput to the layer (also called dense layer or linear layer)

How do we train neural networks to learn good values ofthe (many) parameters, to accurately map from inputs todesired outputs?

Optimizationstep

Periodically usevalidation set tomeasure how themodel will do “inthe real world”.Save a version ofthe model if it givesthe best validationperformance seenso far.

Can also run theentire process fordifferent trainingconfigurations, orhyperparameters,to choose thebest ones.Referred to as“hyperparametertuning”.

Agenda- More on the structure of neural network models- Machine learning training loop and concept of loss, in thecontext of neural networks- Minimizing the loss for complex neural networks: gradient descentand backpropagation

Cross-entropy loss: 0.51

Cross-entropy loss: 0.15

Agenda- More on the structure of neural network models- Machine learning training loop and concept of loss, in the context ofneural networks- Minimizing the loss for complex neural networks: gradientdescent and backpropagation

How can we find “good” values of many parameters?

How can we find “good” values of many parameters?One option: Try all combination of possible weights and test how good each oneis. But this would take forever, since there’s infinite possibilities and there is noindication of how best to adjust.

How can we find “good” values of many parameters?One option: Try all combination of possible weights and test how good each oneis. But this would take forever, since there’s infinite possibilities and there is noindication of how best to adjust.Instead: the trick is that we need to have some idea of which “direction” to adjustthe weights to reduce the loss function.Analogy: the game of Marco Polo!

Backpropagation: mathematical technique that breaksdown complex gradient computation into local gradientcomputations that are then combined together. Secretsauce for allowing us to obtain gradient for large neuralnetwork models!(with the help of graphical processing units or GPUs)

Now that we have a deeper understanding of neuralnetworks, let’s look at how they work for common typesof input data.

Some case studies of convolutionalneural networks.

Gulshan et al. 2016--Task: Binary classification of referablediabetic retinopathy from retinal fundusphotographsInput: Retinal fundus photographsOutput: Binary classification of referablediabetic retinopathy (y in {0,1})- Defined as moderate and worsediabetic retinopathy, referable diabeticmacular edema, or bothGulshan, et al. Development and Validation of a Deep Learning Algorithm forDetection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA, 2016.

Gulshan et al. 2016---Dataset:- 128,175 images, each graded by 3-7ophthalmologists.- 54 total graders, each paid to grade between20 to 62508 images.Data preprocessing:- Circular mask of each image was detectedand rescaled to be 299 pixels wideModel:- Inception-v3 CNN, with ImageNet pre-training- Multiple binary cross-entropy lossescorresponding to different binary predictionproblems, which were then used for finaldetermination of referable diabetic retinopathyGulshan, et al. Development and Validation of a DeepLearning Algorithm for Detection of DiabeticRetinopathy in Retinal Fundus Photographs. JAMA,2016.

Gulshan et al. 2016---Dataset:- 128,175 images, each graded by 3-7ophthalmologists.- 54 total graders, each paid to grade between20 to 62508 images.Data preprocessing:- Circular mask of each image was detectedand rescaled to be 299 pixels wideModel:- Inception-v3 CNN, with ImageNet pre-training- Multiple binary cross-entropy lossescorresponding to different binary predictionproblems, which were then used for finaldetermination of referable diabetic retinopathyPre-training means training first on adifferent (usually larger) dataset first to learngenerally useful visual features as a startingpointGulshan, et al. Development and Validation of a DeepLearning Algorithm for Detection of DiabeticRetinopathy in Retinal Fundus Photographs. JAMA,2016.

Gulshan et al. 2016---Dataset:- 128,175 images, each graded by 3-7ophthalmologists.- 54 total graders, each paid to grade between20 to 62508 images.Data preprocessing:- Circular mask of each image was detectedand rescaled to be 299 pixels wideModel:- Inception-v3 CNN, with ImageNet pre-training- Multiple binary cross-entropy lossescorresponding to different binary predictionproblems, which were then used for finaldetermination of referable diabetic retinopathyGraders provided finer-grainedlabels which were thenconsolidated into (easier) binaryprediction problemsGulshan, et al. Development and Validation of a DeepLearning Algorithm for Detection of DiabeticRetinopathy in Retinal Fundus Photographs. JAMA,2016.

Gulshan et al. 2016-Results:- Evaluated using ROC curves,AUC, sensitivity and specificityanalysisGulshan, et al. Development and Validation of a Deep Learning Algorithm forDetection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA, 2016.

Gulshan et al. 2016AUC 0.991Looked at different operating points- High-specificity pointapproximated ophthalmologistspecificity for comparison. Shouldalso use high-specificity to makedecisions about high-risk actions.- High-sensitivity point should beused for screening applications.Gulshan, et al. Development and Validation of a Deep Learning Algorithm forDetection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA, 2016.

Gulshan et al. 2016Gulshan, et al. Development and Validation of a Deep Learning Algorithm forDetection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA, 2016.

Gulshan et al. 2016Gulshan, et al. Development and Validation of a Deep Learning Algorithm forDetection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA, 2016.Q: What could explain the difference intrends for reducing # grades / image ontraining set vs. tuning set, on tuning setperformance?

Esteva et al. 2017---Two binary classification tasks ondermatology images: malignant vs.benign lesions of epidermal or melanocyticoriginInception-v3 (GoogLeNet) CNN withImageNet pre-trainingFine-tuned on dataset of 129,450 lesions(from several sources) comprising 2,032diseasesEvaluated model vs. 21 or moredermatologists in various settingsEsteva*, Kuprel*, et al. Dermatologist-level classification of skin cancer with deepneural networks. Nature, 2017.

Esteva et al. 2017-Train on finer-grained classification (757 classes) but perform binary classification atinference time by summing probabilities of fine-grained sub-classesThe stronger fine-grained supervision during the training stage improves inferenceperformance!Esteva*, Kuprel*, et al. Dermatologist-level classification of skin cancer with deepneural networks. Nature, 2017.

Esteva et al. 2017-Evaluation of algorithm vs.dermatologistsEsteva*, Kuprel*, et al. Dermatologist-level classification of skin cancer with deepneural networks. Nature, 2017.

Rajpurkar et al. 2017--Binary classification of pneumoniapresence in chest X-raysUsed ChestX-ray14 dataset with over100,000 frontal X-ray images with 14diseases121-layer DenseNet CNNCompared algorithm performance with 4radiologistsAlso applied algorithm to other diseases tosurpass previous state-of-the-art onChestX-ray14Rajpurkar et al. CheXNet: Radiologist-Level Pneumonia Detection on ChestX-Rays with Deep Learning. 2017.

McKinney et al. 2020-Binary classification of breast cancer in mammogramsInternational dataset and evaluation, across UK and USMcKinney et al. International evaluation of an AI system for breast cancer screening. Nature, 2020.

SummaryToday we covered:- Structure of neural network models- Machine learning training loop and concept of loss, in the context ofneural networks- Minimizing the loss for complex neural networks: gradient descentand backpropagation- Neural networks for a common type of input data: images(convolutional neural networks)Next time: more on deep learning models for different types of input dataand prediction tasks

Fundamentals of deep learning and neural networks Serena Yeung BIODS 388. Deep learning: Machine learning models based on “deep” neural networks comprising millions (sometimes billions) of parameters organized into hierarchical layer

Related Documents:

Introduction of Chemical Reaction Engineering Introduction about Chemical Engineering 0:31:15 0:31:09. Lecture 14 Lecture 15 Lecture 16 Lecture 17 Lecture 18 Lecture 19 Lecture 20 Lecture 21 Lecture 22 Lecture 23 Lecture 24 Lecture 25 Lecture 26 Lecture 27 Lecture 28 Lecture

Lecture 1: Introduction and Orientation. Lecture 2: Overview of Electronic Materials . Lecture 3: Free electron Fermi gas . Lecture 4: Energy bands . Lecture 5: Carrier Concentration in Semiconductors . Lecture 6: Shallow dopants and Deep -level traps . Lecture 7: Silicon Materials . Lecture 8: Oxidation. Lecture

Lecture 1: A Beginner's Guide Lecture 2: Introduction to Programming Lecture 3: Introduction to C, structure of C programming Lecture 4: Elements of C Lecture 5: Variables, Statements, Expressions Lecture 6: Input-Output in C Lecture 7: Formatted Input-Output Lecture 8: Operators Lecture 9: Operators continued

TOEFL Listening Lecture 35 184 TOEFL Listening Lecture 36 189 TOEFL Listening Lecture 37 194 TOEFL Listening Lecture 38 199 TOEFL Listening Lecture 39 204 TOEFL Listening Lecture 40 209 TOEFL Listening Lecture 41 214 TOEFL Listening Lecture 42 219 TOEFL Listening Lecture 43 225 COPYRIGHT 2016

Partial Di erential Equations MSO-203-B T. Muthukumar tmk@iitk.ac.in November 14, 2019 T. Muthukumar tmk@iitk.ac.in Partial Di erential EquationsMSO-203-B November 14, 2019 1/193 1 First Week Lecture One Lecture Two Lecture Three Lecture Four 2 Second Week Lecture Five Lecture Six 3 Third Week Lecture Seven Lecture Eight 4 Fourth Week Lecture .

Pass Google ADWORDS-FUNDAMENTALS Exam with 100% Guarantee Free Download Real Questions & Answers PDF and VCE file from: . A key benefit of My Client Center (MCC) is that it allows: . Latest Google exams,latest ADWORDS-FUNDAMENTALS dumps,ADWORDS-FUNDAMENTALS pdf,ADWORDS-FUNDAMENTALS vce,ADWORDS-FUNDAMENTALS dumps,ADWORDS-FUNDAMENTALS exam .

Little is known about how deep-sea litter is distributed and how it accumulates, and moreover how it affects the deep-sea floor and deep-sea animals. The Japan Agency for Marine-Earth Science and Technology (JAMSTEC) operates many deep-sea observation tools, e.g., manned submersibles, ROVs, AUVs and deep-sea observatory systems.

and artificial intelligence (AI) — combined with various analytics approaches and tools — can help CFOs move forwards on this path and ultimately transform the entire finance function. According to PwC’s Finance Effectiveness Benchmarking Report 2019, 61% of finance leaders believe that finance functions could become more effective with improved technology.1 In fact, CFOs are uniquely .