Detecting Time Series Anomalies At Uber Scale With Recurrent Neural .

1y ago
12 Views
2 Downloads
1.97 MB
49 Pages
Last View : Today
Last Download : 3m ago
Upload by : Sabrina Baez
Transcription

Detecting time series anomalies at Uberscale with recurrent neural networksAndrea PasquaAnny ChenIDS Team - Uber

Anomaly Detection at Uber: the Business AngleOur missionMore reliable and safer transportation everywhere,for everyone

Anomaly Detection at Uber: the Business AngleAn important componentReliability of the App

Anomaly Detection at Uber: the Business AngleUber’s app is differentNobody is “just browsing”Unusually high cost of outages Transactions permanently lostCosts magnified by the scale of the business

Anomaly Detection at Uber: the Business AngleGreat opportunity for cost savingAbout 8M saved last yearThrough intelligent, automated on-call alerting Conservative estimate

The Scale of the ProblemWhat does it take to ensure a reliable app?

The Scale of the ProblemAn ecosystem of microservices

The Scale of the ProblemAn ecosystem of microservicesEach service has multiple traces to monitorPowerful combinatorics: x geo x productmore than 1 billion traces

The Scale of the ProblemCompounded challengesHigh Cardinality

The Scale of the ProblemCompounded challengesVariety of patterns

The Scale of the ProblemCompounded challengesVariety of patterns

The Scale of the ProblemCompounded challengesVariety of patterns

The Scale of the ProblemCompounded challengesVariety of patterns and others

The Scale of the ProblemCompounded challengesSpeed of detection1-minute granularity in most situations and whenever possible

Our SolutionThe nature of the problem calls for Rationale Data rich situationComplex patternsInterrelated inputsNecessity of automation and speed

Our SolutionThe nature of the problem calls for A Machine Learning PlatformRationale Data rich situationComplex patternsInterrelated inputsNecessity of automation and speed

Anomaly Detection Platform At the core, the platform implements a stream of binary classifiersDetection WindowIs It Anomalous?YesNoHistory Additional Inputs Metainfo (e.g. events) or

Anomaly Detection Stack Some models are indeed waveform binary classifiers Backward lookingGood for new tracesDoes not rely on metainfo

Anomaly Detection Stack But most carry out a density forecast behind the scenes Learn from the pastForecast our expectations and our uncertainty

Anomaly Detection Stack But most carry out a density forecast behind the scenes Learn from the pastForecast our expectations and our uncertaintyCompare with the actuals

Anomaly Detection Stack But most carry out a density forecast behind the scenes AnomalyLearn from the pastForecast our expectations and our uncertaintyCompare with the actuals

Anomaly Detection Stack Two types of forecasting modelsDistinguished by type of input and by how they learn: Single time-series models- Trained onlineModels that learn across multiple time series- Training is slower

The Serving LayerEven when the models require extensive training, serving needs to be rapidA Golang Serving Layerfor speed and maximum integration with Uber’s stack.

Review of Forecasting MethodsMany methodologies for time series forecasting-Traditional models:--Exponential smoothing family:--Exponential smoothingHolt-WintersDecomposition-based models:--Moving Average (MA),Autoregression (AR),ARMA,Etc.Theta methodSpline regressionProphetProprietary models

Forecasting with Neural NetworksUse recurrent neural network forecasting-Capable of dealing with huge amounts of dataHas some memory of the pastNot just univariate, could make use of other featuresNeural network could adopt many model shapes

Recurrent Neural Networks Inputs are sequential- Apply to cases like language processing, time series, etcModel has some memory of the past- Remember previous look-back stepsPlots from: works-tutorial-part-1-introduction-to-rnns/

Recurrent Neural NetworksLong short-term memory (LSTM) cell, a special RNN cell Capable of learning long-term dependenciesSolves the vanishing gradient problemRepeatPlots from: -LSTMs/

Inside the LSTM CellThree gates:-Forget gateInput gateOutput gateCan accommodate both longand short term memory-Selective memoryPlots from: -LSTMs/

Forecasting with Recurrent Neural NetworksModel-Two LSTM layers and one denselayerWindow-wide scaling of input andoutputAdam optimizationMinimizing absolute error instead ofsquared errorDecaying learning rate

Scaling Inputs and OutputsWindow-wide scaling of input and outputMin-max range scaleSingle windowThe entire time series

Learning RateDecaying learning rate Learning rate- Decay by epoch- Decay rate becomesconstant after 100epochs

Training Input and OutputV1V0Input-InputMultiple time series---FeaturesTime series of different topicsMinute tileTreated as different samplesOne day-Features-Last 30 minutesOutput-Same as before except forLook back---Next 30 minutesLast 30 minutes last week sametime as prediction windowOutput-Next 30 minutes

Model Performance Performance measured out-of-sampleEach example predicts 30 minutes PEsMAPEMedian7.386.60Mean25.0618.02wMAPE: weighted mean absolute percentage errorsMAPE: symmetric mean absolute percentage error

Anomaly Detection using RNN With forecasting, we still need to-Decide on the desired level of confidenceEstimate prediction interval at a given confidence levelChoose confidence level to adjust sensitivityNext let’s focus on prediction interval at a given confidence level

What’s the Prediction Interval?Prediction intervals quantify prediction uncertainty. What do we mean by uncertainty? Model uncertainty Our ignorance of the model parametersInherent noise Irreducible noise level from the random processPredictionuncertainty

Prediction Uncertainty Model uncertaintyInherent noisePlot from: https://analyse-it.com/docs/220/standard/multiple linear regression.htm

Prediction Uncertainty Model uncertaintyInherent noisePlot from: https://analyse-it.com/docs/220/standard/multiple linear regression.htm

Estimating Inherent NoiseWhat does this noise mean?- Uncertainty produced even if we know the trueunderlying distribution- Generate 100 data from normal (5, 1) distribution- Y 5 ε where ε is normal(0, 1)- Model is identity * 5 and no variance- There’s still inherent noise in εHow to estimate noise?- One possible way: computeresidual sum of squares (RSS) toestimate noise

Estimating Model UncertaintyRandom dropout during servingModel uncertaintyInput:X*, dropout probability p, repetition T 500Algorithm:1. Repeat T stochastic feed-forward passes2. Collect predictions Y1, , YTOutput:Sample variance2MMethodology: Gal (2016), Uncertainty in Deep Learning, PhD ThesisPlot: http://mlg.eng.cam.ac.uk/yarin/blog 3d801aa532c1ce.html

Estimating Model UncertaintyPass 1Model uncertaintyInput:X*, dropout probability p, repetition T 500Algorithm:1. Repeat T stochastic feed-forward passes2. Collect predictions Y1, , YTOutput:Sample variance2MMethodology: Gal (2016), Uncertainty in Deep Learning, PhD ThesisPlot: http://mlg.eng.cam.ac.uk/yarin/blog 3d801aa532c1ce.html

Estimating Model UncertaintyPass 2Model uncertaintyInput:X*, dropout probability p, repetition T 500Algorithm:1. Repeat T stochastic feed-forward passes2. Collect predictions Y1, , YTOutput:Sample variance2MMethodology: Gal (2016), Uncertainty in Deep Learning, PhD ThesisPlot: http://mlg.eng.cam.ac.uk/yarin/blog 3d801aa532c1ce.html

Estimating Model UncertaintyPass 3Model uncertaintyInput:X*, dropout probability p, repetition T 500Algorithm:1. Repeat T stochastic feed-forward passes2. Collect predictions Y1, , YTOutput:Sample variance2MMethodology: Gal (2016), Uncertainty in Deep Learning, PhD ThesisPlot: http://mlg.eng.cam.ac.uk/yarin/blog 3d801aa532c1ce.html

Estimating Model UncertaintyPass 4Model uncertaintyInput:X*, dropout probability p, repetition T 500Algorithm:1. Repeat T stochastic feed-forward passes2. Collect predictions Y1, , YTOutput:Sample variance2MMethodology: Gal (2016), Uncertainty in Deep Learning, PhD ThesisPlot: http://mlg.eng.cam.ac.uk/yarin/blog 3d801aa532c1ce.html

Flow of Forecasting and Uncertainty EstimationStep 1Step 2Training dataValidation dataTrain the neuralnetworkUse the trained modelto estimate noiselevelFit the weightsInherent noiseestimatePredictionuncertaintyStep 3New dataObtain prediction andmodel uncertaintyRepeat random dropoutMethodology: Gal (2016), Uncertainty in Deep Learning, PhD ThesisPlots: http://mlg.eng.cam.ac.uk/yarin/blog 3d801aa532c1ce.html

Forecasting Daily Trips with UncertaintyInput-Look back--28 daysFeatures-Trip valueHoliday infoCalendar featuresOutput-Next 5 daysUber Blog: Engineering Uncertainty Estimation in Neural Networks for Time Series Prediction at Uber

Forecasting Daily Trips with UncertaintyPrediction with 95% prediction intervalUber Blog: Engineering Uncertainty Estimation in Neural Networks for Time Series Prediction at Uber

Future DevelopmentsModel Improvements-Truncated backpropagation through time--More feature engineering--Summary features: e.g. mean or quantilesAdditional methods to deal with seasonality within NNs--Longer memory without vanishing gradientsCalendar features: hour of day, day of weekPer hour of day/week modelsTransfer learning

Thank you!Any questions?Learn more about Anomaly Detection at UBER!-Engineering Uncertainty Estimation in Neural Networks for Time Series Prediction at UberEngineering Extreme Event Forecasting at Uber with Recurrent Neural NetworksAnomaly DetectionIdentifying Outages with Argos, Uber Engineering’s Real-Time Monitoring and Root-Cause ExplorationTool

Proprietary and confidential 2017 Uber Technologies, Inc. All rights reserved. No part of thisdocument may be reproduced or utilized in any form or by any means, electronic or mechanical,including photocopying, recording, or by any information storage or retrieval systems, withoutpermission in writing from Uber. This document is intended only for the use of the individual or entityto whom it is addressed and contains information that is privileged, confidential or otherwise exemptfrom disclosure under applicable law. All recipients of this document are notified that the informationcontained herein includes proprietary and confidential information of Uber, and recipient may notmake use of, disseminate, or in any way disclose this document or any of the enclosed information toany person other than employees of addressee to the extent necessary for consultations withauthorized personnel of Uber.

Learn more about Anomaly Detection at UBER! - Engineering Uncertainty Estimation in Neural Networks for Time Series Prediction at Uber - Engineering Extreme Event Forecasting at Uber with Recurrent Neural Networks - Anomaly Detection - Identifying Outages with Argos, Uber Engineering's Real-Time Monitoring and Root-Cause Exploration .

Related Documents:

anomalies in the basic models of A stars to the observed abundance anomalies in AmFm stars. The LiBeB observations will be used to constrain the turbulence in F and G stars and a summary of the constraints abundance anomalies put on turbulence in main sequence stars will be presented. 2. Basic models and iron convection zones

detecting fraud is to use unsupervised machine learning techniques. In this breakout session, we'll . random forests, gradient boosting, neural networks, and other types of classification . An isolation forest is an implementation of the forest algorithm that is used to detect anomalies instead of

SMB_Dual Port, SMB_Cable assembly, Waterproof Cap RF Connector 1.6/5.6 Series,1.0/2.3 Series, 7/16 Series SMA Series, SMB Series, SMC Series, BT43 Series FME Series, MCX Series, MMCX Series, N Series TNC Series, UHF Series, MINI UHF Series SSMB Series, F Series, SMP Series, Reverse Polarity

Complete foreskin, physiologic phimosis Median raphe Deviated 10% Penile anomalies Buried penis Webbed penis Torsion Curvature 0.6% male neonates Hypospadias 1:250 Penile anomalies can be associated with anorectal malformations and urologic abnormalities

Anomalies and Space Weather Phenomena: Improved Satellite Performance and Risk Mitigation Whitney Q. Lohmeyer1 and Kerri Cahoy2 MIT, Cambridge, MA, 02139 . the most anomalies coincided with a sunspot cycle minimum, there were additional . maximum has yet to occur for Cycle 24.

Force consisted of members of the ASRM and the Society of Reproductive Surgeons with expertise in the diagnosis and treatment of m ullerian anomalies (Attaran, Lindheim, Pet-rozza, Pfeifer, and Rackow). Because these anomalies are often identified in adolescents, a designated representative from theNorth AmericanSociety ofPediatric and .

Congenital anomalies of the optic nerve Manuel J. Amador-Patarroyo, Mario A. Pérez-Rueda , Carlos H. Tellez Abstract Congenital optic nerve head anomalies are a group of structural malformations of the optic nerve head and surrounding tissues, which may cause congenital visual impairment and blindness.

paper is to use real-world cell phone traffic to detect anomalies in user patterns based on phone call and text message history. Introduction According to the International Telecommunication Union (ITU), in 2014 mobile subscriptions in underdeveloped nations are estimated to be quickly growing and mobile