Bayesian Networks For Departure Delay Prediction

2y ago
15 Views
2 Downloads
2.29 MB
22 Pages
Last View : 26d ago
Last Download : 3m ago
Upload by : Anton Mixon
Transcription

Booz Allen HamiltonBayesian Networks forDeparture Delay PredictionNASA Ames Research CenterAirline Operations WorkshopAlex CosmasChief ScientistBooz Allen HamiltonSE2020TASK ORDER NO. 67, TORP 1543In support of:Booz Allen Hamilton and Client proprietary and business confidentialFAA NextGen Advanced Concepts and TechnologyDevelopment Group

Agenda Project Overview Bayesian Networks SMDP Model Development Questions

Research Overview Most existing models that are employed in practice (for instance by the FAA) usesimulation techniques, which are based on: Regression / Stochastic / Behavioral Models “Causal Patterns” that are based on theoretical knowledge Iterative, manual, and time-consuming calibration processes Several academic studies propose the use of Bayesian modeling techniques forpredicting flight delays BBNs represent a paradigm shift as they: Have a structure that is machine-learned from data and does not require assumptions about “causal”patterns Can produce estimates even in situations with sparse or limited data Can be used well in advance of the actual flight, as they can predict based on only partial evidenceSMDP represents a paradigm shiftin solving the problem of predicting departure time2

GOAL: Develop a probabilistic model using machinelearning algorithms and data mining techniques to improvedeparture time predictions for real-time TFM in the NASStatistical Methods for Departure Prediction (SMDP)PHASE 1(2013-2014): Developed a Proofof Concept forBoston Logan IntlAirport.Used machinelearningtechniques and52M flight recordsto predictdeparture delaysutilizing 47different variables.PHASE 2(2015-2016): Update the BOSModel withadditional datasets: TFMS andCCFP.Developindividual modelsfor the Core 30Airports.PHASE 3(TBD): Identify use casesand carry outfield tests.Develop and testmultiple BBNmodel network.Operationalizetool withincoming datafeed (e.g. SWIMdata) and realtime capabilities.3

Agenda Project Overview Bayesian Networks SMDP Model Development Questions

BBNs have historically been a tool for the researcher;their potential is extraordinary as a tool for the business 90% of the world's data wascreated in the past 2 years That metric is expected tohold true in another 2 years Data Miners produceSnapple cap facts Data Scientists produceinsights - they require theintellectual curiosity to ask"why" and "so what"?Real fact #855: Animals that lay eggs do nothave belly buttons

Moneyball 2.0: The data revolution is enabling real-timepredictive analysis

BBN Overview and Use CasesWhat are Bayesian Belief Networks? A BBN is a graphical model representing the conditional relationships between variables̶All variables (continuous or discrete) are modeled in terms of probability distributions̶Relationships between the variables are modeled in terms of the conditional probability tablesIllustration of a Simple BBN In the illustrative BBN, variables Quarterback and Victory have two states each and the correspondingprobabilities. For example, there’s 95% chance of first choice QB opening the game. When the status of the playing QB is “known”, the distribution function for Quarterback changes, and theeffect is propagated through the arc influencing the distribution of the Victory variable7

BBN Overview and Use CasesBBNs offer significant advantages over traditional models:Form of themodel has to beselected by themodeler𝑁𝑌 𝛼 oss40%75%𝑖 1Only parameters,not the model formcan be identifiedbased on the dataParametersFormBoth learned from the data For example, linear regression, jointprobability analysis, etc. One size fits all solution Observations with missing data arethrown away Variables with non-numerical values suchas “Color of car” cannot be modeled Optimized network structure (form)learned from the data Missing values are inferred1 during themachine learning process Discretization of variables allows for nonnumerical variables to be modeled1Missing values of a variable are inferred from the known values for thesame variable from other “similar” observations8

BBN Overview and Use CasesOperationalized BBNs:Risk Visualization and Prediction General Electric (failure detection based onsensor data) Intel Corporation (processor fault diagnosis) Proctor and Gamble (market research andconsumer loyalty) SABRE Online Reservation System (bugdetection) Ministry of Defense, UK (TRACS, militaryvehicle location software) Philips Consumer Electronics (testing processquality and software product quality) Inrix Traffic (predicting road traffic flows) Microsoft Office Assistant (enabling proactivetips based on user usage) Reasoning Under Uncertainty, MonashUniversity (missing person search and rescue) National Institute of Water and AtmosphericResearch, New Zealand (forest resourcesmanagement)Goal:Predict likely locations of seriousincidents arising from the transport ofhazardous materialChallenge:Historical incident data spread acrossdisparate sources had many missingvaluesData:113 variables225,000 rowsSource Data Size: 3 GBAsset ManagementGoal:Estimate the reliability of thousands ofassetsChallenge:Sparse information on individualassetsData:49 variables83,000 rowsSource Data Size: 10 GB9

BBN Application – Solving The Monty Hall ProblemA1p(A, B) p(B A)p(A) p(A B)p(B)p(B)EA6p(A i E) A2 A3 A4p(E A i )p(A i )p(E A i )p(A i ) p(E) p(E Ai )p(Ai )iWhen problem first appearedin Parade, approximately10,000 readers, including1,000 PhDs, wrote claimingthe solution was wrong.10A5

BBN Application – Solving The Monty Hall Problem The BBN model for the Monty Hall Problem hasthree nodes̶Door with the car̶Door chosen by the contestant̶Door opened by Monty Hall Since Monty Hall knows the door with the car andthe contestant’s choice, that variable is affected byother two variables as signified by the arcAt the start of the show, allthree variables have equalchance of being door 1, 2 or 3After contestant’s selection,Monty Hall may choose anyof the other two doors11Monty Hall’s selection ofDoor 2 pushes its 1/3probability to Door 3

Application of the Model for Delay Prediction Variables are modeled in terms of probability distribution functions The probabilistic relationships between the variables are represented in terms of theconditional probability tables defined by the arcs in the network Any change in the information about the variables propagates that information through thesearcs of the connected model network and alter the distribution of other variables12

Agenda Project Overview Bayesian Networks SMDP Model Development Questions

SMDP machine learning is based on an iterative processthat tests thousands of alternative model structuresBooz Allen Hamilton and Client proprietary and business confidential14

The datasets acquired in Phase I (Dark Green) have beenintegrated with additional years and types of data inPhase II (Light Green)Data SourceAccessSMDP Data Range0809101112131415Bureau ofTransportationStatistics (BTS)Aviation SystemPerformance Metrics(ASPM)Aggregate DemandList (ADL)Traffic ManagementSystem (ETMS/TFMS)ETMSTFMSNational TrafficManagement Log(NTML)Weather Data (CCFP)Phase IPhase II15

SMDP BBN Model Data ElementsASPMDeparture YYYYMM, Departure Day, DepartureHour, Departure QTR, Arrival YYYYMM, ArrivalDay, Arrival Hour, Arrival QTR, OFF YYYYMM,OFF Day, OFF Hour, OFF QTR, ON YYYYMM, ONDay, ON Hour, ON QTR, FAACARRIER, FlightNumber, Tail Number, ETMS EQPT, DepartureAirport, Arrival Airport, Flight Type, OAGACID,USER CLASS, Scheduled OUT Time, FlightPlan Departure Time, Actual OUT Time, Nomina

Boston Logan Intl Airport. Used machine learning techniques and 52M flight records to predict departure delays utilizing 47 different variables. Identify use cases and carry out field tests. Develop and test multiple BBN model network. Operationalize tool with incomi

Related Documents:

Bruksanvisning för bilstereo . Bruksanvisning for bilstereo . Instrukcja obsługi samochodowego odtwarzacza stereo . Operating Instructions for Car Stereo . 610-104 . SV . Bruksanvisning i original

Learning Bayesian Networks and Causal Discovery Reasoning in Bayesian networks The most important type of reasoning in Bayesian networks is updating the probability of a hypothesis (e.g., a diagnosis) given new evidence (e.g., medical findings, test results). Example: What is the probability of Chronic Hepatitis in an alcoholic patient with

The results of the research show that the daily average arrival delay at Orlando International Airport (MCO) is highly related to the departure delay at other airports. The daily average arrival delay can also be used to evaluate the delay performance at MCO. The daily average arrival delay at MCO is found to show seasonal and weekly patterns,

10 tips och tricks för att lyckas med ert sap-projekt 20 SAPSANYTT 2/2015 De flesta projektledare känner säkert till Cobb’s paradox. Martin Cobb verkade som CIO för sekretariatet för Treasury Board of Canada 1995 då han ställde frågan

service i Norge och Finland drivs inom ramen för ett enskilt företag (NRK. 1 och Yleisradio), fin ns det i Sverige tre: Ett för tv (Sveriges Television , SVT ), ett för radio (Sveriges Radio , SR ) och ett för utbildnings program (Sveriges Utbildningsradio, UR, vilket till följd av sin begränsade storlek inte återfinns bland de 25 största

Hotell För hotell anges de tre klasserna A/B, C och D. Det betyder att den "normala" standarden C är acceptabel men att motiven för en högre standard är starka. Ljudklass C motsvarar de tidigare normkraven för hotell, ljudklass A/B motsvarar kraven för moderna hotell med hög standard och ljudklass D kan användas vid

LÄS NOGGRANT FÖLJANDE VILLKOR FÖR APPLE DEVELOPER PROGRAM LICENCE . Apple Developer Program License Agreement Syfte Du vill använda Apple-mjukvara (enligt definitionen nedan) för att utveckla en eller flera Applikationer (enligt definitionen nedan) för Apple-märkta produkter. . Applikationer som utvecklas för iOS-produkter, Apple .

the phase delay x through an electro-optic phase shifter, the antennas are connected with an array of long delay lines. These delay lines add an optical delay L opt between every two antennas, which translates into a wavelength dependent phase delay x. With long delay lines, this phase delay changes rapidly with wavelength,