Causal Inference Primer

2y ago
15 Views
3 Downloads
8.60 MB
10 Pages
Last View : 14d ago
Last Download : 3m ago
Upload by : Kaleb Stephen
Transcription

EXECUTIVE PRIMERCAUSALINFERENCEJuly 2020 // Peter Gratzke // causalsg.com

CAUSAL INFERENCE 01CONTENTSOverview2The next AI evolution2Why AI is not enough3The causal inference toolkit4Causal inference at the tipping point6Takeaway for executives6

CAUSAL INFERENCE 02Causal inference has seen a dramatic increase inUsing AI, many businesses have gained superiorattention. Leading researchers have pronounced itanalytics insights into customer behaviors, salesthe next evolution of AI, tech companies haveperformance or operational issues. Others havestarted investing in the underlying technology, andautomated simple manual processes or developedbusinesses across a range of industries are in theAI-based products and services that serve entirelyprocessnovel customer needs.ofapplyingittowardscommercialobjectives. Some examples of the trend include:Since being adopted in business, the value123Facebook, Google,4LinkedIn, Netflix,anddelivered by AI has been immense. McKinseyAirbnb are using causal inference to makeestimates that the technology has the potential tobetter decisions and improve their products.reach between 3.5 trillion and 5.8 trillion in5678Microsoft, Uber, IBM, McKinsey913and othersvalue annually. However, as of today, AI falls shorthave built and open-sourced platforms andin one critical way: it cannot inform actions.toolkits for causal inference.Startups such as Optimizely10and CausaLens11The technologies and methods underlying AI areare leveraging causal inference to cater to earlyabout identifying relationships and correlations inadopter industries such as marketing analytics,data. That’s good enough (and often very good) tofinancial services and the pharmaceutical andmake predictions, but it’s insufficient to derive any12healthcare s, which would be necessary forAmid its increasing importance for business,executivesshouldunderstandwhateffective interventions.causalinference is, where its value lies and what to lookThis short-coming leaves businesses wanting inout for in the next 18 months as commercialmany ways. Some questions business leaders failapplications accelerate.to answer with AI include: Why are customersbehaving a certain way? How will changes to ourproducts affect sales performance? What wouldhappen if we made certain changes to operationalTHE NEXT AIEVOLUTIONprocesses?This gap is what Judea Pearl, a leading causalinference researcher, calls “the difference ons into the future, causal inferenceparses out the effects that actions will have.Advances in AI - broadly defined as machineOnlylearning and its applications across areas such asbusinesses confidently develop strategies andpredictivenaturalinitiatives designed to meet commercial objectives.language processing and image recognition - haveIn other words, without causal knowledge, there isyielded incredible commercial value for business.no effective nsightcan

03 CAUSAL INFERENCECausal inference refers to theuseanddevelopmentofWHY AI IS NOTENOUGHBox 1: Examplemethods and technologies thatare able to identify how and towhat degree actions will impactAI is not enough to bridge the gap from predictionthe world.to action. An example for this can be seen in howcompanies use AI to manage customer churn.The AI community has caught on to this gap. Ledby academics such as Judea Pearl, who argues thatPredictive AI models are frequently used toAI can’t be truly intelligent until it has a richidentify customers who are at the highest risk usalaservice.Onceidentified,machine learning” is now firmly on the researchcompanies then target at-risk customers withagendaretention efforts and proactively engage them withofmostleadingcomputersciencedepartments.the goal of swaying customers from dropping.Yoshua Bengio, who worked on some of theStandard machine learning algorithms are veryfoundationalneuralgood at this type of prediction. However, it turnsnetworks and deep learning, believes that “deepout that knowing which customers are at risk oflearning won’t realize its full potential, and won’tchurning is not enough to develop effectivedeliver a true AI revolution, until it can go beyondstrategies to reduce churn. In order to do that,pattern recognition and learn more about causecompanies need to know why those customers areand effect.”14researchunderpinningBengio concludes that “causality isvery important for the next steps of progress ofmachine learning.”at-risk and how to target them in order to reducethe s it stands, the efforts to teach AI causality arenecessary to understand for which customers thestill in their infancy. But the field of causalcausal effect of retention strategies on their churninference already has a deep repertoire ofbehavior are the esses can leverage.Recent research has shown that customers withthe highest risk of churning are different fromThose companies that can best take advantage ofcustomers who would respond best to retentioncausal inference will be able to seize significantinitiatives. Therefore, companies need to goadvantages in how they interpret and analyzebeyond predicting churn, and understand thedata, develop strategic initiatives and allocate theircausal drivers affecting churn, in order to developresources towards the highest impact actions.effective strategies to lower churn.16

CAUSAL INFERENCE 04THE CAUSALINFERENCETOOLKITFor example, at Netflix, every product change goesthrough a rigorous A/B testing process beforebecoming the default user experience.This appliesto major redesigns as well and minute details. A/Btesting on the images associated with movie titleshas shown that they can result in as much as a1920% to 30% increase in views for a given title.Most experiments that businesses run are stillonline,butmanyhavedemonstratedthatrandomized controlled trials (or RCTs) can deliverextraordinary value outside of the digital realm aswell.The goal of causal inference is to generate andanalyze data that provides insights to allow foractions that will have the desired effect on theworld. The toolkit for causal inference is large, nts and inference from observational17data. Each comes with it’s own advantages anddisadvantages, and not all are equally popularamongst practitioners yet.For example, a retail chain wanted to explore theimpact of bonus payments to employees andrandomly selected sales teams to receive a bonus.The experiment showed that the bonus increasedboth sales and number of customers dealt with by3% and each dollar spent on the bonus generated20 3.80 in sales, and 2.10 in profit.The Harvard Business Review found that proachtoidentifyingcausalwhat happens. In reality, this type of trial-andiscomplicated.andGoogle—eachexperiments annually, with many tests engagingrelationships is to simply try out an action and seeerrorFacebook,Amazon,conduct more than 10,000 online onfounding factors have to be separated fromthe effect of our actions. That’s why controlledexperiments are used to isolate the impact of anaction, while controlling for any alternative factorsmillions of users. Start-ups and companies re Airlines, also run them regularly, thoughon a smaller scale. These organizations havediscovered that an ‘experiment with everything’approach has surprisingly large payoffs.”21Experiments as outlined above have the advantagethat could have an influence.of delivering direct observations of the impact ofThe most commonly used form of experiment inexperiments is that they are often expensive, oran action on intended results. The downside tobusiness is “A/B testing” or “split testing.” In A/Btesting, users are randomly split into groups andeach is shown a different version of a website or18app to measure which version performs best.even impossible to perform in many cases.

05 CAUSAL INFERENCEInferring cause and effectfrom observational dataThe company found that “observational causalWhere experiments are not possible at all, causalapproach,”relationships can be inferred from existing data,often referred to as “observational” data. This is, infact, the most common scenario in businesssettings.inference methods can estimate the impacts ationaninternalobservational study platform integrated with it’sexisting experimentation platform to allow for the27democratization of causal inference at LinkedIn.Those examples show that even where trueTo-date, this type of data of mostly analyzed withstandard AI methodologies, such as regression orclassification algorithms. This yields insights andpredictions, but is rarely enough to informexperiments are not possible, statistical tools ps in data that is collected throughbusiness-as-usual. It also means that where aimpactful actions.company has enough data use standard AIThe objective of causal inference is to isolateanalysis into causal relationships.methods, typically the data will allow for deeperdirect (or indirect) causal relationships fromconfounding factors. The are many statisticalmethods that data scientists use, ranging fromvarious type “matching methods” that aim toreduce confounders between the treated and22control groups after interventions, to “structuralcausal models” using graphs that aim to explicatecausal relationships.23Uber is one the leading proponents of causalinference, and uses a technique called mediationmodeling to go from “know whether” LINFERENCE ATTHE TIPPINGPOINTevolutions, ranging from evaluating promotional24campaigns to comparing feature designs. Amongstmany applications, the company has used causalCausal inference is now on the cusp of broadinference to understand how purchase rates areadoption in business. There are three main factorsimpacted by dynamic pricing, and how Uber Eatsthat are driving the emergence of causal inferencecustomer engagement is affected by deliveryas an accelerating category now.25delays.First, the limitations of current AI are becomingLinkedIn leverages a number of different causalevident. The hype for AI for the past decade hasinference methods on observational data tolaid much of the groundwork for data-drivenmeasure the effect of contributions (e.g., post,business,comment, like or send private messages) oninfrastructure to collect, process, analyse and26engagement metrics.byfuelinginvestmentsleverage large quantities of data.inthe

CAUSAL INFERENCE 06But executives are now finding many of theTaken together, demand for insights from causalinsights produced are not actionable. Wheninference is growing, while it’s never been easierMcKinsey launched it’s own causal inferencefor practitioners to leverage causal inferenceproduct, the company highlighted that “helpingtoolkits. All three of the above trends are to set toorganizationsaccelerate through the next 18 tdependsonunderstanding and addressing the underlyingcauses of a situation.”28This realization isspreading fast not only amongst consultants, butalsoamongstmanagementteams,andsubsequently driving demand for causal inference.Second, the use of causal inference in academiahas made significant progress over the pastdecade, following a series of “cross-disciplinaryTHE TAKEAWAYFOREXECUTIVESadvances in algorithms for identifying causalrelations and effect sizes from observational dataor mixed experimental and observational ysics,clinicalecology,Causal inference is on the verge of broad adoptionmedicine,in business. Various forms of causal inference,29neuroscience and many other domains.especially randomized controlled trials, have longbeeninuseinlifesciencesformedicalWhile the conceptual landscape has reachedinterventions, and have recently found adoptionmaturity, especially with the formalization offor a number of use cases by leading techgraphical and structural representations of illincreasingly see machine learning techniques30applied to causal modeling and discovery.Early adopters have used causal inference toachieve benefits, including:Third, open source platforms and libraries haveHigher accuracy predictions over standardreached a tipping point, by democratizing themachine learning methodsapplication of causal inference. Tech leaders suchHigher3132confidenceinsightsas Facebook and Google, have open-sourceddecisions and take actionslibraries to design experiments and determine theBetter33effects and degree of impact of actions. Microsoft,34IBM,and McKinsey35have all keting campaigns, product decisions orstrategic initiativeslibraries to infer causal relationships in existingBetter resource allocation towards highestdata sets. Additionally, startups are also active inimpact y, which offers A/B testing tools, toAs startups and vendors emerge to fulfill thecausaLens, which is applying “causal AI” to usegrowing demand, executives should watch out for36cases across finance, IoT, energy and telecoms.threemaincategoriesthathelpbusinesses to leverage causal inference.support

07 CAUSAL INFERENCEFirst, platforms that provide general purposecausal inference tooling will grow exponentially.Those platforms will serve as the workbench forbusinesses to customize causal inference to theirunique business needs. Tech companies arealready providing a range open source productstargeted to data scientists. Expect the nextgeneration of platforms to be geared to businessanalysts.Second, point solutions for business problemsthat are better addressed using causal inferencewill grow in number. A thriving ecosystem ofvendors already exists providing software for A/Btesting of online product features and digitalcampaigns. Expect the universe of use cases thatcausal inference addresses to ated as a feature in existing products. Someanalytics vendors have integrated forms of causal37analysis into their offering. Expect vendors acrossa range of categories, from CRM to accountingsoftware, to roll out forms of causal features. Asthe next evolution of AI takes hold, it will pay to beat the forefront of this development.Causal inference will deliver on many of thepromiseswhereAIhasfailedto-date.Forexecutives to take advantage of the early-adopterpremium, the time to act is now.

CAUSAL INFERENCE 08ENDNOTES10. McKinsey, Causalnex: A Python library thathelps data scientists to infer causation ratherthan observing usalnex, accessed 6 June 202011. https://www.optimizely.com/12. https://www.causalens.com/13. McKinsey Global Institute, Notes from the AI1. Facebook Research, Efficient tuning of onlinesystems using Bayesian accessed on 6 June 20202. Google, Robust Causal Inference forIncremental Return on Ad Spend withRandomized Paired Geo Experiments,https://arxiv.org/pdf/1908.02922.pdf [PDF],accessed on 6 June 20203. LinkedIn, Causal inference from observationaldata, https://arxiv.org/pdf/1903.07755.pdf,accessed 6 June 20204. Netflix, A/B Testing and ith-experimentation-and-data5b0ae9295bdf, accessed 6 June 20205. AirBnb, Experiments at ments-at-airbnbe2db3abf39e7, accessed 6 June 20206. Microsoft, DoWhy: Making causal inferenceeasy, https://github.com/microsoft/dowhy,accessed 6 June 20207. Uber, Causal ML: A Python Package for UpliftModeling and Causal Inference with ML,https://github.com/uber/causalml, accessed 6June 20208. IBM, Causal Inference 360,https://github.com/IBM/causallib, accessed 6June 2020frontier: Insights from hundreds of use cases,https://www.mckinsey.com/ -use-cases-discussionpaper.ashx [PDF], accessed 6 June 202014. Wired, An AI Pioneer Wants His Algorithms toUnderstand the thms-understand-why, accessed 6 June202015. IEEE Spectrum, Yoshua Bengio, ReveredArchitect of AI, Has Some Ideas About What toBuild Next, t,accessed 6 June 202016. Eva Ascarza, Retention Futility: Targeting HighRisk Customers Might Be 0.1509/jmr.16.0163 [PDF], accessed 6 June 202017. Wicaksono Wijono, Beyond A/B Testing: Primeron Causal btesting-primer-on-causal-inferenced8e462d90a0b, accessed 6 June 202018. Optimizely, What Is A/B ossary/ab-testing/, accessed 6 June 2020

09 CAUSAL INFERENCE19. Netflix, It’s All A/Bout Testing: The Netflix28. McKinsey, Back to New at McKinsey BlogMeetExperimentation Platform,CausalNex, our new open-source library al reasoning and “what if” 458c15, accessed 6 June 2020mckinsey-blog/causalnex-our-new-open-20. Guido Friebel et al., Team Incentives andPerformance: Evidence from a Retail ract id 2649884, accessed 6 June 202021. Ron Kohavi and Stefan Thomke, The trelationships-in-data, accessed 6 June 202029. Kun Zhang et al., Learning causality andcausality-related learning: some recentprogress,Power of Online r-6051411/, accessed 6 June 2020of-online-experiments, accessed 6 June 202022. Gary King and Richard Nielsen, Why Propensity30. Yoshua Bengio et al., A Meta-TransferObjective for Learning to Disentangle CausalScores Should Not Be Used for .pdf [PDF],t.pdf [PDF], accessed 6 June 2020accessed 6 June 202023. Judea Pearl, The Seven Tools of Causal31. Facebook, PlanOut: A Framework for OnlineInference with Reflections on /stat ser/r481.pdfaccessd 6 June 2020[PDF], accessed 6 June 202024. Uber, Mediation Modeling at Uber:32. Google, CausalImpact:An R package for causalinference using Bayesian structural time-seriesUnderstanding Why Product Changes Workmodels,(and Don’t ml, accessed 6 June 2020accessed 6 June 202025. Uber, Using Causal Inference to Improve theUber User tuber/, accessed 6 June 202026. Iavor Bojinov et al., Causal inference fromobservational data: Estimating the effect of33. Microsoft, DoWhy: Making causal inferenceeasy, https://github.com/microsoft/dowhy,accessed 6 June 202034. IBM, Causal Inference 360,https://github.com/IBM/causallib, accessed 6June 202035. McKinsey, Causalnex: A Python library thatcontributions on visitation frequencyhelps data scientists to infer causation ratheratLinkedIn, https://arxiv.org/abs/1903.07755?,than observing correlation,accessed 6 June . As above.x, accessed 6 June 202036. https://www.causalens.com/37. Clearbrain, Introducing Causal ucingcausal-analytics, accessed 6 June 2020

CAUSAL INFERENCE AT THE TIPPING POINT Causal inference is now on the cusp of broad adoption in business. There are three main factors that are driving the emergence of causal inference as an accelerating category now. First, the limitations of current AI are becoming evident. The hype for A

Related Documents:

Chapter 1 (pp. 1 -7 & 24-33) of J. Pearl, M. Glymour, and N.P. Jewell, Causal Inference in Statistics: A Primer, Wiley, 2016. Correlation Is Not Causation The gold rule of causal analysis: no causal claim can be established purely by a statistical method. . Every causal inf

causal inference across the sciences. The authors of any Causal Inference book will have to choose which aspects of causal inference methodology they want to emphasize. The title of this introduction reflects our own choices: a book that helps scientists–especial

For Causal Inference We Need: 6/7/2018 dechter, class 8, 276-18 1. A working definition of “causation” 2. A method by which to formally articulate causal assumptions—that is, to create causal models 3. A method by which to link the structure of a causal model to features of data 4.

Causal inference primer 2. Causality from non-experimental data 3. Text as a control 4. Double machine learning . Text and Causal Inference: A Review of Using Text to Remove Confounding from Causal Estimates (ACL 2020) . chern.handout.pdf. Double ML Text Examples. What We’v

Latin Primer 1: Teacher's Edition Latin Primer 1: Flashcard Set Latin Primer 1: Audio Guide CD Latin Primer: Book 2, Martha Wilson (coming soon) Latin Primer 2: Student Edition Latin Primer 2: Teacher's Edition Latin Primer 2: Flashcard Set Latin Primer 2: Audio Guide CD Latin Primer: Book 3, Martha Wilson (coming soon) Latin Primer 3 .

Causal inference with graphical models – in small and big data 1 Outline Association is not causation How adjustment can help or harm Counterfactuals - individual-level causal effect - average causal effect Causal graphs - Graph structure, joint distribution, conditional independencies - how to esti

So a causal effect of X on Y was established, but we want more! X M Y The directed acyclic graph (DAG) above encodes assumptions. Nodes are variables, directed arrows depict causal pathways Here M is caused by X, and Y is caused by both M and X. DAGs can be useful for causal inference: clarify the assumptions taken and facilitate the discussion.

September 2012, after undergoing peer review. Accreditation Report (draft) submitted on 13 March 2012. The Final version was completed in September 2012, after undergoing review by Crown Agents and ERA and subsequent amendments. Final Project Report (draft) submitted on the 13 March 2012. The final version was