About this Talk
While the world of predictive analytics gives us valuable tools, it’s completely unable to answer very fundamental questions about cause and effect.
This session examines how the open-source ecosystem and emerging technologies can be used for causal machine learning and better decision-making.
In this masterclass for data practioners, you’ll learn how to create and incorporate causal graphs into your predictive workflow to improve solutions.
We'll delve into a practical example using open source tools like PyWhy and emerging solutions to evaluate data and identify possible interventions to improve outcomes.
Hands-on notebooks will explore using Causal Learn, LLMs, and other tools that expedite the complex process of modeling a problem as a causal graph. You’ll also learn methods for working with categorical data and dealing with confounding variables.
Description
In a world obsessed with making predictions and generative AI, we often mistake predicting the future with coming up with the best possible actions. While the world of predictive analytics gives us valuable tools, it’s completely unable to answer very fundamental questions:
“What will happen to my KPIs if I perform a certain action?”; “What was the root cause of this event?”; “What would have happened had we taken a different course?”.
Central to all these questions is the idea of understanding cause-and-effect relationships in the systems in which we operate. This session looks at using the open-source ecosystem and emerging technologies in the space of causal machine learning and better decision-making.
In the realm of predictions and causality, graphs have emerged as a potent model, leading to significant breakthroughs. These purposefully designed graphs capture and represent the intricate connections between entities, providing a comprehensive framework for understanding complex systems. Today, leading teams leverage this framework to surface directional patterns, compute complex logic, and as a foundation for causal inference.
In this masterclass, you’ll learn how to create and incorporate causal graphs into your predictive workflow to improve solutions. You'll gain a deep understanding of foundational concepts such as Jedeau Pearl's "do" operator, causal discovery, and how to keep domain expertise in the loop.
We'll delve into a practical example using open source tools like PyWhy as well as emerging solutions to evaluate traffic data and identify possible interventions to improve safety. The hands-on notebooks will explore using Causal Learn, LLMs, and other tools that expedite the complex process of modeling a problem as a causal graph. You’ll also learn methods for working with categorical data and dealing with confounding variables.
Join us as we examine graphs' transformative potential and profound impact on predictive modeling, explainability, and causality in the era of generative AI. This is an exciting time for our field, and we're thrilled to share our insights with you.
Key Topics
- Causal ML
- Causal graphs and how to create them
- The DoWhy Process
- PyWhy libraries, including DoWhy, Econ ML and Causal Learn
- Challenges of causality
Target Audience
- Data Scientists and Machine Learning Engineers
- Data Analysts
- Managers of the above
Goals
This masterclass is not just about theory but about practical application. You’ll learn the uses and basis of causality in machine learning, tips for dealing with current limitations of causal ML, and experience using various open source and emerging tools.
We’ll convey strategies for the most challenging part of the workflow: getting your data into an accurate causal graph. Attendees can actively participate in this process by following along via notebooks that walk through the complete PyWhy process of modeling causality, identification of impact, estimating impact, and refuting estimates.
Session outline:
- Why should you care about cause and effect?
- Moving beyond predictions
- Descriptive vs. prescriptive: the shortfalls of ML and the rise of causality
- The language of counterfactuals, interventions, and optimal actions
- Causal graphs and the do operator
- Causal graphs as a unifying model
- The do-operator and DoWhy process
- The pitfalls: confounders, colliders, and adjustments
- Overview of Tools and Resources
- PyWhy and its available libraries, including DoWhy, Econ ML, and Causal Learn
- Emerging tools
- Hands-On walkthrough using the DoWhy process
- Motivational questions about neighbor safety and traffic accidents
- PyWhy libraries: Macro scenario to estimate the influence of different variables on neighborhood safety ratings
- Various tools: A deeper dive into traffic accident data to identify what impacts safety
- Strategy for creating a causal graph
- Common hazards
- Iteration and human in the loop
- Pros/Cons of different methods for generating causal graphs
- Learning, cautions, and tips
Format
The beginning of the class will start in a lecture format but will then move to hands-on notebook exercises that walk through city and safety data. This class can also be monitored with the notebook examples followed later at your own pace.
We will work in Google Colab notebooks with standard Python packages to import city data and use PyWhy libraries to explore confounding variables and estimate causal influence. We’ll also employ Ergodic to investigate dependencies and create our traffic accident graph using embeddings for categorial data.
Level
Beginner - Intermediate
Prerequisite Knowledge
Basic Python and familiarity with graph concepts.