Amy Hodler Andre Franca

Causal Graphs in Action: Navigating the Arrow of Why

A Talk by Amy Hodler and Andre Franca

About this Talk

While the world of predictive analytics gives us valuable tools, it’s completely unable to answer very fundamental questions about cause and effect.

This session examines how the open-source ecosystem and emerging technologies can be used for causal machine learning and better decision-making.

In this masterclass for data practioners, you’ll learn how to create and incorporate causal graphs into your predictive workflow to improve solutions.

We'll delve into a practical example using open source tools like PyWhy and emerging solutions to evaluate data and identify possible interventions to improve outcomes.  

Hands-on notebooks will explore using Causal Learn, LLMs, and other tools that expedite the complex process of modeling a problem as a causal graph. You’ll also learn methods for working with categorical data and dealing with confounding variables.

Description

In a world obsessed with making predictions and generative AI, we often mistake predicting the future with coming up with the best possible actions. While the world of predictive analytics gives us valuable tools, it’s completely unable to answer very fundamental questions:

“What will happen to my KPIs if I perform a certain action?”; “What was the root cause of this event?”; “What would have happened had we taken a different course?”.

Central to all these questions is the idea of understanding cause-and-effect relationships in the systems in which we operate. This session looks at using the open-source ecosystem and emerging technologies in the space of causal machine learning and better decision-making.

In the realm of predictions and causality, graphs have emerged as a potent model, leading to significant breakthroughs. These purposefully designed graphs capture and represent the intricate connections between entities, providing a comprehensive framework for understanding complex systems. Today, leading teams leverage this framework to surface directional patterns, compute complex logic, and as a foundation for causal inference.

In this masterclass, you’ll learn how to create and incorporate causal graphs into your predictive workflow to improve solutions. You'll gain a deep understanding of foundational concepts such as Jedeau Pearl's "do" operator, causal discovery, and how to keep domain expertise in the loop.

We'll delve into a practical example using open source tools like PyWhy as well as emerging solutions to evaluate traffic data and identify possible interventions to improve safety. The hands-on notebooks will explore using Causal Learn, LLMs, and other tools that expedite the complex process of modeling a problem as a causal graph. You’ll also learn methods for working with categorical data and dealing with confounding variables.

Join us as we examine graphs' transformative potential and profound impact on predictive modeling, explainability, and causality in the era of generative AI. This is an exciting time for our field, and we're thrilled to share our insights with you.

Key Topics

  • Causal ML
  • Causal graphs and how to create them
  • The DoWhy Process
  • PyWhy libraries, including DoWhy, Econ ML and Causal Learn
  • Challenges of causality

Target Audience

  • Data Scientists and Machine Learning Engineers
  • Data Analysts
  • Managers of the above

Goals

This masterclass is not just about theory but about practical application. You’ll learn the uses and basis of causality in machine learning, tips for dealing with current limitations of causal ML, and experience using various open source and emerging tools.

We’ll convey strategies for the most challenging part of the workflow: getting your data into an accurate causal graph. Attendees can actively participate in this process by following along via notebooks that walk through the complete PyWhy process of modeling causality, identification of impact, estimating impact, and refuting estimates.

Session outline:

  • Why should you care about cause and effect?
  • Moving beyond predictions
  • Descriptive vs. prescriptive: the shortfalls of ML and the rise of causality
  • The language of counterfactuals, interventions, and optimal actions
  • Causal graphs and the do operator
  • Causal graphs as a unifying model
  • The do-operator and DoWhy process
  • The pitfalls: confounders, colliders, and adjustments
  • Overview of Tools and Resources
  • PyWhy and its available libraries, including DoWhy, Econ ML, and Causal Learn
  • Emerging tools
  • Hands-On walkthrough using the DoWhy process
  • Motivational questions about neighbor safety and traffic accidents
  • PyWhy libraries: Macro scenario to estimate the influence of different variables on neighborhood safety ratings
  • Various tools: A deeper dive into traffic accident data to identify what impacts safety
  • Strategy for creating a causal graph
  • Common hazards
  • Iteration and human in the loop
  • Pros/Cons of different methods for generating causal graphs
  • Learning, cautions, and tips

Format

The beginning of the class will start in a lecture format but will then move to hands-on notebook exercises that walk through city and safety data. This class can also be monitored with the notebook examples followed later at your own pace.

We will work in Google Colab notebooks with standard Python packages to import city data and use PyWhy libraries to explore confounding variables and estimate causal influence. We’ll also employ Ergodic to investigate dependencies and create our traffic accident graph using embeddings for categorial data.

Level

Beginner - Intermediate

Prerequisite Knowledge

Basic Python and familiarity with graph concepts.

11 December 2024, 02:00 PM

Network Science & DataViz Stage

02:00 PM - 04:00 PM

About The Speakers

Amy Hodler

Amy Hodler

Graph Advisor and Consultant, GraphGeeks

Stage Host

Amy is highlighted as a distinguished speaker by G-Research and has authored/contributed to several books including Graph Algorithms (O’Reilly).

Amy Hodler

Andre Franca

Andre Franca

Co-Founder and CTO of ergodic.ai, ergodic.ai, London, United Kingdom

Andre Franca is the co-founder and CTO of ergodic.ai, developing the next generation of AI for decision-making and action planning.

Location

Convene 133 Houndsditch

133 Houndsditch, London

Neo4j

Neo4j, the Graph Database & Analytics leader, helps organizations find hidden relationships and patterns across billions of data connections deeply, easily, and quickly.

Platinum Sponsor

Ontotext

Connect the dots of your data! Ontotext helps enterprises to lower data management costs by up to 30%, enable data fabric architectures, create digital twins, utilize Graph RAG benefits, and take information delivery from days to minutes!

Gold Sponsor

Semantic Web Company / PoolParty

The vendor of PoolParty Semantic Suite. Graph-based text mining, recommender systems, and data fabric solutions.

Gold Sponsor

yWorks

yWorks specializes in the development of professional software solutions that enable the clear visualization of diagrams and networks.

Gold Sponsor

Oracle

We’re a cloud tech company that provides organisations around the world with computing infrastructure and software to help them innovate, unlock efficiencies and become more effective. We also created the world’s first – and only – autonomous database to help organise and secure our customers’ data.

Gold Sponsor

Ultipa

Ultipa builds next-gen graph XAI & real-time database empowering smart enterprises w/ smooth digital transformations.

Sliver Sponsor

Oxford Semantic Technologies

Oxford Semantic Technologies (OST) spun out from the University of Oxford and was acquired by Samsung in 2024. OST provides AI software to extract insights from big data, solving issues like medical diagnostics and financial crime. One founder is a BCS Lovelace Medal winner.

Sliver Sponsor

FlureeDB

Web3 data platform built on standards. Fluree powers connected, secure, and agile data ecosystems.

Bronze Sponsor

Senzing

Senzing is the first to deliver real-time, artificial intelligence for entity resolution. Senzing software enables organizations of all sizes to gain highly accurate and valuable insights about who is who and who is related to whom in data.

Bronze Sponsor

Semantic Partners

We partner with you, and your chosen semantic stack, to liberate your data's meaning from isolated silos.

Bronze Sponsor

Epsilla

All-in-one platform to create AI agents powered by your private data and knowledge. Make GenAI prototype to production 10 times faster. We are backed by Y Combinator. Start free today: https://epsilla.com

Bronze Sponsor

Neural Alpha

Since 2016 Neural Alpha have delivered cutting edge, sustainability centric Connected Data solutions for blue-chip corporates, financial institutions, Governments and NGOs. Our bespoke software & data solutions fuse AI, Knowledge Graphs, Taxonomies & other technologies for unprecedented insights.

Sliver Sponsor

GraphWise

Graphwise, born from the merger of Ontotext and Semantic Web Company, empowers enterprises to maximize AI ROI with trusted knowledge graph and semantic AI solutions, employing over 200 people globally across North America, Europe, and APAC.

Gold Sponsor

Lettria

Transparent, verifiable AI, Lettria lets your business docs and data deliver trustworthy AI answers.

Bronze Sponsor

Cricket Hill

Cricket Hill: Greek Organic Premium Olive Oil, Cosmo-Local Events and Tours

Partner

Want to sponsor this event? Contact Us