Lecture 1: Causal Inference in Data Science PDF
Document Details
Uploaded by Deleted User
Menoufia University
Dr. Marwa Sharaf
Tags
Summary
This lecture introduces causal inference, a crucial concept in data science, particularly for machine learning applications. The lecture explores core concepts like causality, outlining its importance. It discusses various aspects of causality and how causal inference is utilized to understand and predict outcomes, offering potential applications.
Full Transcript
Presented by: Dr. Marwa Sharaf Chapter 1 – Lecture. 1 1 Outline What is causality. What is Causal Inference ? Why Causal Inference ? Steps of causal inference. Types of Association. Characteristics of a cause. Types of causal relationship. Types of a ca...
Presented by: Dr. Marwa Sharaf Chapter 1 – Lecture. 1 1 Outline What is causality. What is Causal Inference ? Why Causal Inference ? Steps of causal inference. Types of Association. Characteristics of a cause. Types of causal relationship. Types of a cause. Criteria for assessing causes. How causal inference works. A mathematical definition of a causal model Causal inference vs. Prediction? Causal Inference vs. association. Causal inference examples. What is Causality? What is Causality? What is Causal Inference in data science ? ◼ Causal Inference for Data Science introduces data-centric techniques and methodologies you can use to estimate causal effects. ◼ With causal inference, it’s all about figuring out why something happens. ◼ It extract relationship between two variables, A causes B if change in A will change B. Why Causal Inference? ◼ In many businesses and organizations, when we use machine learning, our goal is to make educated guesses about what will happen in the future. ◼ For example, a hospital might want to guess which patients will become very sick soon, so they can treat these patients first. Often, just being able to make these guesses is enough; understanding why they happen isn’t always necessary. Why Causal Inference? ◼ With causal inference, it’s all about figuring out why something happens. More than that, it’s about asking what can be done to change an outcome. ◼ For instance, a hospital might want to understand which factors cause a certain illness. If they know these causes, they can take steps like making public health policies or developing drugs to prevent the illness, aiming to reduce the number of people who get sick. Why Causal Inference for Data Scientist? ◼ As data scientists or analysts, the questions we’re most interested in often involve understanding cause and effect. We say that X causes Y if, when we change X, Y also changes. Why Causal Inference for Data Scientist? ◼ For example, if your goal is to keep your customers longer, you might want to know what actions could make them stay. This is a causal question: you’re trying to find out what’s behind your customer retention rates so you can improve them. ◼ This idea applies to many areas, like designing marketing strategies, setting prices, adding new features to an app, making changes in an organization, introducing new policies, or developing medications. Understanding causality helps us see the effects of our decisions and identify which factors influence the outcomes we care about Why Causal Inference for Data Scientist? ◼ Understanding causality isn’t straightforward. For example, imagine you’re trying to figure out why some people get sick more than others. When looking at the data, you notice that people living in the country seem to get sick more often than city dwellers. Does this mean that living in the country makes people sick? If that were true, moving to a city should mean you get sick less often. But is that the whole story? Why Causal Inference for Data Scientist? ◼ Living in a city has its own challenges, like more pollution, lack of fresh food, and stress. So, the fact that city dwellers get sick less might be due to the fact that city folks usually have higher incomes that afford them better healthcare, more nutritious food, and gym memberships. If this is what’s really happening, then moving from the country to a city might not actually make you healthier. In fact, without the income to afford healthcare or mitigate new city-based health risks, you might even get sicker. Why Causal Inference for Data Scientist? 5 ◼ This is the reason learning about causal inference is so important for data scientists: it gives us tools to estimate causal effects. That is, it helps us to discern mere coincidences (correlations) from true causes (causation), allowing us to identify the actual factors that lead to certain outcomes.. 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5