Sold out, but you can still join the waitlist!
Course logo

Causal Inference for Data Science

This course teaches you how to answer the most fundamental question in data science: Does X cause Y? We’ve designed the course to cover the essential causal inference techniques while applying them to relevant, real-world examples. By the end of this course, you will know how to identify causal inference problems, pick the right tool for the job, and understand how these tools can break. You will be well-equipped to get maximum value from observational data sets and have the foundation to expand your causal inference toolkit with cutting-edge methods.

Instructor profile photo
Vinod Bakthavachalam
Data Scientist, Netflix
US$ 400
or included with membership
Coming soon
Sold out, but you can still join the waitlist!

Course taught by expert instructors

Instructor Photo
Affiliation logo

Vinod Bakthavachalam

Data Scientist, Netflix

Vinod works at Netflix where he builds causal inference and machine learning models to understand subtitle and dubbing quality. Before Netflix, Vinod worked at Pinterest and Coursera. He holds degrees in Economics, Biology, and Statistics from UC Berkeley and began his career in quantitative finance where he also disliked the soul crushing work. In addition to his professional experience, Vinod is a political junky who builds election forecasting models with his work appearing in the NY Times among other places. When he is not analyzing data on his computer, Vinod is creating cocktail recipes (without using Chat GPT)

The course

Learn and apply skills with real-world projects.

Who is it for?
  • Data practitioners who want to increase their impact by getting more value from observational data and rigorously analyzing complex business questions.

  • Data analysts who want to expand their analytics toolkit and transition into more methodologically demanding roles.

  • Ability to use Python for data munging, visualization, and basic statistical inference

  • Knowledge of statistical inference at the 101 level (e.g., at the level of Uplimit's Applied Statistics for Data Science)

  • Some familiarity with A/B testing and linear regression

Not ready?

Try these prep courses first

  • How to use linear regression for causal inference
  • Ways that linear regression can go wrong
  • The importance of simple models and data visualizations when investigating causal relationships
Throughout this course, we will work with a synthetic data set of a retailer that sells goods online and in physical stores. This week, we will get familiar with the data set and begin to answer our central causal question: Does convincing a customer to shop in person increase their lifetime value?
  • How to use instrumental variable analysis and regression discontinuity designs for causal inference
  • The assumptions underlying each method and how to interrogate them
  • How IV and RDD interrelate and enrich your causal inference toolkit
We will apply IV analysis to our causal question of interest to see how it enables richer inference than regression alone. Along the way, we will become more comfortable thinking through the assumptions of both IV and RDD.
  • How to use difference-in-differences to leverage time for causal inference
  • How to think of panel data analysis as a generalization of DD
  • The importance of adjusting your standard errors when time is a variable
We will enrich our analysis further by re-conceiving our data set as a panel and applying the techniques we learned this week. We will then put our analyses together in a final, executive-friendly report or presentation summarizing our findings.
  • How machine learning is changing the field of causal inference
  • High-level summaries of advanced causal inference techniques and when they might be helpful
  • Where to go from here
You now have the complete causal inference toolkit! There is no limit to the questions you can answer (well, depending on available data). We will spend this week wrapping up our report. For those of us who finished early, we have the optional task of finding a data set in the wild and applying our favorite technique to answering a question we’re passionate about.

A course you'll actually complete. AI-powered learning that drives results.

AI-powered learning

Transform your learning programs with personalized learning. Real-time feedback, hints at just the right moment, and the support for learners when they need it, driving 15x engagement.

Live courses by leading experts

Our instructors are renowned experts in AI, data, engineering, product, and business. Deep dive through always-current live sessions and round-the-clock support.

Practice on the cutting edge

Accelerate your learning with projects that mirror the work done at industry-leading tech companies. Put your skills to the test and start applying them today.

Flexible schedule for busy professionals

We know you’re busy, so we made it flexible. Attend live events or review the materials at your own pace. Our course team and global community will support you every step of the way.


Completion certificates

Each course comes with a certificate for learners to add to their resume.

Best-in-class outcomes

15-20x engagement compared to async courses

Support & accountability

You are never alone, we provide support throughout the course.

Get reimbursed by your company

More than half of learners get their Courses and Memberships reimbursed by their company.

Hundreds of companies have dedicated L&D and education budgets that have covered the costs.


Frequently Asked Questions

Still not sure?

Get in touch and we'll help you decide.