Space is limited
Course logo

Data Centric Deep Learning

Learn to build, improve, and repair deep learning models with a data-centric approach. This course will put you in the shoes of a deep learning engineer, and simulate the real world challenge of improving data quality, building and testing deep learning models, and improving performance with a human-in-the-loop. Week by week, we will develop an understanding of the critical role of data in deep learning operations – from integration tests to deep learning tooling to iterative annotation. Learn the best practices for deep learning in the real world.

Instructor profile photo
Andrew Maas
Co-founder and CEO of Pointable
Instructor profile photo
Mike Wu
PhD Scholar at Stanford
US$ 400
or included with membership
4 weeks
Space is limited

Course taught by expert instructors

Instructor Photo
Affiliation logo

Andrew Maas

Co-founder and CEO of Pointable

Andrew Maas is co-founder and CEO of Pointable, a platform for metrics-driven development of RAG-LLM conversational agents. He previously led teams developing data-centric deep learning approaches at Apple and was a co-founder of Roam Analytics (acquired by Parexel) -- a natural language extraction platform for healthcare. Andrew earned a PhD in computer science from Stanford University, advised by Andrew Ng and Dan Jurafsky, where his work focused on large-scale deep learning for spoken and written language. Andrew also advises machine learning startups and teaches a graduate course on spoken language processing at Stanford.

Instructor Photo
Affiliation logo

Mike Wu

PhD Scholar at Stanford

Mike Wu is currently a fifth year PhD student at Stanford University advised by Noah Goodman. His research spans the fields of inference algorithms, deep generative models, and unsupervised learning. Mike’s research has appeared in NeurIPS, ICLR, AISTATS, and other top ML conferences with two best paper awards and his work has been featured in the New York Times. Mike previously worked as a software engineer at an AI startup called Lattice Data, and as a research engineer at Meta’s applied machine learning group. Mike and Andrew designed and taught a new version of Stanford’s CS224S: Spoken Language Processing in 2022.

The course

Learn and apply skills with real-world projects.

Who is it for?
  • Students who want to learn the infrastructure and operations behind practical deep learning for real world applications.

  • Students who have taken the first two courses in the Uplimit ML foundations track.

  • Data scientists and research engineers looking for best practices in building and maintaining deep learning models.

  • And students curious about the new data-centric approach to ML and AI.
  • Familiarity with Python, and comfortable with reading documentation for learning new tools. Uplimit's Python for Machine Learning course or equivalent.

  • Experience in basic machine learning and data science. Uplimit Introduction to Applied ML: Supervised Learning course or equivalent.

  • Basic web development with tools like Flask. Students do not need to be experts at building web applications.

  • Basic experience in deep learning, including using PyTorch. Uplimit Deep learning essentials, ML Coursera course, or equivalent.

Not ready?

Try these prep courses first

  • How to inspect and improve data quality and annotation quality.
  • How to identify and remove data anomalies or outliers.
  • The types of annotation errors and their effects on model performance.
  • Data analysis in NLP and computer vision.
  • Simulations of annotation errors and a model evaluation framework.
  • Annotation analysis for (1) a bounding-box task for object detection and (2) a text span task for entity recognition.
  • Train deep learning models in two different modalities: text and images.
  • To construct reproducible end-to-end machine learning workflows.
  • To finetune small networks on top of foundation models in computer vision.
  • Post-training processing (such as exporting, tracking, compression) of deep learning models for deployment.
  • Best practices for continuous testing of deep learning models.
  • Comfort with popular deep learning tools like Weights and Biases, ONNX, and FastAPI.
  • Integration tests, regression tests, and directionality tests for model quality assurance.
  • A MetaFlow pipeline that chains together training, evaluation, and deployment on a benchmark dataset of handwritten digits.
  • The role of active learning and self-learning in a deep learning framework.
  • How to use unlabeled data and model uncertainty to improve performance.
  • Best practices for designing web applications with embedded ML models.
  • Tools to identify which examples to prioritize for labeling.
  • Tools to noisily label large batches of data quickly without a third party service.
  • A lightweight web application in Flask that supports human-in-the-loop labeling.
  • How to identify and handle distribution shift and adversarial examples.
  • The different types of distribution shift in NLP and computer vision.
  • Data augmentation techniques for model robustness.
  • Leverage the implemented workflows to quickly retrain and deploy a model.
  • Pipeline to handle the appearance of a new label class.
  • Repair models in response to adversarial examples in a visual classification task with outlier image watermarks.
  • Monitoring tools to track model performance and detect distribution shifts.

A course you'll actually complete. AI-powered learning that drives results.

AI-powered learning

Transform your learning programs with personalized learning. Real-time feedback, hints at just the right moment, and the support for learners when they need it, driving 15x engagement.

Live courses by leading experts

Our instructors are renowned experts in AI, data, engineering, product, and business. Deep dive through always-current live sessions and round-the-clock support.

Practice on the cutting edge

Accelerate your learning with projects that mirror the work done at industry-leading tech companies. Put your skills to the test and start applying them today.

Flexible schedule for busy professionals

We know you’re busy, so we made it flexible. Attend live events or review the materials at your own pace. Our course team and global community will support you every step of the way.


Completion certificates

Each course comes with a certificate for learners to add to their resume.

Best-in-class outcomes

15-20x engagement compared to async courses

Support & accountability

You are never alone, we provide support throughout the course.

Get reimbursed by your company

More than half of learners get their Courses and Memberships reimbursed by their company.

Hundreds of companies have dedicated L&D and education budgets that have covered the costs.


Course success stories

Learn together and share experiences with other industry professionals

This course is incredibly important and useful! I believe it should be required in any data-science curriculum. We gained practical skills to tackle problems that data scientists and machine learning engineers often face when dealing with real-world messy data. I learned so much more than the course material due to the encouragement and guidance of Mike Wu!

Mateo IbarguenData Scientist at

DCDL has taken my experience with ML from modeling datasets in Colab notebooks to working in a full ML system in a codebase. We touched upon the full lifecycle of ML — from annotating and cleaning data, to model training, to evaluation and testing, deployment, and monitoring. What an incredibly insightful 4 weeks of learning!

Max AllenRisk Engineering @ Ramp

This final course in the ML track series provided a realistic framework bridging the concepts we have covered in all 3 classes into a more productionalized format. This course has given a real insight into what a real ML backend may look like and the steps required to get there.

Josh FeagansDevOps Engineer at Charles Schwab

Frequently Asked Questions

Still not sure?

Get in touch and we'll help you decide.