SRE Fundamentals with Google
Unlock the core principles of Site Reliability Engineering (SRE) in this hands-on four week course. Designed for individuals of all backgrounds, this course equips you with the skills to collect performance data, manage incidents, streamline operations, and confidently build and deploy services. Dive into the world of SRE and elevate your understanding of system reliability, preparing you to further explore careers in the field.
Course taught by expert instructors
Site Reliability Engineer, Google
Salim Virji develops reliable engineering practices and processes for Google’s SRE organization, and has built consensus and storage products for Google infrastructure. Salim’s interests include distributed systems and machine learning. He has contributed to several books on SRE, including The Site Reliability Workbook and Implementing Service Level Objectives. Salim received an AB in Classics from the University of Chicago and is a New York City Master Composter.
Learn and apply skills with real-world projects.
This course is for people with a general interest in Site Reliability Engineering, whether or not they have any formal background in information technology
This course offers rigorous grounding in SRE fundamentals, and will appeal to people seeking to build existing SRE skills
Comfort with high-school level algebra
Try these prep courses first
- Practical Alerting
- Monitoring: Four Golden Signals, Principles, and Tools
- Expand your knowledge of working with alerts to consider ad-hoc queries on data in your systems
- Use tracing tools to identify issues within a service
- Evaluate the cost of adding monitoring and tracing to your existing systems
- Incident Management and Response / Anatomy of an Incident
- Postmortems and Postmortem Culture: Learning from Failure
- Extract useful information from tools, documents, and logs
- Write a clear narrative explaining the scenario, its root cause, and suggested course of action
- Focus on addressing the problem so that it does not recur
- Reducing toil - what constitutes toil and how can I effectively reduce toil in my role?
- How on-call shifts can lead to service improvements
- Identify the sources of bugs, feature requests, and interrupts
- Consider ways to address these comprehensively
- Write out a process to systematically triage and report on these requests
- The value of automation
- Models of release engineering
- Build a release candidate
- Deploy the release candidate
- Rollback the binary or service to a previous version
A course you'll actually complete. AI-powered learning that drives results.
Transform your learning programs with personalized learning. Real-time feedback, hints at just the right moment, and the support for learners when they need it, driving 15x engagement.
Live courses by leading experts
Our instructors are renowned experts in AI, data, engineering, product, and business. Deep dive through always-current live sessions and round-the-clock support.
Practice on the cutting edge
Accelerate your learning with projects that mirror the work done at industry-leading tech companies. Put your skills to the test and start applying them today.
Flexible schedule for busy professionals
We know you’re busy, so we made it flexible. Attend live events or review the materials at your own pace. Our course team and global community will support you every step of the way.
Each course comes with a certificate for learners to add to their resume.
15-20x engagement compared to async courses
Support & accountability
You are never alone, we provide support throughout the course.
Get reimbursed by your company
More than half of learners get their Courses and Memberships reimbursed by their company.
Hundreds of companies have dedicated L&D and education budgets that have covered the costs.