#36 CrowdStruck
Manage episode 431934097 series 2995392
CrowdStruck: CrowdStrike Outage Analysis & Lessons in Reliability
In this episode, Arun and Jake analyze the recent CrowdStrike outage that affected millions of devices and critical industries worldwide. They discuss the technical reasons behind the incident, including configuration update issues with the Falcon sensor's content interpreter, the potential lapses in testing procedures, and the broader economic impacts. The conversation also touches on the importance of robust software testing and deployment practices, particularly for companies that serve as foundational platforms in the tech ecosystem. Finally, they consider the repercussions for CrowdStrike's market position and future remediation steps.
00:00 Introduction and Episode Setup
00:12 Discussing Past Episodes and Future Plans
00:48 CrowdStrike Outage Overview
02:10 Technical Breakdown of the Outage
04:20 Testing and Deployment Practices
15:04 Impact and Consequences of the Outage
27:03 Recovery and Future Prevention
36:07 Closing Thoughts and Hot Take Segment
Additional references:
38 episoade