#Awesome#A curated list of Site Reliability and Production Engineering resources.
翻译 - 站点可靠性和生产工程资源的精选列表。
A collection of postmortems. Sorry for the delay in merging PRs!
翻译 - 验尸的集合。抱歉,合并PR的延迟!
A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)
翻译 - 精选的公开资源集合,介绍了世界各地的技术和精通技术的组织如何实践站点可靠性工程(SRE)
Compilation of public failure/horror stories related to Kubernetes
翻译 - 编写与Kubernetes相关的公共失败/恐怖故事
A collection of postmortem templates
#Awesome#A curated list of Site Reliability and Production Engineering Tools
An Incident Management Process / Post Mortem Template
Selection of Development Templates
How to run effective incident post-morterms
💀 🔥 ❄️ A basic analyzer for memory dumps containing managed code
Compilation of public incident/interesting/horror stories related to Kafka operations
Perform post-mortem Linux baselining and forensic analysis.