#

SRE

Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.

bregman-arie/devops-exercises
https://static.github-zh.com/github_avatars/bregman-arie?size=40

#面试#DevOps 面试问题,知识点涉及 Linux、Jenkins、AWS、SRE、Prometheus、Docker、Python、Ansible、Git、Kubernetes、Terraform、OpenStack、SQL、NoSQL、Azure、GCP、DNS、Elastic、网络、虚拟化等

Python 73.98 k
21 天前
https://static.github-zh.com/github_avatars/awesome-foss?size=40

#Awesome#A curated list of amazingly awesome open-source sysadmin resources.

28.58 k
10 天前
https://static.github-zh.com/github_avatars/dastergon?size=40
12.31 k
10 个月前
upgundecha/howtheysre
https://static.github-zh.com/github_avatars/upgundecha?size=40

A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)

翻译精选的公开资源集合,介绍了世界各地的技术和精通技术的组织如何实践站点可靠性工程(SRE)

JavaScript 9.29 k
2 个月前
runatlantis/atlantis
https://static.github-zh.com/github_avatars/runatlantis?size=40

Terraform Pull Request Automation

翻译地形拉取请求自动化

Go 8.18 k
6 小时前
https://static.github-zh.com/github_avatars/isno?size=40

⭐ 【开源书籍】深入讲解内核网络、Kubernetes、ServiceMesh、容器等云原生相关技术。经历实践检验的 DevOps、SRE指南。如发现错误,谢谢提issue

JavaScript 8.05 k
10 天前
linkedin/school-of-sre
https://static.github-zh.com/github_avatars/linkedin?size=40

At LinkedIn, we are using this curriculum for onboarding our entry-level talents into the SRE role.

翻译在LinkedIn,我们正在使用此课程将入门级人才培养为SRE角色。

HTML 7.94 k
8 个月前
https://static.github-zh.com/github_avatars/k8sgpt-ai?size=40
Go 6.48 k
4 小时前
https://static.github-zh.com/github_avatars/StackStorm?size=40

StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 i...

翻译StackStorm(又称“ IFTTT for Ops”)是事件驱动的自动化,可进行自动修复,安全响应,故障排除,部署等。包括规则引擎,工作流,具有6000多个动作的160个集成包(请参阅https://exchange.stackstorm.org)和ChatOps。安装程序位于https://docs.stackstorm.com/install/index.html。有什么问题吗https://forum.stackstorm.com/。

Python 6.23 k
3 天前
https://static.github-zh.com/github_avatars/hjacobs?size=40

Compilation of public failure/horror stories related to Kubernetes

翻译编写与Kubernetes相关的公共失败/恐怖故事

HTML 6.22 k
5 年前
https://static.github-zh.com/github_avatars/chaosblade-io?size=40

An easy to use and powerful chaos engineering experiment toolkit.(阿里巴巴开源的一款简单易用、功能强大的混沌实验注入工具)

Go 6.1 k
14 天前
https://static.github-zh.com/github_avatars/rundeck?size=40

Enable Self-Service Operations: Give specific users access to your existing tools, services, and scripts

Groovy 5.74 k
8 小时前
litmuschaos/litmus
https://static.github-zh.com/github_avatars/litmuschaos?size=40

Litmus helps SREs and developers practice chaos engineering in a Cloud-native way. Chaos experiments are published at the ChaosHub (https://hub.litmuschaos.io). Community notes is at https://hackmd....

翻译Litmus是一个使用kubernetes本机方式进行混沌工程的工具集。 Litmus为Cloud-Native开发人员和SRE提供混乱的CRD,以注入,编排和监视混乱,以发现生产中Kubernetes部署的弱点。

Go 4.66 k
8 小时前
https://static.github-zh.com/github_avatars/jonmosco?size=40
Shell 3.65 k
1 个月前
https://static.github-zh.com/github_avatars/leandromoreira?size=40

CDN Up and Running - Building a CDN from Scratch to Learn about CDN, Nginx, Lua, Prometheus, Grafana, Load balancing, and Containers.

Lua 3.47 k
1 年前
loading...
Website
Wikipedia
维基百科