97 Things Every Sre Should Know: Collective Wisdom from the Experts
暫譯: 每位 SRE 應該知道的 97 件事:來自專家的集體智慧
Stolarsky, Emil, Woo, Jaime
- 出版商: O'Reilly
- 出版日期: 2020-12-29
- 定價: $1,700
- 售價: 8.8 折 $1,496
- 語言: 英文
- 頁數: 252
- 裝訂: Quality Paper - also called trade paper
- ISBN: 1492081493
- ISBN-13: 9781492081494
-
相關分類:
DevOps
-
相關翻譯:
SRE工程師應知應會97件事 (簡中版)
立即出貨 (庫存=1)
相關主題
商品描述
When your system goes down, every minute means lost business and angry customers venting frustration on social media. You may be at wits' end, wishing you knew more about the problem. Enter site reliability engineering (SRE). This practical book takes you through actionable advice on a wide range of topics including how to adopt SRE, where DevOps and SRE overlap, and how monitoring and observability differ.
Editors Jaime Woo and Emil Stolarsky, cofounders of Incident Labs, have collected 97 concise and useful tips from various colleagues and fellow professionals to help you expand your SRE skills through trusted best practices and new approaches to knotty problems. You'll hone your SRE skills through sound advice, including how to ask thought-provoking questions that will drive the direction of the field.
- Learn how SRE relates to concepts including DevOps and resilience engineering
- Assess how SRE is implemented across companies of different sizes
- Implement foundational concepts of SRE, including SLOs, error budgets, incident response, game days, and post-mortems
- Build and scale an SRE team for your organization's changing needs
- Evaluate the progress of SRE adoption and strategies and relate them back to stakeholders
商品描述(中文翻譯)
當您的系統出現故障時,每一分鐘都意味著商機的損失和顧客在社交媒體上發洩的不滿。您可能會感到束手無策,希望自己能對問題有更多的了解。這時,網站可靠性工程(Site Reliability Engineering, SRE)便派上用場。這本實用的書籍提供了可行的建議,涵蓋了廣泛的主題,包括如何採用 SRE、DevOps 與 SRE 的重疊之處,以及監控和可觀察性之間的差異。
編輯 Jaime Woo 和 Emil Stolarsky,Incident Labs 的共同創辦人,從各位同事和專業人士那裡收集了 97 條簡明且有用的建議,幫助您通過可信的最佳實踐和新方法來擴展您的 SRE 技能。您將通過切實的建議來磨練您的 SRE 技能,包括如何提出引人深思的問題,以推動該領域的發展方向。
- 了解 SRE 與 DevOps 和韌性工程等概念的關係
- 評估 SRE 在不同規模公司的實施情況
- 實施 SRE 的基礎概念,包括服務水平目標(SLOs)、錯誤預算、事件響應、演練日和事後檢討
- 為組織不斷變化的需求建立和擴展 SRE 團隊
- 評估 SRE 採用的進展和策略,並將其與利益相關者進行關聯
作者簡介
Emil Stolarsky is a site reliability engineer, who previously worked on caching, performance, & disaster recovery at Shopify and the internal Kubernetes platform at DigitalOcean. He is the program co-chair for SREcon EMEA 2019 and SREcon Americas West 2020, and contributed a chapter to the O'Reilly book "Seeking SRE."
Jaime Woo is an award-nominated writer, and is a frequent speaker at SREcon EMEA, Americas West, and Americas East. He spent three years as a molecular biologist, before working at DigitalOcean, Riot, and Shopify, where he launched the engineering communications function.
作者簡介(中文翻譯)
Emil Stolarsky 是一位網站可靠性工程師,曾在 Shopify 從事快取、性能及災難恢復的工作,並在 DigitalOcean 的內部 Kubernetes 平台上工作。他是 SREcon EMEA 2019 和 SREcon Americas West 2020 的程式共同主席,並為 O'Reilly 出版的書籍《Seeking SRE》貢獻了一章。
Jaime Woo 是一位獲獎提名的作家,並且是 SREcon EMEA、Americas West 和 Americas East 的常客演講者。他曾擔任三年的分子生物學家,之後在 DigitalOcean、Riot 和 Shopify 工作,並在那裡啟動了工程通訊功能。