Implementing Service Level Objectives: A Practical Guide to Slis, Slos, and Error Budgets
暫譯: 實施服務水平目標:SLI、SLO 與錯誤預算的實用指南

Hidalgo, Alex

  • 出版商: O'Reilly
  • 出版日期: 2020-10-06
  • 定價: $2,200
  • 售價: 9.5$2,090
  • 語言: 英文
  • 頁數: 404
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1492076813
  • ISBN-13: 9781492076810
  • 立即出貨 (庫存 < 3)

買這商品的人也買了...

相關主題

商品描述

Although service-level objectives (SLOs) continue to grow in importance, there's a distinct lack of information about how to implement them. Practical advice that does exist usually assumes that your team already has the infrastructure, tooling, and culture in place. In this book, recognized SLO expert Alex Hidalgo explains how to build an SLO culture from the ground up.

Ideal as a primer and daily reference for anyone creating both the culture and tooling necessary for SLO-based approaches to reliability, this guide provides detailed analysis of advanced SLO and service-level indicator (SLI) techniques. Armed with mathematical models and statistical knowledge to help you get the most out of an SLO-based approach, you'll learn how to build systems capable of measuring meaningful SLIs with buy-in across all departments of your organization.

  • Define SLIs that meaningfully measure the reliability of a service from a user's perspective
  • Choose appropriate SLO targets, including how to perform statistical and probabilistic analysis
  • Use error budgets to help your team have better discussions and make better data-driven decisions
  • Build supportive tooling and resources required for an SLO-based approach
  • Use SLO data to present meaningful reports to leadership and your users

商品描述(中文翻譯)

儘管服務水平目標(SLO)在重要性上持續增長,但有關如何實施它們的信息卻明顯不足。現有的實用建議通常假設您的團隊已經具備必要的基礎設施、工具和文化。在本書中,知名的 SLO 專家 Alex Hidalgo 解釋了如何從零開始建立 SLO 文化。

本書非常適合作為任何創建 SLO 基礎的可靠性文化和工具的入門書籍和日常參考指南,提供了對進階 SLO 和服務水平指標(SLI)技術的詳細分析。憑藉數學模型和統計知識,幫助您充分利用基於 SLO 的方法,您將學會如何構建能夠測量有意義的 SLI 的系統,並獲得組織內所有部門的支持。

- 定義能夠從用戶的角度有意義地衡量服務可靠性的 SLI
- 選擇適當的 SLO 目標,包括如何進行統計和概率分析
- 使用錯誤預算幫助您的團隊進行更好的討論並做出更好的數據驅動決策
- 建立支持 SLO 基礎方法所需的工具和資源
- 使用 SLO 數據向領導層和用戶呈現有意義的報告

作者簡介

Alex Hidalgo is a Site Reliability Engineer and expert at all things related to Service Level Objectives. He developed an interest in computers at a young age, started writing his first BASIC programs at around the age of nine, and remembers the Internet when it was all still text. He eventually turned his hobby into a career, working in various capacities as a network engineer, security engineer, and systems administrator and in many roles within the world of IT support. After moving to New York, he joined Admeld as a Technical Operations Engineer, only to find himself employed by Google a few months later due to acquisition.

At Google, Alex was first introduced to the discipline of Site Reliability Engineering, which connected so closely with him that he wonders how he ever did anything else. Eventually, he found his other calling as an educator, writer, and speaker, traveling all over the world training other Site Reliability Engineers, becoming one of the primary developers of the Coursera Google IT Professional Certification, and contributing to multiple chapters of The Site Reliability Workbook -- most notably "Implementing SLOs" and "SLO Engineering Case Studies."

Recently, he has joined Squarespace, where his focus is now on spreading the concepts of SLO-based approaches to service reliability -- both internally and across the entire industry. When not sharing his passion for error budgets with others, you can find him scuba diving or watching college basketball. He lives in Park Slope, Brooklyn, with his partner Jen and a rescue dog named Taco. He thinks about SLOs so much he once had a dream about defining some for Taco. Twitter handle: @ahidalgosre

作者簡介(中文翻譯)

亞歷克斯·希達爾戈(Alex Hidalgo)是一位網站可靠性工程師,專精於服務水平目標(Service Level Objectives)相關的所有事務。他在年輕時便對電腦產生興趣,約在九歲時開始撰寫他的第一個BASIC程式,並記得當時的網際網路還是全文字的。他最終將這個興趣轉變為職業,擔任過網路工程師、安全工程師和系統管理員等多種職位,並在IT支援的世界中擔任多個角色。搬到紐約後,他加入了Admeld擔任技術運營工程師,幾個月後因為被收購而成為Google的一員。

在Google,亞歷克斯首次接觸到網站可靠性工程(Site Reliability Engineering)這一領域,這與他的興趣緊密相連,讓他不禁思考自己過去是如何做其他事情的。最終,他發現自己還有另一個使命,那就是成為教育者、作家和演講者,環遊世界培訓其他網站可靠性工程師,成為Coursera Google IT專業認證的主要開發者之一,並為《網站可靠性工作手冊》(The Site Reliability Workbook)的多個章節做出貢獻,特別是「實施SLOs」(Implementing SLOs)和「SLO工程案例研究」(SLO Engineering Case Studies)。

最近,他加入了Squarespace,現在的重點是推廣基於SLO的服務可靠性方法——無論是在內部還是整個行業中。當他不在與他人分享對錯誤預算的熱情時,你可以找到他潛水或觀看大學籃球賽。他與伴侶珍(Jen)和一隻名叫塔可(Taco)的救援犬住在布魯克林的公園坡(Park Slope)。他思考SLO的次數之多,曾經夢到為塔可定義一些SLO。推特帳號:@ahidalgosre