Forem

Software at Scale

Software at Scale 7 - Charity Majors: CTO, Honeycomb

Charity Majors is the CTO of Honeycomb, a platform that helps engineers understand their own production systems through observability. Honeycomb is very different from traditional monitoring tools like Wavefront as it is built for data with high cardinality and high dimensionality, which can instantly speed up debugging of many problems.

Apple Podcasts | Spotify | Google Podcasts

Subscribe now

Share Software at Scale

NOTE: This episode has some explicit language.

We talk about observability, monitoring, building your own database for a particular use case, starting a developer tool startup, having the right oncall culture, getting to fifteen minute deployments and more.

Highlights

Notes are italicized

05:00 - High cardinality and high dimensionality in Honeycomb. Data retention in Honeycomb - 60 days. Many monitoring systems, like Dropbox’s Vortex, downsample data in two weeks

13:00 - Observability driven development. The impact of deploying code within 15 minutes of it being merged in. Synchronous and asynchronous engineering workflows

19:00 - Setting up oncall rotations. What the size of a rotation should be

21:00 - How often should someone on a 24/7 oncall rotation be woken up? Once or twice. But there are exceptions. The impractical nature of some of Google SRE book’s “Being Oncall” chapter. Oncall for managers

Utsav @utsav_sha
There should be a version of Google SRE Book's "Being Oncall"
sre.google/sre-book/being… for companies that don't have money printing machines. At least 8 people for every rotation? I wishGoogle - Site Reliability Engineeringsre.google

31:00 - Why are monitoring tools so ubiquitous compared to observability tools?

36:00 - Observability & Tracing. What the future of observability infrastructure might look like

40:00 - What will the job of an SRE look like in the future? The split of roles in software engineering organizations in the future

43:00 - Shipping code faster makes engineers happier. How do you ensure your engineering organization is healthy, and the metrics to use. Learned helplessness in engineering organizations, and leadership failures

51:00 - Building internal tools in-house vs using external tools. The large impact that designers at Honeycomb have had on the product.

58:00 - The story of starting Honeycomb. Creating a “Minimum Lovable Product”. A description of Honeycomb internal architecture. Dealing with tail latencies.

71:00 - Continuous Deployment and releasing code quickly. Use calendly.com/charitym if you want to chat with Charity about continuous deployment best practices or anything else.

Episode source