Featured Posts

Hand-picked, artisanally curated and lovingly chosen by yours truly. Guaranteed to tickle the tastebuds of your mind and slowly set you off to the land of nod.

  • The Theory Behind Understanding Failure

    The Theory Behind Understanding Failure

    In the last 100 years, there’s been a lot of intense and distributed advancement in technology, and…


    Read post
  • 4d6 Psychic Damage: The effects of meaningless work

    4d6 Psychic Damage: The effects of meaningless work

    At the centre of the story of society is a moral: If you love your job, you won’t work a day in…


    Read post
  • CI/CD Best Practises: Scaling A Delivery Platform

    CI/CD Best Practises: Scaling A Delivery Platform

    After 1.5 years managing the Delivery team at Squarespace, it’s highlighted some things I’ve…


    Read post
  • The Importance Of A Golden Path

    The Importance Of A Golden Path

    In software engineering, the Golden Path is all about opinions and assumptions in a company. As we…


    Read post
  • The Incident Response Lifecycle

    The Incident Response Lifecycle

    This document is about the theory of incident response, it is not a prescription for how to do…


    Read post
  • Psychological Safety and the Only Pyramid Scheme That Works

    Psychological Safety and the Only Pyramid Scheme That Works

    It's a strange phenomenon that I've seen time and time again where if you lay out processes and…


    Read post
  • Improving Reliability by Splitting Up API Breaking Changes

    Improving Reliability by Splitting Up API Breaking Changes

    Often, when someone works on changes that span multiple services, they think of it as a separate…


    Read post
  • Snippet: Repairing a Degraded Raid Array

    Snippet: Repairing a Degraded Raid Array

    RAID arrays are a way we make data robust but what happens when they fail? Learn how to repair a…


    Read post
  • ChatOps: Building Someone You’d Want To Have A Beer With

    ChatOps: Building Someone You’d Want To Have A Beer With

    As Operations Engineers, we often overlook the user experience of tooling in favour of…


    Read post
The Theory Behind Understanding Failure
By Evan Smith

The Theory Behind Understanding Failure

In the last 100 years, there’s been a lot of intense and distributed advancement in technology, and our use of it as a species. As technology advanced, it also brought catastrophic and costly failure. There’s a lot to be learned about the theory behind failure, safety and resiliency, on the back of the events of the 20th century.

Read more

4d6 Psychic Damage: The effects of meaningless work
By Evan Smith

4d6 Psychic Damage: The effects of meaningless work

At the centre of the story of society is a moral: If you love your job, you won’t work a day in your life. So what happens if you believe you work a bullshit job? When the meaning has evaporated and left behind only questions and uncertainty? Join me in exploring the intersection of our identity and work.

Read more

CI/CD Best Practises: Scaling A Delivery Platform
By Evan Smith

CI/CD Best Practises: Scaling A Delivery Platform

After 1.5 years managing the Delivery team at Squarespace, it’s highlighted some things I’ve learned about CI/CD throughout my career. If you’re out there as part of the team that manages CI/CD at your company, hopefully this advice helps you understand the practical advice to run things quickly, some cultural values that underpin what you do, and how to scale your platform.

Read more

The Importance Of A Golden Path
By Evan Smith

The Importance Of A Golden Path

In software engineering, the Golden Path is all about opinions and assumptions in a company. As we grow as engineers and organisations, we build opinions on how to write, build, test and deploy code.

Read more

The Incident Response Lifecycle
By Evan Smith

The Incident Response Lifecycle

This document is about the theory of incident response, it is not a prescription for how to do incident response necessarily. Its aim is to familiarise yourself with the lifecycle of an incident and give you general advice. If you are not the Incident Commander (IC) for an incident, the information in this document is still useful, to understand the current priorities of the incident, what can be done to help and to hold the IC accountable.

Read more

Inbox Zero: How I Handle Email
By Evan Smith

Inbox Zero: How I Handle Email

The central idea to this all is that you treat your email inbox like a real inbox tray: Once something is dealt with, it's archived. Only the things you are dealing with right now stay in your inbox as a reminder that they are to-dos.

Read more

Psychological Safety and the Only Pyramid Scheme That Works
By Evan Smith

Psychological Safety and the Only Pyramid Scheme That Works

It's a strange phenomenon that I've seen time and time again where if you lay out processes and tools that make things like software deployments safer, the effects continue to compound long after the change has happened. As people feel more and more secure in doing deployments, raising issues and speaking confidently in a company, the amount of failure goes down.

Read more

Customer Communication During Incidents: The How-to of Status Page Updates
By Evan Smith

Customer Communication During Incidents: The How-to of Status Page Updates

Something we don't spend nearly enough time trying to master as engineers is external communication. When all hell's broken loose, how do you calm and reassure thousands of customers that you're on the case.

Read more

Improving Reliability by Splitting Up API Breaking Changes
By Evan Smith

Improving Reliability by Splitting Up API Breaking Changes

Often, when someone works on changes that span multiple services, they think of it as a separate Pull Request for every project. Then, when it comes to deploy day, there’s a concern: We want to make a change to X but Y also needs that change to work - how do we deploy these at the same time?

Read more

How to Move From Dublin to Berlin
By Evan Smith

How to Move From Dublin to Berlin

In October 2019, I moved from Dublin to Berlin. As a Worrier In Residence employed at the Life Of Evan, I planned a lot around the move, the new culture, the new city and all the wonderful things that come with a new adventure. In doing that, I found a common theme: All the resources about moving to

Read more

Snippet: Repairing a Degraded Raid Array
By Evan Smith

Snippet: Repairing a Degraded Raid Array

RAID arrays are a way we make data robust but what happens when they fail? Learn how to repair a degraded/failing RAID array step by step.

Read more

ChatOps: Building Someone You’d Want To Have A Beer With
By Evan Smith

ChatOps: Building Someone You’d Want To Have A Beer With

As Operations Engineers, we often overlook the user experience of tooling in favour of functionality. CLIs end up with vast sprawling seas of flags and nested commands requiring a minotaur to traverse. UX is an important part of tooling. As a user, well-thought out interfaces reinforce confidence.

Read more