A lot of my job is about safety. Safety prevents errors from happening but, more importantly, when people feel safe, things become safer.
It’s a strange phenomenon that I’ve seen time and time again where if you lay out processes and tools that make things like software deployments safer, the effects continue to compound long after the change has happened.
Something often overlooked during an incident is how we communicate with our customers and reassure them of the situation. How you convey an incident to the people paying for your service can make all the difference when it comes to contract renewal period. At a past company, our sales team regularly reported back from client meetings that they consistently mentioned how helpful and reassuring status page updates from the SRE team were, even when our service was fully down.