Operational Excellence Means Automation

Share on:

People use the term “operational excellence” in a lot of different ways. In its vaguest sense, it means continuous improvement as applied to operations. But you’re interested in what it means in the context of technology operations. And I’m here to tell you that it means automation.

Operational Excellence is one of the five pillars of the AWS Well-architected Framework. The AWS whitepaper lists six design principles for achieving operational excellence. I’ve paraphrased these principles for clarity. Here they are:

This is easily the most obvious. Turn everything into code that can be automatically executed by a machine. This includes the building of infrastructure, application deployments, testing, recovery, and anything that requires or benefits from being defined in a runbook. If it’s a repeatable process, code it and let a machine do it.

The delightful side-effect of defining everything as code is that code can serve as documentation. It becomes trivial to have a machine take code as input, execute it, and then generate some pretty documentation based on a template. The resulting documentation can then be used by another machine. All automatically, of course.

Without getting into the rationale behind this, the point is that the only way to make small changes as frequently as possible is to use automation. Pushing a code change to a repo whence it’s automagically built, deployed, and documented is faster than doing any of that manually. Reversing a change automatically is faster, too.

If you’re not automating something, and you can, then do it. Of course, you should avoid automating a bad process. Fix the process and automate it. And if there’s nothing to automate right now, keep looking, because changes will inevitably bring opportunities for automation.

Break things to cause failures. If recovering from those failures requires manual intervention, automate the recovery steps.

The idea is to share what you’ve learned with others. Of course, what you’ve learned is that automation is the key to achieving operational excellence. So just keep it simple and tell them to automate.

What operational excellence actually looks like depends on the organization. But no matter how you slice it, you’re closer to operational excellence if you automate than you are if you don’t. So yes, there is more to it than just automation, just as there’s more to driving than going from point A to point B. But if operational excellence is the goal, you need the vehicle to get there, and the only vehicle that will do it is automation.