Justify Multicloud Workload Automation – Doing the Maths

How many times have you wondered:

Should I automate this task? What about the end-to-end process?
How many times do I need to use the automation to justify the work needed to set it up and maintain it?

As an engineer myself, I have asked these question many times. Can I write a script to do repetitive work? What if I only plan to run the script 10 times this year? Or should I just perform the task manually? Again?

Both individuals and organizations need to understand how to answer these questions. Typically, it is the individual (administrators, operators, architects, developers, etc.) who understands the technical tasks required; and it is the organization (multi-function teams, IT operations, datacenter engineer, systems reliability engineer) that understands the business processes impact, and funds strategic initiatives such as increased automation.

Generally, it makes sense to automate an IT task or process if:

a) You can run the automation enough times to spread out the cost of creating the automation capability (hours of development, testing, and ongoing maintenance).

Pattern – “I’m going to do this task X times in the next year and it will take me a couple hours of writing, testing, and updating a script to add to my bag of tools.”

b) Each automated execution has a lower cost than performing the work manually.

Anti-Pattern – every time I run the automation I end up spending more time tweaking things because the environment or process keeps changing.

The Maths

I built a simple model to help methodically evaluate your decision, based on work originally developed by Brown and Hellersten at the IBM Thomas Watson Research. With a mathematic model, you “tweak the knobs” and consider the impact of different variables on your decision. A main consideration is your investment of time and effort as a quantifiable cost. Time is money, right?

The model compares the fixed and variable costs of manual task or process versus the automated version of the same. The cost calculation is based on the variable N, which represents the number of times the automated process will be executed.

Consider 4 key variables

1 – Manual fixed costs – cost of developing the manual process. How much does it cost to produce a box of ball bearings? 1^st you need a factory (manual fixed cost) then you need some steel rod and 5 minutes of labor (manual variable cost). Do not spend too much time on defining the “factory cost” because it is a foundational requirement for both the manual and automated processes, and thus it drops out of your final decision calculation. (If you are thinking “Yeah, but I can do things with automation I can’t do manually” you are on the right track – see footnote)

2 – Manual variable costs – the amount of time and effort involved each time you do the work manually. Think of this is the pain you are trying to eliminate. Every time X lands on your task queue– you are about to experience manual variable cost.

3 – Automated fixed costs – this is the cost of the tools used for multiple processes, and time and effort to build and maintain automation artifacts per workload, task, or process you automate. Python or Perl = fun little programming effort. Orchestration tool or cloud management platform = need to get PO approved. Key – if you are automating deployments to 3x environments and have 3x automation artifacts to build and maintain – that automated fixed cost multiplier works against you.

4 – Automated variable costs – the amount of time and effort involved in automated execution. Ideally it is close to zero if task or process is fully automated. But automation does not always work so you may have to consider what percent of automated deploy fail, and include the cost of manually debugging and reworking exceptions or failures.

Calculate all of the things

We can then solve for N_T which we call the Automation Tipping Point. This represents the cross over point at which it becomes cost effective to automate the process. Keep reading. This will get interesting.

As seen below, automation is justified when N_T is greater than the ratio of automation fixed costs divided by manual variable costs minus automated variable costs.

Tweaking the knobs

If you could wave a magic wand and create an “easy button” for everything you do, then N_Twould be 1. However, consider the impact of your reality on this ratio. IF the numerator goes down (automation fixed costs) THEN the tipping point goes down and it is easier to justify automation. IF the denominator goes down (manual is easy, or automation has lots of errors) – THEN the tipping point goes up e.g. harder to justify. Things you can tweak include:

Automation fixed costs – if the cost of automation platform goes down, then you can more easily spread that cost out and justify automating a specific task. If you automate many different workloads, the cost is also spread across all those deployment processes.
Manual variable costs – if performing tasks manually is becoming increasingly more difficult, i.e. a need for specific and deep skills in a particular cloud provider (AWS, GCP, Azure), then you can more easily justify the cost of automating that task.
Automated variable costs – if your automation artifact is effective across multiple environments, then you achieve consistency, predictability, and repeatability that also reduces expensive errors. That is good.

Applied to Multicloud Workload Deployment

When applying this to decision to the automated deployment of applications in a multicloud cloud management scenario, you should assume:

The person who creates the automation is not necessarily the user of the automation. If you want to scale, one person does the automation work to build an easy button, and many persons use the easy button. So, the automation requires guardrails to control the automation which raises automation fixed costs, and thus can raise the tipping point (Someone has to determine policy, implement governance rules that guide user decisions).
Different environments may require unique automation tools. This reality drives up automated fixed costs even further, and likewise raises the tipping point.
Different environments may require multiple environment specific automation artifacts. If you have multiple automation artifacts that are environment specific (e.g. a script for workload 1 in vSphere and another script for workload 1 in AWS), fixed costs go up. More is worse. Configuration tools help here.
Automation should drive consistency, predictability, and repeatability that also reduces expensive errors. The variability of different datacenter and cloud environments may result in variation and exceptions that increase the number of exceptions that need to be handled manually, which also raises the tipping point significantly.

Footnote: Thoughts from Automation Guru Jeremy Guthrie

Jeremy Guthrie, Technical Architect for Orchestration at CDW has some has three tips for success for automation projects in this 3 min video.

He approaches automation with his customers from a different angle, and offers these additional considerations:

Error checking – automation allows error checking to be built in. Manual processes never have that option. I might be building a process that runs 10-12 times per year but it really needs to work and/or recover. Unit testing scripts can be built early on or later, but their value is immense in assuring that after upgrades, systems are working as planned. e.g. Amazon upgrades and promotes new code every 11 minutes because that unit testing is central to their velocity.
Don’t over think it – don’t get too hung up on calculating all of the things [above]. If you try to model all the costs and benefits, you will get overwhelmed and not know where to start. Instead, just start by automating something, get familiar, and as you gain some successes, then you can start to strategize on what’s next and what’s worth it.
Don’t focus on the dollars – as a datacenter engineer working on automation, you need operations and developer buy-in. It has to look attractive on multiple fronts (easy to use, reliable, makes work faster) or else my experience is that Ops or Dev are both capable of burying the project.

Wise words of advice!

Bottom line

A multicloud management platform like Cisco CloudCenter can demonstrate real value when you are automating application deployment and management across different datacenter, private and public cloud environments.

Request a demo here.