When Should You Automate?

Automation is the way forward for the tech industry, but it can be a double edged sword. It can cut costs and reduce maintenance when done right, but it can also convolute processes and balloon budgets when it’s done wrong. Like any given technology or process in general, there is a time and place it is effective, and areas where it is either less effective, ineffective, or even detrimental. Automation is powerful; automation is efficient; automation is not magic.

Until the singularity comes for us all, automation is limited compared to a human. This gap is getting tighter and tighter, and the distinction between where a human and a machine are most effective is getting fuzzier and fuzzier. The limitation was once what a computer could effectively do, but now the primary limitation is the cost effectiveness of automating a solution.

Automate Your Boring Problems Away

Image by Gaertringen from Pixabay

How many times have you had to go through bunches of logs or spreadsheets to do the same basic thing over and over for your job? How many times have you been asked to transform data from one format into data in a slightly different format? When the task is highly repetitive and formulaic, it is a great piece of low hanging fruit for automation.

I got started automating because I was asked to open hundreds of log files, find something on the Xth line and compare it to something on the Yth line. My manager expected it to take almost a week to go through by hand. It did take me almost a week, but not because I did it all by hand. I started learning the basics of Bash to do it because that was what was available on all of the machines involved. It took me about a week the first time, it took me hours for the next few runs and most of that was transferring files and prettying the data up.

Automate Your Continuous Problems Away

Automation has an associated cost to develop, and a technical debt to maintain. The cost can depend on many factors including: familiarity with the problem, familiarity with automation solutions (programming languages, libraries, algorithms, etc.), and the consistency of the input or raw data. The technical debt depends on: how often data changes, or how much upkeep there needs to be to keep a solution relevant.

Common problems typically require more time to complete or work through for the overall process. Even if the whole problem can’t be automated, there may be large parts which are common enough to be good targets for automation. If one action takes up 75% of the process, you can potentially automate 75% of the process with one investment. Even if automation doesn’t buy much time, it can provide more accuracy than a human is capable of, making the entire process more efficient.

Automate Your Easy Problems Away

Image by TeroVesalainen from Pixabay

There may be actions which are a drop in the bucket for the process, but are also easy to automate. You might as well automate these if they’re common since it’s cheap productivity as long as the time to develop the automation won’t outpace the time to complete the tasks. For my task getting specific lines of a file, it might have been worth automating even if it were a less frequent task or over a much smaller data set because it is an extremely easy task to automate and it could be easy to miss something when doing it manually, like skipping a file (which definitely happened when doing it by hand). It was the first thing I learned to automate and it took me barely any time to learn from scratch.

One of the largest barriers to automation is fear of the unknown and unfamiliarity with the subject. Easy problems provide an easy way to test the effectiveness of automation and are easier to check for errors. It also requires a much lower investment in terms of both time and skill which makes it even easier to get started. Easy problems are also prone to accuracy issues by people if they’re extremely monotonous.

Where Does Automation Fall Short?

Image by Peggy und Marco Lachmann-Anke from Pixabay

Automation relies on either being able to control for inputs, or being able to handle errors. Sterile input means that there may extra steps to process the data. Error handling typically implies extra code which is costly, time-consuming, and can impact performance. If you ask for a phone number, you might get any of the following:
1235551234
123-555-1234
(123) 555-1234
123.555.1234
123 5551234

These all make plenty of sense to a human, but how do you pass this to a computer? The answer is surprisingly simple for some languages, but can be much harder for others. Error handling has to be built to process this which requires knowledge, both in terms of the tool set, but also that this is a possibility in the first place. What does the program do if it expects only 1235551234 but it gets any one of the others?

Controlling for Input

Image by Steve Buissinne from Pixabay

Different groups, different companies, and different regions all have different standards for data-keeping. A date in the US is going to be formatted differently than a date in Europe. When you automate, you need to be able to have a way to assess if the data is good or not, and you need to know what is possible to scope the solution properly.

A date of 11/12/2019 can mean two different dates depending on who you ask, while 11/30/2019 is valid in the US, and makes no real sense in the EU. If you standardize a date format, you minimize these kinds of problems entirely. You also have to have a way to make sure that the standard is enforced as well.

Controlling for input is one of the easiest ways to make automation more scalable and practical overall, but it can be one of the hardest hurdles to get over. Every bit of old data which is relevant needs to be digitized to fit the new format. All the new stuff needs to be standardized and someone has to check it, especially in the beginning. If the data is bad, the output will be bad too. This is the concept of SISO (or any number of joke acronyms), junk in, junk out.

I had to throw out a module in an ERP I built because the client’s workers refused to use standard terminology in certain fields. The client asked why I couldn’t just make it “read”. I told them that if I could pull it off, I’d be a billionaire. There was a disconnect between why the program could make sense of certain data types and not others, but they trusted me. Ultimately, a trivial field input became a point of contention for automation because of a lack of standard input.

Error Handling

A good automation solution is highly fault tolerant and has a way to easily flag data which is either malformed or which doesn’t fit the solution’s parameters. Building in this kind of handling can get more and more complicated as time goes on and depending on the data. If we’re not controlling for input, how do we ensure the input is right? What do we do for a date of 11/12/2019 if we operate in both the EU and the US?

On top of this, there needs to be error handling to make sure that we know when the automation is not functioning as expected. An automation solution which returns data from an external API might crash when something goes wrong, but it might also silently fail and continue on. It’s easy to write a script to collect a bunch of data, but what happens if the data changed or there’s a bad character in the data? How can we be sure our job ran and what it did?

I have seen a single bad character in an unexpected field cause an automation solution to break for months, but silently and unpredictably. The job might get 99% through or it might die the first try. There was no error handling and no logging to show what or why it was breaking, but it acted like it ran perfectly each time. It took months of phantom inconsistencies to nail down what in the pipeline was broken.

These all sound like deal breakers for automation, but they can be prevented by properly scoping a solution and thinking through the workflow before building it. No plan survives an encounter with the enemy, but it’s a lot easier to ad lib when you have half of the words on a page already. Something out of the ordinary will come up, but knowing what, why, and how you’re automating a task makes most of these minor issues.

Making Automation Work for You

Image by Arek Socha from Pixabay

Best practices rely on planning and testing a solution before fully integrating it. You want to plan the scope and use case for the automation solution as soon as possible and stick to it. You also want to thoroughly test so you know you can trust the solution.

Scope and Use Case

Image by Ag Ku from Pixabay

The scope and use case of an automation solution must be fully developed and understood before it can be relied on. Why does this solution exist and what does it do? These questions need answers in full to actually mean anything.

Knowing the scope and use case prunes the project to shape growth. Feature creep is a sign of poor planning (it wasn’t thought of before) or poor implementation (not enough control on where the project is going). Nailing down the scope and use case means you have to know the purpose and what affects the automation process. How expensive is it to change how data is collected versus fixing the data in the automation solution? If you don’t know, your project is doomed to cost more than expected.

An application which crawls out of its scope gets more and more expensive to build and maintain. An application where the use case isn’t defined becomes at best another “me too” product, and at worst a confused amalgam of ideas and code which resembles a fever dream. The scope dictates what it should ever be expected to do and the use case dictates what the purpose is. Features may develop organically, but you should adhere to the scope as much as possible.

Testing

You don’t let a car on the road without testing impact and build quality, but many people are fine letting an automation solution into the core of their business which hasn’t been vetted. It can be a black box, but it has to be a black box with consistent output. Automation which is wrong is worse than having no automation whatsoever. Automation which is wrong which you don’t know is wrong is worse than not even bothering with the process in the first place. A solution must be tested with sane data, bad data, service disruptions, etc. to really know if it can handle the load or not. Testing means you know what the real world limitations are.

Moving Forward With Automation

Image by Nattanan Kanchanaprat from Pixabay

Automation isn’t something to fear, but bad automation should be. A little bit of clean automation can save a huge amount of time and effort, but harebrained automation can become a financial blackhole. The singularity is coming for us, but until then you can stay ahead of it by knowing how and what to automate, but most importantly, when and why.

There are also times automation plain doesn’t work. If the barrier to automation requires too much of an investment in infrastructure, it may be outside of what is possible. If the technology just isn’t there, you may be unable to make the solution functional. A network based solution which uses a lot of bandwidth is near useless at a site with satellite internet. A solution which requires digital data in an analog house accomplishes nothing unless they’re willing and can afford to digitize the data.

Automation is not the solution to everything, sometimes it becomes a hammer looking for a nail and all you have are screws. Sometimes though, hammering the screws in can work in a pinch if you only need it to last a little while. Everything has to be weighed to make sure that automation doesn’t become a massive money sink or hassle. A little planning saves a lot of wasted time and money with automation.

Analyze the problem, scope the solution to the perceived use case, and build a solution which works for you rather than against you. If you understand the foundation of the problem, you can build a far better solution which is easier and cheaper to create and maintain.

Featured image by Michal Jarmoluk from Pixabay