The Programmer’s Guide to Optimizing for Usability

When you design software, you can’t just focus on the functionality of the product; you have to also focus on how it’s going to be used. What are you developing and who’s going to be using it? Most software engineers can answer the former, but struggle a bit more with the latter. Your users will ultimately determine how your program is actually used.

What do you optimize for with each piece of code? If speed is important, you can sacrifice RAM to keep things in memory to make access quicker. If you’re developing for an embedded board, that same optimization may not work. You can optimize for CPU, RAM, IO, bandwidth, etc. depending on what the bottleneck is for your program and your users. Sometimes you’ll need to sacrifice speed and responsiveness in certain parts so that the user experience remains responsive.

Balancing Users and Resources

While hardware usually isn’t the biggest technical limitation anymore (“just throw better hardware at it” is apparently an acceptable design principle at many shops), it’s still a factor in designing user interfaces and making a system usable. To know what is more valuable, you have to know how often its used, and how big a bottleneck it is for the general functioning of your system. A single function only run once in the workflow per user can be such a tight bottleneck it’s the most valuable part to optimize. Other times, you’re shaving near imperceptible hairs of runtime off at a time because the function is so common.

What is your user’s workflow? What pieces affect that workflow? You don’t need a full Gantt chart, but you do need to know the basic dependencies in each process. What pieces are limited by the user and what pieces limit the user from getting to the next user limited piece? You want to be waiting on your user and not vice versa.

You also have to be aware of the environment your user is using to access your product and what environment your product potentially resides in. If you’re pushing an on premise solution, you may be limited to hardware which is just barely above the minimum you even tested. Do you think your user will blame their ancient hardware or your program when it’s slow?

A lot of your efficiency is going to be rooted in perception. Even if your program is amazing overall, an extra check in between extremely common operations in a workflow will anger your users. I ended up on the receiving end of this type of phone call when a client yelled about their rage at how slow their server was because of a change made for a different use case. They blamed IT until they realized the vendor was at fault, then they moved products.

SAGE Timeslips

Let’s start with an example. Let’s look at a theoretical bug report along the lines of:

We have a bug where the input drop down cache does not lock on user write operations. We received a user report of a user updating this table while a coworker was searching. The cached results did not refresh when the user went to select the field and they had to log out and log back in to see the changes.

What would you do? Would you make the cache reread on each operation, would you create an operation to reread occasionally, or would you train the users to inform others they’ve made a change and to refresh? There are other options, but these were the first few I thought up and the one the vendor implemented. How would you proceed?

The right answer is going to be determined by what and where the input is and how often the fields will be refreshed. For this specific issue, the input is used literally every few minutes of their day, multiple times, and the update is something most companies only do once every month or quarter. Is it worth the users being inconvenienced for 2 seconds every minute of their day, or the update to be a little slower once in a blue moon?

SAGE Timeslips (years ago when their name was all caps) made a change which sounds perfectly reasonable, but ended up causing me to have to spend 10 weeks fighting the client and vendor about whether the change they made would impact operations. One of their techs eventually slipped up and told me they had updated this for larger companies with more constant updates which were having the locking issue to prevent the cache from being incorrect. The result was that each of those inputs took a few seconds extra per operation. The client I was supporting had three people using the software total, but this delay added up quickly for them.

The change debilitated smaller businesses when using SAGE Timeslips when the change first dropped. They fixed a rare bug by introducing a fix which had a massive impact on usability. A simple cache clear button or refresh could have averted this. They could have had a configuration to choose which method to use or similar at the least, but they opted not to do anything. Ultimately, it probably made more sense for some of their largest clients, but it hurt a huge part of their user base as well.

What Causes These Moments?

The principle behind what SAGE did was sound to fix the problem, but it ignored how the program was being used. They introduced a massive delay in every trivial form for what most users work in exclusively to prevent a bug which was rare. Having it be toggleable or a legacy client without that fix would have helped many businesses. The company I supported ultimately ended up moving to something else because of the experience. They weren’t the only ones to do so either.

SAGE had disconnected with exactly what their users were doing and how. They saw part of the user base potentially had an issue and fixed it without taking into account the ramifications of their change. 2 seconds isn’t a lot of time to be lose, but it is when you click a drop down every half second without the delay. The potential issue was extremely rare even in large environments. If you don’t look at the big picture, even a little change has the power to do this.

Making Your Unit Functional

When you have to optimize for usability, you need to look at more than just results from unit tests. Unit tests are great for figuring out what needs to be fixed where, but they don’t help for user usability without a real understanding of the user’s workflow. This is the principle weakness on relying on unit tests as they’re commonly taught. They are a first step, but not a last.

You need to go deeper than just Big O notation for how fast something is. If something is slow while waiting on user interaction, does it really matter? You can focus on optimizing for something else your software is hungry for to provide relief.

I often sacrifice RAM for CPU when I’m making my user wait. I don’t care if I have to load a whole database table into RAM as long as it’s reasonable and it makes their experience that much better. It may only add up to a half a second of saved time per operation, but if the user is doing this even 5 to 10 times an hour, they notice the difference. Other parts are slower or less efficient, but perception is key. Make the parts your user doesn’t notice slow, don’t make the parts they’re trying to get through.

The Needs of the Many Outweigh the Needs of the Few

Focus on keeping their experience smooth. Push their changes into a queue and let it go; don’t make them wait unless there’s a reason to do so. Sacrifice what you need to keep everything running smoothly on background tasks. My dispatch queues may take hours to work through everything, but they don’t affect the user’s operational capacity or user experience even when processing.

Make your program a good restaurant, the front is calm and flows smoothly, but the back spikes with every order or request from the front. A good waitperson will read the room and read the tickets to know how long something will take from the kitchen. If a table is chatting away over an appetizer and the kitchen is buried, why would they go take orders? Presented correctly, your waitperson is showing social grace rather than avoiding their job. This is how your user views your software; they’re here to enjoy and want a good experience even if it sucks for you. If they have to fill out 20 forms, why not pre-crunch some of the data in the background?

If one table has something which requires a lot of prep, you can usually push the other orders with them back. This frees up resources but the other dishes can theoretically be made earlier if time permits. The table won’t notice, and they’ll probably understand that the reason the food is taking longer is because of the special prep for the person in their party. When you approach software design like this, you can explain certain slowness or inefficiency away on certain things. The process which runs the least can be slow or do more to lighten the load of the rest of the program.

Application to Implementation

When I wrote a small ERP suite, I applied this philosophy religiously. All input had more of a delay than output and reporting. I sacrificed a lot of efficiency with input specifically so that the output would be fast. I precomputed certain data on insertion too so that I didn’t have to do it on the front end. My users didn’t notice the input was slow, they just knew that when they needed to see something, it took seconds, most of which was waiting on them.

This would have been completely backwards if I were working with something intended largely for massive amounts of data entry. I chose this design philosophy specifically because my user base was all going to read much, much more than they wrote. They expected it to work well and be fast. The reports took a lot of time to put in (the process not necessarily the form), so users didn’t notice the slowness.

A lot of users were limited by the connectivity as well which meant the site was light at the expense of aesthetics. Each decision to shape any given unit was based on how the user interacted with it as well as how it impacted performance. There were slow spots, but they were largely overlooked because what users did the most was lightning fast. One thing you do once a month being slow is a lot more palatable than what you do every second of every day being slow.

Look beyond unit testing, algorithm speed, and all of that to see how your software is going to be used. What does your user’s workflow look like and how can you help shuffle it along? Make your code work for your users and not against them, even if it makes you have to make sacrifices elsewhere.

Image by diegoxue from Pixabay