With the sheer volume of ransomware attacks in the past couple weeks, I decided to write an article about what works and what doesn’t. I’m writing about ransomware attacks, but this advice ultimately applies to most types of compromises, viruses, or malware attacks. These steps are going to be more generic because there is “no one size fits all” approach to resolving infection or compromise.
What To Do First
The very first thing to do is to remain calm. Don’t let the situation blindside you and don’t let fear and emotions take control. You need to keep control of the situation so you can do your best work.
When you get scared and freak out, so do your customers, but far, far worse. You are the expert to your client. If you show fear, they assume they should be more scared than you are, and when you relax, they assume you have it under control (or at least should hopefully calm down). You have it under control, or you will, just stay calm and do what you can do. Research and respond where and how you can.
What To Look For
The very first thing after the simple act of staying calm is to assess the situation. What do you have? Is it a compromise or is it ransomware? If it’s ransomware, what has it hit and what hasn’t it? Are server shares still on? Shut down as much as you safely can as quickly as you can unless you identify (and test) that you are dealing with something old enough to easily recover from. Shutdown the “server” service on any Windows file share and turn off pretty much any service or app you can like Quickbooks or Sharepoint. It’s okay to be a little paranoid at this point.
Try to identify accounts or events which can account for this attack. You need to know what you’re dealing with to know how to respond. This isn’t the stage to try and learn everything about the attack, but you need to know at least the basics in order to respond without wasting time. Look for unnecessary administrator accounts, open ports for commonly exploited services (RDP, SSH, etc.), shares which are too loose with permissions, etc. These will often lead you to your exploit, or at least allow you to prevent further compromise and infection at this stage.
Putting Together a Plan
Now that you know what you’re dealing with, or at least have an idea, you need to formulate a plan. Your plan should include two primary parts: the reactionary response (triage), and the normalizing response (getting back to normal). This should also be followed up with a plan in order to secure the site, but unless it is something easy to do while restoring the site or is a vulnerability which was exploited, this shouldn’t be done until you stem the bleeding. There’s no point trying to diet and exercise if you’re just going to go step in front of a bus the next time you walk outside.
Reactionary Response – Triage
This is where you triage the site and focus on quarantining and cleaning the site. Shutting off the “server” services on Windows servers and similar would be in the triage stage. You want to be reactionary and cut access for whatever infection or compromise may be rampant. You’re probably going to cause issues, but at this stage, it’s usually okay. Everything which is not essential which is even potentially compromised can be sacrificed if necessary. Some clients might fight this, but you can usually win them over by explaining that everything is on the line and that stopping everything now is better than letting everything become completely infected.
Another thing to do at this phase is to begin schisming the network. I typically aim for 3 classes of VLANs, infected, quarantine and remediation, and clean. You can have more of these as necessary and as equipment permits, but ultimately, you want a group of infected machines with bare minimal access, then you move them over in small groups and remediate them as possible and see if they manage to reinfect each other. From there, you move them to the clean groups when they are shown to be safe. This minimizes reinfection, and also grants the environment the best chance of survival the quickest.
If you can’t do this, you can sort of do this by pulling the machines from the network and cleaning them up a few at a time, then putting them back on the network as they are clean. This is not ideal, but neither is being infected or compromised. Some clients may resist this however as it is much more disruptful.
Your goal at this phase is to get the site into a state where you can stop the bleeding and start rectifying the issues. Certain actions may not be ideal, but will be the easiest way to set yourself up for long-term success at the site for remediating. Make sure to communicate with the client and explain why certain things are going down. It may be a deal-breaker to some, but most clients will be a bit understanding once you explain the risks of not following such a plan. This phase should be as short as possible since it harms the business which is being affected.
You may need to push different antivirus solutions or other security solutions in the interim to get the site clean. You may need to add more aggressive rules to the firewall. There are all sorts of potential changes which can be made here, but the goal right now is minimize further damage and contain if the infection hasn’t infected the site, or else secure the site in order to begin restoration. Try to save evidence of what happened where you can, but don’t do it at the expense of the client.
Normalizing Response – Getting Back to Normal
The normalizing response is where we start trying to get the site back to normal. In the last phase, we talked about VLANing the site into pieces, this is the phase that starts when we have gotten the site mostly under control. Go in order of most to least important services and assets. Start with the common servers and move on to the highest level people at the company’s machines, then work your way down the org chart unless something sticks out as a necessity sooner.
Identify, or ideally already know, what systems are the most essential to the business and work from there. File servers, active directory, web servers, etc. are going to be the most important thing depending on the business. If your attack can be confirmed to be limited to a specific OS (Windows, MacOS, Linux, not individual versions of an OS as a lot of these attacks may have a separate backdoor that a malicious actor will use to push a different exploit out with), you can move everything not that OS back over. The goal is to get enough of the business back up that they can function at some capacity instead of having to stay closed.
You should begin undoing some of the triage steps once the agents are cleaned and in their own VLAN. Some security remediation can be done at this step if necessary as well, but that is not the focus right now. We want to see the site back up and running first then use this to leverage better security practices at the site, but at the same time, fix the low hanging fruit or obvious, serious issues along the way. At this point, if there is a compromise or signs of an infection which could compromise the site (something running a keylogger or some other account compromising tool), every account (including service accounts and similar) should be completely reset. It’s inconvenient, but so is getting reinfected due to laziness.
Moving Forward
This is the stage we want to begin addressing both the cause of our infection as well as how to prevent similar types of attacks going forward. Depending on how you handled the previous phases and how the client has reacted, this is the best opportunity to push new security best practices which the client has resisted. This is also a chance to either upgrade or sell new infrastructure pieces to prevent further attacks. Don’t use this to oversell or capitalize on your client’s misfortune, but use it to shore up issues which need addressing.
You also want to perform a full audit and see if you can figure out exactly what happened. Ideally you saved evidence as you went along, but anyone who has been in these scenarios knows that hindsight is 20/20 for what to save and that sometimes it just isn’t possible without causing issues for the client. If you have data to audit off of, you can better protect yourself and your client.
What Should I Do If It Was Because of Me?
A lot of attacks are caused by a lack of security concern either by the client or the IT department or company. Try to shore everything up and have a reason why every decision was made. Make sure the client knows what you did, or didn’t, and exactly why. I try to always build a paper trail whenever a client wants me to not follow best practices. Make sure to reiterate your concerns and cover yourself in case they refuse to do things the way they should.
If it was actually because of you and an oversight, own up and explain to the client what has happened. You can wait it out and hope they go under, but going down with some dignity is better than trying to lie and cheat your way through service. It might work once, but it won’t work forever. Own mistakes, shore them up, apologize, and hope you get out with your reputation intact if you are the cause and there is no good reason for what you did.
Preventing Attacks
The most useless question I have ever been asked on a job interview is: “Assume a standard business, how would you secure their server assuming it is a new [OS of choice]?” This question is impossible to answer without understanding the client. There is no “one size fits all” approach to security. You need to know what the client needs, what they want, and what they can tolerate to know what needs to be done. You need to try and lock down as much as possible without crippling the client, but the line between ease of use and crippled is a blurry one.
Three clients can have the same workflows and be in the same industry, but if one is paranoid, one is apathetic for better and for worse, and one is hostile to change, you may (and probably will) end up with three entirely different security policies and setups for these clients. Your clients are a bottleneck to how you proceed with security unless you can make them understand why they need it. If they are resilient to change, cover yourself.
When you try to prevent attacks and secure a site, you need to consider the factors of: the overall cost, the client themselves, the client’s workflow, and the risk. Research what security measures exist and weigh them with against the previous factors. A top of the line firewall may be great, but if the client won’t pay, it is pointless. On the other hand, a top of the line security solution which blocks scripts may be a disaster for a coding shop. Security always comes with a price, and sometimes the price is too steep to pay.
Securing Yourself
Assuming you are running an IT department or an IT company, you are probably running some kind of RMM (Remote Monitoring and Management). RMM’s are powerful but also extremely popular to exploit as of writing. Many attacks nowadays rely on either RMM platforms, or email. Long gone are the more traditional worms and viruses since it’s just easier to just phish or hack an RMM and effectively hack dozens or more of companies in one go.
2FA (2 Factor Authentication) should be on on every single platform which offers it. This is just a nice way to say that when someone logs in, they either need to enter another code from a text or app, or enter some other kind of proof to show they logged in. This prevents password hacking and similar from being as effective. It also limits API access for RMM’s and other tools which facilitate massive pushes of exploits.
If you are a business owner, make sure to ask if your IT department or company is using 2FA, and if not, do they have some other kind of security measures in place. This is the quickest, easiest, safest way to make an RMM tool much less risky and is applicable to 100% of modern situations (if the tool permits). If the tool does not offer 2FA, it is probably not worth using to be honest.
Common Problems During Response
The most common problem I see during a response scenario is the fact that people freak out and scare their clients even if the scenario was originally salvageable. You lead your clients or your company with how you react. It’s okay to be scared, but show them no fear and get them as stable as possible, or pass the torch to do them right.
The second most common problem I see is rallying for assistance. Having assistance is great, but a lot of times, people either fail to clearly delineate roles or fail to properly plan for the sheer volume of assistance and so the extra people become literally less than worthless and become a burden. I have seen home run scenarios turn into clients going under due to a lack of proper planning and proper resource allocation. A classic case of too many cooks in the kitchen shows up when there isn’t a good use for each resource.
I have recently worked on several response cases where I am a resource, but the head is unsure what to do and has too many resources to allocate. Another common theme with these scenarios is that the person leading is unable to maintain control, so you see several sub-factions develop in the external assistance which causes the authority of the leading agent to fall apart. When roles are clearly delineated and each resource has a purpose, this plain just does not happen.
Another issue is that sometimes the response team’s leader cannot admit they are in over their head and they fail to properly plan or allocate resources. You don’t want the high level security tech fixing AD issues (that aren’t security related or which impact security), and you don’t want the system imaging tech on the firewall making changes. This type of misallocation usually comes from an ignorance of what each job requires and entails, though it can also come from a combination of the previous issues as well.
Conclusion
This is just the tip of the iceberg of how to survive a ransomware attack or some other infection or compromise. These sorts of scenarios are almost all identical for the overall, abstract process, but the devil’s in the details. One must assess the situation, triage to stem the bleeding, then work to get healthy, and finally work to prevent further attacks. Make sure that any known insecurity has a good reason and that best practices are followed on your end to prevent attacks in the first place.
MSPs and anyone using RMM tools are a large target for modern hackers and modern ransomware as it affords a root level access to machines with an easy way to deploy attacks. Make sure to secure your own inlets into the client’s site so that you are not the weakest link in their security. 2FA and similar measures can make easy additions to any security policy with minimal issues, especially on the backend. Security is usually painful, but careful balancing means reduced risk for both your clients and yourself.
Image by Andrew Martin from Pixabay