Getting started with malware analysis for absolute beginners
A few months ago I did some basic malware analysis as part of an incident responder training. As I was completely new to the field of malware analysis, I learned a lot of new things. This primer is my way of sharing my experiences as a beginner in the field. In this write-up, I want to describe some of the basics and point enthusiastic beginners like myself in the direction of more self-education resources.
I use the following broad, pragmatic definition of malware: malware is basically anything malicious that potentially has a negative impact on a system.
Why learn about malware analysis?
The goal of malware analysis is to answer all kinds of questions about it by looking at the malware and its behavior. But why start with malware analysis anyway? If you know how to analyse a piece of malware, it becomes much easier to discover malware-related problems. It allows you to find out the cause of the problem and assess what damage has been done. It also dramatically improves your chance of solving the problem!
As such, knowing about malware analysis is a useful skill for anyone that manages a system that might be exposed to malware. Think of security professionals such as incident responders and SOC analists, but also system administrators and web administrators.
It also gives you an edge when talking to your boss. If malware pops up in a business environment, your manager will want to know whether the company will lose customers, money or its good name. When you know malware analysis, you'll be able to assess the situation and answer these questions better.
It's all about hygiene
For people that know the foundations of computer science or computer security, malware analysis is not necessarily hard. But it requires some basic understanding of computer hygiene. It is important that you understand the risks of working with malware samples and know how to take proper precautions against the most common risks.
The environment in which you do malware analysis is often called malware lab. This word invokes images of researchers with safety goggles and lab coats, like in a pathogen lab, where they research diseases. Naturally, a malware sample can be as dangerous as a pathogen sample.
Like a pathogen lab, a malware lab needs a set of hygiene rules. We want to prevent the samples from spreading, while at the same time, we want them to infect the right hosts so we can observe the sample in action.
Learning the propper programming or reverse-engineering skills is easy, but without proper security awareness, you'll get nowhere
How to build a malware lab
To get good at malware analysis, you need to practice: experiment with tools, look at different kinds of malware and get familiar with working in an isolated environment. The best thing to do is of course build your own malware lab!
To build a malware lab, I suggest you start with dedicated hardware. If you're just practicing, like me, just one system will do fine.
Just make sure the system has plenty of memory so virtualisation won't be a problem. Take 8 GB of memory as a minimum. You will probably want to run multiple virtual machines at the same time.
You also need enough disk space to save all the different virtual machines and snapsnots of those machines so you can return them to a clean state after running malware.
Virtualisation has some advantages which makes it very usable for malware analysts. It allows you to quickly load a malware sample on multiple operating systems. Secondly, you can save so-called snapshots of virtual machines to quickly restore a virtual machine to a previous, un-infected state.
Make sure your malware system is isolated ("airgapped") from all other digital systems and networks. You don't want to accidentally infect your personal computer, web server or backup drive. Don't connect your malware lab to the internet, because active malware can do all kinds of nasty stuff online. It can try to connect back to the system of original creator, connect to a command and control server or contribute to a DDoS attack on a website.
Static and dynamic analysis
There are two styles of malware analysis. Which one you choose depends on the questions you want to answer and the tools and techniques you're familiar with.
We can investigate malware by looking at the sample, its code and its properties without running it. This is called static analysis. Some examples of static analysis:
- Comparing the hash of the malware sample to a database of known malware samples (such as VirusTotal)
- Looking at the filetype of the sample
- Extracting all ascii-strings from the sample and searching them for clues
- Putting the sample in a disassembler and going through the whole assembly by hand to see what it does
The other form of malware analysis is dynamic analysis or behavioral analysis. This form tries to answer questions about the malware by running it in a controlled environment and looking at its behaviour. Some examples:
- Running the malware and comparing the 'before' and 'after' states of the virtual machine afterwards
- Running the malware while tracking its activities with a process monitor and a network sniffer
- Hooking up an infected virtual machine to a fake virtual network to trick the malware into connecting to a malicious server
- Going through the malware step-by-step with a debugger to investigate its internal states while running
To do a complete analysis of a sample, you will have to use both static and dynamic analysis techniques. Iterate this process to move from the easy basics to the more complex stuff.
Additional resources for learning
The following resources can be helpful if you want to learn more about malware analysis:
- Practical malware analysis, by Honig & Sikorski (book and practice material)
- A curated list of awesome malware analysis tools and resources
- Open Courseware by RPISEC is a course on malware analysis based on the book `Practical malware analysis'
- Anything written by Lenny Zeltser (webcasts, blog articles)
- The SANS Digital Forensics Blog has some nice introductory articles
- Tuts4you has a few tutorials on reverse engineering
- On Crackmes.de you can find crackme and reverseme executables to reverse
Final remarks
When you start out doing malware analysis, the greatest part of the time will be spent tweaking your lab and setup. You will struggle with dependency hell and badly maintained github repo's, you will fight with your virtualisation software... Don't give up, once you have your infrastructure in place, malware analysis can be a lot of fun and very rewarding.
At this point, you know the basics of malware analysis. If you want to learn more, it's time to start practicing.