Building Dynamic Analysis Tools with Docker

After running my SSH honeypot for several months, I have a collection of ELF executable files with several different architectures that I don’t know what to do with. I am in the process of reverse engineering them, but it is taking a long time as I am still learning about ARM and MIPS and Linux malware. I wanted a way to dynamically analyze the malware, but didn’t find anything like the tool I wanted, so I decided to make my own by using Docker and combining some tools together.

Warning

I am currently running this tool in a virtual machine that I was already using for malware analysis without Internet connectivity. I am still looking at the security of this tool before connecting it to the Internet.

Features I want

-Analysis of many different architectures to simulate IOT devices(x86, ARM, MIPS, PowerPC, ETC.). -Ability to capture network traffic -Capture DNS requests -Monitor file system for changes -Quickly start and stop the tool

Design

This tool will consist of two separate docker containers, one will contain the malware samples and will be interactive to execute the malware and see the changes to the system in real time. The other will handle the networking, including the packet capture and logging. The sandbox container will forward all of it’s traffic through the network container to capture traffic from the malware. A single bash script will be able to start both containers, handle the data generated from running them, and shut them down.

Network Diagram of Docker system

Sandbox

The sandbox container doesn’t need a lot of special features, it mainly just needs to act as a regular Linux machine. The dockerfile for this container mostly consists of installing packages that are common in devices that are getting hacked such as busybox. I used Ubuntu as a base image. The other thing it does is copies scripts and configuration files to the container that are needed at runtime.

To handle different architectures, I can use Qemu. Qemu can detect what architecture an executable is and automatically runs the executable with an emulator to allow a different architecture to run on my x86 machine. To set it up, I used this guide. https://www.stereolabs.com/docs/docker/building-arm-container-on-x86/

Qemu running process

The networking for the container changes the default route so all of the network traffic goes to the network container by modifying etc/interfaces to change gateway. After the container starts, I have a script that brings up and down the eth0 interface so the changes can take effect. The rest of the networking is handled by Docker itself. I created a network with the subnet 10.10.10.0/24 and then set a static IP for each container with Docker run.

Network

The network simulation will be handled by fakenet-ng, created by the FLARE team at Mandiant. This tool responds to network traffic by sending a valid response to network traffic. For example, if it receives a HTTP request, it will send back some data along with a 200 response to see how the malware reacts without being able to connect to the Internet. I am most interested in this for it’s DNS server to be able to see where the malware is trying to connect, it’s Telnet and IRC servers to view ex-filtrated data, and the HTTP server to capture attempts to exploit routers.

The dockerfile is again based on Ubuntu and just installs packages and moves around scripts and configuration files. The rest of the container is almost identical to the sandbox. The difference is that this one will be run in the background using the -d flag in Docker run and will only start fakenet-ng instead of running /bin/bash. The network container never has to be interacted with by the end user. It simply starts, responds to network requests, and writes to a .pcap and logs.

Results

The prototype of this tool, even with basic functionality, is able to capture a lot of useful information about malware samples. Many of the malware samples I have collected are heavily obfuscated and require a fair amount of work to find details about what it is doing. Just in testing, I have found a lot of interesting functionality, new executable binary files, and network command and control functionality, allowing me to identify a few botnets I have been seeing for a few months.

DNS and Command and Control IPs

Malware, especially in the botnet samples I have, need to send data somewhere or get instructions from a server. This can be very helpful to identify the sample if the malware has been seen before, but has changed to avoid hashes or or other IOCs from being recognized.

DNS request to C2 server:
DNS Requests

Traffic from IRC C&C Server:
DNS Requests

Scanning Methods

Some of the samples I find have build in functionality to spread to other devices by scanning the Internet for new devices to infect. This sample sends requests to random IP addresses on port 22 to see if it gets a response. It is similar to how the Nmap scan “-nF” works.

Fin Scanning

Persistence

In order to gain persistence on a device, the malware will have to make some changes to system configurations that can be caught with docker diff. The most common I see are changes to cron jobs and startup processes. Cron jobs will execute a command at a specific time such as run a command at noon every day, or the 30th minute of every hour. Malware creators use them to allow people to delete or kill the malware and then reinfect the machine once the cron job executes again. This can also help malware survive machines rebooting. Startup processes are simply the programs that start when a machine starts. The administrator can change passwords, close the vulnerability that allowed the attackers in, and restart the machine, but the malware will still come back.

Dropped Files

Sometimes a malware sample will obfuscate itself and then save the malware to storage and then execute it. By tracking the differences made to the file system, I can find these dropped file and analyze them easier than the original files

Processes showing running commands

Process Structure

Often the malware doesn’t have just one process, but will spawn many different child processes. Right now I don’t automatically record process history, but just using ‘ps’ in the terminal is enough to see the process structure. The command entry in the “ps faux” command can give insight into what the malware is doing by showing what commands are running, or can be obfuscated by malware authors.

Processes showing running commands

Device Exploits

An unintended feature I didn’t think about when creating this tool is that I would be able to see easily if malware is using an exploit and that it would make it easier to identify them. Just by typing in part of the network request, I can quickly find a proof of concept for it to analyze it and figure out what devices are being targeted.

Packets containing exploits

Next Steps

Clean up code in the prototype
Make installation easier
Add more tools such as memory analysis, static analysis tools
Allow the sandbox to safely connect to C2 servers