JavaScript Backdoor Malware Challenge
A few days ago I saw this tweet with a malware sample to decode. The task is to decode the second stage of a JavaScript back-door and figure out the malware works to decode a string into JavaScript code for the seconds stage.
Setup
I want to be able to use dynamic analysis to decode this sample, so I will use the developer tools in Firefox to get access to a debugger. I’ll make an html file and put the code in script tags. Then I’ll add two p tags, one to click to run the code and another to hold the decoded second stage. I’ll put the main function in the JavaScript to allow it to be run when the button in the html is clicked so I can load the code, insert breakpoints before it starts running, and then execute it. I also change the eval statement to instead print to the html page inside the p tag. I’ll remove the WScript arguments and just hard coded the one value that is needed later on.
Code modified to run in function, eval() removed, and hardcoded arguments.
P tags to start running code, and to hold decoded stage 3.
Challenge Questions
What variable holds the next stage of code?
The variable “ES3c” will hold the next stage after it has been decoded.
What function is responsible for executing the de-obfuscated second stage?
The function Eval() will be passed the ES3c variable to execute the second stage.
What is the “key” needed to unravel the second stage?
To find the key, I’ll need information outside of the single sample provided. I know it is the malware Turla so I’ll search there first. At this securelist post, I found a potential key. The code was structured exactly the same , but variable names were different so I didn’t think it would work. Using this key, the second stage did not contain valid Javascript. This suggests to me there are multiple variations of this malware that has changed overtime to make detection harder.
Next I looked for posts specifically about this exact sample of Turla. I added “eval(ES3c)” to my search to make sure the exact variable name is in the post. I found a key of “EzZETcSXyKAdF_e5I2i1” in another blog post, added it to the sample, and executed it to find the next stage.
What does the function LXv5 do?
The function takes a string, decodes it, and converts it to an array that is passed to CpPT to decode and get the second stage. I could go line by line and look closely at the code, but the input string gives away what the code is doing. It contains all ASCII characters, capital and lowercase letters, digits 0-9, “/”, “+”, and the “=” at the end for padding. I can throw the string into CyberChef and base-64 decode it. It’s still nonsense, but there’s another function to decode it before it is executed. Comparing the array that is returned from the function and the result in CyberChef, I see they are exactly the same.
What does the function CpPT do?
CpPT takes the array from the function LXv5 and the key I found in the blog post to decode the second stage. Other research has identified this function as RC4 encryption so I will try to confirm that is the case here. Refactoring the code in the malware shows it is nearly identical to the pseudocode presented in the Wikipedia article for RC4. The first two for loops are part of the key-scheduling algorithm. The last for loop is the Pseudo-random generation algorithm (PRGA).
KSA Comparison
PRGA Comparison
TLDR
- This stage of Turla is encoded JavaScript that takes a string, decodes it, and executes it with eval.
- It requires a password to increase the difficulty of analyzing the sample without additional data about where it came from.
- The malware uses RC4 and Base-64 to decode the string to get to the second stage.
- Both the RC4 and Base-64 algorithms are coded in the malware instead of using a built-in function to make the sample harder to understand.