Since late 2014 we witness a resurgence of campaigns spamming malicious Office documents with VBA macros. Sometimes however, we also see malicious Office documents exploiting relatively recent vulnerabilities.
In this blog post we look at a malicious MS Office document that uses an exploit instead of VBA.
The sample we received is 65495b359097c8fdce7fe30513b7c637. It exploits vulnerability CVE-2015-2545 which allows remote attackers to execute arbitrary code via a crafted EPS image, aka “Microsoft Office Malformed EPS File Vulnerability”. In this blog post we want to focus on extracting the payload.
A more detailed explanation on the exploit itself can be found here (pdf).
The sample we received is a .docx file, with oledump.py we can confirm it doesn’t contain VBA code:
With zipdump.py (remember that the MS Office 2007+ file format uses ZIP as a container) we can see what’s inside the document:
Looking at the extensions, we see that most files are XML files. There’s also a .gif and .eps file. The .eps file is unusual. Let’s check the start of each file to see if the extensions can be trusted:
This confirms the extensions we see: 12 XML files, a GIF file and an EPS files. As we know are exploits for EPS, we took a look at this file first:
The file contains a large amount of lines like above, so let’s get some stats first:
byte-stats.py gives us all kinds of statistics about the content of the file. First of all, it’s a large file for MS Office documents (10MB). And it’s a pure text file (it only contains printable characters (96%) and whitespace (4%)).
There is a large amount of bytes that are hexadecimal characters (66%) and BASE64 characters (93%). Since the hexadecimal character set is a subset of the BASE64 character set, we need more info to determine if the file contains hexadecimal strings or BASE64 strings. But it very likely contains some, as there are parts of the file (buckets) that contain only hexadecimal/BASE64 characters (10240 100%).
base64dump.py is a tool to search for BASE64 strings (and other encodings like hexadecimal). We will use it to search for a payload in the EPS file. Since the file is large, we can expect to have a lot of hits. So let’s set a minimum sequence length of 1000:
There are 4 large sequences of BASE64 characters in the document. But as the start of each sequence (field Encoded) contains only hexadecimal characters, it’s necessary to check for hexadecimal encoding too:
With this output, it’s clear that the EPS file contains 4 large hexadecimal strings.
Near the start of the decoded string 2 and 4, we can see characters MZ: this could indicate a PE file. So let’s check:
This certainly looks like a PE file. Let’s pipe it through pecheck.py (we need to skip the first 8 bytes: UUUUffff):
This tells us that it is definitively a PE file. With more details from pecheck’s output, we can say it’s a 64-bit DLL. It has a small overlay:
Since this overlay is actually just 8 bytes (UUUUffff), it’s not an overlay, but a “sentinel” like at the start of the hexadecimal sequence. So let’s remove this:
We did submit this DLL to VirusTotal: 30ec672cfcde4ea6fd3b5b14d6201c43.
It has some interesting strings:
Like the string of the PDB file: GetDownLoader. And a PowerShell command to download and execute an .exe (the URL is readable).
Also notice that the string “This program can not be run in DOS mode.” appears twice. This is a strong indication that this DLL contains another PE file.
Let’s search for it. By using the –cut operator to search for another instance of string MZ, we can cut-out the embedded PE file:
We also submitted this file to VirusTotal: 2938d6eda6cd941e59df3dd54bf8dad8. It is a 32-bit EXE file.
The hexadecimal string with Id 4 found with base64dump also contains a PE file. It is a 32-bit DLL (ce95faf23621a0a705b796c19d9fec44), containing the same 32-bit EXE as the 64-bit DLL: 2938d6eda6cd941e59df3dd54bf8dad8.
With a steady flow of VBA maldocs for over more than 2 years, one would almost forget that Office maldocs with exploits are found in-the-wild too. If you just look for VBA macros in documents you receive, you will miss these exploits.
In this sample, detecting a payload was not too difficult: we found an unusual file (large .eps file) with long hexadecimal strings that decode to PE-files. It’s not always that easy, especially if we are dealing with binary MS Office files (like .doc).
In this post we focus on a static analysis method to extract the payload. When performing analysis on this file yourself, be aware that this maldoc also contains shellcode (strings 1 and 3 found by base64dump) and an exploit to break out of the sandbox (protected view).