Evidence of VBA Purging Found in Malicious Documents

TL;DR We have found malicious Office documents containing VBA source code only, and no compiled code. Documents like these are more likely to evade anti-virus detection due to a technique we dubbed “VBA Purging”.

VBA Purging technique
Malicious MS Office documents leveraging VBA, have their VBA code stored inside streams of Compound File Binary Format files. Each VBA module, class, … is stored in its own Module Stream. A Module Stream contains PerformanceCache data (compiled VBA code, aka P-Code) that is version and implementation dependent, followed by CompressedSourceCode data (compressed VBA source code).

In 2016, Vesselin Bontchev hinted at a technique where compressed VBA source code can be removed/changed, and have P-Code executed. At the end of 2018, this technique came to be known as “VBA stomping”.
Our research back then revealed that it was also possible to remove PerformanceCache data (compiled VBA code, aka P-Code) while leaving the VBA source code intact, ready to be executed. We call this technique “VBA purging” (cfr. cache purging).
The boundary between PerformanceCache data and CompressedSourceCode data is defined by a MODULEOFFSET for each Module Stream. This MODULEOFFSET is a record stored in the dir stream for each Module Stream.

To remove the PerformanceCache data, its bytes have to be removed, so that the CompressedSourceCode data starts at position 0x0000 (beginning of stream). The size of the Module Stream has to be reduced accordingly, and the MODULEOFFSET record has to point at position 0x0000.

PerformanceCache data is also present in the _VBA_PROJECT stream and in SRP streams. The VBA Purging technique implies removal of PerformanceCache data from the _VBA_PROJECT stream and complete removal of SRP streams.

VBA Purging in the wild
Over the years, we encountered Office documents without PerformanceCache data that were either benign or Proof-of-Concept documents. Recently however, we found malicious Office documents that are most likely real malware (e.g. not PoC documents).
Voo Cancelado Localizador RR9N4V.ppam” (MD5 730a8401140edb4c79d563f306ca529e) is such a document. There are several interesting aspects to this malicious document, however for this blog post, we focus solely on the VBA Purging aspect.
The document we found is a PowerPoint Add-In (.ppam file extension). It is an Office Open XML file (OOXML) with VBA macros:

Inside this ZIP container, file vbaProject.bin contains VBA code.
It can be analyzed with a tool like oledump.py:

Using option -i, we can visualize the MODULEOFFSET record values:

This option adds an extra column, with the size of the PerformanceCache data and the CompressedSourceCode data. Notice that for this sample, the size of the PerformanceCache data is 0: there is no P-Code.
And the size of the _VBA_PROJECT stream is just 7 bytes: that’s just a header without PerformanceCache data.

Compare this with a “normal” malicious Office document:

The VBA code in this “VBA Purged” document downloads and executes a VBS script hosted on GitHub:

VBA Purging Tools
There are tools to create Office documents without any dependencies on MS Office components, like the .NET library EPPlus. Excel documents created with this library, contain no performance cache data.
There are also commercial tools used by professional VBA developers to clean their documents prior to release, like these tools.
We don’t know exactly what tool was use to achieve VBA Purging on this document, but it was not just simply saved with PowerPoint. When a document with VBA code is created with MS Office, it will contain PerformanceCache data and CompressedSourceCode data. The absence of PerformanceCache data means that the document was created/purged with another tool. 

Removing performance cache data of malicious documents reduces the chance of anti-virus detection. Here follows an example.
Walmart’s Red Team used malicious document a668657023e2c9c12dabad14c8f905e4 in their presentation. It has 44/61 detections on VirusTotal (27th December 2019):

The module streams contain PerformanceCache and CompressedSourceCode data. Then Walmart’s Red Team “VBA stomped” this document by overwriting the CompressedSourceCode variable with null bytes. This resulted in a far lower detection rate on VirusTotal (7/58 vs 36/59 when they performed their tests in April 2018).
When we took Walmart’s Red Team sample to remove the PerformanceCache data, we also witnessed a lower detection rate (16/58 on 22nd December 2019) on VirusTotal:

Of course, this is just a single sample on VirusTotal. Your experience might be very different with your fully operational AV in a production environment, compared to the static AV scans on VirusTotal.
AV is not the only technology that can be negatively impacted by VBA Purging. There are IDS and YARA rules that rely on strings solely found in PerformanceCache data. For example, the string CreateObject, often used in VBA source code of malicious office documents to create ActiveX objects (e.g. HTTP objects) does only appear in PerformanceCache data. There are IDS and YARA rules that rely on the presence of this string. Also present in the VBA source code, it does not appear in the CompressedSourceCode data as a complete string because of compression.

VBA Purging removes PerformanceCache data from Office documents with VBA code, without impairing code execution. It can have an impact on AV detection, and certainly impact the effectiveness of IDS and YARA rules that rely on strings solely found in PerformanceCache data. If you use or develop rules like these, we encourage you to review them in light of the information presented here.
Unlike VBA Stomping, VBA Purging is not a clear sign of malicious adversaries. There are legitimate tools that create documents without PerformanceCache data and/or remove it from existing documents.

About the authors
Didier Stevens is a malware expert working for NVISO. Didier is a SANS Internet Storm Center senior handler and Microsoft MVP, and has developed numerous popular tools to assist with malware analysis. You can find Didier on Twitter and LinkedIn.