Are threat actors turning to archives and disk images as macro usage dwindles?

Malicious macros in Office documents have long been a favorite tactic of threat actors. So Microsoft’s announcement in February 2022 that macros in documents originating from the internet would be blocked by default came as welcome news (despite a brief rollback in July). XLM4 macros were also disabled by default in Microsoft 365 as of February 2022.

But threat actors have always evolved in response to security developments, and this looks like it will be no exception. Following the rollout of Microsoft’s new policy, we’ve seen attacks using archive and disk image files – including the usual suspects (ZIP and RAR), but also more obscure formats like ARJ, ACE, LZH, VHD, and XZ – accompanying a decrease in detections of popular Office formats.

Archives can make it harder for detection products to inspect and flag malicious content – even more so with less-popular formats, as they tend to be less well-understood. They can also allow threat actors to bypass the ‘Mark of the Web’ (MOTW), the tag Microsoft inserts into files originating from the internet. While the MOTW is usually present in the archive file itself, it isn’t always propagated to an archive’s contents once extracted.

But there are some positives. Threat actors sometimes adopt more convoluted attack chains when using archives, which provides more opportunities to detect and block malicious activity. They’re likely less familiar to many users, which may make them pause before opening them (although this cuts both ways, as users may be less aware of the associated risks).

Our Sophos X-Ops researchers have also been working hard to expand our coverage to protect against lesser-known archive types. And some archive software vendors have begun to offer MOTW support, so that extracted contents will also contain the tag.

Mark-of-the-web

MOTW was originally an Internet Explorer feature that forced saved webpages to run in the same security zone of the site they were saved from (it could also be added manually to HTML documents meant to be viewed locally, such as product manuals and help guides).

MOTW was designed to protect users by ensuring that local webpages didn’t have access to the entire filesystem by running in the ‘Local Machine zone,’ which has fewer security restrictions. Instead, those webpages were forced to run in the zone of the location the page was saved from.
In practice, this meant an HTML comment was added to a saved webpage, like this:

<!-- saved from url=(0028)https://www.news.sophos.com/ -->

Internet Explorer would parse the page for this comment and determine which security policy to apply to the contents, based on the user’s zone settings.
Microsoft later expanded MOTW to apply to files originating from the internet, including browser downloads and email attachments, and integrated MOTW handling throughout Windows. Instead of an HTML comment, an Alternate Data Stream (ADS) called Zone.Identifier is added to files, and an element of this stream, called ZoneId, contains a value indicating which zone the file came from.

Possible ZoneId values include:

0: Local machine
1: Intranet
2: Trusted sites
3: Internet
4: Untrusted sites

Interestingly, some users have reported that the URL is preserved in the ADS in Windows 10. It’s also worth bearing in mind that different applications handle MOTW in different ways (if at all), and the feature is not infallible; security researchers have identified methods to bypass it, and in some applications, files may or may not be tagged with a MOTW, depending on the user’s behavior – for instance, right-clicking and selecting ‘Save As,’ versus drag-and-drop.

Let’s look at an example. I’ll download a test file (a Word document containing a simple macro) from an external website, and we’ll take a look at its properties and ADS.
We see the following notification in the file’s properties:

Figure 1: Properties of our sample document

We can inspect the ADS with the following PowerShell command:

Get-Content .\remote_file.doc -Stream Zone.Identifier

Which gives us the following output:

Figure 2: Inspecting the ADS of the sample document

So we have a ZoneId of 3 (internet zone), and two values of interest, ReferrerUrl and HostUrl, which tell us where the file was downloaded from. These may be worth noting for incident responders!

As of version 2203, the default behavior of five Office applications (Access, Excel, PowerPoint, Visio, and Word) is to block macros in files originating from the internet, meaning that users see the following notification when opening this file:

Figure 3: A notification that macros have been blocked (credit: Microsoft)

Figure 4: Microsoft’s decision tree for blocking macros (credit: Microsoft)

Of course, organizations can configure their policies differently, but making this the default behavior is likely to frustrate many threat actors who rely on Office macros as an initial infection vector.

This could be one explanation for the increase in archive formats we’ve seen recently. Here’s why it’s a problem: Say you download an ‘XZ’ archive, created using 7-Zip, from the internet. The properties and ADS contain the MOTW:

Figure 5: Properties of our sample XZ archive

Figure 6: Examining the ADS of the XZ archive

So far so good. But if you then extract the archive and examine the document inside, the MOTW is not propagated:

Figure 7: MOTW not propagated to the extracted document

Figure 8: Examining the ADS of the extracted document

If an attacker persuades a victim to extract this file (perhaps by using some additional context in a malicious spam email, like “I’ve put this file in a password-protected archive for security reasons”), and the victim opens the document and enables macros (assuming that’s within an organization’s policy rules), the malicious macro will still run.

Most popular archiver products – including WinRAR, WinZip, and the built-in ‘extract all’ Windows function – all support MOTW propagation, although depending on the product, this may be only for certain file extensions or ZoneIds (usually 3 and/or 4), and attackers may also be able to bypass MOTW propagation. As of version 22.00, 7-Zip also contains, for the first time, support for MOTW propagation – although it’s not enabled by default. To configure it, you’ll need to go to Tools > Options > 7-Zip and select either Yes or Office files only under Propagate Zone.Id stream.

However, some archiver products and methods of extraction don’t propagate MOTW. In the screenshot below, for example, I’ve downloaded the same test file (this time archived as a ZIP file using WinZip). If I extract the contents with the Expand-Archive cmdlet in PowerShell, the MOTW is not propagated to the Word document. But if I unzip using Windows Explorer, and then check the ADS of the extracted document in my terminal, there’s an MOTW tag:

Figure 9: Examining the ADS of the document after extracting with PowerShell, versus extracting with Explorer

Of course, it’s unlikely that most users would use a PowerShell cmdlet to extract an archive – so in this particular scenario, the attacker’s job is more difficult, as most users would probably unzip the archive using Explorer or a popular archiver.

Developer Nobutaka Mantani maintains a list of archiver products, along with whether or not they support MOTW propagation, on a GitHub repository (last updated August 27th, 2022).

Archive formats

In malicious spam emails, threat actors often attach a password-protected archive, and include the password in the body of the email, as in this example:

Figure 10: An example of a ZIP archive used as a part of a malicious spam attack

Some attackers may send the password in a follow-up email, or reference the password obliquely (e.g., “the password is the current month and year, in the format MMYYYY”) to prevent email scanners and sandboxes from unpacking the archive. The archive itself then contains a malicious payload – which might be an Office document, an EXE disguised as a PDF, an ISO file, or something else.

In other cases, the email may contain an ISO file or another disk image file as an attachment, as in this case:

Figure 11: An ISO attachment in a malicious spam email

Whether threat actors use common archive formats like ZIP and RAR, disk image formats like ISO and VHD, or more obscure archive types like ACE, the intentions are the same: smuggling malicious code past gateway scanners and security systems, and executing it.

It’s worth noting that these aren’t new approaches – for example, researchers have observed ACE files in attack campaigns since at least 2015, and ARJ files since at least 2014. And threat actors have been using ZIP and ISO files for a while now, so it’s no surprise that there have been some notable attacks using these formats. Here’s a brief overview of some archive and disk image file types, along with instances of attacks in which they’ve been used:

Format	Released	MIME type	Read using	Notable past attack campaigns
7Z	2000	application/x-7z-compressed	7-Zip, WinZip, WinRAR, other third-party tools	Scarab ransomware (2017); possible Locky ransomware (2017); GlobeImposter ransomware (2018)
ACE	1998	application/x-ace-compressed	WinAce, various less common third-party tools	GuLoader Trojan (2020);
ARJ	1991	application/x-arj	7-Zip, WinZip	Spyware (2018); Lokibot (2020); keylogger/infostealer (2022); AgentTesla (2022)
CAB	1992	application/vnd.ms-cab-compressed	7-Zip, WinZip, WinRAR, other third-party tools, Windows explorer	Loki (2020)
GZ	1992	application/gzip	7-Zip, WinZip, WinRAR, GNU tar, other third-party tools	AgentTesla (2022); GuLoader (2022)
ISO (and UDF, IMG)	1988	application/x-iso9660-image	7-Zip, WinZip, WinRAR, other third-party tools. Can also be mounted	NanoCore, Remcos, LokiBot (2019); Nobelium (2021); Bumblebee (2022); Vidar (2022). Various disk image files (2019-2020)
LZH	1988	application/x-lzh	7-Zip, WinZip, WinRAR, other third-party tools	Lokibot (2020); FormBook (2021); ASyncRAT (2022)
RAR	1993	application/x-rar-compressed	7-Zip, WinZip, WinRAR, other third-party tools	BazarBackdoor (2021); NanoCore (2021)
TAR	1979	application/x-tar	7-Zip, WinZip, WinRAR, other third-party tools	Unknown RAT (2022)
VHD	2003	application/x-vhd	7-Zip, WinZip, WinRAR, other third-party tools. Can also be mounted	Bumblebee (2022)
XZ	2009	application/x-xz	7-Zip, WinRAR, GNU tar	Possible Pony spyware (2017); ISR Stealer (2018); FormBook (2022)
ZIP	1989	application/zip	7-Zip, WinZip, WinRAR, most third-party tools	NanoCore (2019); Emotet (2021); Qakbot (2022); Lockbit (2022)

Figure 12: Table showing details of a selection of archive and image formats, and which threat actors have used them

Of course, the above table isn’t exhaustive. There are a multitude of formats to choose from, although threat actors may limit themselves to those supported by popular archivers like WinZip, 7-Zip, and WinRAR. There’s also a wide variety of alternatives to ISO. Researchers reported that a recent Bumblebee campaign, for example, used a VHD (virtual hard disk) file.

Our Managed Detection and Response (MDR) team responded to several cases in August 2022 involving a behavioral detection for the Bumblebee Loader, where the initial access was via a VHD file with the naming schema [customerName].vhd. When we investigated further, we found that the overall infection chain was pretty complex: Phishing email > WeTransfer URL for a file download > .vhd file > LNK shortcut file > PowerShell file > Malicious DLL.

A shift in the threat landscape?

When we dug into our telemetry from the last few months, we spotted some interesting trends.

First, detections of popular Office formats, which often contain malicious macros, seem to be trending downwards (in a Twitter thread back in August, we noted that Excel 4.0 macros were also declining, after Microsoft disabled them by default in 365):

Figure 13: Detections for DOC, DOCM, XLS, or XLSM files between April and September 2022

So, are threat actors using archive formats to pick up the slack? Hard to say. We did notice that more obscure archive formats (ACE, ARJ, XZ, GZ, and LZH) rose pretty sharply up until mid-June, but that trend seems to be less clear in the last few months (although detections have been on the rise again in the last few weeks after a brief drop, which coincides with the traditional summer holiday season).

Figure 14: Detections for ACE, ARJ, XZ, GZ, or LZH files between April and September 2022

With more common formats (ZIP, 7Z, CAB, TAR, and RAR), we haven’t seen much movement in the last few months, although there was a significant spike in early September.

Figure 15: Detections for ZIP, 7Z, CAB, TAR, or RAR files between April and September 2022

However, disk image formats (ISO, VHD, and UDF) are trending upwards, peaking in July. These may be particularly attractive to threat actors, as they can be used to bypass MOTW.

Figure 16: Detections for ISO, VHD, or UDF files between April and September 2022

Other security firms have also noted a decrease in the use of macros and an increase in other file formats such as ISO and RAR.

At present, there’s no evidence to suggest that threat actors will stop using standalone Office files altogether and turn to other formats wholesale. Some organizations may still enable macros due to business needs, and threat actors may adopt more sophisticated pretexts, to try to convince users to remove the MOTW attribute from files. And Microsoft’s rollout is still ongoing, so it may take a while before we see any kind of permanent, significant shift in the threat landscape.

But if this is the start of a long-term change, one positive for defenders and responders is that threat actors often adopt relatively convoluted infection chains when using archives and disk images, as our MDR team noted when investigating several Bumblebee campaigns.

Of course, attacks using traditional macros may also involve complex infection chains. In any case, they provide responders with additional opportunities to detect and block an attack in progress.

Detection and guidance

While it’s probably too early to say if archive formats will be adopted by the majority of threat actors long-term, the trends we’ve observed here are definitely worth keeping an eye on.

The use of malicious archives and disk images has three key implications for defenders and responders:

1. Threat actors will often adopt more complex infection chains. This can complicate analysis and investigation, but it also means more opportunities to stop infections in their tracks

2. Configuration and development of scanners and automated security tools. It’s important to be able to inspect the contents of archives and containers, particularly those sent via email or downloaded via the internet. When it comes to email filtering and inbound traffic, defenders should consider blocking most or all of the file types we discuss here by default, unless there’s a specific business need to allow a particular one through.

3. Awareness and education. User awareness programs around email attachments and links are valuable, but should reflect changes in the threat landscape. While many users will be aware of the risks posed by macros in Office documents, they may be less familiar with archive and container formats. Defenders and responders also have a part to play here, in researching these formats and the ways threat actors use them.
We’ll continue to monitor our telemetry and threat intelligence sources for signs that threat actors are moving to different malware delivery techniques, and the Sophos X-Ops team is continually looking at what formats attackers are using, to ensure that we can protect against both new and old malware.

Acknowledgments

Sophos X-Ops thanks Richard Cohen of SophosLabs and Colin Cowie of Sophos’ Managed Detection and Response (MDR) team for their contributions to this report.