Microsoft Purview - Evading Data Loss Prevention policies

Introduction

Microsoft Purview is a comprehensive solution that helps organizations manage and protect their data across various environments, including on-premises, multi-cloud, and software-as-a-service (SaaS) platforms. It provides a unified data catalog, data classification, and data security capabilities, enabling organizations to gain insights into their data landscape, secure their data accordingly, and ensure compliance with regulatory requirements.

Illustration of Microsoft Purview Pillars, showcasing its capabilities in managing and protecting data across on-premises, multi-cloud, and SaaS environments. — Figure 1: Microsoft Purview pillars.

The data security toolset includes Sensitivity Labels and Data Loss Preventions, which are the tools that this blog post focuses on:

Sensitivity labels are used to classify and protect data based on its content. These labels can be applied to documents and emails to enforce protection settings such as encryption and content marking. Sensitivity labels helps organizations classify their information, but also safeguard their sensitive information by ensuring that only authorized users can access and share it. On the technical side, the Sensitivity label is represented by additional metadata added to the file or email.
Data Loss Prevention (DLP) refers to a set of policies and technologies designed to detect and prevent the unauthorized transmission of potentially sensitive data inside and outside an organization’s boundaries.

While Microsoft Purview provides interesting features related to data security, the solution remains prone to some very old and always relevant vulnerabilities: the human factor and bad design decisions.

This blog post explores an exfiltration scenario that highlights the importance of robust DLP policies within Microsoft Purview. We will examine how Sensitivity Labels function, why inadequate DLP measures can lead to data leaks, and we’ll introduce best practices for strengthening an organization’s data protection measures. If you are not familiar with Microsoft Purview, Sensitivity Labels, Data Loss Prevention, or Insider Risk Management, I recommend having a look at this other blog post: Become Big Brother with Microsoft Purview – NVISO Labs

Evasion scenario

Context

Alex is an internal employee in the company Purview Territory. Alex has access to an Entra ID joined Windows 10 workstation that is managed in Intune. Through the device, Alex can access and sync files from SharePoint Online and emails from Exchange Online. Alex will try to exfiltrate docx and pdf files containing sensitive information via email and to upload the files to an online storage platform.

Depiction of Alex, an employee at Purview Territory, using a Windows 10 workstation managed by Intune to access SharePoint and Exchange Online, attempting to exfiltrate sensitive docx and pdf files via email and online storage. — Figure 2: Alex initial access.

Protection in place

In Microsoft Purview, the sensitivity label ‘Super Secret’ can be applied to files and emails, it is the most sensitive level of information. Other sensitivity labels exist, and the policy that makes all labels available to Alex also enforces the application of a label in order to save a file or send an email, making the suppression of sensitivity label impossible in theory. Emails inherit the sensitivity label from the attachment if it has a higher level of sensitivity. In addition, Insider Risk Management is used to investigate and raise alerts if users downgrade sensitivity labels.

Two Data Loss Prevention policies exist with a unique rule for each with the following settings:

DLP 1 – Rule 1: Files labeled with ‘Super Secret’ cannot be shared externally, and emails labeled with ‘Super Secret’ cannot be sent to external domains. This policy is processed in the Microsoft 365 ecosystem.
DLP 2 – Rule 1: Files labeled with ‘Super Secret’ cannot be uploaded to unauthorized online storage. This policy is processed on the endpoint thanks to the Defender for Endpoint integration.

Image illustrating two Data Loss Prevention (DLP) policies: DLP 1 prevents sharing 'Super Secret' files and emails externally within Microsoft 365, while DLP 2 blocks uploading such files to unauthorized online storage via Defender for Endpoint. — Figure 3: Existing protection.

Failed attempt

Alex tries first to exfiltrate the two files via email. The files are added as attachments and then sent to an external email address:

GIF showing Alex attempting to exfiltrate two files by attaching them to an email and sending it to an external address. — Figure 4: Failed email exfiltration.

When Alex clicks “Send”, the action is blocked directly in the client and the policy tip defined in the DLP appears. Note: If the client cannot process the DLP rule for some reasons, it will be processed in Exchange Online and the email will bounce back with a delivery failure message instead.

Alex tries a second time to exfiltrate the two files via an online storage platform. The files are simply dragged and dropped into the website:

GIF depicting Alex's second attempt to exfiltrate two files by dragging and dropping them into an online storage platform's website. — Figure 5: Failed online storage exfiltration.

When the file is dropped into the website, the Microsoft Purview Extension identifies the sensitivity label and prevents the action. This browser extension is enabled by default in Microsoft Edge and it can be added to Google Chrome and Firefox via Group Policy Object (GPO) or Intune Configuration Profile.

Overall, the exfiltration was prevented as the DLP policies were triggered according to the sensitivity labels. The diagram below presents the results of the failed attempts:

Diagram illustrating how Data Loss Prevention (DLP) policies successfully blocked exfiltration attempts by triggering based on sensitivity labels, showing failed attempts. — Figure 6: Exfiltration blocked by DLP 1 & 2.

Successful attempts

To successfully bypass the DLP policies, the sensitivity labels must be removed from the files. However, the label policy defined in Microsoft Purview makes it mandatory for users to select a sensitivity label when saving a file. Even with the Microsoft Purview Information Protection client, it is not possible to delete the label as the option is grayed out.

Image of the Information Protection File Labeler interface, showing that sensitivity labels cannot be removed due to policy enforcement in Microsoft Purview, with the delete option grayed out. — Figure 7: Information Protection File Labeler.

In order to get rid of the sensitivity labels by other means, we must understand how they actually work. As stated earlier, a sensitivity label is only made of metadata that is added to a file. If we open the SuperSafe.docx with a metadata reader, such as ExifTool, the following information is displayed:

Image showing the metadata of SuperSafe.docx opened in a metadata reader like ExifTool, illustrating how sensitivity labels are stored as metadata within the file. — Figure 8: Metadata of SuperSafe.docx seen in ExifTool.

In the first part, the content marking is described. It does not offer any protection. In the second part, the sensitivity label applied is seen. Now the question: What if we simply delete those metadata? Unfortunately, we cannot edit metadata of docx files with ExifTool, but we can edit metadata of pdf files. So first, we export the docx to pdf. This file will be called “SuperSafe – Exported.pdf” and here is its metadata:

Image displaying the metadata of "SuperSafe – Exported.pdf," highlighting the content marking and sensitivity label, and discussing the potential to edit metadata in PDF files using tools like ExifTool. — Figure 9: Metadata of SuperSafe – Exported.pdf seen in ExifTool before deletion.

And now with ExifTool, we can delete the metadata of “SuperSafe – Exported.pdf”:

Image showing the process of using ExifTool to delete the metadata from "SuperSafe – Exported.pdf," demonstrating the removal of sensitivity labels. — Figure 10: Deleting the metadata with ExifTool.

If we reopen “SuperSafe – Exported.pdf” with ExifTool, we can see that the sensitivity label is gone:

Image showing "SuperSafe – Exported.pdf" reopened in ExifTool, confirming the removal of the sensitivity label from the file's metadata. — Figure 11: Metadata of SuperSafe – Exported.pdf seen in ExifTool after deletion.

After removing the metadata of “SuperSafe.pdf”, Alex can upload the two pdf files to an online storage platform:

GIF depicting Alex successfully uploading two PDF files to an online storage platform after removing the metadata from "SuperSafe.pdf." — Figure 12: Successful online storage exfiltration.

The upload was successful. Alex now sends the same files via email:

GIF showing Alex successfully sending the files via email after a successful upload, indicating the completion of the exfiltration process. — Figure 13: Successful email exfiltration.

The exfiltration is now successful. The diagram below depicts the entire exfiltration:

Diagram illustrating the complete exfiltration process, showing how Alex successfully bypassed security measures to send files via email and upload them online. — Figure 14: Successful exfiltration after removing the labels.

Detection & potential mitigation

Now that the exfiltration was successful, it is the moment to deep dive into the Microsoft security & compliance solutions in order to try to identify and mitigate the exfiltration.

Device timeline

The device timeline is a feature from Microsoft Defender for Endpoint (MDE) that provides the events that happened on the device onboarded to Defender for Endpoint. MDE is included in the Enterprise E5 license and therefore does not involve additional cost. The goal of this feature is to help analysts and administrators in researching and investigating anomalous behaviors. After reproducing the removal of metadata on a new file called “VerySuperSafe.pdf”, the following sequence can be observed in the device timeline:

The diagram below provides a more high level view over the sequence:

Diagram offering a high-level overview of the sequence of events captured in the device timeline, illustrating the process of metadata removal from a file. — Figure 16: Metadata deletion overview.

The following information from the events could be used in hunting queries to identify similar behavior:

The process itself: ExifTool is originally a tool to modify metadata. Using it with arguments that overwrite the original metadata with nothing is a good indicator.
Renaming the file repeatedly: As part of the metadata overwriting process, the file is renamed twice in a very short timeframe (approximately 800 milliseconds).

Classifiers

While the existing protections use sensitivity labels, it is clear that this classification tool alone is not enough to properly identify and protect sensitive information. This is because the sensitivity label ‘Super Secret’ is only applied manually.

An important improvement would be to have one or more classifiers that can help in identifying sensitive information. The classifiers can be Sensitive Information Type (SIT) which are based on keywords, RegEx, fingerprints or exact data match (EDM). Microsoft Purview also provides the ability to create Trainable Classifier, which can be summarized as a Sensitive Information Type powered by machine learning. Note that the Trainable Classifier (TC) must be trained thoughtfully to meet a sufficient level of confidence in its classification.

Once the sensitive information is translated into SITs or TCs, a sensitivity label policy can be created to automatically apply the sensitivity label ‘Super Secret’ based on the information found in the file.

Finally, the sensitivity label ‘Super Secret’ can be modified to include encryption. This ensures that no unauthorized user is able to open and read the documents even if the sensitivity label is removed.

Stronger DLP policies

To mitigate this exfiltration scenario, the DLP policies created in Microsoft Purview can be improved with additional rules. As a reminder, one DLP policy can contain multiple rules with different conditions and actions.

Starting with the DLP 1:

Rule 1: Files labeled with ‘Super Secret’ cannot be shared externally, and emails labeled with ‘Super Secret’ cannot be sent to external domains.
Rule 2 (new): Files without sensitivity label cannot be shared externally, and emails without sensitivity label cannot be sent to external domains. This is only applicable to files compatible with Sensitivity Labels.
Rule 3 (new): Emails and attachments containing information that match selected SIT or TC cannot be shared externally.

This second rule prevents the emails, even if the emails themselves are correctly labeled, to be sent out to external domains if the attachment has no sensitivity label. Additionally, the third rule ensures that any form of information identified as super secret via the classifiers cannot be sent out to external domains.

Regarding the DLP 2:

Rule 1: Files labeled with ‘Super Secret’ cannot be uploaded to unauthorized online storage.
Rule 2 (new): Files without sensitivity label cannot be uploaded to unauthorized online storage. This is only applicable to files compatible with Sensitivity Labels.
Rule 3 (new): Files containing information that match selected SIT or TC cannot be shared externally.

Similarly to the improvement for the emails, this second rule ensures that all files supporting sensitivity labels cannot be uploaded to unauthorized online storage. Comparably to the DLP1, the third rule ensures that any form of information identified as super secret via the classifiers cannot be uploaded to unauthorized online storage. From a high level point of view, these additional DLP rules ensure that everything that is explicitly ‘Super Secret’, contains information that can be identified as ‘Super Secret’, or has been tampered with (i.e., label removal), cannot be send to external domains or uploaded to unauthorized online storage platforms.

Insider Risk Management

The Insider Risk Management (IRM) solution from Purview could also help in identifying potential exfiltration activities through the analysis of users activities. This can be achieved through the creation of the IRM policy ‘DEMO – Data exfiltration’ that does the following:

Monitor activities & sequences: Insider Risk Management analyses all users activity in real time. Whenever some activities reach a specific threshold, or if an exfiltration sequence is detected, an investigation is conducted to determine if an actual data exfiltration is happening.
Risk assignment: Based on the investigation, a risk score is assigned to the user who performed the suspicious activities. The score is calculated based on the number of activities, sequences, or sensitive information found.
Alerts: When the investigation is completed, an alert is raised in the IRM portal. It contains the following information:
1. ID and name of the IRM policy: Provides primary information including the risk score
2. Triggering event: What triggered the investigation
3. Activity that generated this alert: What was found during the investigation
4. User details & history: The username and the history of alerts related to this user
5. User activity: A list and a scatter plot presenting the different activities identified

Image depicting the Insider Risk Management (IRM) solution interface, highlighting features such as real-time activity monitoring, risk score assignment, and alert details, including policy ID, triggering events, and user activity analysis. — Figure 17: Alert raised in Insider Risk Management.

Following a review, this alert can be dismissed or confirmed. Confirming the alert will escalate to an insider risk case, which we will cover in another article.

Overall, IRM policies provide valuable information about correlated exfiltration events rather than isolated activities. The risk assignment can be used to quickly highlight the risky users as well as configuring dedicated risk-based protection through Adaptive Protection, a topic we will cover in another article.

Conclusion

It is clear that relying solely on manual Sensitivity Label application and basic Data Loss Prevention (DLP) rules for classifying and protecting information is insufficient. While it may seem better than having no measures in place, this approach can be counterproductive. It might satisfy compliance audits, creating a false sense of security as this protection can be evaded easily.

Here is a summary on the main topics to keep in mind when pursuing data security:

Identify what is sensitive: Often referred as ‘know your data’, the output of this primordial step is a comprehensive overview of all data processed and stored within the organization. The protection that will be enforced later will always be as good as the identification of sensitive information. This requires the combined use of Sensitivity Labels and Policies, Trainable Classifiers, and Sensitive Information Types.
Enforce strong data protection measure: Strong Data Loss Prevention (DLP) policies should be designed with multiple rules to utilize all available features and resist metadata tampering. Additionally, files labelled as sensitive should be encrypted to ensure that only authorized users have access to the content.
Ensure everything is monitored: Comprehensive monitoring is essential, covering everything from cloud activities to processes running on devices. Endpoint activities can be captured by Microsoft Defender for Endpoint and forwarded to a Security Information and Event Management (SIEM) solution for alerting or a Security Orchestration, Automation, and Response (SOAR) system for automated mitigation. User activities can be monitored through Insider Risk Management, which can investigate, generate alerts, and restrict user operations and access using Adaptive Protection.

If you are interested in implementing data security measures or review your current controls, don’t hesitate to visit our website www.nviso.eu or connect with us on LinkedIn.

Resources

Should you have any questions or remarks, please feel free to contact me with the details at the end of the article.

While the topic tackled in this article reflects professional experience and expertise, the Microsoft resources I used below will also help you in understanding Microsoft Purview:

Microsoft Purview introduction:
https://learn.microsoft.com/en-us/purview/purview
Data Loss Prevention policy reference:
https://learn.microsoft.com/en-us/purview/dlp-policy-reference
Sensitive Information Type:
https://learn.microsoft.com/en-us/purview/sit-sensitive-information-type-learn-about
ExifTool:
https://ExifTool.org
Trainable Classifiers:
https://learn.microsoft.com/en-us/purview/trainable-classifiers-learn-about
Microsoft Purview Data Classification overview:
https://learn.microsoft.com/en-us/purview/data-classification-overview
Microsoft Purview Insider Risk Management overview:
https://learn.microsoft.com/en-us/purview/insider-risk-management
Microsoft Purview Data Loss Prevention overview:
https://learn.microsoft.com/en-us/purview/dlp-learn-about-dlp
Learn more about sensitivity labels:
https://learn.microsoft.com/en-us/purview/sensitivity-labels
Restrict access to content by using sensitivity labels to apply encryption:
https://learn.microsoft.com/en-us/purview/encryption-sensitivity-labels
Microsoft Defender for Endpoint device timeline:
https://learn.microsoft.com/en-us/defender-endpoint/device-timeline-event-flag

About the author

Mathéo Boute

Mathéo is a Senior Cybersecurity Consultant and member of the Cloud Security Team. His area of expertise revolves around Azure and Microsoft 365 with a focus on Microsoft Purview, Entra ID, Intune, and Defender, where he has an extensive experience of assessing, designing and implementing multiple security solutions in various industries.

Mathéo confirmed his skills by passing the CISSP and all the security-related Microsoft certification, including the Microsoft Certified Cybersecurity Architect.

Microsoft Purview – Evading Data Loss Prevention policies

Introduction