Block Phishing Content with Proactive DLP

OPSWAT Proactive DLP (Data Loss Prevention) key functionality is to detect and protect sensitive and confidential data from breaches. With a few alterations in configuration, you can also leverage the built-in OCR (Optical Character Recognition) technology integrated into Proactive DLP to help detect phishing content in images.

What is Phishing?

Phishing is a cybercrime in which threat actors impersonate a legitimate person or organization to “bait” susceptible victims into taking an action (e.g., clicking on a link, installing a malicious attachment, or granting threat actors access to your computer or your organization’s network) to steal sensitive information.

Email is the most common channel for phishing. Cybercriminals send out fraudulent messages with enticing language to gain the receiver’s trust and eventually prompt them to take harmful actions.

Phishing Using Macros in Microsoft Office

Macro is one of the popular attack vectors that threat actors leverage. We discussed how cybercriminals exploit the Excel 4.0 Macro to store hidden malware in this blog.

The attack chain usually starts with an email. First, the attacker sends out a deceptive email purporting to come from a reputable source. This email generally contains an attached document with an embedded malicious macro. Once the victim opens and enables the macro, they immediately enable downloading of the malware and allow the infection process to begin.


When you download a Word Document (.doc) file from the internet, it automatically opens in the Protected View. This mode isolates unreliable web content to limit your chance of inadvertently opening malware, spyware, or any potentially harmful code.


The Protected View lets you read the file without running any potential hidden malware. If you are confident that the file is safe and want to make changes, you may click “Enable Editing”. If the file contains macros, there will be another security gate. Microsoft Office will ask you to allow or not to allow the inside content via the “Enable Content” button.


On one hand, threat actors came up with macro exploiting technologies that can bypass security defenses including obfuscated code, stomping, or password-protected files. On the other hand, they had to increase the gullibility of their targets by making it seem essential to click “Enable Editing” and “Enable Content”. Messages like “Please enable Editing and Content to see this document” can successfully fulfill this stratagem.

This social engineering tactic is simple yet effective for malware and payload delivery, so much so that it was the advanced malware evasion method for VBA stomping or the primary distribution of Emotet–the world’s most dangerous malware.

Using Optical Character Recognition to Detect Phishing Images

OPSWAT Proactive DLP supports OCR (Optical Character Recognition), which can help block phishing documents. The technology combines OCR with regular expressions (RegEx) and keywords to detect phishing keywords in images.

OCR is a prevalent technology used to recognize text inside images. It examines and extracts data from handwritten or printed texts—whether from a scanned document or an image—then converts the text into machine-readable format that can later be used for data processing. The more advanced the OCR system is, the higher recognition accuracy it provides.

Here’s an example of the Proactive DLP configuration in OPSWAT MetaDefender Core. Under the “Check for Regular Expressions” section, you can use the “RegEx” field to input potential phishing keywords, and add any other relevant keywords (e.g., “decrypt,” “protected”) in the “Keywords” field to reduce false positives.


If the file contains any phishing keywords, it will be blocked. Below is the test result:


OPSWAT Proactive DLP

OPSWAT Proactive DLP detects and automatically redacts sensitive and confidential data in files and emails—including credit card numbers, social security numbers, IPv4 addresses, CIDR (Classless Inter-Domain Routing), or any custom regular expressions. With the integration of OCR, Proactive DLP also helps to block phishing documents and flag personally identifiable information (PII) in images and non-searchable PDF files.

Our technology also aids compliance with data protection regulations and industry-standard security requirements such as PCI, HIPAA, Gramm-Leach-Bliley, FINRA, and more. Proactive DLP is a key technology in many OPSWAT products: MetaDefender Core, MetaDefender ICAP Server, MetaDefender Email Gateway Security, MetaDefender Kiosk, and MetaDefender Vault.

To learn more about Proactive DLP and how OPSWAT can protect your organization, contact one of our critical infrastructure cybersecurity experts.

Sign up for Blog updates!
Get information and insight from the leader in advanced threat prevention.