An Important Piece of the Puzzle, but not a Panacea: ML and AI in Phishing Detection

No single approach can protect against today’s modern threats: Material combines threat research, custom detections, user report automation, and AI to detect and stop sophisticated attacks.

Engineering

July 24, 2025

9m read

9m listen

9m watch

speakers

authors

Gian Gonzaga

participants

No items found.

Table of Contents

No single approach can protect against today’s modern threats: Material combines threat research, custom detections, user report automation, and AI to detect and stop sophisticated attacks.

Introduction

The use of Machine Learning (ML) and Artificial Intelligence (AI) has exploded in recent years. And it’s not all hype: companies are developing remarkably useful models to solve some important problems across just about every industry and use case.

But AI and ML are not solutions unto themselves for critical security operations. They are incredibly useful tools to extend and augment other capabilities, and the nature of today’s threat landscape demands the power and flexibility of robust ML models as a piece of the puzzle. However, this post explores why solutions that rely on a single approach for phishing protection are insufficient, and why we believe a well-rounded approach that considers those models as a critical part of the whole is the only way forward.

The limitations of today’s AI

Developing, deploying, and maintaining ML models and infrastructure can be challenging and expensive. And, as the hilarious and occasionally terrifying results of images created by early generative models shows, the results aren’t always worth it. The use of ML, while having the potential to make powerful rapid advancements, also brings the risk of unfairness, bias, discrimination, and outright hallucinations, among others.

Simply put, they are powerful tools but often cannot solve problems on their own. In the first place, models must be used within a broader context of strong structural foundations and clear processes to monitor, measure, re-train, and maintain guardrails around those models. Without this structure, models can easily stray and lose effectiveness.

On a more fundamental level, ML models often must be deployed as part of a set of many tools to address the issue at hand. The trend lately has been companies seeking to solve problems with AI alone–but because ML models have important limitations, blind spots, and weaknesses (as all tools do), using multiple tools that employ varied methodologies will lead to the best result.

At Material we have been building and perfecting many tools that help us detect phishing emails, and we have found that you can’t solve that problem with any single tool. Combining multiple approaches, however, creates a force multiplier.

How we got here

Material was built with the ethos of delivering effective, pragmatic security that we know is effective for our customers. When we first started out, we focused on the email problem that nobody in the industry was focused on: protecting the sensitive data within the inbox itself. We know from decades of experience that even the best inbound protections are never perfect, so we developed an API-based solution that prevented a mailbox breach from giving attackers access to years of sensitive content.

While our customers loved that functionality, we realized that our approach and our technology could help with inbound protections as well. In taking on that challenge, one of our biggest questions seemed simple on the surface but becomes quite complex once you scratch just a bit deeper: how could we accurately classify emails as phishing (i.e., malicious)? As we started to think about how to solve this problem, we believed that any single method would be insufficient. This has proven to be the case. As we have built our detection system we have found great value in different types of tools and we deploy and use many tools to solve this problem.

Developing herd immunity: the power of crowds

One of our earliest approaches was to develop what we called “herd immunity.” This approach leveraged the power of user-submitted reports by searching for similar messages across the environment to automatically protect other users in the organization based on a single report.

When a user reported a malicious message, we searched for similar messages knowing that attackers often send copies of their message to many recipients. We took the power of many eyes looking for threats and magnified it. Even if another person misses that same email, our system will pick it up. This method harnessed the power of many human observers to help detect and remediate malicious messages.

It has proven to be valuable and not only remains an important detection tool for Material today, but has been copied widely across the email security market.

However, herd immunity is not the complete solution because of a couple key challenges.

The security team effort to confirm that a message is malicious can be high, so not all user reports are reliably labeled. Moreover, well-written phishing emails are designed to not be detected and sometimes even the most experienced and savviest users can be fooled. While user reporting is a necessary line of defense, relying on busy and distracted users to catch all threats is unsustainable. This meant that we were missing some of the most sophisticated and subtle attacks. We knew we had to go further.

Adding layers of protection

The second tool we deployed was our own set of rules to detect phishing messages. Our threat researchers wrote detection rules based on their deep knowledge of security and their research of emerging threats. Those rules detected the key elements in the emails and phishing campaigns that showed they were malicious. We tested those rules by running them against a large set of new emails to ensure that the rule was capturing only phishing emails and not other, non-malicious, emails.

Although it took time and effort to put the testing framework itself into place, that structure provided–and continues to provide–highly accurate rules that can be deployed quickly.

Moreover, our experts can distill the key elements that signal maliciousness rather than looking for exact matches. This method increases our ability to detect malicious emails, and provides a great balance between speed (to test and deploy a rule quickly) and recall (the proportion of actual threats correctly identified as such).

These rules increased our coverage significantly and have proven to be a powerful addition to our tool kit. And along with Material's rules, we added the ability to create custom rules, enabling customers to operationalize organization-specific threat intelligence and quickly respond to targeted attacks. But while detection rules are important and incredibly powerful, they still didn’t solve the whole problem even when combined with similarity matching.

The new rules are more flexible when they are being tested than when they are deployed (where they are static until updated). Because of this, over time, even the most effective rules degrade without consistent supervision. Moveover, as our corpus of rules gets bigger, the amount of effort to maintain all of the rules grows. We have found rule maintenance doesn’t scale, in fact the maintenance costs grow faster than the added recall of new rules. Relying on rules alone is something of a Faustian bargain: they can appear relatively easy to create at first, but the maintenance costs rise at such a steep rate, they hinder a teams’ ability to keep them current–let alone address other issues–without scaling the team significantly.

The next evolution: ML and AI in threat detection

The final tool in our kit is ML/AI. It has the ability to incorporate hundreds of predictors (i.e., features) and empirically determine which are the most important based on a training data set. By harnessing the power of many features simultaneously, it maximizes the ability to predict when a message is phishing and makes it much more difficult for an attacker to easily counteract the model. In a relatively short time our models have become a powerful tool to detect malicious emails.

Additionally, we’ve built models to alleviate the shortcomings of other tools. Material’s ML automatically investigates and remediates user reports, for example, alleviating the significant effort previously needed to handle that volume.

But there are important limitations with these tools as well. Most importantly properly retraining (e.g., updating) a model is an intensive process that takes time. So it takes time for a model to adjust to a new phishing campaign or a new threat which means proper protection is delayed. In addition, the infrastructure needed to develop, test, deploy and maintain models is significant.

Many tools are better than one

We have come to realize that each tool has limitations, so if you bet your entire detection system on one tool (i.e., only rules based, or only ML/AI model based) you will struggle with those limitations. So if your system is based entirely on rules, you will take on ever expanding maintenance and complexity costs that will require an ever growing team. If your system is based only on models you will be slow to react to new threats. If your system is only based on user reports you will be blind to the most sophisticated and well written attacks.

But by combining these methods and using them in careful coordination, Material is able to provide state of the art phishing detection.

Our threat researchers research, write, and deploy rules responding to the latest threat intel,
Our automated user report response leverages the eyes of the entire company to add a human line of defense, while minimizing the burden on the security team, and
Our AI and ML capabilities add flexibility and power to both of the above capabilities, giving our protections the ability to respond and scale rapidly.

Put another way: the combination of traditional detection rules with ML allowed both our security team and the model to adapt to an attacker’s changing tactics. Our use of models as an effective security measure was made possible by the advances we put into place as a part of our previous phishing detection work. All incoming messages are evaluated using our ML models and the detections created by our threat research team. User reports are triaged via our ML models, and when malicious emails are found, similar messages are identified across the entire environment and remediated.

The strength of one method compensates for the weakness of others. When our models cannot react fast enough, we can write a rule. When there are many threats coming in creating a backlog for our threat research team, we can leverage the power of user reports to identify new threats in that company's environment. The combination of these tools has proven to be even more powerful than we anticipated.

Greater than the sum of its parts

We started this blog by asking about the value of using ML/AI in phishing detection. While we have seen these models be powerful tools in detecting malicious messages for our customers, they are not the whole picture. But when added to the other detection methods we use, it completes our tool box and gives Material enormous flexibility and power to detect phishing attempts in any context.

We leverage the eyes of all the employees of our customers (user reports), rapidly and reliably incorporate the expertise of deeply experienced security professionals (rules written by experts), and we deploy models that can incorporate hundreds of features to detect malicious emails at great scale.

We know that ML/AI is not a panacea that solves all of our problems. But we believe models in coordination with other detection methods give us the best opportunity to detect malicious messages and help keep our customers safe from phishing threats and constantly evolving phishing tactics.

‍This blog is part of a series on Material’s approach to the responsible use of AI in cybersecurity. Stay tuned for more.

‍

Listen to the Podcast

Watch the Webinar

Watch the Video

Frequently Asked Questions

Find answers to common questions and get the details you need.

No items found.

^Related posts

Our blog is your destination for expert insights, practical tips, and the latest news in technology. Stay informed with our regular updates and in-depth articles. Join the conversation and enhance your understanding of the tech landscape.

Blog Post

The Access You Forgot You Granted: Material’s OAuth Risk Report

Summary: We analyzed 22,000+ OAuth apps in the wild and found AI adoption bypassing security and IT, thousands of zombie tokens still active, and lots of restricted scopes.

Material Research Team

10m read

Read post

Podcast

The Access You Forgot You Granted: Material’s OAuth Risk Report

Summary: We analyzed 22,000+ OAuth apps in the wild and found AI adoption bypassing security and IT, thousands of zombie tokens still active, and lots of restricted scopes.

10m listen

Listen to episode

Podcast

The Access You Forgot You Granted: Material’s OAuth Risk Report

Summary: We analyzed 22,000+ OAuth apps in the wild and found AI adoption bypassing security and IT, thousands of zombie tokens still active, and lots of restricted scopes.

10m watch

Watch webinar

Podcast

The Access You Forgot You Granted: Material’s OAuth Risk Report

Summary: We analyzed 22,000+ OAuth apps in the wild and found AI adoption bypassing security and IT, thousands of zombie tokens still active, and lots of restricted scopes.

10m watch

Watch webinar

blog post

What Is an Email Bomb? How Inbox Flooding Attacks Work and How to Stop Them

Material delivers a new solution to a resurgent threat: automated remediations to email flooding attacks.

Nate Abbott

m read

Read post

Podcast

What Is an Email Bomb? How Inbox Flooding Attacks Work and How to Stop Them

Material delivers a new solution to a resurgent threat: automated remediations to email flooding attacks.

m listen

Listen to episode

Video

What Is an Email Bomb? How Inbox Flooding Attacks Work and How to Stop Them

Material delivers a new solution to a resurgent threat: automated remediations to email flooding attacks.

m watch

Watch video

Downloads

What Is an Email Bomb? How Inbox Flooding Attacks Work and How to Stop Them

Material delivers a new solution to a resurgent threat: automated remediations to email flooding attacks.

m listen

Watch video

Webinar

What Is an Email Bomb? How Inbox Flooding Attacks Work and How to Stop Them

Material delivers a new solution to a resurgent threat: automated remediations to email flooding attacks.

m listen

Listen episode

blog post

Consent Is the New Vulnerability

OAuth consent has emerged as a critical security vulnerability that bypasses traditional authentication like MFA and passwords, granting attackers persistent, automated access that survives even password resets and account offboarding.

Kate Hutchinson

m read

Read post

Podcast

Consent Is the New Vulnerability

m listen

Listen to episode

Video

Consent Is the New Vulnerability

m watch

Watch video

Downloads

Consent Is the New Vulnerability

m listen

Watch video

Webinar

Consent Is the New Vulnerability

m listen

Listen episode

blog post