Go back

Classifying Chaos: How Material Automates User-Reported Phishing at Scale

While automated detection systems catch many threats, user reports remain a vital defense layer – often catching sophisticated attacks that slip through automated filters. See how Material solves this problem.

Engineering
January 10, 2025
10m read
10m read
10m listen
10m watch
10m watch
Classifying Chaos: How Material Automates User-Reported Phishing at Scale HeaderClassifying Chaos: How Material Automates User-Reported Phishing at Scale Thumbnail
speakers
speakers
speakers
authors
Eddie Conk
participants
No items found.
share

While automated detection systems catch many threats, user reports remain a vital defense layer – often catching sophisticated attacks that slip through automated filters. See how Material solves this problem.

It's 9 AM on a Monday, and your security team is already navigating through dozens of user-reported emails. Some are legitimate phishing attempts that need immediate attention. Others are marketing emails that persistent vendors refuse to stop sending. Most fall somewhere in between, each requiring precious minutes of analysis that add up to hours of work better spent elsewhere. You might be wondering how, despite repeated training, some of your end-users treat phishing reports as a dumping ground for any unwanted email and what you can do about it while staying on top of real threats.

This scenario plays out daily in organizations worldwide, highlighting a critical challenge in email security: managing user-reported phishing messages effectively. While automated detection systems catch many threats, user reports remain a vital defense layer – often catching sophisticated attacks that slip through automated filters. However, the sheer volume of reports creates a significant operational burden for security teams.

Material solves this problem with User Report Auto Classification, which automatically investigates and classifies user-reported messages as safe, spam, or malicious. 

Security teams have the power to customize remediation responses in each outcome, including the rare instances when Material is unable to make a high-confidence determination. This capability can transform how organizations handle user reports, enabling real-time triage and response while maintaining the high accuracy standards required for effective email security.

Why user reports matter

The evolution of phishing attacks has created an arms race between attackers and detection systems. Threat actors continuously adapt their techniques to evade automated detection, leading to a fundamental truth in email security: human reporting remains a critical defense layer. A single user report can be the difference between a prevented incident and a successful breach.

However, the operational reality of managing these reports presents several challenges:

  1. Volume and Variety: Organizations face a daily deluge of reports ranging from legitimate threats to benign marketing emails. Each report requires individual attention, creating a significant time sink for security teams.
  2. Speed vs. Accuracy: Security teams face an impossible trade-off between thorough investigation and rapid response. Delay could mean a threat remains active longer, while hasty analysis risks missing crucial details.
  3. Resource Allocation: Many security teams spend a disproportionate amount of time triaging user reports – time that could be better spent on proactive security measures or investigating genuine incidents.
  4. Default Policy Limitations: Organizations without resources for manual review often implement blanket policies: either overly permissive (risking security) or overly restrictive (disrupting business operations).

Many security teams realize that fielding user reported messages is critical, and Material is enabling organizations to do that effectively. In the next section, we’ll dive into the technical capabilities that allow User Report Auto Classification to achieve this.

The technical foundation

Material approaches the challenges above through a multi-layered analysis system that mirrors and enhances the investigation process of expert security analysts. With an ever-shifting threat landscape, our job becomes a lot harder than checking the message body for “PayPal” or urgent requests (“kindly”, of course). 

At the heart of the system is a predictor, designed to make accurate determinations about user-reported messages. Our machine learning models tackle this by analyzing thousands of signals across three key dimensions that can generalize across evolving campaigns:

  1. Organizational Context: By syncing every single email that goes in and out of your tenant, Material builds a comprehensive analysis of your email environment
  2. Message Analysis: We employ sophisticated natural language processing techniques to understand each message’s content from both statistical and semantic perspectives
  3. External Intelligence: The system correlates internal findings with external data sources, examining things like domain registration details to round out the picture of every user report

Organizational context

Material's data infrastructure maintains a comprehensive view of email communication patterns within your organization. The output is data that describes any existing relationship between an external sender and the recipient organization and is something we’ve observed to be among the most effective features for our models. The system continuously analyzes both historical and real-time data to understand:

Historical Sender Relationship: For every sender address and domain, we track detailed statistics including:some text

  • Volume of messages sent to your organization
  • Bi-directional communication patterns
  • Distribution of recipients within your organization
  • Historical spam report rates

Real-time Sender Data: We maintain real-time sender novelty status - identifying when an address or domain is communicating with your organization for the first time. To achieve this, we supplement our real-time message processing engine with a scheduled job framework that builds on top of Google’s Datastore BigQuery and Redis to provide:some text

  • Streaming updates as new messages arrive
  • Distributed caching for efficient lookups
  • Historical interaction data for contextual analysis and recovery
  • Scale across any organization size

Organizational context data are particularly useful because they capture behavioral patterns that are hard for attackers to fake. A legitimate business partner will typically show consistent communication patterns, while attack infrastructure often exhibits telltale signs like sudden appearance or unusual recipient patterns. We believe organization contextual data is a key piece of the picture in discerning malicious messages from safe ones, and we’re constantly exploring new ways to make this data more robust.

Advanced semantic analysis

Another key source of performance in our classification system is its ability to understand the semantic meaning of email content, going far beyond simple keyword matching. Our approach combines multiple sophisticated techniques:

Deep semantic understanding 

We employ advanced embedding models that transform email text into high-dimensional vector representations, capturing subtle patterns in:

  • Message intent and tone
  • Complex relationships between concepts
  • Writing style and sophistication
  • Technical indicator patterns
In-house email-optimized models

While we leverage state-of-the-art pre-trained models from Google, we've also developed specialized embedding models trained specifically on email data. These models:

  • Learn patterns specific to email communication
  • Use efficient architectures optimized for CPU inference
  • Enable real-time analysis at scale

Our embedding approach uses an architecture that allows for efficient inference on CPU hardware while maintaining high accuracy. Rather than relying on heavy transformer models at every step, we use an architecture that leverages subword information and hierarchical structure in text, allowing us to deploy these models effectively at scale. In many downstream tasks, our domain-specific embeddings model outperforms those trained by Frontier AI labs on email-related tasks. Additionally, these models provide competitive alternatives for extremely data-sensitive customers that must limit third party processing for regulatory reasons. We see a great deal of potential in further investment here and are excited to push the state of the art for textual representation learning in the email domain.

Automated Classification Workflow

Our multi-faceted approach allows us to achieve high precision in identifying malicious messages – a critical requirement for any automated security system.

To build on the technical foundation, we’re shipping enhancements to make the user experience around managing user reports streamlined and intuitive. To start, each classification comes with clear, natural language explanations that help security teams understand exactly why a message received its classification – no machine learning expertise required.

When configured to “Automatically classify,” Material transforms the user-report workflow:

  1. As soon as a user reports a suspicious message, the system automatically creates a case with a classification (safe, spam, malicious, or unknown).
  2. Each classification includes plain-language explanations that highlight key factors behind the decision. For instance, you might see reasons like "Sender domain registered 2 days before message date” or “Sender address is in the 98th percentile of external senders to your tenant in the past 35 days”
  3. Based on your configured settings, the system automatically applies the appropriate remediation action for each classification type. You maintain full control by mapping specific classifications to your preferred remediation approaches.
  4. Security teams can review and override any classification or remediation action at any time, maintaining the flexibility needed for edge cases or changing circumstances.

Manual Review with Smart Assistance

For teams that prefer a more hands-on approach, <URAT Feature> can operate in a manual classification mode:

  • Cases are created as before, but now include a suggested classification along with the supporting explanations
  • Accept accurate suggestions with a single click, streamlining the review process
  • Maintain full manual control while benefiting from intelligent classification assistance

Intelligent Case Management

User Report Auto Classification streamlines case management even further than Material already did. Whether set to automatic or manual classification, Material keeps your caseload minimized by ensuring:

  • Similar messages are automatically grouped together
  • Classification consistency is maintained across related messages to prevent conflicting remediation actions - even when automated classifications are overridden by your team
  • Case relationships are intelligently managed to ensure that different classifications for similar messages don't result in unintended consequences

This combination of automated intelligence and manual control means security teams can process user reports more efficiently without sacrificing accuracy or oversight. Whether you choose full automation or assisted classification, Material provides the context and control needed to make confident decisions about user-reported messages.

Conclusion

Security teams can finally say goodbye to spending hours triaging user reports every week. With User Report Auto Classification, reports transform from a manual burden into automated, real-time protection that adapts to your organization's needs. By combining sophisticated machine learning with transparent explanations and flexible controls, security teams can confidently automate user report triage without sacrificing visibility or control over their security programs. Your team's time is now free for the strategic security work that matters most.

It's just the latest in Material's commitment to developing solutions that eliminate security bottlenecks without compromising the control and visibility that modern security programs demand. Stay tuned for more!

Related posts

Our blog is your destination for expert insights, practical tips, and the latest news in technology. Stay informed with our regular updates and in-depth articles. Join the conversation and enhance your understanding of the tech landscape.

blog post

Classifying Chaos: How Material Automates User-Reported Phishing at Scale

While automated detection systems catch many threats, user reports remain a vital defense layer – often catching sophisticated attacks that slip through automated filters. See how Material solves this problem.

Eddie Conk
10
m read
Read post
Podcast

Classifying Chaos: How Material Automates User-Reported Phishing at Scale

While automated detection systems catch many threats, user reports remain a vital defense layer – often catching sophisticated attacks that slip through automated filters. See how Material solves this problem.

10
m listen
Listen to episode
Video

Classifying Chaos: How Material Automates User-Reported Phishing at Scale

While automated detection systems catch many threats, user reports remain a vital defense layer – often catching sophisticated attacks that slip through automated filters. See how Material solves this problem.

10
m watch
Watch video
Downloads

Classifying Chaos: How Material Automates User-Reported Phishing at Scale

While automated detection systems catch many threats, user reports remain a vital defense layer – often catching sophisticated attacks that slip through automated filters. See how Material solves this problem.

10
m listen
Watch video
Webinar

Classifying Chaos: How Material Automates User-Reported Phishing at Scale

While automated detection systems catch many threats, user reports remain a vital defense layer – often catching sophisticated attacks that slip through automated filters. See how Material solves this problem.

10
m listen
Listen episode
blog post

Protecting Patients’ Data Beyond HIPAA Requirements

Recent proposed changes to the HIPAA Security Rule don’t go far enough, but that shouldn’t stop healthcare organizations from keeping patient data safe.

Rajan Kapoor
7
m read
Read post
Podcast

Protecting Patients’ Data Beyond HIPAA Requirements

Recent proposed changes to the HIPAA Security Rule don’t go far enough, but that shouldn’t stop healthcare organizations from keeping patient data safe.

7
m listen
Listen to episode
Video

Protecting Patients’ Data Beyond HIPAA Requirements

Recent proposed changes to the HIPAA Security Rule don’t go far enough, but that shouldn’t stop healthcare organizations from keeping patient data safe.

7
m watch
Watch video
Downloads

Protecting Patients’ Data Beyond HIPAA Requirements

Recent proposed changes to the HIPAA Security Rule don’t go far enough, but that shouldn’t stop healthcare organizations from keeping patient data safe.

7
m listen
Watch video
Webinar

Protecting Patients’ Data Beyond HIPAA Requirements

Recent proposed changes to the HIPAA Security Rule don’t go far enough, but that shouldn’t stop healthcare organizations from keeping patient data safe.

7
m listen
Listen episode
blog post

New in Material: Unified Detections and Automated Responses

Bundle up and get cozy—our latest update is packed with frosty new features and cool improvements to keep your security sleigh running smoothly.

Patrick Duffy
2
m read
Read post
Podcast

New in Material: Unified Detections and Automated Responses

Bundle up and get cozy—our latest update is packed with frosty new features and cool improvements to keep your security sleigh running smoothly.

2
m listen
Listen to episode
Video

New in Material: Unified Detections and Automated Responses

Bundle up and get cozy—our latest update is packed with frosty new features and cool improvements to keep your security sleigh running smoothly.

2
m watch
Watch video
Downloads

New in Material: Unified Detections and Automated Responses

Bundle up and get cozy—our latest update is packed with frosty new features and cool improvements to keep your security sleigh running smoothly.

2
m listen
Watch video
Webinar

New in Material: Unified Detections and Automated Responses

Bundle up and get cozy—our latest update is packed with frosty new features and cool improvements to keep your security sleigh running smoothly.

2
m listen
Listen episode
blog post

New in Material: Detections, Remediations, Reports & More

As the days grow shorter and the weather gets colder, Material’s ready with a steady stream of hot platform updates to keep you warm.

Patrick Duffy
4
m read
Read post
Podcast

New in Material: Detections, Remediations, Reports & More

As the days grow shorter and the weather gets colder, Material’s ready with a steady stream of hot platform updates to keep you warm.

4
m listen
Listen to episode
Video

New in Material: Detections, Remediations, Reports & More

As the days grow shorter and the weather gets colder, Material’s ready with a steady stream of hot platform updates to keep you warm.

4
m watch
Watch video
Downloads

New in Material: Detections, Remediations, Reports & More

As the days grow shorter and the weather gets colder, Material’s ready with a steady stream of hot platform updates to keep you warm.

4
m listen
Watch video
Webinar

New in Material: Detections, Remediations, Reports & More

As the days grow shorter and the weather gets colder, Material’s ready with a steady stream of hot platform updates to keep you warm.

4
m listen
Listen episode
Privacy Preference Center

By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.