As we’ve previously written about, adding Microsoft 365 and Google Workspace data into a security data lake can unlock many new use cases for security teams. This post continues our series on sharing examples on how to use this data to measure and improve your security posture.
In this blog post, we’ll walk through how to use Material and Snowflake to detect sensitive workflows happening over email as well as steps to mitigate the risks that they pose.
What are sensitive workflows and why should you move them out of email?
A sensitive email workflow is the handling of emails containing sensitive information such as legal matters, employee data, financial details, patient records, passwords, trade secrets, and classified government information.
While email is the most commonly used business communication tool, the suitability of handling sensitive workflows solely through email depends on the specific requirements and risk factors involved. Email itself presents a number of risks as the incumbent collaboration platform. Here are just a few:
- Lack of control: Once an email is sent, the sender or owner loses control over distribution. Recipients can share the information with others, intentionally or inadvertently, increasing the risk of unauthorized access to sensitive information.
- Legal and compliance issues: Depending on the nature of the sensitive content, sending it via email may violate regional or industry-specific regulations.
- Phishing and social engineering attacks: Email is a spoof-able channel that is open to the internet and by processing sensitive content over email, organizations are more susceptible to impersonation, social engineering and business email compromise.
- Data loss: Email is one of the most targeted datasets by attackers; if a bad actor gains unauthorized access to one or more email accounts, they can exfiltrate years of archived content instantly.
While email offers convenience, there are alternative methods that may provide enhanced security for sensitive workflows. These include secure file transfer, secure collaboration platforms or messaging applications, and other dedicated secure systems. Ultimately, the choice of the most appropriate method depends on several factors; however, the optimal choice is rarely email. So let’s walk through how to move these workflows out of email.
Moving sensitive workflows out of email with Material and Snowflake
Solving this problem is multi-faceted. You need to start by getting a baseline of what types of sensitive content exist in your organization's email today. Next, you need to ensure that your policies make clear what types of sensitive content should not appear in email. The next step is to prioritize which types of sensitive content to target and in what order. Lastly, you need to operationalize the enforcement of these policies to ensure you maintain correct posture and continuously measure impact. If you are successful, you will see a downward trend for new sensitive content appearing in email.
1. Get a Snapshot & Understand Baseline
In order to solve this issue, it’s important to first get a baseline of where data sits. The pipeline here looks something like the following:
- Process all emails up to a certain historical date. This generally would require exporting via Graph or Gmail APIs, transforming the data, and loading that information into a tool like Snowflake.
- Develop tagging of sensitive content within this dataset. Techniques might include OCR, text matching, and other 3rd party tools like Google’s DLP API.
- Spot check your tagging for accuracy and precision across all relevant categories.
- Aggregate counts of sensitive content by category, and potentially other breakdowns such as by group, job function, etc.
This is quite a lot of work, but luckily Material makes it easy by executing steps 1-3 out of the box. The results might look something like the following:
Material also makes it easy to get this data into Snowflake. I talk a bit about the integration in a previous blog, How to Monitor Shadow IT using Material Security and Snowflake. Once it’s there, you might have a similar visual like the following:
Now that we have a good grip on our baseline, let’s move on to the next step.
2. Communicate Policy & Prioritize Types of Data
Once the baseline is established, it is time to introduce and communicate policies aimed at securing sensitive workflows. Announcing these policies to the organization is crucial for ensuring awareness and buy-in from all stakeholders. It is also important to prioritize the types of data you will target. Each organization will have differing tolerances for the types of risks the categories of sensitivity represent to them based on region and industry. For example, healthcare companies may want to prioritize healthcare records to honor their BAA (business associate agreement) commitments whereas European companies may want to address sensitive employee and customer information first to remain compliant with GDPR.
Once you’ve selected your categories, you can visualize them in Snowflake easily, as such:
3. Operationalize Enforcement and Measure Impact
Implementing a robust measurement and reporting framework becomes vital. Tracking time-based metrics, such as week-over-week (WoW) and month-over-month (MoM) comparisons, enables organizations to evaluate the impact of the policies. This data-driven approach allows for continuous improvement and a better understanding of progress.
Gauging MoM metrics and aggregating data based on groups or departments provide insights into specific areas that require attention. By monitoring group-based progress, organizations can identify patterns and address compliance gaps promptly. Furthermore, setting up alerts for repeat offenders ensures that enforcement remains effective. One such example is a recurring job that queries for users with the largest increase in sensitive content handling MoM. An alert set up to identify these users could help identify a malicious actor, or at the very least lead to better monitoring of sensitive data usage. Here’s a visual representation in Snowflake:
This operationalization of policies helps establish a culture of accountability and reinforces security practices throughout the organization.
Next Steps
The resources in this blog post are a great starting point for getting sensitive workflows out of email, but it is crucial to acknowledge that external factors, such as third parties sending you sensitive content, might limit complete elimination. While organizations strive for zero tolerance of sensitive content in email, realistic targets range from 0.5% to 2% of sensitive content for all email. This percentage translates to millions of sensitive messages for organizations. To protect these millions of messages effectively, organizations can turn to Material's Data Protection, which offers comprehensive security controls and advanced monitoring.
To learn more about our unique approach, request time with our team here.
Gist of Queries:
https://gist.github.com/maxpollard/fafdbd6460b125d7582b5f595ef97d59