Skip to content

Canopy Talks PII, Data Breach Response, & Privacy on BarCode Podcast

Canopy Team April 04, 2022


Canopy COO Adi Elliott recently joined Chris Glanden, host of BarCode podcast, to chat about data breach response, incident response, preventing data loss, the application of AI in cybersecurity fields, and more.

Here’s a quick look at what they discussed:


Where Does Data Breach Response Fit in the IR Process?

When we talk about data breach response, we’re typically referring to business email compromises (BECs) and ransomware. If you look at the NIST Cybersecurity Framework, you’ll just see one little line under Respond titled “Analysis” with very little explanation — even though assessing these incidents and adhering to data privacy & breach notification regulations is expensive and difficult to execute.

NIST FrameworkSource: An Introduction to the Components of the Framework [NIST]

To elaborate on this IR step a bit, data breach response encompasses:

  • Figuring out who was impacted by a security/compromise event
  • Determining whether the incident legally constitutes a breach
  • Sending out notifications (if applicable) 


What Problem Does Canopy Solve?

Canopy’s Data Breach Response software essentially takes a bag of data and transforms it into a list of names and PII using artificial intelligence (AI). To explain, let’s use a common example: a business email compromise (BEC) caused by an employee clicking a link in a phishing email. 

A digital forensic/incident response (DFIR) team loads the compromised PST into Canopy’s software, which uses AI and machine learning to data mine for two things: personally identifiable information (PII) and people. Reviewers then comb through the flagged documents to link each PII element to its respective person using a streamlined, AI-powered workflow. And finally, Canopy’s powerful technology helps deduplicate the list of identified people quickly, generating a consolidated entity list for breach notifications.


Data Breach Response: How It Started & How It’s Going

Because this space is relatively new, there isn’t a long progression of software companies solving the problem. When legislation like the EU General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) began rolling out across the globe, IR teams needed an immediate solution, so they turned to the best available option: ediscovery.

When it comes to PII detection, ediscovery’s data mining techniques like keyword search/regular expressions (regex) are both over- and under-inclusive. This means that IR teams would waste time looking at a ton of documents that don’t actually contain PII, and at the same time miss important data. The inefficiencies trickled down from there.

Download our case study to see how one Partner saved its data breach response client over $300,000 by switching from traditional data mining to 🌳🌳🌳Canopy. 

Initially, it was nearly impossible to comply with strict requirements like GDPR’s non-negotiable 72-hour breach notification rule. But these regulations were necessary to protect people’s data, and they set the North Star. Now, Canopy’s purpose-built technology is helping Data Breach Response teams zero in on PII and assess cyber incidents fast, so that compromised organizations can affordably comply with applicable breach notification mandates.


What Is “Sensitive Data”?

We’ve already talked about PII, but there are a variety of terms used to refer to sensitive data, including:

  • Protected health information (PHI)
  • Financial information
  • Medical information
  • Intellectual property (IP)
  • Religious affiliation

Regulations define specific data elements and dictate what Canopy’s algorithms are trained to search for, with some (like GDPR) being more vague than others (like HIPAA). Interestingly, regulations apply based on where affected people live, not where a breach occurs — so if a Florida-based company breach affects more than 500 California residents and California-recognized PII is compromised, the company must comply with CCPA.


How Much Can We Rely on AI?

When you try to throw AI at something complex and ill-defined, like self-driving cars, it’s difficult to achieve success. But AI is game-changing when it’s tightly defined and implemented correctly. 

Each of Canopy’s machine learning models does something very specific: There are individual models to detect each category of PII — one for social security number, one for credit card number, one for names, etc. — and some of those even have separate models to account for different languages. They each are very accurate and much faster than manual processes. 

The other differentiator with our software is that it isn’t solely reliant on AI. Humans validate Canopy’s decisions and help the software continuously improve. So if the models mistakenly identify the religious affiliation “Christian” as a name on one project, they will learn from human correction that will benefit future projects.


Data Privacy Best Practices & Recommendations

The onus to solve privacy problems is mostly on businesses, and it’s in their best interest to protect customers’ personal data — if they’re breached, their reputation is on the line and they risk losing this valuable asset.

BECs from phishing attacks are the biggest culprits for unintentional data loss. People are primarily focused on getting their jobs done, and they don’t think about putting PII on file shares and sending it through emails. 

To combat this, businesses are currently doing three things: 

  1. Sending out surveys asking employees how they handle sensitive data.
  2. Requiring employees to complete generic cybersecurity training.
  3. Setting up heavy endpoints-everywhere software to supposedly scour the enterprise for PII. 

We know from our work in the Data Breach Response space that these three tactics range from ineffective to impractical. So we started to wonder: how we can help companies proactively mitigate privacy risk before a breach occurs? That’s where the idea for our second product came from. With Canopy’s Privacy Audit software, enterprises can analyze how a sample set of employees actually handles PII — not just how they report handling it. They can then use these insights to change human behavior by developing customized cybersecurity training and policies that empower employees to get their jobs done safely.


Special Shout-Outs


Want to hear more? You can listen to the full episode on BarCode podcast’s website, or tune in wherever you listen to podcasts. Episode 51: Proceed Into Independence originally aired on February 3, 2022.


About BarCode

Cybersecurity with 1337% ABV. BarCode is a place where Cybersecurity professionals can unite in a relaxed atmosphere while getting to hear experts opensource their wisdom and insight....outside of conference walls. Untap the knowledge of an industry guru, find out what fuels their drive, or simply kick back, relax, and listen to their story. Due to COVID-19 restrictions, most bars are limited or closed for on-prem service. Therefore, each episode will feature Tony, a virtual bartender who will greet and walk us through making an exceptional yet easy-to-make beverage right from the comfort of your own home. It's Cybersecurity straight up, no chaser.Winner of a 2021 People's Choice Podcast Award (Technology Category).