Skip to content

Unpacking the Jargon of Data Breach Response

Canopy Team March 31, 2020
man processing breached data on computer powered by AI/machine learning brain


Canopy is the core of a unique process in the ever-growing field of data breach response. A delicate balance of people, process, and technology come together on every project to ensure that the response to every data breach is efficient, accurate, and affordable. Excellent partnerships with companies such as The Crypsis Group and Integreon, Inc. allow data breaches to be investigated using our custom-designed software for higher quality results than traditional ediscovery methods. This combination of man and machine-learning creates the ideal mechanism for any data breach response. Although data breach response closely parallels ediscovery, the two processes differ significantly. Nevertheless, terminology is often erroneously used interchangeably and, in order to fully understand our data breach process, some new terminology needs to be introduced. This article defines four phases of data breach response, exploring both the terminology in those phases and the technology behind the labels.

Before human reviewers ever lay eyes on the documents, Canopy’s software processes the data, including all extraction and data detection, producing an Impact Assessment Report. Canopy’s automated impact assessment gives our partners the scope of the breach in 24-72 hours - eliminating the 1-2 weeks traditionally set aside for a preliminary manual assessment. Our impact assessment provides a processing overview of the data uploaded to Canopy and the Personally Identifiable Information (PII) and Protected Health Information (PHI) elements contained within the data. Our machine-learning based software detects over 60 unique data elements - from contact information to Social Security Numbers to medical diagnoses - and everything in between. These elements allow our partners to ensure compliance with a wide range of global privacy regulations including CCPA, FERPA, HIPAA, GDPR, and PIPEDA.

The second phase is Data Mining, or the process of sorting documents that contain PII or PHI from the larger pool of documents, thereby reducing the number of documents that go to review. Canopy developed data analytics tools specifically designed for protected data to sort documents that require review from those that do not. This process is essential to making sure that valuable man hours are only spent on documents containing relevant information to the review - dramatically reducing response time and review cost.

In the third phase, the Review, human reviewers put eyes on the documents to identify entities and reportable data within the documents. Reviewers work alongside counsel to identify data elements that, in combination with contact information, are reportable in the context of each individual case. During Canopy’s processing phase, machine-learning and other computer-based techniques have already identified and highlighted reportable elements within the documents. Therefore, reviewers are only required to verify the elements and link them to the appropriate entity. This approach dramatically reduces both false positives and human error - ideally the reviewer never has to touch their keyboard! Our partners estimate that Canopy’s review technology reduces false positives by over 43%.

Following the review, the fourth phase is Entity Management, in which the list of entities identified by reviewers is reduced into a single, de-duplicated list for notification. Canopy’s technology automatically merges identical duplicate entities based on the names and PII contained in the entity list. Furthermore, the machine groups near-duplicate entities, allowing our partners to further merge related entities seamlessly and intuitively.

With Canopy's purpose-built, machine-learning technology at the core, Data Breach Notification projects are more efficient and effective. At each phase, Canopy’s technology makes humans’ lives easier and expedites the process - ultimately ensuring clients are able to notify affected individuals fast.