Privacy-preserving data collaboration zone

Project Overview

A bank embarked on a strategic initiative to leverage its partners as a valuable source for lead generation data, aimed at promoting its products effectively. The bank recognized the need for a compliant approach to accessing partner data, allowing them to conduct pre-analysis, measure potential overlap, and predict look-alike audiences. However, a paramount requirement was that partner data must remain private and not expose Personally Identifiable Information (PII). To facilitate this collaboration, a dedicated “collaboration zone” was envisioned to enable secure data sharing between the bank and its partners.

 

The Challenge

The central challenge revolved around how to conduct data analysis while joining customer data from both the bank and its partners, all without exposing any PII to either party. The bank also faced the predicament of needing to execute their code on the partner’s data, while simultaneously safeguarding their code from being accessed by the partner.

 

Our Solution

We devised a comprehensive data sharing solution pack that encompassed various privacy-preserving techniques. This included PII encryption hashing, robust key management, a standardized common ID structure, and SQL query monitoring with audit capabilities. These measures collectively allowed for the secure joining of data from both the bank and its partners while maintaining the utmost data privacy.
Key DP (Data Platform) components were designed to be reusable within the collaboration zone, which included MLOps (Machine Learning Operations), a Lakehouse architecture, and an API Gateway. This ecosystem provided a familiar environment for data analysts from the bank, enabling them to conduct analyses in a manner consistent with their own Data Warehouse (DWH) practices. The secure and fortified environment ensured the confidentiality of code, meeting the bank’s privacy requirements.

 

The Outcome

The bank successfully accomplished its objectives, developing pre-approval mechanisms and identifying look-alike audiences from its partner’s customer base. Crucially, this was achieved without the risk of PII leakage. The implemented privacy-preserving measures facilitated data collaboration and analysis within the confines of the collaboration zone, ensuring both data security and privacy were preserved throughout the entire process.