Today’s privacy-conscious environment presents mounting challenges for enterprise data processing infrastructure.
The confluence of legislation and consumer sentiment is pressing both digital natives and transforming enterprises to reshape their customer data strategies to embed data privacy in their operations. In reaction, enterprises are reorienting how they collect and process customer data to optimize around consent, ownership and control.
While presenting new technical challenges to data teams, the data privacy-centric outlook provides a profound market opportunity for cloud data platforms to necessitate warehouse-native applications. These are applications where data processing happens in the customer’s cloud data warehouse instead of being ingested or stored by the application. In addition to presenting organizations with a single source of truth for enterprise data, these platforms also provide an emergent de facto reference architecture for governing data privacy at scale throughout the data processing lifecycle.
Anticipation of warehouse-native applications replacing standalone vertical software-as-a-service solutions has existed for some time. Data platforms themselves have invested heavily in their application-layer ecosystems, tools and accelerators. Likewise, vast sums of venture capital dollars have funded the data tools to support these applications. However, data privacy is already proving the catalyst for accelerated adoption of warehouse-native applications.
Data privacy: friction to customer data analytics
The use of customer data has delivered profound economic impact and delighted consumers by powering the analytics, personalization and recommendations that underpin much of the digital economy. However, data innovators must now pause to rethink their analytics and machine learning strategies. Growing regulatory momentum, spanning the EU’s General Data Protection Regulation and the California Consumer Privacy Act and beyond, coupled with headline-grabbing fines, make practices of third-party tracking and exchange of customer data untenable in the future.
Legislative barriers are compounded by growing consumer mistrust. Upon introduction of the consent management framework within Apple’s App Tracking Transparency program, three of four Apple device users worldwide opted out of sharing their personal data. The inability to track consumers across applications was forecast to cost Facebook’s parent company Meta Platforms Inc. $10 billion in foregone ad revenue in the second half of 2022.
Data privacy practices: permission, ownership and control
To deliver competitive customer data analytics, innovators are centering their data strategies around the governance pillars of permission, ownership, and control with respect to customer data. Operationally, organizations are doubling down on a raft of policies to support governance at large scale. They include zero-party and first-party data that is intentionally created with transparency and consent from customers, exclusive ownership of all customer data to maintain accountability as to its privacy, and complete control over where customer data resides, by whom it may be accessed and the period of time for which it persists.
A reference architecture for governing data privacy
Europe’s GDPR mandates an individual’s “right to be forgotten,” where upon request an organization has 30 to 90 days to delete a customer’s data entirely. The organization that collected the customer data, deemed the “controller,” is responsible for compliance. However, without a holistic toolset to manage how data is classified, anonymized, deleted and conserved for audit, the data controller’s obligations are infeasible at large scale.
The challenge is best illustrated with the use of enterprise SaaS applications as third-party “processors” of customer data. Organizations that relinquish control by transferring customer data to third parties, as processors, forego their ability to underwrite data privacy throughout the data processing lifecycle. Most visibly, data regulators in Austria, France, the Netherlands, Italy and Denmark to have ruled the data encryption and anonymization measures used by Google Analytics to be inadequate.
Cloud data platforms, as alternate processors of customer data, are competing feverishly to position a data privacy reference architecture and tooling to help data controllers to effect data governance at large scale. This consists of building native applications on top of the warehouse, where the data persists, to reduce the transfer and replication of customer data to third-party applications and partners — without stifling analytics output.
A core tenet of the data privacy reference architecture is the introduction of data clean rooms in the warehouse. These services provide a privacy-proofed environment where multiple participants can join their first-party data for analytics without risk of the exposing that data to other participants.
Warehouse-native applications supporting data privacy
While the “pull” of a data privacy reference architecture is a compelling pitch for warehouse-native applications, increasingly there is a “push” factor stemming from concerns and headlines about relinquishing control of data privacy to third-party SaaS applications.
Industry commentary is rife with predictions that warehouse-native applications will augment and upgrade standalone SaaS applications. Primarily the prediction is based on performance. Those applications built to leverage the data engineering tools of the warehouse platforms should deliver enhanced reliability and processing speed for embedded analytics and search. Early cohorts of warehouse-native applications have already emerged across data-intensive use cases in security (with security information and event management data), marketing (with customer relationship management) and product (with product usage telemetry).
However, in the current privacy-conscious environment, it is the necessity for governance at scale that is driving data strategies and accelerating investments in warehouse-native applications.
Conor Doyle is a partner at Atlantic Bridge Capital, investing in data infrastructure startups across Europe and the United States. He wrote this article for SiliconANGLE.