In a new paper, Institute Fellow Aaron Fluitt and co-authors Aloni Cohen, Micah Altman, Kobbi Nissim, Salome Viljoen, and Alexandra Wood argue that data protection regulators and policymakers are failing to address the accumulation of what can be very subtle and unintuitive privacy erosions across multiple data uses — and warn that overlooking these cumulative effects has already led to many startling privacy failures.
The paper introduces the concepts of “composition” and “composition effects” from computer science into the data protection law lexicon, in order to describe the cumulative erosions of privacy that inhere in every piece of data about people. Because these composition effects are an intrinsic and unavoidable property of data, privacy erosions “occur no matter how aggregated, insignificant, or anonymized the data may seem, and even small erosions can combine in unanticipated ways to create big risks.”
The authors describe several recent real-world privacy attacks that have leveraged composition effects, including the recent revelation “that the underlying confidential data from the 2010 US Decennial Census could be reconstructed using only the statistical tables published by the Census Bureau.” In that test-attack, researchers were able to reconstruct the sex, race, ethnicity, and block-level geographic location for 71% of the population, and could completely re-identify over 50 million people!
The authors argue that “Privacy and data protection failures from unanticipated composition effects reflect a type of data myopia—a short-sighted approach toward addressing increasingly-ubiquitous surveillance and privacy risks from Big Data analytics, characterized by a near total focus on individual data processors and processes and by pervasive underestimation of systemic risks accumulating from independent data products.”
The authors are continuing to develop their Composition Theory for privacy law as one part of a broader effort to bridge the conceptual and linguistic divides between how computer scientists understand privacy risks, and how law and policy treats these risks.
Having noted that the “failure to recognize accumulation of risk in the information ecosystem reflects a more general societal blind spot to cumulative systemic risks, with parallels in collective failures to foresee or forestall global financial crises, and to adequately address mounting risks to the natural environment,” the authors intend to explore whether privacy law may be able to learn valuable lessons from other regulatory spheres that have grappled with systemic risk for decades.
In the paper, the authors contend that the “prevailing approach to protecting personal data at the level of individual data processors, rather than treating risks to personal data at the macro or systemic level, typifies the tyranny of small decisions: although each step seems small, together they can bring society over a cliff.” Some recent technical innovations, like differential privacy, offer techniques that are resilient against cumulative composition effects for some data uses, but many of the looming privacy threats identified by the authors “will be impossible to overcome unless regulations are designed specifically to regulate cumulative risk in a manner that is consistent with the science of composition effects.”