What We Gained and Lost by Updating Census Privacy

tldt arrow

Too Long; Didn't Read

The adoption of differential privacy in the U.S. Census altered how expertise, responsibility, and transparency are distributed. Formerly led by statisticians, the process now prioritizes computer science, creating both new privacy benefits and stakeholder challenges. Despite efforts to include the public through educational tools and demos, key data was initially withheld, reflecting deeper tensions in tech-driven governance transitions.

Coin Mentioned

Mention Thumbnail
featured image - What We Gained and Lost by Updating Census Privacy
Tech Media Bias [Research Publication] HackerNoon profile picture
0-item

Abstract and 1. Introduction

2. Related Work

3. Theoretical Lenses

3.1. Handoff Model

3.2. Boundary objects

4. Applying the Theoretical Lenses and 4.1 Handoff Triggers: New tech, new threats, new hype

4.2. Handoff Components: Shifting experts, techniques, and data

4.3. Handoff Modes: Abstraction and constrained expertise

4.4 Handoff Function: Interrogating the how and 4.5. Transparency artifacts at the boundaries: Spaghetti at the wall

5. Uncovering the Stakes of the Handoff

5.1. Confidentiality is the tip of the iceberg

5.2. Data Utility

5.3. Formalism

5.4. Transparency

5.5. Participation

6. Beyond the Census: Lessons for Transparency and Participation and 6.1 Lesson 1: The handoff lens is a critical tool for surfacing values

6.2 Lesson 2: Beware objects without experts

6.3 Lesson 3: Transparency and participation should center values and policy

7. Conclusion

8. Research Ethics and Social Impact

8.1. Ethical concerns

8.2. Positionality

8.3. Adverse impact statement

Acknowledgments and References

4.3 Handoff Modes: Abstraction and constrained expertise

The handoff model pays particular attention to differences between modes of interaction between components. Attending to how these modes shift with the introduction of DP reveals how underlying values shift as well.



The changes to the DAS also reconfigured the relevance of disciplinary expertise. Many kinds of pre-DP census expertise (such as that of demographers and political advocates) was no longer sufficient to afford a confident understanding of, or even engagement with, how the DAS operates [95, 97]. This is to say, the responsibility to effectively design privacy protections was displaced from statisticians and the DAS’s prior experts onto computer scientists. This reorientation leads to not only a new balance of power across the landscape of Census experts, but also a fundamental shift in the rhetorical and epistemological configuration of the Census and the DAS [19, 92].


4.4 Handoff Function: Interrogating the how

At first blush, it seems that the disclosure avoidance system’s primary function remained the same before and after DP: preserving the confidentiality of census responses. However, the handoff model encourages a richer understanding of a system’s function, including not only its “goals, purposes, or [...] values” but also “how it does what it does, as a designer or engineer might explain it” [91, p. 6, original emphasis]. The handoff lens ultimately reveals that DP, by shifting how the function of confidentiality preservation was enacted, expanded the function of the DAS in many value-laden ways. In particular, DP (1) created new opportunities for transparency between the Bureau and interested publics; (2) allowed for formal, quantifiable validation of the privacy and confidentiality commitments actualized by the Bureau; and (3) replaced one form of expertise with another, precipitating the rise of theoretical computer science professionals and the decline of statisticians in the design, operation, evaluation of disclosure avoidance. We explore these shifting functions in greater detail throughout Section 5.

4.5 Transparency artifacts at the boundaries: Spaghetti at the wall

To seize the benefits of the transparency that DP allows, and to enable stakeholder participation in the DAS design, the Bureau created many new artifacts to facilitate public understanding and input of the DAS. First, the Bureau released an unprecedented degree of technical detail, sharing the DAS source code via GitHub [54]. Realizing that code was not sufficient for providing transparency given stakeholder capacity, the Bureau released demonstration data that would allow demographers and social scientists who use census products to interact with the new system in a way that was familiar to them. Ultimately releasing six sets of data between 2018 and 2021, these public datasets were the result of applying the Bureau’s 2020 DP algorithm to data from the 2010 Census.


The Bureau also engaged external experts in formal and informal co-design processes. Specifically the Bureau solicited written comments from data users via the Federal Register [120]; encouraged user feedback after publishing each round of demonstration data; hosted and participated in workshops devoted to discussing the use of DP [45, 93]; and held multiple consultations with tribal leadership [121]. Reaching beyond those experts, the Bureau provided an impressive array of educational resources designed for more diverse stakeholders and the interested public about the new DAS. Blogs narrating the Bureau’s plans and progress as they worked to implement DP were authored by the Census’ chief scientist himself [5, 9, 10]. To build up stakeholders’ understanding of what DP is and why it is a worthwhile tool, the Bureau developed interactive Python Jupyter notebooks [36], webinars [127], handbooks [124], and videos [87] designed for a lay audience.


However, while the bureau created many artifacts and processes to bolster the public’s understanding and participation in the DAS design, they withheld one artifact that external DP experts needed to evaluate the DAS. Noisy measurement files are an interim data product which contain the census data after the application of DP, but before postprocessing removes negative or non-integer values. These files were not originally released by the Bureau but became an object of great interest. In 2021, a group of over 50 researchers, technologists, and city officials wrote to the Bureau requesting publication of the noisy measurement file, arguing that the release of this noisy data would expedite evaluations of the downstream effects of DP while still adhering to Title 13 privacy requirements [41]. Initially, the Bureau denied a FOIA request to release this data, citing concerns about confusing the public by revealing the existence of more than “one ‘true’ data set” [19, p. 15]; researchers were further frustrated [60]. Yet a year and a half later, in April 2023, the Bureau did release noisy measurements of 2010 demonstration data [126].


While the transparency artifacts listed in this section originated in the Bureau’s interest in helping engage, educate, and involve diverse stakeholder groups; they ultimately ended up being sites of active negotiation within the handoff to DP. This reflects an implicit understanding within the Bureau of the need for boundary objects [116] to bridge the many stakeholder groups they hoped to engage in the design of the DAS. We do not intend to interrogate the degree to which individual artifacts were or were not successful in acting as boundary objects. Rather, we present and emphasize the volume and variety of artifacts which the Bureau developed to facilitate stakeholder participation. Building upon this foundation, in Section 5 we will evaluate the artifacts’ overall effectiveness in doing boundary work [77] – that is, allowing a variety of stakeholders to participate in and negotiate the handoff from SDL to DP.


Authors:

(1) AMINA A. ABDU, University of Michigan, USA;

(2) LAUREN M. CHAMBERS, University of California, Berkeley, USA;

(3) DEIRDRE K. MULLIGAN, University of California, Berkeley, USA;

(4) ABIGAIL Z. JACOBS, University of Michigan, USA.


This paper is available on arxiv under CC BY-NC-SA 4.0 DEED license.

[4] To be sure, there are still many fundamental choices that remain for designers of the DAS under DP. Researchers must draw block and tract boundaries, determine which data will be kept invariant, as well as which kinds of geographies will and will not include in the hierarchy known as the ‘spine,’ to name a few [7, 94, 124]. Nevertheless, epsilon-DP greatly reduces the dimensionality of disclosure avoidance decisions left to Bureau experts.

Trending Topics

blockchaincryptocurrencyhackernoon-top-storyprogrammingsoftware-developmenttechnologystartuphackernoon-booksBitcoinbooks