Open Data and Privacy – Most Definitely Not Mutually Exclusive
As governments open more data to the public, watchdog groups and think tanks are closely eyeing privacy protections to ensure that released data does not violate citizens’ privacy. That’s a good thing, because the focus on privacy has spawned innovations in the way data is protected and released.
As governments open more data to the public, watchdog groups and think tanks are closely eyeing privacy protections to ensure that released data does not violate citizens’ privacy. That’s a good thing, because the focus on privacy has spawned innovations in the way data is protected and released. And as the 2020 Census rolls out, innovations in differential privacy by the Census Bureau are expected to offer other governments opportunities to learn new best practices. In short – there’s a lot happening in the privacy space in terms of data sharing.
New Practices in the 2020 US Census
“We’ve seen differential privacy on a limited basis, but seeing it roll out for the entirety of the 2020 Census will be really exciting,” says Kelsey Finch, senior counsel for the Future of Privacy Forum. “Lots of lessons to be learned in terms of how to operationalize these kinds of tools and how we communicate with the public about these kinds of tools.”
danah boyd, (note:she keeps her name lower case), founder and president of Data & Society Research Institute and partner researcher at Microsoft Research, also says the 2020 Census will offer innovations.
“We should be celebrating the Census Bureau for recognizing that it must innovate in order to protect the confidentiality of the data it collects,” says boyd. “Their innovation is going to change the future of data production, dissemination, and use – not just for the Census Bureau, but for all groups invested in open data.”
What exactly is Differential Privacy?
Under differential privacy, public data is opened, but it is done in a way so individuals cannot be identified. It’s become a nuanced process, since data might still be identifiable when overlayed against other datasets. Local governments are aware of this possibility, and are careful to imagine and work through possible data overlays. The 2020 Census is doing the same, and given the resources of the census, its innovations could be transformative.
Innovation in Data Privacy Across the Board
But it’s not just the 2020 Census that is driving privacy innovations, local governments across the country are developing risk assessment processes, tools, and standards to ensure their open data protects privacy. Some have established data privacy officers.
Finch says establishing guidelines and processes is critical, as is communicating policies to the public. She worked with the city of Seattle to do just that.
“They take privacy very seriously in that city,” Finch says of Seattle. “They recognize the need to develop real robust policy safeguards and to consider more holistically the potential impact of making data available…We tried to develop tools and standards for more transparent decision-making.”
Transparency with the public about data privacy policies and decision-making frameworks is crucial, experts say.
(Governments) need to be really clear and transparent with the public about why and how particular data sets are being released,” adds Finch. “Make sure people understand that when they provide info to the government, 311 calls, or writing an email, filling out a form…make sure people know and understand if that info will later be available on a public portal.”
Finch adds: “There might be situations where data is so sensitive and so identifiable that it shouldn’t be released through open data programs. On the other hand, there might be information that has such compelling public interest, that notwithstanding privacy risks it should be put out there. For example, salaries, to look for inequities.”
Why Not Releasing Certain Data Can Also Carry a Risk
But some experts say not releasing data also carries risk.
“There are also risks involved in not opening data,” says Stefaan Verhulst, co-founder and chief research and development officer of the Governance Laboratory at New York University. “What are the opportunity costs to society if you don’t open the data?”
Verhulst stresses that governments need to examine privacy and data quality issues across the entire data chain – not just the release. They need to take an “end-to-end approach,” he says.
“It’s important for local governments to be aware that indeed risks exist at the moment you open the data, but also across the value chain,” he says.
“Data lineage” – or where the data originated – is important to understand. His team at the Governance Lab advocates for labeling of data – he calls it a kind of “nutritional” labeling – that discloses where the data originated and what its uses will be. He advocates for a new profession as well, that of data steward. The data steward would be charged with assessing data quality across the value chain and communicating with key constituencies, including the public.
“That should be the next stage of the open data movement,” Verhulst says.
Finch says she sees tremendous gains being made at the local level in terms of data privacy safeguards.
“More cities are passing privacy policies and ordinances that are professionalizing the protection of privacy and making it more systematic,” Finch says. “There is momentum growing.”
Finch adds that people need to see that open data and privacy “are not mutually exclusive.”
“Having real safeguards in place actually often helps the innovation of the data-driven work go faster,” she says, “because you’ve built processes and infrastructure.”
The American Privacy Rights Act (APRA) promises to increase the protection, privacy and security of consumer data, across the United States through federal legislation. We look at what it covers, next steps on its implementation and the benefits it brings to organizations as well as consumers.
Data governance is critical to ensuring that data is reliable, trustworthy and accessible by the right users, enabling organizations to become truly data-centric. Ensuring that cloud-based data is well-governed brings new challenges around control, security and compliance - this blog explains how to overcome them.