A group of European privacy experts has proposed a decentralized system for Bluetooth-based COVID-19 contacts tracing which they argue offers greater protection against abuse and misuse of people’s data than apps which pull data into centralized pots.
The protocol — which they’re calling Decentralized Privacy-Preserving Proximity Tracing (DP-PPT) — has been designed by around 25 academics from at least seven research institutions across Europe, including the Swiss Federal Institute of Technology, ETH Zurich and KU Leuven in Belgium.
They’ve published a White Paper detailing their approach here.
The key element is that the design entails local processing of contacts tracing and risk on the user’s device, based on devices generating and sharing ephemeral Bluetooth identifiers (referred to as EphIDs in the paper).
A backend server is used to push data out to devices — i.e. when an infected person is diagnosed with COVID-19 a health authority would sanction the upload from the person’s device of a compact representation of EphIDs over the infectious period which would be sent to other devices so they could locally compute whether there is a risk and notify the user accordingly.
Under this design there’s no requirement for pseudonymized IDs to be centralized, where the pooled data would pose a privacy risk. Which in turn should make it easier to persuade EU citizens to trust the system — and voluntarily download contacts tracing app using this protocol — given it’s architected to resist being repurposed for individual-level state surveillance.
The group does discuss some other potential threats — such as posed by tech savvy users who could eavesdrop on data exchanged locally, and decompile/recompile the app to modify elements — but the overarching contention is such risks are small and more manageable vs creating centralized pots of data that risk paving the way for ‘surveillance creep’, i.e. if states use a public health crisis as an opportunity to establish and retain citizen-level tracking infrastructure.
The DP-PPT has been designed with its own purpose-limited dismantling in mind, once the public health crisis is over.
“Our protocol is demonstrative of the fact that privacy-preserving approaches to proximity tracing are possible, and that countries or organisations do not need to accept methods that support risk and misuse,” writes professor Carmela Troncoso, of EPFL. “Where the law requires strict necessity and proportionality, and societal support is behind proximity tracing, this decentralized design provides an abuse-resistant way to carry it out.”
In recent weeks governments all over Europe have been leaning on data controllers to hand over user data for a variety of coronavirus tracking purposes. Apps are also being scrambled to market by the private sector — including symptom reporting apps that claim to help researchers fight the disease. While tech giants spy PR opportunities to repackage persistent tracking of Internet users for a claimed public healthcare cause, however vague the actual utility.
The next big coronavirus tech push looks likely to be contacts-tracing apps: Aka apps that use proximity-tracking Bluetooth technology to map contacts between infected individuals and others.
This is because without some form of contacts tracing there’s a risk that hard-won gains to reduce the rate of infections by curtailing people’s movements will be reversed, i.e. once economic and social activity is opened up again. Although whether contacts tracing apps can be as effective at helping to contain COVID-19 as policymakers and technologists hope remains an open question.
What’s crystal clear right now, though, is that without a thoughtfully designed protocol that bakes in privacy by design contacts-tracing apps present a real risk to privacy — and, where they exist, to hard-won human rights.
Torching rights in the name of combating COVID-19 is neither good nor necessary is the message from the group backing the DP-PPT protocol.
“One of the major concerns around centralisation is that the system can be expanded, that states can reconstruct a social graph of who-has-been-close-to-who, and may then expand profiling and other provisions on that basis. The data can be co-opted and used by law enforcement and intelligence for non-public health purposes,” explains University College London’s Dr Michael Veale, another backer of the decentralized design.
“While some countries may be able to put in place effective legal safeguards against this, by setting up a centralised protocol in Europe, neighbouring countries become forced to interoperate with it, and use centralised rather than decentralised systems too. The inverse is true: A decentralised system puts hard technical limits on surveillance abuses from COVID-19 bluetooth tracking across the world, by ensuring other countries use privacy-protective approaches.”
“It is also simply not necessary,” he adds of centralizing proximity data. “Data protection by design obliges the minimisation of data to that which is necessary for the purpose. Collecting and centralising data is simply not technically necessary for Bluetooth contact tracing.”
Last week we reported on another EU effort — by a different coalition of technologists and scientists, led by by Germany’s Fraunhofer Heinrich Hertz Institute for telecoms (HHI) — which has said it’s working on a “privacy preserving” standard for Covid-19 contacts tracing which they’ve dubbed: Pan-European Privacy-Preserving Proximity Tracing (PEPP-PT).
At the time it wasn’t clear whether or not the approach was locked to a centralized model of handling the pseudoanonymized IDs. Speaking to TechCrunch today, Hans-Christian Boos, one of the PEPP-PT project’s co-initiators, confirmed the standardization effort will support both centralized and decentralized approaches to handling contacts tracing.
The effort had faced criticizm from some in the EU privacy community for appearing to favor a centralized rather than decentralized approach — thereby, its critics contend, undermining the core claim to preserve user privacy. But, per Boos, it will in fact support both approaches — in a bid to maximize uptake around the world.
He also said it will be interoperable regardless of whether data is centralized or decentralized. (In the centralized scenario, he said the hope is that the not-for-profit that’s being set up to oversee PEPP-PT will be able to manage the centralized servers itself, pending proper financing — a step intended to further shrink the risk of data centralization in regions that lacks a human rights frameworks, for example.)
“We will have both options — centralized and decentralized,” Boos told TechCrunch. “We will offer both solutions, depending on who wants to use what, and we’ll make them operable. But I’m telling you that both solutions have their merits. I know that in the crypto community there is a lot of people who want decentraliztion — and I can tell you that in the health community there’s a lot of people who hate decentralization because they’re afraid that too many people have information about infected people.”
“In a decentralized system you have the simple problem that you would broadcast the anonymous IDs of infected people to everybody — so some countries’ health legislation will absolutely forbid that. Even though you have a cryptographic method, you’re broadcasting the IDs to all over the place — that’s the only way your local phone can find out have I been in contact or no,” Boos went on.
“That’s the drawback of a decentralized solution. Other than that it’s a very good thing. On a centralized solution you have the drawback that there is a single operator, whom you can choose to trust or not to trust — has access to anonymized IDs, just the same as if they were broadcast. So the question is you can have one party with access to anonymized IDs or do you have everybody with access to anonymized IDs because in the end you’re broadcasting them over the network [because] it’s spoofable.”
“If your assumption is that someone could hack the centralized service… then you have to also assume that someone could hack a router, which stuff goes through,” he added. “Same problem.
“That’s why we offer both solutions. We’re not religious. Both solutions offer good privacy. Your question is who would you trust more and who would you un-trust more? Would you trust more a lot of users that you broadcast something to or would you trust more someone who operates a server? Or would you trust more that someone can hack a router or that someone can hack the server? Both is possible, right. Both of these options are totally valid options — and it’s a religious discussion between crypto people… but we have to balance it between what crypto wants and what healthcare wants. And because we can’t make that decision we will end up offering both solutions.
“I think there has to be choice because if we are trying to build an international standard we should try and not be part of a religious war.”
Boos also said the project aims to conduct research into the respective protocols (centralized vs decentralized) to compare and conduct risk assessments based on access to the respective data.
“From a data protection point of view that data is completely anonymized because there’s no attachment to location, there’s no attachment to time, there’s no attachment to phone number, MAC address, SIM number, any of those. The only thing you know there is a contact — a relevant contact between two anonymous IDs. That’s the only thing you have,” he said. “The question that we gave the computer scientists and the hackers is if we give you this list — or if we give you this graph, what could you derive from it? In the graph they are just numbers connected to each other, the question is how can you derive anything from it? They are trying — let’s see what’s coming out.”
“There are lots of people trying to be right about this discussion. It’s not about being right; it’s about doing the right thing — and we will supply, from the initiative, whatever good options there are. And if each of them have drawbacks we will make those drawbacks public and we will try to get as much confirmation and research in on these as we can. And we will put this out so people can make their choices which type of the system they want in their geography,” he added.
“If it turns out that one is doable and one is completely not doable then we will drop one — but so far both look doable, in terms of ‘privacy preserving’, so we will offer both. If one turns out to be not doable because it’s hackable or you could derive meta-information at an unacceptable risk then we would drop it completely and stop offering the option.”
On the interoperability point Boos described it as “a challenge” which he said boils down to how the systems calculate their respective IDs — but he emphasized it’s being worked on and is an essential piece.
“Without that the whole thing doesn’t make sense,” he told us. “It’s a challenge why the option isn’t out yet but we’re solving that challenge and it’ll definitely work… There’s multiple ideas how to make that work.”
“If every country does this by itself we won’t have open borders again,” he added. “And if in a country there’s multiple applications that don’t share data then we won’t have a large enough set of people participating who can actually make infection tracing possible — and if there’s not a single place where we can have discussions about what’s the right thing to do about privacy well then probably everybody will do something else and half of them will use phone numbers and location information.”
The PEPP-PT coalition has not yet published its protocol or any code. Which means external experts wanting to chip in with informed feedback on specific design choices related to the proposed standard haven’t been able to get their hands on the necessary data to carry out a review.
Boos said they intend to open source the code this week, under a Mozilla licence. He also said the project is willing to take on “any good suggestions” as contributions.
“Currently only beta members have access to it because those have committed to us that they will update to the newest version,” he said. “We want to make sure that when we publish the first release of code it should have gone through data privacy validation and security validation — so we are as sure as we can be that there’s no major change that someone on an open source system might skip.”
The lack of transparency around the protocol had caused concern among privacy experts — and led to calls for developers to withhold support pending more detail. And even to speculation that European governments may be intervening to push the effort towards a centralized model — and away from core EU principles of data protection by design and default.
I read this as saying that the PEPP-PT enables different configurations, depending on what the ‘user’ (government, platform) prefers. That is not DPbDD. Also I got no answer to the question who are the partners, what NDAs are involved and what downstream data-flows are enabled.
— Mireille Hildebrandt (@mireillemoret) April 6, 2020
As it stands, the EU’s long-standing data protection law bakes in principles such as data minimization. Transparency is another core requirement. And just last week the bloc’s lead privacy regulator, the EDPS, told us it’s monitoring developments around COVID-19 contacts tracing apps.
“The EDPS supports the development of technology and digital applications for the fight against the coronavirus pandemic and is monitoring these developments closely in cooperation with other national Data Protection Supervisory Authorities. It is firmly of the view that the GDPR is not an obstacle for the processing of personal data which is considered necessary by the Health Authorities to fight the pandemic,” a spokesman told us.
“All technology developers currently working on effective measures in the fight against the coronavirus pandemic should ensure data protection from the start, e.g. by applying apply data protection by design principles. The EDPS and the data protection community stand ready to assist technology developers in this collective endeavour. Guidance from data protection authorities is available here: EDPB Guidelines 4/2019 on Article 25 Data Protection by Design and by Default; and EDPS Preliminary Opinion on Privacy by Design.”
We also understand the European Commission is paying attention to the sudden crop of coronavirus apps and tools — with effectiveness and compliance with European data standards on its radar.
However, at the same time, the Commission has been pushing a big data agenda as part of a reboot of the bloc’s industrial strategy that puts digitization, data and AI at the core. And just today Euroactiv reported on leaked documents from the EU Council which say EU Member States and the Commission should “thoroughly analyse the experiences gained from the COVID-19 pandemic” in order to inform future policies across the entire spectrum of the digital domain.
So even in the EU there is a high level appetite for data that risks intersecting with the coronavirus crisis to drive developments in a direction that might undermine individual privacy rights. Hence the fierce push back from certain pro-privacy quarters for contacts tracing to be decentralized — to guard against any state data grabs.
For his part Boos argues that what counts as best practice ‘data minimization’ boils down to a point of view on who you trust more. “You could make an argument [for] both [deccentralized and centralized approaches] that they’re data minimizing — just because there’s data minimization at one point doesn’t mean you have data minimization overall in a decentralized system,” he suggests.
“It’s a question who do you trust? It’s who would you trust more — that’s the real question. I see the critical point of data as not the list of anonymized contacts — the critical data is the confirmed infected.
“A lot of this is an old, religious discussion between centralization and decentralization,” he added. “Generally IT oscillates between those tools; total distribution, total centralization… Because none of those is a perfect solution. But here in this case I think both offer valid security options, and then they have both different implications on what you’re willing to do or not willing to do with medical data. And then you’ve got to make a decision.
“What we have to do is we’ve got to make sure that the options are available. And we’ve got to make sure there’s sound research, not just conjecture, in heavyweight discussions: How does what work, how do they compare, and what are the risks?”
In terms of who’s involved in PEPP-PT discussions, beyond direct project participants, Boos said governments and health ministries are involved for the practical reason that they “have to include this in their health processes”. “A lot of countries now create their official tracing apps and of course those should be connected to the PEPP-PT,” he said.
“We also talk to the people in the health systems — whatever is the health system in the respective countries — because this needs to in the end interface with the health system, it needs to interface with testing… it should interface with infectious disease laws so people could get in touch with the local CDCs without revealing their privacy to us or their contact information to us, so that’s the conversation we’re also having.”
Developers with early (beta) access are kicking the tyres of the system already. Asked when the first apps making use of PEPP-PT technologies might be in general circulation Boos suggested it could be as soon as a couple of weeks.
“Most of them just have to put this into their tracing layer and we’ve already given them enough information so that they know how they can connect this to their health processes. I don’t think this will take long,” he said, noting the project is also providing a tracing reference app to help countries that haven’t got developer resource on tap.
“For user engagement you’ll have to do more than just tracing — you’ll have to include, for example, the information from the CDC… but we will offer the skeletal implementation of an app to make starting this as a project [easier],” he said.
“If all the people that have emailed us since last week put it in their apps [we’ll get widespread uptake],” Boos added. “Let’s say 50% do I think we get a very good start. I would say that the influx from countries and I would say companies especially who want their workforce back — there’s a high pressure especially to go on a system that allows international exchange and interoperability.”
On the wider point of whether contacts tracing apps is a useful tool to help control the spread of this novel coronavirus — which has shown itself to be highly infectious, more so than flu, for example — Boos said: “I don’t think there’s much argument that isolating infection is important, the problem with this disease is there’s zero symptoms while you’re already contagious. Which means that you can’t just go and measure the temperature of people and be fine. You actually need that look into the past. And I don’t think that can be done accurately without digital help.
“So if the theory that you need to isolate infection chains is true at all, which many diseases have shown that it is — but each disease is different, so there’s no 100% guarantee, but all the data speaks for it — then that is definitely something that we need to do… The argument [boils down to] if we have so many infected as we currently have, does this make sense — do we not end up very quickly, because the world is so interconnected, with the same type of lockdown mechanism?
“This is why it only makes sense to come out with an app like this when you have broken these R0 values [i.e how many other people one infected person can infect] — once you’ve got it under 1 and got the number of cases in your country down to a good level. And I think that in the language of an infectious disease person this means going back to the approach of containing the disease, rather than mitigating the disease — what we’re doing now.”
“The approach of contact chain evaluation allows you to put better priorities on testing — but currently people don’t have the real priority question, they have a resource question on testing,” he added. “Testing and tracing are independent of each other. You need both; because if you’re tracing contacts and you can’t get tested what’s that good for? So yes you definitely [also] need the testing infrastructure for sure.”
This article was updated with a correction — we originally stated KU Leuven is in the Netherlands; in fact it’s in Belgium