Anonymizing mobile data no longer enough to ensure user privacy

One of the researchers said that it was time to "reinvent what [mobile data] anonymization means".
1 February 2022

Companies are legally bound to make this mobile data anonymous, but the study says this is no longer enough to keep identities private. (Photo by JOEL SAGET / AFP)

Privacy measures that are meant to preserve the anonymity of smartphone users are no longer suitable for the digital age, a study suggested last week. Vast quantities of mobile data are scooped up from smartphone apps by firms looking to develop products, conduct research, or target consumers with adverts.

In Europe and many other jurisdictions, companies are legally bound to make this mobile data anonymous, often doing so by removing telltale details like names or phone numbers. But the study in the Nature Communications journal says this is no longer enough to keep identities private.

The researchers say people can now be identified with just a few details of how they communicate with an app like WhatsApp. One of the paper’s authors, Yves-Alexandre de Montjoye of Imperial College London, told AFP it was time to “reinvent what anonymization means”.

‘Rich’ data

De Montjoye’s team took anonymized data from more than 40,000 mobile phone users, most of which was information from messaging apps and other “interaction” data. They then “attacked” the data searching for patterns in those interactions — a technique that could be employed by malicious actors.

With just the direct contacts of the person included in the dataset, they found they could identify the person 15% of the time. When further interactions between those primary contacts were included, they could identify over half (52%) of the people involved.

“Our results provide evidence that disconnected and even re-pseudonymized interaction data remain identifiable even across long periods of time,” wrote the researchers hailing from the UK, Switzerland, and Italy.

“These results strongly suggest that current practices may not satisfy the anonymization standard set forth by (European regulators) in particular with regard to the linkability criteria.” De Montjoye stressed that the intention was not to criticize any individual company or legal regime.

Rather, he said the algorithm they were using just provided a more robust way of testing what we regard as anonymized data. “This dataset is so rich that the traditional way we used to think about anonymization[…] doesn’t really work anymore,” he said. “That doesn’t mean we need to give up on anonymization.”

The researcher then said one promising new method was to heavily restrict access to large datasets to just simple question and answer interactions. That would get rid of the need to classify a dataset as “anonymized” or not.