Skip to main content

When Big Data Marketing Becomes Stalking

Data brokers cannot be trusted to regulate themselves

Many of us now expect our online activities to be recorded and analyzed, but we assume that the physical spaces we inhabit are different. The data-broker industry does not see it that way. To it, even the act of walking down the street is a legitimate data set to be captured, catalogued and exploited. This slippage between the digital and physical matters not only because of privacy concerns—it also raises serious questions about ethics and power.

The Wall Street Journal recently published an article about Turnstyle, a company that has placed hundreds of sensors throughout businesses in downtown Toronto to gather signals from smartphones as they search for open Wi-Fi networks. The signals are used to uniquely identify phones as they move from street to street, café to cinema, work to home. The owner of the phone need not connect to any Wi-Fi network to be tracked; the entire process occurs without the knowledge of most phone users. Turnstyle anonymizes the data and turns them into reports that it sells back to businesses to help them “understand the customer” and better tailor their offers.

Prominent voices in the public and private sectors are currently promoting boundless data collection as a way of minimizing threats and maximizing business opportunities. Yet this trend may have unpleasant consequences. Mike Seay, an OfficeMax customer, recently received a letter from the company that had the words “Daughter Killed in Car Crash” following his name. He had not shared this information with OfficeMax. The company stated that it was an error caused by a “mailing list rented through a third-party provider.”


On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


Clearly, this was a mistake, but it was a revealing one. Why was OfficeMax harvesting details about the death of someone's child in the first place? What limits, if any, will businesses set with our data if this was deemed fair game? OfficeMax has not explained why it bought the mailing list or how much personal data it contains, but we know that third-party data brokers sell all manner of information to businesses—including, as Pam Dixon, executive director of the World Privacy Forum, testified before the U.S. Senate last December, “police officers' home addresses, rape sufferers..., genetic disease sufferers,” as well as suspected alcoholics and cancer and HIV/AIDS patients.

In the absence of regulation, there have been some attempts to generate an industry code of practice for location-technology companies. One proposal would have companies de-identify personal data, limit the amount of time they are retained, and prevent them from being used for employment, health care or insurance purposes. But the code would only require opt-out consent—that is, giving your details to a central Web site to indicate that you do not want to be tracked—when the information is “not personal.”

The trouble is, almost everything is personal. “Any information that distinguishes one person from another can be used for re-identifying anonymous data,” wrote computer scientists Arvind Narayanan, now at Princeton University, and Vitaly Shmatikov of the University of Texas at Austin in a 2010 article in Communications of the ACM. This includes anonymous reviews of products, search queries, anonymized cell-phone data and commercial transactions. The opt-out-via-our-Web-site model also compels customers to volunteer yet more information to marketers. And it is not clear that self-regulation will ever be sufficient. Most industry models of privacy assume that individuals should act like businesses, trading their information for the best price in a frictionless market where everyone understands how the technology works and the possible ramifications of sharing their data. But these models do not reflect the reality of the deeply unequal situation we now face. Those who wield the tools of data tracking and analytics have far more power than those who do not.

A narrow focus on individual responsibility is not enough: the problem is systemic. We are now faced with large-scale experiments on city streets in which people are in a state of forced participation, without any real ability to negotiate the terms and often without the knowledge that their data are being collected.

Scientific American Magazine Vol 310 Issue 4This article was originally published with the title “Big Data Stalking” in Scientific American Magazine Vol. 310 No. 4 (), p. 14
doi:10.1038/scientificamerican0414-14