‘The data hasn’t gone away’: How Facebook opened Pandora’s box of… #Business [Video]

Cristian WorthingtonFeedToPost, MondoPlayer Twitter

To see the original post and the Video, click here

Years ago, Facebook made it possible for companies like Cambridge Analytica to access information like users’ full names, birthdays, religious affiliations, political views and work histories, whether a Facebook user had directly consented to share it with an outside company or not....

Watch/Read More

Years ago, Facebook made it possible for companies like Cambridge Analytica to access information like users’ full names, birthdays, religious affiliations, political views and work histories, whether a Facebook user had directly consented to share it with an outside company or not.

It’s unclear how many of those categories are included in Cambridge Analytica’s ill-gotten trove of Facebook user data on more than 50 million people. Facebook did not respond to a request for comment before press time.

But based on documents detailing the tools Facebook provided to developers that were used to collect the data that Cambridge Analytica received, it’s clear how easy it would have been for a company to collect such personal information through Facebook. And even as Facebook looked to lock down outside access to users’ information, the data it had previously made available likely still exists and has become even more valuable.

“There are people still sitting on most of this data,” said Jonathan Albright, research director at the Tow Center for Digital Journalism. “The data hasn’t gone away.”

How apps gained access to Facebook’s user data
In April 2010, Facebook CEO Mark Zuckerberg unveiled Facebook’s Graph API, a tool for developers to receive data from Facebook to incorporate into their own apps and sites. For example, Pandora could ask people to sign in using their Facebook accounts in order to receive music recommendations based on their friends’ musical interests. But in opening up its user data to the likes of Pandora, Facebook was opening a Pandora’s box of people’s personally identifiable information.

For roughly five years, anyone with a little coding know-how could operate an app or site to collect data from people’s Facebook profiles and their friends’ profiles. So long as a person signed in to the app or site with their Facebook account, that app or site would receive personal information by default, including that person’s full name and gender, plus the full name and gender of each of their Facebook friends. The app or site could also ask for information like the person’s email address and relationship details as well as their friends’ birthdays and political views. Only if a person agreed to provide access to that information would they be able to log in and use the app or site.

This is akin to someone showing their driver’s license to get into a bar and that bar receiving a list of names and genders for every one of that person’s friends. The bar could ask for more information, like when each friend was born, where they work, their political views and their hometown. A person could decline to share that information, but then they wouldn’t be allowed in the bar.

The data Facebook made available
The below images list all the Facebook information, beyond public profile information like name and gender, that a developer could request from the person signing into their account. The first image is from a video uploaded to the Facebook Developers channel on YouTube in June 2013. The second image is from a book published by O’Reilly Media in October 2013 titled “Mining the Social Web: Data Mining Facebook, Twitter, LinkedIn, Google+, Github, and More.”

Source: YouTube/Facebook Developers
Source: Safari Books Online

In many ways, the data that Facebook allowed developers to access is not so different from the data that companies like credit-reporting firms collect and make available for ad targeting. Experian’s marketing division, for example, provides brands with audience segments that group people by political views, age range and education level. But a major difference is the data from those companies is aggregated and anonymized before brands can access it; the data that Facebook made available through its Graph API was not.

“Having direct individual-level [personally identifiable information] is unlike anything we’d even want access to within the agency,” said one agency executive.

Each day Digiday+ members receive an exclusive early look at the next day’s Digiday content. Join now for access.

By June 2013, Facebook had made it so simple for outsiders to get information about people’s friends that a limitation it cited was that developers could only get information about a person’s friends but not those friends’ friends.

“The amount of data we can access about the current user’s friends is very limited. For example, we can’t get the friends of these friends,” said Facebook product manager Simon Cross in the video uploaded to the company’s YouTube channel for developers in June 2013. The video demonstrates exactly how developers could use Facebook’s Graph API to pull information from the Facebook profiles of not only the person who signed into the developer’s app or site using their Facebook account but also that person’s friends.

Less than a year after that video was uploaded — and three years after settling with the Federal Trade Commission over a failure to protect users’ privacy — Facebook decided to limit the amount of data that could be accessed about a user’s friends. Starting April 30, 2015, Facebook no longer allowed developers to receive the list of a person’s friends by default when that person signed into their site or app using Facebook. If the person agreed to give the app or site permission to access their friends list, the app would only be able to collect public profile information, like a friend’s name and gender.

Data in the wild
By that time, it was too late. The data was already out in the world, and Facebook relied largely on an honor system for developers to abide by its policy against sharing the data they had collected with others who had not been given permission to access it. That honor system has holes, as the Cambridge Analytica scandal proves. As reported by The Intercept, The Guardian and The New York Times, a researcher named Aleksandr Kogan used the original, leaky version of Facebook’s Graph API to collect Facebook information from people and their friends and then was paid to share that data with Cambridge Analytica, which used it for President Donald Trump’s 2016 election campaign.

As alarming as the ease with which companies could access this data is the data’s lasting value. Not only was the data not anonymized, but it included people’s actual Facebook user IDs, which are Facebook’s version of Social Security numbers.

“Not having a substitute ID for what amounts to a Social Security number on Facebook is a problem,” said Albright.

It’s problematic because those IDs are more broadly applicable than the app-specific IDs that Facebook now provides developers in lieu of the actual user ID, another switch that happened in April 2014. For example, Cross displayed his Facebook user ID in the 2013 YouTube video, and that ID is still connected to his account. If it were an app-specific ID, then only the app he had consented to provide it to could use it. But because it’s the actual ID, anyone with the ID can use it to do things like target Cross or people like Cross with ads using Facebook’s Custom Audiences and Lookalike Audiences ad-targeting tools. For a time, they could have even used it to acquire Cross’s phone number.

For nearly two years, advertisers could use a Facebook-provided tool to compare how different lists of people’s information, such as their email addresses or phone numbers, overlapped and determine the phone numbers of people who had not previously provided them to an advertiser, according to a Wired report published in January. Facebook fixed the issue in December 2017 and said it hadn’t found anyone using it to extract people’s information. But for a time, the company also wasn’t aware how Cambridge Analytica accessed its data, and Facebook has not yet been able to ascertain how much of that data Cambridge Analytica still holds.

Data wrangling
Facebook is beginning to address this legacy data issue. The company announced March 21 that it plans to scrutinize apps that had accessed large amounts of user data before the Graph API change that was announced in 2014, audit those suspected of misusing that information and ban the ones that misused it. Facebook will also notify people whose information had been collected and misused by apps such as the one at the center of the Cambridge Analytica controversy, disable data access for apps that a person hasn’t used within the last three months and only allow apps to request people’s full names, profile photos and email addresses when they log in using their Facebook accounts unless the apps submit to a review and are approved by Facebook to request more information.

However, turning off access to this data is not the same as undoing that access. For example, the data that Kogan’s app had acquired was handed off to Cambridge Analytica. Facebook’s audits may be able to trace these exchanges. But they may not.

Facebook could disarm this data, to an extent, by disposing of the Facebook user IDs that outside apps have collected. It could refresh all of its user IDs, or at least those of people whose information was collected and misused. And, as people review the apps that accessed their information, Facebook could give individuals the option to request their Facebook user IDs to be reset, even if none of those apps are found to have misused their data. A Facebook spokesperson did not address by press time whether the company will enable these user ID resets.