While criticizing social media companies for invading user privacy, some university researchers and nonprofit organizations are pushing for access to the same data collected by those companies—arguing that such information is crucial to understanding how to combat domestic extremism.
Researchers and activists made their case for more access to social media data at an Oct. 28 Senate Homeland Security Committee hearing on social media and domestic extremism. Their calls follow the release of the Facebook Files—a trove of internal records that reportedly show, among other things, the company’s failure to control the spread of radicalizing content.
According to the researchers, they need to see how Facebook and other social media users are interacting with content so they can better understand the phenomenon of domestic extremism.
“The key types of datasets that should be made available relate to ‘who’ viewed or engaged with what content, when and how,” Nathaniel Persily, co-director of the Stanford Cyber Policy Center, said at the Oct. 28 hearing. “In other words, to answer the most pressing questions relating to social media, we need data that can assess which types of people—though not individuals themselves—were seeing certain online content at certain times.”
In the wake of the Cambridge Analytica scandal—in which researchers scraped billions of Facebook data points to target users with political ads ahead of the 2016 elections—companies have been hesitant to provide access to their data. In August, for instance, Facebook blocked New York University researchers from scraping publicly accessible data from its API, saying “research cannot be the justification for compromising people’s privacy.”
At the hearing, the researchers insisted they could study user data without compromising privacy.
Persily proposed that government should be the gatekeeper of who accesses the data. He proposed a system where the Federal Trade Commission would be responsible for vetting researchers and research projects and specifying the conditions under which research shall be conducted.
He also said strict rules would govern the use of social media data.
“Researchers may not take any data out of the research environment without a privacy review being conducted,” Persily said in his opening statement. “We need to make sure measures are in place that reassure the public that no individual’s data is of interest to the research project, just the aggregated findings derived from them.”
Persily further said the data should be “anonymized or pseudonymized”—meaning that personal information would be taken out of datasets to protect individual privacy.
However, privacy activists have expressed doubt that data anonymization and pseudonymization techniques truly protect individuals. Past studies have shown how anonymized data can be used to infer the identity of individuals with just a few data points.
Witnesses at the hearing did concede that giving the government access to internal social media data would be a grave threat to civil liberties, as “government officials would see this research environment as a honey pot for intelligence and law enforcement activities,” Percily said.
Nevertheless, Anti-Defamation League Vice President David Sifry still proposed a government-funded nonprofit center to “track online extremist threat information in real-time and make referrals to social media companies and law enforcement agencies when appropriate.”
Committee members offered little pushback to the researchers’ proposals. Some Republican lawmakers such as Sen. Ron Johnson (R-Wis.) criticized the witnesses and other members for only concentrating on right-wing misinformation, but none raised privacy concerns about universities having access to social media data.
The committee’s ranking Republican, Sen. Rob Portman (R-Ohio), expressed agreement with the researchers.
“We need to be able to look under the hood and figure out what the issues are to be able to regulate them properly,” Portman said. “If they want to ensure that Congress pursues evidence-based policy solutions, I think it’s incumbent upon the platforms to provide quality data.”