HYDERABAD, INDIA/SAN FRANCISCO: Over the past year, a team of as many as 260 contract workers in Hyderabad, India has ploughed through millions of Facebook Inc photos, status updates and other content posted since 2014.
The workers categorise items according to five “dimensions,” as Facebook calls them.
These include the subject of the post – is it food, for example, or a selfie or an animal? What is the occasion – an everyday activity or major life event? And what is the author’s intention – to plan an event, to inspire, to make a joke?
The work is aimed at understanding how the types of things users post on its services are changing, Facebook said. That can help the company develop new features, potentially increasing usage and ad revenue.
Details of the effort were provided by multiple employees at outsourcing firm Wipro Ltd over several months. The workers spoke on condition of anonymity due to fear of retaliation by the Indian firm. Facebook later confirmed many details of the project. Wipro declined to comment and referred all questions to Facebook.
The Wipro work is among about 200 content labelling projects that Facebook has at any time, employing thousands of people globally, company officials told Reuters. Many projects are aimed at “training” the software that determines what appears in users’ news feeds and powers the artificial intelligence underlying many other features.
The labelling efforts have not previously been reported.
“It’s a core part of what you need,” said Nipun Mathur, the director of product management for AI at Facebook. “I don’t see the need going away.”
The content labelling program could raise new privacy issues for Facebook, according to legal experts consulted by Reuters. The company is facing regulatory investigations worldwide over an unrelated set of alleged privacy abuses involving the sharing of user data with business partners.
The Wipro workers said they gain a window into lives as they view a vacation photo or a post memorialising a deceased family member. Facebook acknowledged that some posts, including screenshots and those with comments, may include user names.
The company said its legal and privacy teams must sign off on all labelling efforts, adding that it recently introduced an auditing system “to ensure that privacy expectations are being followed and parameters in place are working as expected.”
But one former Facebook privacy manager, speaking on condition of anonymity, expressed unease about users’ posts being scrutinised without their explicit permission. The European Union’s year-old General Data Protection Regulation (GDPR) has strict rules about how companies gather and use personal data and in many cases requires specific consent.
“One of the key pieces of GDPR is purpose limitation,” said John Kennedy, a partner at law firm Wiggin and Dana who has worked on outsourcing, privacy and AI.
If the purpose is looking at posts to improve the precision of services, that should be stated explicitly, Kennedy said. Using an outside vendor for the work could also require consent, he said.
It remains unclear exactly how GDPR will be interpreted and whether regulators and consumers would see Facebook’s internal labelling practices as problematic. Europe’s top data privacy official declined to comment on possible concerns.
A Facebook spokeswoman said: “We make it clear in our data policy that we use the information people provide to Facebook to improve their experience and that we might work with service providers to help in this process.”
US Senator Mark Warner, a Democrat and leading critic of social media, told Reuters in a statement that large platforms increasingly are “taking more and more data from users, for wider and more far-reaching uses, without any corresponding compensation to the user.”
Warner said he is drafting legislation that would require Facebook to “disclose the value of users’ data, and tell users exactly how their data is being monetised.”
Human-powered content labelling also referred to as “data annotation,” is a growth industry as companies seek to harness data for AI training and other purposes.
Self-driving car companies such as Alphabet Inc’s Waymo have labellers identify traffic lights and pedestrians in videos to fortify their AI. Voice assistant developers including Amazon.com Inc have people annotate customer audio to improve AI’s ability to decipher speech.
Facebook launched the Wipro project in April last year. The Indian firm received a $4 million contract and formed a team of about 260 labellers, according to the workers. Last year, the work consisted of analysing posts from the prior five years.
After completing that, the team in December was cut to about 30 and shifted to labelling each month posts from the prior month. Work is expected to last through at least the end of 2019, they said.
Facebook confirmed the staffing changes but declined to comment on financial details.
The company said its analysis is ongoing so it could not provide any findings from the labelling or resulting product decisions. It has not told labellers the purpose or results of the project, and the workers said all they have inferred from their limited view is that selfies are increasingly popular.
The Wipro labellers and Facebook said the posts are a random sampling of text-based status updates, shared links, event posts, Stories feature uploads, videos and photos, including user-posted screenshots of chats on Facebook’s various messaging apps. The posts come from Facebook and Instagram users globally, in languages including English, Hindi and Arabic.
Each item goes to two labellers to check the accuracy, and a third if they disagree, Facebook said. Workers said they see on average 700 items per day. Facebook said the target average is lower.
Facebook confirmed labellers in Timisoara, Romania and Manila, the Philippines are involved in the same project.
Among Facebook’s other labelling projects, one worker in Hyderabad for outsourcing vendor Cognizant Technology Solutions Corp said he and at least 500 colleagues look for sensitive topics or profane language in Facebook videos.
The aim is to train an automated Facebook tool that enables advertisers to avoid sponsoring videos that are, for example, adult or political, Facebook said. Cognizant did not respond to a request for comment.
Another application of labelling involved the social network’s Marketplace shopping feature, where it automated category recommendations for new listings by first having labellers and product experts categorise some existing listings, Facebook’s Mathur said.
Facebook users are not offered the chance to opt out of their data being labelled.
At Wipro, the posts being examined include not only public posts but also those that are shared privately to a limited set of a user’s friends. That ensures the sample reflects the range of activity on Facebook and Instagram, said Karen Courington, director of product support operations at Facebook.
Facebook’s data policy does not explicitly mention manual analysis.
“We provide information and content to vendors and service providers who support our business, such as by providing technical infrastructure services, analysing how our products are used, providing customer service, facilitating payments or conducting surveys,” the policy states.
Europe’s GDPR also requires companies to delete user data upon request. Facebook said it has the technology to routinely sync labelled posts with both deletion requests and changes to content privacy settings.
Facebook and other companies are testing techniques to curtail the need for outsourced labelling, in part to analyse more data faster and cheaper. For instance, AI training data for news feed rankings and photo descriptions for the blind came from hashtags on Instagram posts, Facebook’s Mathur said.
“We try to minimise the amount of things we send out,” he said.
MANCHESTER: David Silva scored one goal and made two more as Manchester City put on a passing masterclass to overwhelm Newcastle United with a 5-0 Premier League victory on Wednesday. Having announced that he will leave the club at the end of this season after a 10-year spell, the 34- Read More...
BRIGHTON: Mohamed Salah scored twice as champions Liverpool moved on to 92 points with a 3-1 win at Brighton and Hove Albion in the Premier League on Wednesday. The victory kept Liverpool on target to secure a record Premier League points tally, with Juergen Klopp's side needing nine from Read More...
KATHMANDU, JULY 8 Prime Minister KP Sharma Oli said protection of the most vulnerable, including migrant workers and those in informal sectors, and provision of adequate social security and health care was the key to minimise the impact of COVID-19. Addressing virtual ‘Global Summit on COVID Read More...
Barca beat Espanyol 1-0 in city derby Suarez won the match with his 195th goal for the club Espanyol were relegated after 26 years in La Liga BARCELONA: Barcelona kept up their pursuit of leaders Real Madrid in the La Liga title race by beating city rivals Espanyol 1-0 at home Read More...
KATHMANDU, JULY 8 The country’s sovereign rating assessment is likely to be delayed due to the coronavirus pandemic. Though the government had awarded Fitch Ratings, a United States-based international credit rating agency, to study and confirm Nepal’s sovereign credit rating in December l Read More...
KATHMANDU, JULY 8 International partners, civil society and humanitarian organisations have been working together to support the Government of Nepal’s response to the COVID-19 crisis, read a joint press statement issued by United Nations, various embassies and international agencies in Nepal to Read More...
KATHMANDU, JULY 8 Samajwadi Party-Nepal today wrote a letter to the Parliament Secretariat stating that it had expelled lawmaker Sarita Giri from the House of Representatives. Assistant Spokesperson for the Parliament Secretariat Dasharath Dahamala said his office would inform Speaker Agni Pra Read More...
Swindler allegedly offered ‘fake offer letters’ to victims KATHMANDU, JULY 8 Police have arrested Bijay Pandey, 34, on charge of swindling lakhs of rupees from at least seven persons on the pretext of securing them employment at World Health Organisation’s Nepal office. Pandey, who ha Read More...