Skip to content

Search the site

LinkedIn starts training GenAI on users' personal data - without notification

Anyone who doesn't want the Microsoft-owned social network and its "affiliates" to harvest their posts will have to open up settings and specifically opt out.

Some users are likely to say "get on your bike" to LinkedIn. Read on to find out how (Photo by Greg Bulla on Unsplash)
(Photo by Greg Bulla on Unsplash)

Go into your LinkedIn settings today and you may notice a small but potentially concerning new option.

The social network has quietly introduced a feature that trains Generative AI (GenAI) models with users' personal data - and switched it on by default.

This means that a large share of LinkedIn's 800 million users are now letting LinkedIn, Microsoft and its "affiliates" use their data and content to teach its GenAI models how to perform a task.

Anyone living in the EU, EEA or Switzerland will not have their data harvested. LinkedIn has not yet confirmed why it has spared the citizens of Europe, but it may be due to rules introduced under the EU AI Act.

We have asked LinkedIn if the data involved includes DMs and will update this article when we hear back.

"Where LinkedIn trains generative AI models, we seek to minimize personal data in the data sets used to train the models, including by using privacy-enhancing technologies to redact or remove personal data from the training dataset," LinkedIn wrote in its terms and conditions, which were updated seven days ago.

It added: "As with most features on LinkedIn, when you engage with our platform we collect and use (or process) data about your use of the platform, including personal data. This could include your use of the generative AI (AI models used to create content) or other AI features, your posts and articles, how frequently you use LinkedIn, your language preference, and any feedback you may have provided to our teams."

LinkedIn and its owner, Microsoft, will not be the only organisations allowed to use personal data to train GenAI models.

"The artificial intelligence models that LinkedIn uses to power generative AI features may be trained by LinkedIn or another provider," the social network added. "For example, some of our models are provided by Microsoft’s Azure OpenAI service."

Why are EU citizens not subject to LinkedIn's GenAI training?

The EU AI Act sets out data governance and management practices for the data sets used in the training, validation and testing of high-risk AI systems. Although content-producing models are not likely to be considered high-risk, LinkedIn may have decided to simply sidestep any potential problems in Europe.

Heather Burns, a tech policy expert based in Glasgow, Scotland, told The Stack: "The fact that LinkedIn is not rolling out this generative AI model in the EU, EEA, or Switzerland - places which unconditionally require opt-in consent, while also giving users the right to call out incorrect information that companies hold about them - perhaps says more about the accuracy of LI's generative AI model than it does about EU privacy regulations."

Discussing how the UK should respond to LinkedIn's update, Burns added: "The fundamental principle of the EU privacy model is 'get the person's consent first, then user their data.' The UK, of course, is no longer in the EU, and has been moving worryingly fast towards the US model of 'do whatever you want with people's data, their consent doesn't matter!'

"That being said, the UK's data protection standards are still derived from the EU model. LinkedIn still needs to get opt-in consent to use this data. So I'd be curious to see their homework on this, including their data protection impact assessment.

"The Information Commissioner's Office (ICO) will be even more curious than I am to see that working too. Meta recently got in some hot water with the ICO about its own use of user data to train generative AI, and so LinkedIn really has no excuse for not knowing how this story always ends.

"It's a good reminder that so many privacy issues, including the one LinkedIn has just created, could be easily avoided if companies just do their basic due diligence and engage in good faith with regulators before they decide to move fast and break things."

READ MORE: “Barfing” code, RAG stacks and other AI lessons from LinkedIn

Ellen Keenan-O'Malley, senior associate solicitor at the leading IP law firm EIP, said the ICO's decision to avoid putting a halt on companies like Meta training GenAI tools with user data "will make some companies think they can follow suit."

"However, the ICO made it explicitly clear that such processing has not received regulatory approval and the ICO will continue monitoring to ensure Meta is 'demonstrating ongoing compliance'," she said. "Therefore, I think LinkedIn’s decision to adopt the approach of opt-out versus opt-in to users data being used to train AI models could still face some backlash, be that from its users and/or from the ICO.”

The risks of training GenAI models on enterprise data

For enterprises, the risks of allowing data to be scraped and used for AI training include the very real possibility of corporate secrets being leaked by the models which have learned from them.

Vanessa Barnett, technology partner at Keystone Law, told The Stack: “Most AI housekeeping – inside and outside the EU – is being driven by the EU AI Act. Enterprises (and people) will now increasingly need to pay attention to how their personal and corporate data is being used.

“From an enterprise point of view, a model training on data means that it can potentially recreate that data based on prompts. So, being able to protect business and personal data from model leakage is vital. Apart from that, there is still UK GDPR compliance to do: LinkedIn can only personal data for a new purpose if either this is compatible with the original purpose, or there is consent from users.”

For Lillian Tsang, Senior Data Protection and Privacy Solicitor at the law firm Harper James, organisations that "businesses that rely heavily on LinkedIn" should be "concerned" about the exposure of proprietary information, business strategies and other competitive secrets,

She said: “Could it even harm business? It could, if it is known that the business data that is scraped from LinkedIn is being used to train generative AI. This may harm the reputation of the businesses to the point that clients or stakeholders might lose trust in a company that appears not to prioritise data protection, which is fundamentally what the GDPR is all about."

Jeremy Bradley, COO, Zama, advised businesses to focus on exploring "emerging solutions that protect their customers and their personal information".

“Enterprises should be concerned about LinkedIn's use of personal data to train generative AI models by default, and without explicit user consent, as it raises serious data privacy issues," he said.

"Advancements in encryption - such as Fully Homomorphic Encryption (FHE) - could help enterprises comply with global data protection laws while also allowing businesses to leverage the benefits of AI technologies responsibly. "

How can I opt out of LinkedIn GenAI personal data training?

Anyone who doesn't want to hand over their data to AI trainers can opt out by going to the "data privacy" section of settings, clicking "Data for Generative AI improvement," and then moving the slider on an option marked "Use my data for training content creation AI models."

A LinkedIn spokesperson told The Stack: “We’re making changes that give people using LinkedIn even more choice and control when it comes to how we use data to train our generative AI technology. We’re introducing new tools with AI that benefit all members by default while also making sure that those who have specific privacy preferences have an easy way to opt out. People can choose to opt-out but they come to LinkedIn to be found for jobs and networking and generative AI is part of how we are helping professionals with that change.

"At this time, we will not be enabling training for generative AI on member data from the European Economic Area or Switzerland, and will not provide the setting to members in those regions until further notice."

You can find a LinkedIn FAQ on Generative AI here.

In Q4, LinkedIn’s revenue increased 10% year-over-year as it saw "accelerated member growth and record engagement."

"Members engage with 1.5 million pieces of content every minute on the platform, and video is now the fastest growing format on LinkedIn, with uploads up 34% year-over-year," it wrote.

READ MORE: Why the Microsoft and CISPE deal is bad news for European cloud providers

Latest