Is Meta using works from Irish authors for AI training?

is-meta-using-works-from-irish-authors-for-ai-training?

Irish authors have joined forces to campaign against the alleged use of their copyrighted work to train Meta AI models.

A petition was launched by the Irish Writers Union after searches revealed that the work of prominent Irish authors, including President Michael D Higgins, Joseph O’Connor and Anne Enright, appear in a database used to develop the software.

While there has been growing concern around the unauthorised use of content for AI development in recent years, an article published by The Atlantic, the American magazine and multi-platform publisher recently in the headlines for the US security story relating to the use of the Signal messaging app by Trump administration officials, has beamed a spotlight on the issue.

The Atlantic reports that court documents unsealed this month in the US in a class-action lawsuit filed against Meta, by the novelists Richard Kadrey and Christopher Golden and comedian Sarah Silverman, give more details and insights into their allegations of Meta’s use of copyright-protected works in order to train its AI model, Llama 3.

Meta said in court in San Francisco on Monday that it made “fair use” of the books in developing its large language model Llama, arguing that the authors’ lawsuit should be thrown out.

“The allegations have profound implications for Irish authors,” said Conor McAnally, Chair of the IWU.

The Atlantic published a link for a searchable database of over 7.5 million books and 81 million research papers in the article.

This database is the front end of Library Genesis or LibGen for short, which is described as a decades old shadow library, full of material without copyright authorisation.

Writers and publishers have been quick to use The Atlantic’s search tool to check if their name and texts have been part of this process.

President Michael D Higgins

Searches show that Irish authors including President Higgins, Joseph O’Connor, Anne Enright, John Boyne, Colm Tóibín and Sally Rooney all appear in the database.

Acclaimed Irish authors including Emma Donohue have confirmed that their work has been targeted too, with Ms Donohue posting screenshots of her search results, saying “91 results, that’s every book I’ve published since 1993 in multiple languages scraped without permission to train AI.”

The award-winning author added “they’re robbing us in hopes of replacing us.”

The IWU say works have been unknowingly uploaded to a large language model (LLM) to help train Ai models, without due credit – either financially or through attribution.

In recent years, individual publishers have taken cases against LibGen but it still unclear who runs the system.

The digital warehouse LibGen is reported to be registered in both Russia and the Netherlands, making the appropriate jurisdiction for legal action unclear.

The LibGen address has been blocked by internet service providers in some domains.

This week, in the letter signed by Mr McAnally, writers whose work appears to be on the database are encouraged to make a formal complaint to Meta.

The union offered a potential template that writers could use to outline their concerns including a request for “compensation for past unauthorized use in your AI model training programmes”.

The IWU sent the letter to its 500 members and within a day, 53 members responded, reporting a total of 325 unauthorised uses.

‘Normal People’ author Sally Rooney

The group has now posted a petition on Uplift to raise awareness addressed to Minister for State with responsibility for AI and Digital Transformation Niamh Smyth.

The controversial issue of Meta using large quantities of copyright protected texts to train its AI has been growing internationally, with writers growing more vocal in their efforts to restrain the company.

The European Writers Council is promoting a #dontzuckourbooks campaign.

“The union takes this infringement very seriously and will be working with other writers groups including the European Writers Council to insist that any unlicensed use of our works is piracy and seeking compensation for writers whose work has been used to train AI,” Mr McAnally adds.

Conor Kostick, Advocacy Officer with the IWU, told RTÉ News that “tech billionaires are dominating this planet and behaving like overlords… we might not be marching around brandishing pitchforks, but together we can oblige them to stop stealing from us.”

Mr Kostick is calling on the minister to seek a detailed response to the allegations from Meta, saying “they have engaged in wholesale copyright infringement and to provide unequivocal assurances that they will respect the copyright of authors, not engage in unlawful conduct and will pay authors for all historic infringements.”

RTÉ News put a series of questions to Meta, including confirmation of the use of President Higgins’ writing.

In response, a Meta spokesperson said “We respect third-party intellectual property rights and believe our use of information to train AI models is consistent with existing law.”

Leave a Reply