Skip to main content

Investigation · Xavier Vinaixa & Marçal Font Espí

The silent plunder

How a mysterious company buys second-hand books from bookstores across half the world to train AI models — then physically destroys the copies.

This page collects the joint investigation conducted by Xavier Vinaixa (xaviviro) and UB professor, bookseller and poet Marçal Font Espí from the Llibreria Fènix bookshop in Badalona, after spotting anomalous orders from a foreign company systematically buying Catalan non-fiction — often copies that had been sitting in the back room for years.

What looked like an unusual buyer turned out to be an industrial chain: the firms buy the books, ship them to a plant where the spines are cut, the pages are automatically scanned, and the volumes are finally pulped to make paper pulp. All to train artificial intelligence models on human-written text before the web is overrun by AI-generated content — what specialists call the «data wall».

What we uncovered

An international network silently acquires lots of old, low-commercial-value books from second-hand bookshops. They are chosen precisely because they are human texts barely present on the internet — no digital footprint, no editorial scrutiny, no effective copyright protection in practice.

The consequence is not only economic for authors and publishers: it is cultural. Minoritised languages such as Catalan lose written heritage not through human neglect but through algorithmic voracity. Our investigation documents the mechanism and proposes protection and licensing pathways in the Substack trilogy.

Cedulari · Our proposal

The solution we propose: Cedulari

We don't stop at denouncing the problem. For over a year we have been building a constructive answer: Cedulari, the documented memory of Catalan publishing. It is both a database of authors, books and editions —focused on nearly a century of Catalan publishing, from the 1920s to the adoption of the ISBN, a heritage that today is recorded nowhere— and a set of tools that give AI models provenance, traceability and verifiable sources instead of the usual hallucinations.

It was precisely this knowledge —of the second-hand book trade, the print runs, the pseudonyms and the clandestine editions held by only five or six antiquarian booksellers in the country— that let us uncover the silent plunder. Cedulari exists to preserve that heritage and make it available to AI lawfully and traceably. The project, led by Marçal Font, Xavi Vinaixa and Sorensen.ai with advice from the University of Barcelona's literary studies department, is now seeking funding.

Discover Cedulari.cat

Who investigated

01

Xavier Vinaixa Roselló

AI expert, CTO at Sorensen.ai. Pulled the technical thread: identifying the purchase pattern, analysing the logic of the «data wall» and framing it within the wider debate on the real cost of training large language models.

02

Marçal Font Espí

Professor in the Department of Literary Studies at the UB, poet and bookseller, owner of the Llibreria Fènix bookshop in Badalona. Spotted the anomalous orders, documented the requests and contributed first-hand knowledge of the second-hand book trade.

The series: «The silent plunder»

All four instalments are co-authored by Xavier Vinaixa and Marçal Font Espí, published across both authors' Substacks — xaviviro.substack.com and maralfontesp.substack.com:

  1. The silent plunder

    xaviviro.substack.com →

  2. The silent plunder II: the erased confession

    xaviviro.substack.com →

  3. The silent plunder III: protect and license

    xaviviro.substack.com →

  4. And we'll make drums out of the books

    maralfontesp.substack.com · 2026-05-18 →

TV3 · 3Cat · May 2026

On Catalan television (TV3)

«Tot es mou» — 27 May 2026

TV3's flagship news magazine «Tot es mou» invites Marçal Font (Llibreria Fènix bookshop in Badalona) and Xavier Vinaixa (CTO at Sorensen) on the studio set to explain how they spotted the anomalous purchases, how they pulled the thread to Anthropic and the story originally broken by the Washington Post, and why preserving the physical document is a matter of cultural sovereignty — not just economics. Programme aired in Catalan.

«Tot es mou» · TV3 · 27 May 2026 · 3cat.cat →

Telenotícies (TV3 newscast) — 31 May 2026

TV3's flagship newscast Telenotícies ran a piece on the case: booksellers denounce the destruction of books to train AI models and call on the Ministry of Culture to step in to protect literary heritage. The report picks up the case uncovered by the investigation. Aired in Catalan.

Telenotícies · TV3 · 31 May 2026 · 3cat.cat →

Press coverage

The investigation has been picked up by television, the press, national radio and international outlets:

Read the investigation

Read on Substack