Investigation · Xavier Vinaixa & Marçal Font Espí
The silent plunder
How a mysterious company buys second-hand books from bookstores across half the world to train AI models — then physically destroys the copies.
This page collects the joint investigation conducted by Xavier Vinaixa (xaviviro) and UB professor, bookseller and poet Marçal Font Espí from the Llibreria Fènix bookshop in Badalona, after spotting anomalous orders from a foreign company systematically buying Catalan non-fiction — often copies that had been sitting in the back room for years.
What looked like an unusual buyer turned out to be an industrial chain: the firms buy the books, ship them to a plant where the spines are cut, the pages are automatically scanned, and the volumes are finally pulped to make paper pulp. All to train artificial intelligence models on human-written text before the web is overrun by AI-generated content — what specialists call the «data wall».
What we uncovered
An international network silently acquires lots of old, low-commercial-value books from second-hand bookshops. They are chosen precisely because they are human texts barely present on the internet — no digital footprint, no editorial scrutiny, no effective copyright protection in practice.
The consequence is not only economic for authors and publishers: it is cultural. Minoritised languages such as Catalan lose written heritage not through human neglect but through algorithmic voracity. Our investigation documents the mechanism and proposes protection and licensing pathways in the Substack trilogy.
Cedulari · Our proposal
The solution we propose: Cedulari
We don't stop at denouncing the problem. For over a year we have been building a constructive answer: Cedulari, the documented memory of Catalan publishing. It is both a database of authors, books and editions —focused on nearly a century of Catalan publishing, from the 1920s to the adoption of the ISBN, a heritage that today is recorded nowhere— and a set of tools that give AI models provenance, traceability and verifiable sources instead of the usual hallucinations.
It was precisely this knowledge —of the second-hand book trade, the print runs, the pseudonyms and the clandestine editions held by only five or six antiquarian booksellers in the country— that let us uncover the silent plunder. Cedulari exists to preserve that heritage and make it available to AI lawfully and traceably. The project, led by Marçal Font, Xavi Vinaixa and Sorensen.ai with advice from the University of Barcelona's literary studies department, is now seeking funding.
Who investigated
01
Xavier Vinaixa Roselló
AI expert, CTO at Sorensen.ai. Pulled the technical thread: identifying the purchase pattern, analysing the logic of the «data wall» and framing it within the wider debate on the real cost of training large language models.
02
Marçal Font Espí
Professor in the Department of Literary Studies at the UB, poet and bookseller, owner of the Llibreria Fènix bookshop in Badalona. Spotted the anomalous orders, documented the requests and contributed first-hand knowledge of the second-hand book trade.
The series: «The silent plunder»
All four instalments are co-authored by Xavier Vinaixa and Marçal Font Espí, published across both authors' Substacks — xaviviro.substack.com and maralfontesp.substack.com:
TV3 · 3Cat · May 2026
On Catalan television (TV3)
«Tot es mou» — 27 May 2026
TV3's flagship news magazine «Tot es mou» invites Marçal Font (Llibreria Fènix bookshop in Badalona) and Xavier Vinaixa (CTO at Sorensen) on the studio set to explain how they spotted the anomalous purchases, how they pulled the thread to Anthropic and the story originally broken by the Washington Post, and why preserving the physical document is a matter of cultural sovereignty — not just economics. Programme aired in Catalan.
Telenotícies (TV3 newscast) — 31 May 2026
TV3's flagship newscast Telenotícies ran a piece on the case: booksellers denounce the destruction of books to train AI models and call on the Ministry of Culture to step in to protect literary heritage. The report picks up the case uncovered by the investigation. Aired in Catalan.
Press coverage
The investigation has been picked up by television, the press, national radio and international outlets:
- VIA Empresa Opinion ES
Anthropic 451
Opinion piece by Josep M. Ganyet citing the investigation as its starting point.
- elDiario.es Press ES
The mysterious company buying used books to train AI and destroying them
Investigation by Pol Pareja. Quotes Xavier Vinaixa as an AI expert and one of the people who pulled the thread on the case.
- Cadena SER Radio ES
Bread today, hunger tomorrow: a bookseller warns about the AI plan
National radio coverage. Xavier Vinaixa is cited in the audio segment.
- 3Cat / TV3 — «Tot es mou» TV ES
Suspicious second-hand book purchases to train artificial intelligence
Report on «Tot es mou» (TV3) with Marçal Font (Llibreria Fènix) and Xavier Vinaixa (Sorensen) presenting the investigation on the studio set.
- Diario Socialista Press ES
The antiquarian book trade warns about mass purchases of copies to train AI models
Report featuring the warning from UNILIBER (the antiquarian book association). Cites Xavier Vinaixa as an «AI expert who investigated the case» for the «data wall» concept.
- 3CatInfo Press ES
Buying old books to train an AI and destroying them: booksellers' alert over suspicious orders
Report by Toni Noguera Martínez on 3CatInfo. Quotes Xavi Vinaixa (Sorensen.AI) on the «data wall»: the digitized data available for training is estimated to have already hit its ceiling.
- Montevideo Portal Press UY
The race to train AI threatens to destroy thousands of old books
Latin American pickup. Cites Xavier Vinaixa as an AI specialist for the concept of the «data wall».
- La Voz del Interior Opinion AR
Artificial intelligence: the new digital extractivism
Opinion piece by Francesc-Xavier Soria Jofra in the Córdoba (Argentina) daily. Extensively cites Xavier Vinaixa and Marçal Font's investigation and reproduces their «Venice phase» passage.
- IADE — Realidad Económica Press AR
The mysterious company buying used books to train AI and destroying them: «It's literary plunder»
The Argentine Institute for Economic Development (IADE / Realidad Económica) republishes Pol Pareja's report (elDiarioAR). Cites Xavier Vinaixa as an AI expert and Marçal Font.
- 3Cat / TV3 — Telenotícies TV ES
«Literary plunder»: booksellers denounce the destruction of books to train AIs
Telenotícies (TV3) news piece on the booksellers' complaint over the destruction of books to train AIs; it picks up the case uncovered by the investigation and calls on the Ministry of Culture to step in to protect literary heritage.
- Telecinco — Informativos Telecinco Press ES
From the bookshop shelf to feeding the algorithm: the controversial purchase of old books to train AI
Report in the Culture section of Informativos Telecinco. Quotes Xavier Vinaixa, technical director of Sorensen.IA: they already hold the information from many books but lack the legal permission to train on it — all they need is the invoice. Also features Miguel Ángel Ortega (UNILIBER) and Marçal Font (UB).
- RTVA — «Les coses grans» Radio AD
Booksellers sound the alarm: US companies buy old books to train AI and destroy them
Panel discussion on «Les coses grans» (Ràdio i Televisió d'Andorra) with Marçal Font (Llibreria Fénix), Toni Noguera (3Cat) and Oliver Vergés (Andorran Publishers' Association). Though not taking part, Xavier Vinaixa is cited several times as the author of the investigation that broke the case.
- Badalona Comunicació Press ES
Badalona's Fènix bookshop has received several suspicious book orders from a Canadian company
Report by Enric Garcia Torrell in Badalona's public media outlet on the case at the Llibreria Fènix. Quotes Xavi Vinaixa, technical director of Sorensen.AI, warning that the real danger will come when private companies digitize poorly protected books and that information vanishes from accessible databases.
- La Vanguardia Press ES
Destroying books to train artificial intelligence
Report by Lara Gómez Ruiz in the Culture section of La Vanguardia. Extensively quotes Xavier Vinaixa, technical director of Sorensen.ai, and bookseller Marçal Font (Llibreria Fènix) on the investigation published on Substack. Features Vinaixa's proposal to charge these companies for the data and keep the heritage, and presents the Cedulari.cat project.