An AI-generated book
A few days ago, I found a newly published book about Starlark (a language derived from Python, intended to be embedded). This caught my attention since I was responsible for Starlark during my time at Google.
I was shocked. Starlark is too niche to warrant a full book, and too simple to fill one. I wouldn’t know how to write 100 pages about it. So I took a closer look at the book.
It looked decent and legit. It’s available on multiple book stores, has a nice cover, and the description said what I would expect.
But it felt strange. I didn’t know the author, and I couldn't find any additional context or advertisements about the book. I asked Gemini who suggested that “the book is almost certainly a low-quality, possibly AI-generated, product”. A friend of mine pointed out the irony: “isn’t it suspicious for an LLM to say that a book was probably written by an LLM?”
(Notice how it uses the expression “delving into”? It’s a word that ChatGPT loves.)
Of course, in 2025, I’m not surprised if someone uses LLM assistance to write a book. Even with assistance, it would take me a long time to write an entire book, proofread it, and get it published. But my suspicion is that the entire book is generated... What if the book is being sold and no one has ever read it?
Actually, more than one
As I was using Google book search with “starlark” as a query, I noticed other suspicious books:
- “Practical Jsonnet for Configuration Engineering”
- “Carvel Ytt in Action - The Complete Guide for Developers and Engineers”
Again, it seemed surprising that someone would spend significant time writing books on these niche topics. Still, that alone isn’t proof. All three books were written by the same author, William Smith, in 2025. Maybe he’s just a prolific author?
But it got me curious: who is this William Smith and what about the editor, HiTeX Press? Gemini mentioned that "HiTeX Press is not a known or reputable technical publisher."
Using Book search, I found lots and lots of books written by William Smith and his colleague, Richard Johnson. The list of books is long, on lots of technical topics: Who can write so many books in a year?
Using ChatGPT, I found this link that's more comprehensive: https://www.overdrive.com/publishers/hitex-press. I scrolled many pages, and it turns out all 800+ books were written within one year by just two authors. All books are technical books, as if they took a list of computer project names and generated a book for each of them.
At that scale, it’s obvious the books are generated. No one could even read that much content.
But what if the books are good?
AI is getting better everyday. AI can beat humans in a large number of tasks. So can it write a decent book? I certainly don’t want to give money to a spamming book factory. But stores offer a free preview, so let me take a closer look at the Starlark book and be the judge.
At a first glance, the content looks legit. It explains the language goals and history, it contains lots of things I could have written. This is not surprising, I wrote many of the texts about Starlark that you can find on the Internet, so it’s probably an LLM-rewritten version of what I wrote.
So is the content any good? No. Definitely not.
Starlark has 3 interpreters (in Java, in Go, in Rust). So I was curious to see which one was used as reference in the book. I scrolled a bit and found a code sample:
It’s using… a C++ implementation of Starlark that doesn’t exist. The API is entirely hallucinated. This is garbage.
The table of contents is also dubious. It contains sections on features that don’t even exist. More generally, the table of contents shows that the book doesn’t have a purpose. It doesn’t have a clear use-case. It reads like a meaningless collection of facts about Starlark, some real, some invented.
So no, do not buy this book. Do not buy any book by HiTeX Press.
A spamming factory
With 800 books in a year, this is industrial-scale spam. I saw lots and lots of their books on Amazon. It looks like they are all sold for around 8€. Fortunately, I have not seen any reviews yet, but I still wonder how many books were sold.
What I find concerning is that it can be difficult for a non-expert to identify which books are garbage AI.
Take for example this “Go programming language reference”, self described as an “authoritative guide”. Yes, the description is dubious (error handling is an advanced chapter? why would you list trivial things like “tokens”, “literals”, “operators”?), but it can take time to identify that the book will be garbage. How many people will fall in the trap? Even if you avoid them, how much time will you waste filtering the garbage?
HiTeX Press isn’t publishing books, it’s publishing spam. I’m afraid this problem will get worse and worse. Let’s hope books stores will find an antispam solution.