Google Books has reportedly been cataloguing low quality AI generated books, a discovery made in same manner that unearthed AI generated products on Amazon.
The revelation of the many AI generated books might negatively impact on Google Ngram Viewer, which is a tool that is used by researchers for analyzing word usage over time. This follows the growth in generative AI tools, which has resulted in unskilled personnel composing books, lyrics, poetry, and videos using the AI tools.
Fishing out the “garbage”
Google Books is a service which shows matching books or are related to a keyword when a user enters that keyword. It has been discovered that AI generated books, mostly of low quality are being indexed on Google Books.
These, according to reports were unearthed using the same method that uncovered AI generated product reviews on Amazon, online articles, and academic papers. The discovery has left experts shocked especially that Google wasn’t aware of the indexing of AI generated books.
These have however suggested that Google Books clearly label AI generated content to allow users make informed decisions, thus benefiting both Google and its users.
“There’s no way Google doesn’t know what books they’re adding to their Google Books index,” said Gary Price, editor of the library journal InfoDOKET.
“They appear to index every book that’s ever published, no matter what it is. I’d like to see Google do some kind of labeling for their books that their AI generates.”
A search for the phrase “As of my last knowledge update” helped in the identifying of the AI generated books. The phrase is associated with answers generated by ChatGPT and Gemini.
According to Newsbytes, this search uncovered “dozens of books containing that phrase on Google Books.”
While some of these books discuss topics like ChatGPT, machine learning, and AI and seem to be human-written, most appear to be generated by artificial intelligence.
The AI books
Books like Bear, Bull, and Wolves: Stock Trading for the Twenty-Year-Old by Tristin Mclver and Maximize Your Twitter Presence: 101 Strategies for Marketing Success by Shu Chen Hou were among the AI generated books.
The two books, published in 2024, all read like a ChatGPT-generated text content “with superficial analysis complex topics.”
The latter, according to Newsbytes, “appears outdated at the time of publication due to being generated with old version of ChatGPT.”
Another book – Maximize Your Twitter Presence: 101 Strategies for Marketing Success, which was published last month describes how to get verified mark on Twitter, now X, “but since Elon Musk’s acquisition of Twitter in 2022, it has become relatively easy to get a verified mark.”
The book was written using information from 2021. According to Gigazine, the book indicates that “At the time of last update in September 2021, Twitter was in the process of its verification standards and processes, so the procedures and requirements may have changed since then.”
Also read: New L1 Blockchain Looks to Boost Scalability and Speed in Metaverse
The possible impact on Ngram Viewer
One of the concerns raised is the impact that the AI generated books will have on the Ngram Viewer. Alex Hanna, director of research at the Distributed AI Research Institute (DAIR) expressed concerns the tool may become unusable once affected by AI generated books.
“AI-generated content will be pulled into Google Books, and Google will use that content to train new AI models; it’s like an Ouroboros. Google will say it has ‘quality filters,’ but the details of these will never be revealed,” pointed out Hanna.
While Google has not said if they will implement a policy that tackles AI-generated books, a spokesperson said: “We are continually adapting our systems and policies to help users find useful and relevant books within the Google Books corpus.”
Amazon has also been plagued by AI generated books and has removed thousands of them from its shelves.