A recently exposed database has revealed a list of up to 16,000 artists, including children, that Midjourney used to train its AI art-generating tools.
The database, circulated over the New Year on social media, comes in the form of a Google Sheets spreadsheet, detailing various time periods, styles, genres, movements, mediums, and techniques purportedly utilized in the training of the program.
This comes as artists have been disgruntled with AI models like Stable Diffusion, accusing them of “theft” and unethically scraping data from sites like DeviantArt.
Prominent figures named
According to an article by Dazed, the long list, which was extracted from a 24-page document that was submitted to a class-action lawsuit amendment against Midjourney, Stability AI, and DeviantArt, included digital artists, game systems, commercial illustrators, and well-known modern and current artists.
Prominent figures including Damien Hirst, Anish Kapoor, Harmony Korine, HR Giger, and Anish Kapoor have also been linked, stoking the already fierce debate.
The list in question includes other renowned persons such as Yayoi Kusama, Banksy, Frida Kahlo, and Andy Warhol.
Their work was also among the datasets that were used to train the AI models.
This revelation has escalated the ongoing discussion surrounding generative AI, with A-list authors and musicians like Drake and Kurt Cobain accusing the technology of “systematic theft on a mass scale.”
Also read: Camera Manufacturers Fight Against Fake AI Images
Historical artists not spared
Some of the artists included in the leaked database ranged from historical icons like Picasso, Vincent Van Gogh, and Egon Schile to contemporary figures like Matisse and Monet.
Additionally, it also comprises various art periods and styles, including cottagecore, glitchcore, gorpcore, and gorecore. Among these was an artwork by six-year-old Hyan Tran, who contributed to The Gathering fundraiser for the Seattle Children’s Hospital in 2021. This upset users on the X platform who commented on the matter.
“Great, now they can’t say they don’t know who they stole from. They had the list this whole time,” wrote Orrus Fellin on the X platform.
The revelation spurred debates over the moral implications of using a diverse pool of artists and the degree to which AI systems should be permitted to use artists of all ages.
Other users on X feel it is “morally justified to destroy this AI industry even more, now,” for “stealing children’s artwork for profit.”
The Midjourney Devs got caught with lists of who they scraped for training their datasets and there's a TON of Magic the Gathering artists… Of which it includes play test card arts by game devs AND Secret Lair cards that *literal children* drew.
Fuck these entitled tech bros. https://t.co/4z6tuenn9W
— Ian D (@dixonij) January 1, 2024
Legal implications
There are numerous legal disputes pertaining to AI image production, this disclosure being only one of them. Due to the work’s alleged lack of human creativity, artist Jason Allen’s copyright appeal for his internationally acclaimed AI-generated piece Théâtre d’Opéra Spatial was denied in September 2023.
The ongoing legal battles demonstrate how difficult it is to define intellectual property rights in the era of AI-generated art, although, just like other AI artists, Allen has also pledged to fight for a change in copyright laws.
Apart from artists, The New York Times recently filed against OpenAI and Microsoft for using data from its articles to train ChatGPT.
According to Dazed, more traditional artists are also expected to keep piling pressure on AI firms like Midjourney and Stability AI until a concrete legal decision is made.
Others think otherwise
But elsewhere, a Beijing internet court has ruled in favor of AI, declaring that AI-generated content can be “copyright protected.”
In the US, the same stance has been adopted, arguing that “if a human was the driving force of the creative work and they happened to use AI to produce it, it can be protected by copyright,” according to The Register.
Meanwhile, artists have been asked to check databases for their names and seek necessary legal action if need be.
Since the disputed spreadsheet was first shared on social media, access to it has been restricted. However, considering the Internet Archive, it has been archived and added to the vast amount of material that was submitted in response to the artists’ claims against Midjourney and Stability AI being dismissed in October 2023.