How to Replicate Copyrighted Images Using Artificial Intelligence

January 9, 2024

Per a recent Business Insider report, AI researcher Gary Marcus and digital artist Reid Southen have made a discovery in the image space. According to their findings, AI models are capable of generating “near replicas of trademarked characters” with a two-word prompt.

A researcher who used the simple text prompt “animated sponge” on OpenAI’s Dall-E 3 got images of the popular “Spongebob Squarepants” character, while “videogame Italian” showed images of Mario from the iconic Nintendo movie.

Midjourney and Dall-E 3 were the two visual AI models used for the test by Reid Southen and Gary Marcus. According to IEEE Spectrum, the models were able to generate correct pictures from video games and movies even with indirect text prompts.

Using #DALLE3 or #Midjourney could land you in a copyright lawsuit.

In a paper, Dr. @GaryMarcus & @Rahll highlighted how harmless prompts can lead to legal trouble by generating copyrighted images.

See examples in our latest article ⬇️#AIart #AIArtwork https://t.co/iSTsvf8MnO

— DailyAI (@DailyAIOfficial) January 9, 2024

Characters produced with the visual AI models

Characters from “The Simpsons” were produced via the “popular 90’s animated cartoon with yellow skin” on the Midjourney AI model. Star Wars franchise characters were generated through the prompt “black armor with a light sword.” The researchers made more investigations, discovering hundreds of realistic images of animated and human characters from films and games.

As the researchers delved into their investigation, they stumbled upon many instances of well-known animated and human characters from movies and video games.

This study took place at a time when there’s growing concern about generative AI models potentially copying content. In a recent legal case, The New York Times accused OpenAI of GPT-4 replicating chunks of their articles almost word for word.

Drawbacks of the virtual AI model for end users

Generative models are still considered “black boxes,” which makes the relationship between the inputs and outputs unclear to end users. This makes it difficult to predict when a viral AI model will give a plagiaristic result, according to the researchers.

An end user without specialized knowledge of an AI tool or who doesn’t recognize a trademarked image it produces will find the models challenging in confirming whether they violate copyright.

The authors explained that in a generative AI system, users might assume the created artwork is original and free to use without knowing how it was made. In contrast, when people find an image on Google, they have more tools to determine its source and whether it’s okay to use.

Currently, the responsibility to avoid copyright issues falls mostly on artists or image owners. Even though Dall-E 3 provides an opt-out process, some artists find it frustratingly difficult. Midjourney has faced lawsuits from visual artists concerned about copyright.

Suggestions from the authors

The authors employed practical strategies to help tackle copyright issues within AI models. They suggest the removal of any copyrighted content from the training data, filtering out queries that could lead to problematic results, and transparently disclosing the sources of generated images.

Their proposal underscores how important it is for AI models to only use training data that is properly licensed until a better solution is found to track image origins and prevent copyright violations.

Despite attempts to get feedback, neither Midjourney nor OpenAI responded to Business Insider’s inquiries. The main point here is that AI models need to be built on legally obtained and authorized training data. The recommended actions aim to uphold copyright standards and promote the responsible and ethical use of AI technologies.