A recent study has shown that watermarking AI-generated content to tackle misinformation has loopholes as the watermarks can be forged and tempered with.
With the advent of generative AI, spurred by the launch of ChatGPT last November, the internet has increasingly been inundated with content generated by AI, but often mistaken for human-made content or made by computational models, with the European Union’s law enforcement agency warning 90% of internet content may be AI-created or edited by 2026.
This has resulted in growing calls for online platforms to watermark AI-generated content, but researchers at Nanyang Technological University in Singapore, as well as Chongqing University and Zhejiang University in China, recently revealed that watermarks are not the ultimate solution.
Effectiveness of watermarks
As for the Asian researchers, they wanted to assess the usefulness of watermarking images, videos, and other AI-generated content as a way of limiting the spread of deepfakes.
Their study, which was published on the pre-print server arXv, shows that there are two ways that attackers can use to forge watermarks or remove them from AI-generated content.
“One night we discussed whether we could explore a new advanced watermarking for generative models,” Guanlin Li, co-author of the paper, told Techexplore.
This comes as companies and individuals have also been tagging their content with watermarks to protect the IP or “restrict illegal usage.”
“I just said, Hey, why not attack the existing watermarking scheme? If we can remove the watermark, some illegal AIGC (AI-generated content) will not be treated as AI-generated. That could cause a lot of chaos on the internet,” said Li.
But the study presents a computational approach for watermark removal or forging in photos created by AI.
Cleaning the data
A publicly available denoising model is used to ‘clean’ the data once it has been gathered from a target AI company, application, or content-generating service. Using this cleaned data, a generative adversarial network (GAN) is trained as the last stage.
Remarkably, the researchers found that the GAN-based model could effectively remove or fake watermarks after training.
Li elaborated on their approach, saying, “If we want to identify the watermarked content, the distribution of watermarked content should be different from the original one.”
“Based on it, if we can learn a projection between these two distributions, we will be able to remove or forge a watermark,” added Li.
Early experiments showed that their approach worked very well for both forging and removing watermarks from a variety of AI-generated photos.
Nothing new under the sun
According to Vox, the problems of misinformation on the internet have always existed before generative AI tools like ChatGPT went viral. But these tools, including Dall-E, Midjourney, and Photoshop, have made it easier to create fake images, videos, and text.
The EU has also called on online platforms to watermark AI-generated content to allow users to distinguish between real and AI-made content.
But the results of the study raise concerns over the credibility of employing watermarking to protect AIGC’s copyrights. Li emphasized that their approach relies on data dispersion, suggesting that the security of the watermarking systems in use today may not be as strong as it was previously thought.
Although there may be difficulties using watermarks on AI content, this research has shown there is also room for creativity. Li and his colleagues have hoped that by sharing their research, generative AI businesses and developers would be motivated to create more sophisticated watermarking techniques or investigate different methodologies to better secure AIGC.
“We are now primarily focused on developing new watermarking schemes for generative models, not only for image generation but also for other AI models,” added Li.