Google’s research division today detailed just how easy it is for computer algorithms to bypass standard photo watermarking practices, stripping those images of copyright protection and making them vulnerable to reposting across the internet without credit. The research, presented at a leading computer vision conference in Hawaii back in July, is described in detail in a paper titled, “On the Effectiveness of Visible Watermarks.”
“As often done with vulnerabilities discovered in operating systems, applications or protocols, we want to disclose this vulnerability and propose solutions in order to help the photography and stock image communities adapt and better protect its copyrighted content and creations,” Tali Dekel and Michael Rubinstein, Google research scientists, explain in a post published on Google’s research blog earlier today.
Dekel and Rubinstein say the core problem with current photo watermarking processes is the high level of consistency in style. “We show that this consistency can be used to invert the watermarking process — that is, estimate the watermark image and its opacity, and recover the original, watermark-free image underneath,” the duo explain. “This can all be done automatically, without any user intervention or prior information about the watermark, and by only observing watermarked image collections publicly available online.”
The team behind the watermark-removal algorithm was able to train software with enough public examples to identify watermark patterns and then, through a process called “multi-image matting,” separate the watermark’s components from the rest of the image. Then, because the software understands the elements of the watermark like its opacity, structure, and shadow or color gradient effects, Google’s algorithm is able to remove it from any photo containing that specific watermark or a similar one.
To fix this, and create stronger copyright protections for images on the web, the team suggests adding elements of specific randomness to the watermark. However, you can’t simply change the location, or make changes to the opacity of the watermark, Dekel and Rubinstein explain. Instead, you need to make changes that will leave visible artifacts after the removal process. This includes adding “random geometric perturbations to the watermark” — effectively warping the text and logos being used. That way, when algorithms like the one Google uses try to scrub the watermark out, they’ll leave outlines of the image because these systems are trained to look for consistency and work by targeting the vulnerabilities inherent in that consistency.
“In a nutshell, the reason this works is because removing the randomly-warped watermark from any single image requires to additionally estimate the warp field that was applied to the watermark for that image — a task that is inherently more difficult,” the duo write. “Therefore, even if the watermark pattern can be estimated in the presence of these random perturbations (which by itself is nontrivial), accurately removing it without any visible artifact is far more challenging.”
The team admits that the defense isn’t a perfect one. There will likely always be more sophisticated algorithms developed to bypass current practices, in a cat-and-mouse struggle similar to that of cybersecurity protections. However, the current state of watermarks leaves image protection in a sad state, they say, and even just a little bit of the right kind of randomness can go a long way in keeping photographs safe from theft in the short term.