clock menu more-arrow no yes

Filed under:

You can unzip this tiny image on Twitter to reveal the complete works of Shakespeare

New, 8 comments

If you buy something from a Verge link, Vox Media may earn a commission. See our ethics statement.

Shakespeare was a prolific playwright, penning more than 37 plays and 154 sonnets throughout his entire career. Now, his life’s work has been compiled into a tiny image on Twitter by computer science researcher David Buchanan. When unzipped, the image opens up more ZIP and RAR files to reveal an HTML browser with Project Gutenberg’s The Complete Works of William Shakespeare, specifically, this 6.7MB page.

Buchanan explained to Motherboard that the trick came to him while he was testing how much raw data he could cram into a tweet. “I wrote a script which parses a JPG file and inserts a big blob of ICC metadata,” he said. “The metadata is carefully crafted so that all the required ZIP headers are in the right place.”

Twitter strips away most metadata from images, but Buchanan was able to store the Shakespeare data within the ICC profile (some color data) of the image, which survived Twitter’s scaling, compression, and thumbnailing of its images.

Buchanan says he tried reporting the technique to Twitter’s bug bounty program, but the company responded that it wasn’t a bug. That’s a little concerning, considering that the technique has the potential to hide malware within an image that’s in a tweet. It’s been done before on platforms like WhatsApp and Telegram when researchers found that malware-laced images could be used by hackers to hijack users’ accounts.

I had to do a bit of work to follow the instructions Buchanan tweeted, but I was able to unzip the image to access the complete works of Shakespeare. It’s possible to rename the .JPG file to a ZIP file to unzip it, but unzipping the lol.zip file it produces will just create a copy of that file. Instead, open up Terminal, type “unzip” and a space, and drag the lol.zip file into the Terminal window. This will produce a bunch of copies of the same RAR file, but you only need to unrar one (I used The Unarchiver to do this) to access the HTML page.

Malicious possibilities aside, the technique still allows for some fun experimenting with other files that can be embedded in an image. If you’d like to view the source code to try out the process for yourself, Buchanan has nestled the PDF within the image of this tweet.