Skip to main content

‘May be an image’: what it’s like browsing Instagram while blind

The ridiculous sounds of social sites

Illustration by William Joel / The Verge

Share this story

Animation of a blank Instagram post on the left emitting an audio waveform that breaks apart before reaching an ear on the right.

Using a screen reader to navigate Instagram, as some people with low vision do, is a strange patchwork of sounds. It can be overwhelming, especially if you’re used to quickly scanning information with your eyes, to hear a synthetic voice clunkily rattle off usernames, timestamps, and like counts as though they’re all equally important as the actual content of the post. Among all that auditory stimulation, if someone added alt text to their photo, you might hear something like “John and I standing with our ankles in the water at the beach. John is making a distressed face while I menacingly hold out a dead crab and laugh.”

The image descriptions used by screen readers have to be added by users, and like many accessibility features in social media, those fields are regularly neglected. In those cases, the voice will sometimes recite alt text that Instagram or the user’s device generates automatically. The result, Danielle McCann, the social media coordinator for the National Federation of the Blind, tells me, can be pretty funny. The descriptions that have evolved from years of machine learning still often misidentify what’s happening in photos. 

Recently, she was scrolling through Instagram when her screen reader said there was a photo of “two brown cats lying on a textured surface.” Her husband informed her that it was actually a bridal shop ad featuring a woman in a wedding dress. “Thank goodness I wasn’t [commenting] like, ‘Oh those cats are cute,’ you know?”

These kinds of algorithmic misinterpretations are pretty common. Here’s a sampling of descriptions I heard while I browsed Instagram with VoiceOver on my phone: “red polo, apple, unicorn” (a photo of a T-shirt with a drawing of a couch on it), “may be an image of indoor” (a photo of a cat next to a house plant), “may be an image of food” (a photo of sea shells), “may be a cartoon” (almost every illustration or comic panel), and a whole lot of “may be an image of one person” (a variety of photos featuring one or more people).

As devices have gained accessibility settings like magnification, high contrast, and built-in screen readers, social media has also slowly become more accessible for people who are blind or have low vision: many sites and apps respond to users’ device settings, have options to toggle light and dark modes, and enable users to compose image descriptions. But the existence of those features doesn’t guarantee people with disabilities won’t be excluded online. Social media accessibility is a group effort. People have to know about the features, understand what they are, and actually remember to use them. A platform can have a hundred accessibility options, but without buy-in from every user, people are still left out. 

Even when people use alt text, they often don’t fully think through what’s important to convey to someone who can’t see photos. Some people will write overly simplistic descriptions like “red flower” or “blonde girl looking at sky,” without actually describing what it is about the images that makes them worth sharing. At the other end, multiple paragraphs of text to describe one image can be annoying to navigate with a screen reader. McCann tells friends to think of alt text as a writing exercise: “How do you provide as much information in as few words as possible?”

“The general rule is to be informative, not poetic,” says the American Foundation for the Blind (AFB). “But on social media, feel free to add some personality — you’re probably sharing that picture of your dog because he has a hilarious quizzical expression, for example, not because he is a black-and-white pitbull mix.”

While automated image descriptions might eventually improve beyond the level of mistaking a woman in a wedding dress for some cats, they can’t replace the human element. Facebook had an image outage in 2019 that showed all of its users the photo tags that are usually hidden, displaying machine-assigned descriptors like “Image may contain: people standing.” Are the people contained in that image embracing and making goofy faces? Are they standing in front of a breathtaking vista? Social media can feel a lot less social if your access to the content shared within it relies on computers’ conservative interpretations. 

Advocates stress that accessibility should always be a consideration from the start, “not as an add-on to an already-existing platform well after the fact,” says AFB. But most popular platforms, including Twitter, Instagram, and TikTok, didn’t take that route during initial development, and are instead constantly playing catch-up to improve their accessibility. When those improvements roll out, it’s never guaranteed that people will consistently use them.

“Just because they’re visual doesn’t mean that they’re immediately not attractive to people who are blind or low vision”

One of the biggest barriers is the assumption that blind people just won’t be interested in visual media. “Just because they’re visual doesn’t mean that they’re immediately not attractive to people who are blind or low vision,” says McCann. “I think that’s one big misconception: ‘Oh, well, they don’t care about pictures.’ But we do.” When culture is molded on social networks, it sucks to lose out on a shared social language because you can’t see the images everyone is talking about. 

Christy Smith Berman, a low vision editor at Can I Play That, responded to a TT Games tweet that announced the delay of Star Wars Lego with text on an image. When she replied with a request for alt text, Smith Berman was met with responses from people expressing disbelief that blind people would even be on Twitter to begin with, let alone care about video games.

Those false assumptions often mean that people are left out of fun cultural moments on social media. Memes usually involve rapidly evolving iterations of undescribed images with tiny words in weird fonts. Viral videos are reposted and shared without any kind of description, through audio or text, of what’s happening on-screen. “Oh, that must be somebody dancing,” thinks McCann when she encounters a TikTok with no audio besides music. “Well, no, it’s actually somebody making a cheesesteak. But I didn’t know that because there’s no audio indication.”

“A lot of the memes that people share, they don’t add alt text to it,” says Steven Aquino, a legally blind journalist. Aquino doesn’t use a screen reader, instead relying on magnification, but he’s still sometimes left wondering what’s going on in memes. “It’s really hard because I can’t see so well, and I just feel like, ‘Okay, it’s supposed to be funny, but I can’t tell.”

Beyond a simple neglect of accessibility features, conveying visual humor through text isn’t something everyone has a knack for. The funniest images rely on comedic timing through careful visual composition, prior knowledge of a specific meme, or familiarity with several different cultural references. Writing an image description for an esoteric meme can feel like explaining internet culture to your grandparents: you suddenly don’t know how to describe what exactly made you laugh. The complicated nature of meme literacy isn’t something we can blame on platforms — it’s just not something the average person is used to putting into words. 

But there are other less complicated factors that can impact the online experiences of people who are blind or have low vision. Aquino points out that people will use special unicode characters in their Twitter display names that are harder to read and aren’t interpreted as letters by screen reading software. A screen reader isn’t technically incorrect if it reads a character as “mathematical bold capital,” but most sighted people will read it simply as a letter with different formatting. 

“For people that do use screen readers, this software is only so smart,” says Aquino. “So if you’ve got a clever name, your voiceover or whatever it is that you use is going to fail.” Tweets that include rows of emojis, or a lot of special characters to create an image or convey cursive script, can be hellish to listen to when they’re read by a screen reader. Posting a screenshot of the tweet with alt text is a workable alternative, but people rarely know to do so.

McCann is glad that many sites have improved their accessibility options over the years, but she wishes they were more widely used and wonders why they aren’t better promoted. TikTok has text to speech and warns people when flashing effects in their videos might trigger seizures, so why can’t all social sites have better prompts for encouraging users to add captions, visual descriptions, and alt text? 

“The onus is on the disability community to educate,” she says. “Why isn’t there more education from these mainstream companies?”

McCann wishes it were easier for her to join the party when things like TikTok videos go viral. “Unless I have someone sit with me and explain to me what’s going on, I definitely feel like I can’t have a conversation about it with someone,” she says. “It is exclusionary to a point, because I like jokes. I like pasta recipes. I want to know that stuff! I’m still a part of the social fabric.”