Skip to main content

The Internet Archive is adding digital previews of book sources to Wikipedia articles

The Internet Archive is adding digital previews of book sources to Wikipedia articles

/

50,000 books are already available

Share this story

Photo by Helen H. Richardson/The Denver Post via Getty Images

A new initiative from the Internet Archive makes it easier to check citations on Wikipedia by linking to digitized previews of the books being referenced. When a scan of a book is available this should make it far easier to that a source is saying what the Wikipedia article is claiming.

While it’s always been possible to do the same thing by tracking down a physical copy of any books cited, this often isn’t practical for journalists or students working to tight deadlines, especially for hard to find books. In theory the new initiative means that a source is just a click away.

Clicking a compatible citation (for example, from Martin Luther King’s Wikipedia page), brings you to a two-page preview of the book from the Internet Archive.
Clicking a compatible citation (for example, from Martin Luther King’s Wikipedia page), brings you to a two-page preview of the book from the Internet Archive.
Screenshot by Jon Porter / The Verge

In practice, however, it’s going to take some time to match Wikipedia’s millions of citations with the relevant books. So far, the Internet Archive has linked a relatively small pool of 130,000 citations to 50,000 books. The plans also rely on Wikipedia’s authors citing books using the correct format, and they’ll need to specify an exact page number for the system to work. ISBN numbers are very helpful for finding matches, but not every book has one, according to Wired.

Away from the challenges of matching citations up with the right books, the Internet Archive has made good progress in digitizing books in the first place. Wired reports that the organization already has a database of 3.8 million scanned books, and that it’s scanning more at a rate of over 1,000 a day. The Internet Archive says it wants to bring 4 million more books online over the coming years.

Digitizing Wikipedia’s book citations is just one part of the Internet Archive’s attempts to make accurate information easier to find online. As well as its works on book citations, it has also been scraping Wikipedia to replace broken citations with links to pages it’s archived in its Wayback Machine. As of the beginning of October, its InternetArchiveBot has fixed nearly 6 million broken citations across Wikipedia.

Correction: The Internet Archive is archiving books at a rate of 1,000 per day, not 10,000 as originally stated. Added a clarification to note that the organization is bringing an additional 4 million books online over the coming years, not 4 million total.