clock menu more-arrow no yes mobile

Filed under:

One data set doxxed every cab driver in New York

New, 7 comments

When civic hacker Chris Whong got a full year's worth of taxi logs from New York's taxi commission, it seemed like a victory for data visualizers everywhere. But a new piece from Vijay Pandurangan suggests the data set is more revealing than the city initially thought. Diving into the data, Pandurangan shows how easy it is to work back from the pickup and dropoff locations and times to create a comprehensive record of each driver's medallion number, name and a rough guess at their annual salary. The final lesson: anonymization is hard, and it takes a delicate balancing act to respect privacy in a data set big enough to hold every cab driver in New York City.