How does Facebook manage the insane amount of data that one billion users pour into the service nearly every day? Wired explains how the company manages and analyzes the staggering amount of data using custom-built software solutions like Prism and Scuba. Facebook has the world's largest Hadoop cluster — a group of servers connected using Hadoop's open-source software — with more than 4,000 machines containing over 100 petabytes of data. Even more impressive, it isn't Facebook's only cluster. The problem of managing the constantly swelling system requires some of the greatest engineering and computing minds to solve, but as database administration and storage systems manager Santosh Janardhan told Wired, "if you're a technical guy, this is like Candy Land."
Facebook's data management system is 'like Candy Land' for engineers
Facebook's data management system is 'like Candy Land' for engineers
|