IBM Builds Biggest Data Drive Ever (120 PB)

http://www.technologyreview.com/computing/38440/?nlid=nldly&nld=2011-08…
http://nextbigfuture.com/2011/08/ibm-builds-120-petabyte-drive.html
A 120 petabyte drive could hold 24 billion typical five-megabyte MP3 files or comfortably swallow 60 copies of the biggest backup of the Web, the 150 billion pages that make up the Internet Archive's WayBack Machine.

The data storage group at IBM Almaden is developing the record-breaking storage system for an unnamed client that needs a new supercomputer for detailed simulations of real-world phenomena. However, the new technologies developed to build such a large repository could enable similar systems for more conventional commercial computing, says Bruce Hillsberg, director of storage research at IBM and leader of the project.

The largest arrays available today are about 15 petabytes in size.

IBM's engineers developed a series of new hardware and software techniques to enable such a large hike in data-storage capacity. Finding a way to efficiently combine the thousands of hard drives that the system is built from was one challenge. As in most data centers, the drives sit in horizontal drawers stacked inside tall racks. Yet IBM's researchers had to make those significantly wider than usual to fit more disks into a smaller area. The disks must be cooled with circulating water rather than standard fans.
...

The new system also benefits from a file system known as GPFS that was developed at IBM Almaden to enable supercomputers faster data access. It spreads individual files across multiple disks so that many parts of a file can be read or written at the same time. GPFS also enables a large system to keep track of its many files without laboriously scanning through every one. Last month a team from IBM used GPFS to index 10 billion files in 43 minutes, effortlessly breaking the previous record of one billion files scanned in three hours.