Swarm is a storage system that provides scalable, reliable, and cost-effective data storage. Swarm is based on storage servers, rather than file servers; the storage servers are o...
Modern scientific experiments can generate hundreds of gigabytes to terabytes or even petabytes of data that may furthermore be maintained in large numbers of relatively small fil...
Wantao Liu, Brian Tieman, Rajkumar Kettimuthu, Ian...
“Garbage in. garbage out” is a well-known phrase in computer analysis, and one that comes to mind when mining Web data to draw conclusions about Web users. The challenge is th...
Large client-server data intensive applications can place high demands on system and network resources. This is especially true when the connection between the client and server s...
The Sloan Digital Sky Survey (SDSS) Data Archive Server (DAS) provides public access to over 12Tb of data in 17 million files produced by the SDSS data reduction pipeline. Many tas...