A Quantitative Analysis of Storage Efficient Management for Distributed File System

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 379
  • Download : 28
To manage exponentially increasing large-scale data, the use of scalable distributed file systems is being considered as a major solution for storing large scale data. The Google file system and Hadoop distributed file system are typical examples of the distributed file systems that manage a massive volume of data. They provide high fault-tolerance and availability by keeping multiple replicas for each file. However, the replication scheme burdens lots of storage overhead. We address this space overhead problem in this paper. We suggest a practical solution for reducing the space overhead by combining distributed RAID techniques with replication-based DFS. Distributed RAID provides availability based on parities which are generated by erasure coding such as Reed-Solomon coding. This scheme requires less space overhead than replication. Our solution is to reduce the storage overhead by substituting the replicas with parities for a certain group of data blocks. We also perform a quantitative analysis for distributed systems with RAID parity. Finally, we discuss issues related to decreasing space overhead and increasing high availability of the distributed file systems.
Publisher
KIISE
Issue Date
2011-08-25
Language
English
Citation

The Third International Conference on Emerging Databases (EDB 2011) , v.3, pp.98 - 108

URI
http://hdl.handle.net/10203/170327
Appears in Collection
CS-Conference Papers(학술회의논문)

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0