Reducing I/O Cost in OLAP Query Processing with MapReduce

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 319
  • Download : 0
This paper presents a method to reduce I/O cost in MapReduce when online analytical processing (OLAP) queries are used for data analysis. The proposed method consists of two basic ideas. First, to reduce network transmission cost, mappers are organized to receive only data necessary to perform a map task, not an entire set of input data. Second, to reduce storage consumption, only record IDs are stored for checkpointing, not the raw records. Experiments conducted with TPC-H benchmark show that the proposed method is about 40% faster than Hive, the well-known data warehouse solution for MapReduce, while reducing the size of data stored for checkpoining to about 80%.
Publisher
IEICE-INST ELECTRONICS INFORMATION COMMUNICATIONS ENG
Issue Date
2015-02
Language
English
Article Type
Article
Citation

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, v.E98D, no.2, pp.444 - 447

ISSN
1745-1361
DOI
10.1587/transinf.2014EDL8143
URI
http://hdl.handle.net/10203/200071
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0