DSpace at KOASAS: Parallel labeling of massive XML data with MapReduce

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Journal Papers(저널논문)

Parallel labeling of massive XML data with MapReduce

Cited 10 time in

Cited 14 time in

Hit : 763
Download : 15

Export

Choi, Hyebong / Lee, Kyong-Ha / Lee, Yoon Joon researcher

The volume of XML data has become enormous and still grows very quickly as many data have been typed in XML by virtue of its simplicity and extensibility. While a tree labeling algorithm has a crucial role in XML query processing, conventional algorithms are all sequential so that they fail to label a large volume of XML data in a timely manner. To address this issue, we devise parallel tree labeling algorithms for massive XML data. Specifically, we focus on how to efficiently label a single large XML file in parallel. We first propose parallel versions of two prominent tree labeling schemes based on the MapReduce framework. We then present techniques for runtime workload balancing and data repartition to solve performance issues caused by data skewness and MapReduce's inherited limitation. Through extensive experiments with synthetic and real-world datasets on 15 nodes, we show that our parallel labeling algorithms are up to 17 times faster than conventional algorithms, providing strong durability against data skewness.

Publisher: SPRINGER

Issue Date: 2014-02

Language: English

Article Type: Article

Keywords: ALGORITHM

Citation: JOURNAL OF SUPERCOMPUTING, v.67, no.2, pp.408 - 437

ISSN: 0920-8542

DOI: 10.1007/s11227-013-1008-6

URI: http://hdl.handle.net/10203/190076

Appears in Collection: CS-Journal Papers(저널논문)

Files in This Item

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 10 items in WoS	Click to see citing articles in

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Parallel labeling of massive XML data with MapReduce

This item is cited by other documents in WoS

KOASAS

Communities & Collections