DSpace at KOASAS: A distributed in-situ analysis method for large-scale scientific data

DSpace at KOASAS

RIMS Collection RIMS Conference Papers

A distributed in-situ analysis method for large-scale scientific data

Cited 0 time in webofscience

Cited 3 time in

Hit : 154
Download : 0

Export

Han, Donghyoung / Nam, Yoon-Min / Kim, Min-Soo

Recently, a massive amount of data is generated in a wide range of scientific applications such as NASA's satellite, the large hadron collider, and large synoptic survey telescope. Most of scientific data follows the array model, and there are various kinds of standard array formats such as HDF, NetCDF, MDSplus, and ROOT. SciDB is the most well-known DBMS that stores the array-based scientific data and processes queries on it. SciDB is a distributed DBMS, and so, is scalable in terms of query performance. However, it has a severe drawback that takes a huge amount of time for loading a massive amount of scientific data into DBMS. That is, it is not scalable in terms of data loading. To overcome that problem, we propose a distributed in-situ analysis method that allows processing queries on raw scientific data in a distributed manner without explicit data loading. In detail, we propose the in-situ scan operator that scans necessary data of the array format and passes it to upper operators of the pipeline of a query plan. It also performs repartitioning during in-situ scanning, which is required for correct query results. Through experiments using real datasets, we have shown that the SciDB system using our method significantly outperforms the original SciDB system by orders of magnitude in terms of the performance of the first query. ? 2017 IEEE.

Publisher: Institute of Electrical and Electronics Engineers Inc.

Issue Date: 2017-02

Language: English

Citation: 2017 IEEE International Conference on Big Data and Smart Computing, BigComp 2017, pp.69 - 75

DOI: 10.1109/BIGCOMP.2017.7881718

URI: http://hdl.handle.net/10203/274439

Appears in Collection: RIMS Conference Papers

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

A distributed in-situ analysis method for large-scale scientific data

KOASAS

Communities & Collections