PACC: Large scale connected component computation on Hadoop and Spark

Cited 4 time in webofscience Cited 2 time in scopus
  • Hit : 688
  • Download : 86
DC FieldValueLanguage
dc.contributor.authorPark, Ha-Myungko
dc.contributor.authorPark, Namyongko
dc.contributor.authorMyaeng, Sung-Hyonko
dc.contributor.authorKang, U.ko
dc.date.accessioned2020-06-22T10:20:30Z-
dc.date.available2020-06-22T10:20:30Z-
dc.date.created2020-06-15-
dc.date.created2020-06-15-
dc.date.issued2020-03-
dc.identifier.citationPLOS ONE, v.15, no.3-
dc.identifier.issn1932-6203-
dc.identifier.urihttp://hdl.handle.net/10203/274779-
dc.description.abstractA connected component in a graph is a set of nodes linked to each other by paths. The problem of finding connected components has been applied to diverse graph analysis tasks such as graph partitioning, graph compression, and pattern recognition. Several distributed algorithms have been proposed to find connected components in enormous graphs. Ironically, the distributed algorithms do not scale enough due to unnecessary data IO & processing, massive intermediate data, numerous rounds of computations, and load balancing issues. In this paper, we propose a fast and scalable distributed algorithm PACC (PartitionAware Connected Components) for connected component computation based on three key techniques: two-step processing of partitioning & computation, edge filtering, and sketching. PACC considerably shrinks the size of intermediate data, the size of input graph, and the number of rounds without suffering from load balancing issues. PACC performs 2.9 to 10.7 times faster on real-world graphs compared to the state-of-the-art MapReduce and Spark algorithms.-
dc.languageEnglish-
dc.publisherPUBLIC LIBRARY SCIENCE-
dc.titlePACC: Large scale connected component computation on Hadoop and Spark-
dc.typeArticle-
dc.identifier.wosid000535300000026-
dc.identifier.scopusid2-s2.0-85081894429-
dc.type.rimsART-
dc.citation.volume15-
dc.citation.issue3-
dc.citation.publicationnamePLOS ONE-
dc.identifier.doi10.1371/journal.pone.0229936-
dc.contributor.localauthorMyaeng, Sung-Hyon-
dc.contributor.nonIdAuthorPark, Ha-Myung-
dc.contributor.nonIdAuthorPark, Namyong-
dc.contributor.nonIdAuthorKang, U.-
dc.description.isOpenAccessY-
dc.type.journalArticleArticle-
dc.subject.keywordPlusFRAMEWORK-
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 4 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0