Tri-Fly: Distributed Estimation of Global and Local Triangle Counts in Graph Streams

Cited 0 time in webofscience Cited 7 time in scopus
  • Hit : 200
  • Download : 0
Given a graph stream, how can we estimate the number of triangles in it using multiple machines with limited storage? Counting triangles (i.e., cycles of length three) is a classical graph problem whose importance has been recognized in diverse fields, including data mining, social network analysis, and databases. Recently, for triangle counting in massive graphs, two approaches have been intensively studied. One approach is streaming algorithms, which estimate the count of triangles incrementally in time-evolving graphs or in large graphs only part of which can be stored. The other approach is distributed algorithms for utilizing computational power and storage of multiple machines. Can we have the best of both worlds? We propose Tri-Fly, the first distributed streaming algorithm for approximate triangle counting. Making one pass over a graph stream, Tri-Fly rapidly and accurately estimates the counts of global triangles and local triangles incident to each node. Compared to state-of-the-art single-machine streaming algorithms, Tri-Fly is (a) Accurate: yields up to 4.5 × smaller estimation error, (b) Fast: runs up to 8.8 × faster with linear scalability, and (c) Theoretically sound: gives unbiased estimates with smaller variances.
Publisher
Springer
Issue Date
2018-06-05
Language
English
Citation

Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), pp.651 - 663

DOI
10.1007/978-3-319-93040-4_51
URI
http://hdl.handle.net/10203/251574
Appears in Collection
RIMS Conference Papers
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0