Integrating outlier removal into existing histogram construction methods for geographic data

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 437
  • Download : 0
Histograms have been widely used for estimating selectivity in query optimization. In this paper, we propose a new approach to improve the accuracy of histograms for multi-dimensional geographic data. Our idea is to remove outliers where appropriate in the histogram buckets. Our aim, in removing the outliers, is to increase the uniformity of data distribution in the buckets' areas, and thus enhance the histogram's accuracy. While the two fields, histogram construction and outlier detection, have been extensively investigated, there has been no research work on their integration to improve the accuracy of the histogram. Therefore, we present in this paper why removing outliers is useful for the histograms. Then, we describe a simple, yet effective, algorithm to detect and remove outliers for the histogram buckets. This algorithm is designed especially for histogram buckets and can be integrated easily into existing histogram construction methods. Through extensive experiments using real-life data sets, we show that the proposed approach can enhance the accuracy of existing histogram construction methods by 2 times on average.
Publisher
C R L PUBLISHING LTD
Issue Date
2015-03
Language
English
Article Type
Article
Citation

COMPUTER SYSTEMS SCIENCE AND ENGINEERING, v.30, no.2, pp.109 - 124

ISSN
0267-6192
URI
http://hdl.handle.net/10203/200782
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0