Identification of Corrupted Data via k-Means Clustering for Function Approximation

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 144
  • Download : 0
In addition to measurement noises, real world data are often corrupted by unexpected internal or external errors. Corruption errors can be much larger than the standard noises and negatively affect data processing results. In this paper, we propose a method of identifying corrupted data in the context of function approximation. The method is a two-step procedure consisting of approximation stage and identification stage. In the approximation stage, we conduct straightforward function approximation to the entire data set for preliminary processing. In the identification stage, a clustering algorithm is applied to the processed data to identify the potentially corrupted data entries. In particular, we found k-means clustering algorithm to be highly effective. Our theoretical analysis reveal that under sufficient conditions the proposed method can exactly identify all corrupted data entries. Numerous examples are provided to verify our theoretical findings and demonstrate the effectiveness of the method.
Publisher
GLOBAL SCIENCE PRESS
Issue Date
2021-03
Language
English
Article Type
Article
Citation

CSIAM TRANSACTIONS ON APPLIED MATHEMATICS, v.2, no.1, pp.81 - 107

ISSN
2708-0560
DOI
10.4208/csiam-am.2020-0212
URI
http://hdl.handle.net/10203/297247
Appears in Collection
MA-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0