Mitigating label sparsity for time series analysis시계열 데이터 부족을 위한 레이블 부족 완화

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 2
  • Download : 0
A time series is a sequential set of data points, collected from various sources such as sensor, mobility, and finance. It takes a large cost to annotate every timestamp in a time series because of length and complexity, making hard to recognize patterns in a time series. Label sparsity in time-series data is regarded as a hurdle for its broad applicability, especially in deep learning where huge amount of labels are required. To overcome label sparsity, this dissertation research aims to suggest improve efficiency of few labels in a time series for time series analysis such as classification. The first chapter introduces an active learning algorithm called as TCLP using temporal coherence. Active learning trains an initial model and then queries informative labels to human annotators for re-training the model with the additional labels. As a time series is temporally coherent and the same class lasts for a duration, TCLP propagates the annotated instantaneous label for timestamps in the duration. Propagated labels accelerate model re-training so the model converges faster than before. TCLP estimates the duration of temporal coherence for each newly annotated label and accurately propagate given labels. The second chapter suggests CrossMatch, a method of semi-supervised learning when there is no additional labels but only initial labels. CrossMatch is a consistency regularization framework that trains a model with unlabeled data points by minimizing the difference between the output of a data point and the output of its augmentation. CrossMatch suggests a novel data augmentation method called as context-additive augmentation, which exploits the surrounding contexts of a given sampled instance from a time series. As the length of surrouding contexts can be varied, multiple instances can be augmented and the original instance does not perturbed. Using this property, CrossMatch conducts consistency regularization in more stable manner along. Also reliability-weighted mixing in CrossMatch generates more accurate pseudo-labels that become the target of each augmented instance. The third chapter proposes a change point detection algorithm called as RECURVE that finds class change when there is no available label for further analysis. A recent change point detection algorithm leverages a representation model that outputs a representation at each timestamp. It detects change points by measuring the distance between two representations at consecutive timestamps. However, RECURVE computes curvature of representation trajectory, focusing on more sequential aspect of representations. By using curvature, class change can be detected where neighboring timestamps has similar representation due to temporal coherence. The effectiveness of curvature is proven theoretically using random walk theory and empirically verified by extensive experiments using real datasets. This dissertation is expected to pave a way to employ sparse labels as much as possible and mitigates cost burden for annotating every timestamp in a time series for efficient time-series analysis.
Advisors
이재길researcher
Description
한국과학기술원 :데이터사이언스대학원,
Publisher
한국과학기술원
Issue Date
2024
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 데이터사이언스대학원, 2024.2,[vi, 74 p. :]

Keywords

시계열▼a능동학습▼a준지도학습▼a변화점감지; Time series▼aActive learning▼aSemi-supervised learning▼aChange point detection

URI
http://hdl.handle.net/10203/321985
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1098140&flag=dissertation
Appears in Collection
IE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0