DSpace at KOASAS: Sequential targeting: A continual learning approach for data imbalance in text classification

DSpace at KOASAS

RIMS Collection RIMS Journal Papers

Sequential targeting: A continual learning approach for data imbalance in text classification

Cited 8 time in

Cited 0 time in

Hit : 291
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Jang, Joel	ko
dc.contributor.author	Kim, Yoonjeon	ko
dc.contributor.author	Choi, Kyoungho	ko
dc.contributor.author	Suh, Sungho	ko
dc.date.accessioned	2021-07-06T01:30:06Z	-
dc.date.available	2021-07-06T01:30:06Z	-
dc.date.created	2021-07-05	-
dc.date.created	2021-07-05	-
dc.date.issued	2021-10	-
dc.identifier.citation	EXPERT SYSTEMS WITH APPLICATIONS, v.179	-
dc.identifier.issn	0957-4174	-
dc.identifier.uri	http://hdl.handle.net/10203/286401	-
dc.description.abstract	Text classification has numerous use cases including sentiment analysis, spam detection, document classification, hate speech detection, etc. In realistic settings, classification on text data confronts imbalanced data conditions where classes of interest usually compose a minor fraction. Deep neural networks used for text classification, such as recurrent neural networks and transformer networks, suffer from a lack of efficient methods addressing imbalanced data. Traditional data-level methods attempting to mitigate distributional skew include oversampling and undersampling. The oversampling methods destruct the quality of original language representation of the sparse data coming from minority classes whereas the undersampling methods fail to fully utilize the rich context of majority classes. We address such issues in data-driven approaches by enforcing continual learning on imbalanced data by partitioning the training data distribution into mutually exclusive subsets and performing continual learning, treating the individual subsets as distinct tasks. We demonstrate the effectiveness of our method through experiments on the IMDB dataset and constructed datasets from real-world data. The experimental results show that the proposed method improves by 56.38 %p on the IMDB dataset and by 16.89 %p and 34.76 %p on the constructed datasets compared to the baseline method in terms of the F1-score metric.	-
dc.language	English	-
dc.publisher	PERGAMON-ELSEVIER SCIENCE LTD	-
dc.title	Sequential targeting: A continual learning approach for data imbalance in text classification	-
dc.type	Article	-
dc.identifier.wosid	000663549200004	-
dc.identifier.scopusid	2-s2.0-85105247196	-
dc.type.rims	ART	-
dc.citation.volume	179	-
dc.citation.publicationname	EXPERT SYSTEMS WITH APPLICATIONS	-
dc.identifier.doi	10.1016/j.eswa.2021.115067	-
dc.contributor.localauthor	Kim, Yoonjeon	-
dc.contributor.nonIdAuthor	Jang, Joel	-
dc.contributor.nonIdAuthor	Choi, Kyoungho	-
dc.contributor.nonIdAuthor	Suh, Sungho	-
dc.description.isOpenAccess	N	-
dc.type.journalArticle	Article	-
dc.subject.keywordAuthor	Continual learning	-
dc.subject.keywordAuthor	Data imbalance	-
dc.subject.keywordAuthor	Deep learning	-
dc.subject.keywordAuthor	Sentiment analysis	-
dc.subject.keywordAuthor	Text classification	-
dc.subject.keywordPlus	NETWORKS	-

Appears in Collection: RIMS Journal Papers

Files in This Item: There are no files associated with this item.

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 8 items in WoS	Click to see citing articles in

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Sequential targeting: A continual learning approach for data imbalance in text classification

This item is cited by other documents in WoS

KOASAS

Communities & Collections