DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kim, Minseon | ko |
dc.contributor.author | Tack, Jihoon | ko |
dc.contributor.author | Shin, Jinwoo | ko |
dc.contributor.author | Hwang, Sung Ju | ko |
dc.date.accessioned | 2023-12-12T05:00:23Z | - |
dc.date.available | 2023-12-12T05:00:23Z | - |
dc.date.created | 2023-12-08 | - |
dc.date.issued | 2023-02-08 | - |
dc.identifier.citation | 2023 IEEE Conference on Secure and Trustworthy Machine Learning, SaTML 2023, pp.316 - 326 | - |
dc.identifier.uri | http://hdl.handle.net/10203/316276 | - |
dc.description.abstract | Adversarial training, which minimizes the loss of adversarially-perturbed training examples, has been extensively studied as a solution to improve the robustness of deep neural networks. However, most adversarial training methods treat all training examples equally, while each example may have a different impact on the model's robustness during the course of adversarial training. A couple of recent works have exploited such unequal importance of adversarial samples to the model's robustness by proposing to assign more weights to the misclassified samples or to the samples that violate the margin more severely, which have been shown to obtain high robustness against untargeted PGD attacks. However, we empirically find that they make the feature spaces of adversarial samples across different classes overlap and thus yield more high-entropy samples whose labels could be easily flipped. This makes them more vulnerable to adversarial perturbations, and their seemingly good robustness against PGD attacks is actually achieved by a false sense of robustness. To address such limitations, we propose simple yet effective re-weighting scheme that weighs the loss for each adversarial training example proportionally to the entropy of its predicted distribution to focus on examples whose labels are more uncertain. | - |
dc.language | English | - |
dc.publisher | Institute of Electrical and Electronics Engineers Inc. | - |
dc.title | Rethinking the Entropy of Instance in Adversarial Training | - |
dc.type | Conference | - |
dc.identifier.wosid | 001012311500019 | - |
dc.identifier.scopusid | 2-s2.0-85163190212 | - |
dc.type.rims | CONF | - |
dc.citation.beginningpage | 316 | - |
dc.citation.endingpage | 326 | - |
dc.citation.publicationname | 2023 IEEE Conference on Secure and Trustworthy Machine Learning, SaTML 2023 | - |
dc.identifier.conferencecountry | US | - |
dc.identifier.conferencelocation | Raleigh, NC | - |
dc.identifier.doi | 10.1109/SaTML54575.2023.00029 | - |
dc.contributor.localauthor | Shin, Jinwoo | - |
dc.contributor.localauthor | Hwang, Sung Ju | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.