Recent machine learning (ML) methods have demonstrated impressive and versatile performance across a wide
range of applications, including image and natural language processing. Fascinated by the performance, previous
works attempted to apply ML approaches for safety-critical applications such as user authentication. However,
numerous studies have shown that ML methods are vulnerable to adversarial behavior and not proper for those
safety-critical tasks. The vulnerability attracted lots of researchers’ interest, and many efforts have been made to
devise attack and defense strategies.
However, the majority of research focuses on feed-forward neural networks (FNN), including Multi-layer
perceptrons (MLP), Convolutional Neural Networks (CNN), and Transformer. In contrast, Recurrent Neural
Networks have received little research attention. We contend that RNN’s robustness is also crucial and needs to
be studied, because many safety-critical applications, such as autonomous driving and medical monitoring, rely
on RNN. By studying RNN’s robustness, we discovered unique challenges of attacks on RNNs and propose three
novel attacks to overcome them. The attacks make use of temporal dependence and hidden state transition, both of
which are features of the RNN’s input and the RNN itself.
The first attack is a new evasion attack for RNN. The goal of an evasion attack is to alter the output of an ML
model during test-time. To achieve the manipulation, this kind of attack slightly modifies the test input. Although
various evasion attacks are suggested to evaluate the robustness of FNN, we found that naive applications of the
FNN-attacks can not fully assess RNN’s robustness due to RNN’s various output requirements of online tasks. To
address this problem, we offer a general attack framework to express different RNN output requirements. This
framework also implies that hidden state transitions and temporal dependence can be used to realize RNN attacks,
leading us to devise a Predictive attack.
The second attack is a backdoor attack for RNN. A backdoor attack is also a test-time attack, but it modifies
both of training data and test input. Once a training of a victim model is done using the data, an attacker can
manipulate the victim model’s output by adding a trigger pattern to a test output. While many backdoor attacks
have been presented for non-temporal data, such as images, little is known regarding backdoor attacks for temporal
data. We found the naive backdoor attacks to temporal data, for which RNN is typically used, makes a trigger
pattern detectable to a victim, leading to attack failure. To make a trigger pattern undetectable, we propose a new
backdoor attack that exploits the temporal dependence of a dataset.
The third attack is a missing value-based backdoor attack, which can be a better option when many missing
values exist in a dataset. This attack exploits patterns in missing values of medical data in a decision of trigger values.
By replacing input values with trigger values, this attack becomes more convenient not requiring post-processing of
the temporal covariance-based attack, restricting input values in a valid range. To generate an undetectable missing
value-based trigger, we utilize Variational Autoencoder (VAE) to capture temporal dependence.
By considering the three suggested attacks, we expect that ML practitioners will be more aware of the RNN’s
vulnerability and make careful decisions in deploying RNNs for safety-critical applications. Furthermore, we
hope this thesis paves the way for future research on RNN’s robustness, leading to improved robustness against
adversarial attacks and trustable ML methods.