Current neural networks achieved human-comparable performance in many offline tasks. However, since
the training environment is not static in the real world, it breaks the i.i.d assumption in Empirical Risk
Minimization. Continual learning is a research area that aims to make the neural network can train in a
dynamically changing training environment. Continual learning assumes that the previously seen data is
not used again to check that the neural network is well retaining previous knowledge during the training
environment change. The basic way in continual learning is to additionally train the new dataset on a
trained model, but this leads to a so-called catastrophic forgetting in which the performance of previously
learned data is drastically decreased. Many previous works point out the absence of previous loss function
and they modeled the surrogate loss to approximate it. Although previous works achieved the alleviation
of catastrophic forgetting, they have several limitations such as violating the assumption, unintentional
change of surrogate loss, and need large additional cost. In this thesis, we propose a new approach
to overcoming catastrophic forgetting, Dataset Distillation for Continual learning (D2CL), which can
avoid the above shortcomings. D2CL train the small synthetic dataset which well approximates the loss
function of the original dataset. This thesis also introduces a new proposition that is useful for loss
approximation. Finally, it is validated through several experimental results that our proposed method
help to store more informative data while using small additional memory.