Boosting Adapter Transfer Learning via Weak Parameter Sharing

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 71
  • Download : 0
Adapter tuning is a parameter-efficient way of transfer learning for natural language processing, where adapter modules are inserted in a pretrained model and only the modules are updated, with the pretrained model fixed. This also allows easier continual learning, as the number of parameter linearly grows with the number of tasks. In this paper, we apply the concept of adapter tuning and further improve the method by sharing parameters among adapters plugged in different encoder layers. We attempt a lot of different configurations of parameter sharing to find the optimal setup for each GLUE task and find the settings where the model is on par with or even outperforms the default adapter tuning with just 1, 2, 3, or 4 adapters in a BERT-base model. In the experiment, we also analyze the training results to find some patterns among the configurations and their meanings in the perspective of improving transfer learning.
Publisher
IEEE
Issue Date
2022-01
Language
English
Citation

IEEE International Conference on Big Data and Smart Computing (BigComp), pp.382 - 384

ISSN
2375-933X
DOI
10.1109/BigComp54360.2022.00086
URI
http://hdl.handle.net/10203/298312
Appears in Collection
CS-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0