Sequential item selection via reinforcement learning강화 학습을 통한 순차적 항목 선택

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 72
  • Download : 0
Recently, the importance of the recommender system has grown as personal preferences become more diversified. This dissertation proposes reinforcement learning algorithms for the recommender system that the system improves the performance by itself over time. I firstly model recommender systems as Markov decision processes that the agent iteratively selects items to recommend among the possible candidates and receives the selection results. In the first part, I introduce a concept of bankruptcy in the recommender system, so the agent needs to consider a trade-off between avoiding bankruptcy and maximizing the benefits. I propose a variant of the multi-armed bandit algorithm, which achieves the asymptotically optimal performances without any bankruptcy. In the second part, I develop a deep reinforcement learning algorithm where the agent recommends multiple items simultaneously. The algorithm takes advantage of graph neural networks and the idea of greedy algorithm to relieve the scalability issue of selecting the best combination of the multiple items once. I theoretically prove that the applied ideas never degrades the performance. The proposed algorithms in the dissertation are tested in various environments, and the results insist that the algorithms successfully improve their item selection policy over time.
Advisors
Yi, Yungresearcher이융researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2022
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2022.2,[iv, 64 p. :]

URI
http://hdl.handle.net/10203/309170
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=996247&flag=dissertation
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0