Showing results 2 to 4 of 4
Improving Thompson sampling via information relaxation for budgeted multi-armed bandits = 예산제약이 있는 멀티암드벤딧에서 정보 완화를 통한 톰슨샘플링 개선link Jeong, Woojin; 정우진; et al, 한국과학기술원, 2023 |
Improving upper confidence reinforcement learning with bootstrapping = 강화학습에서의 효율적 탐색을 위한 부트스트랩 기법의 활용link Kim, Sanghwa; Min, Seungki; 민승기; Kim, Kyoung-Kuk; et al, 한국과학기술원, 2022 |
Thompson sampling with information relaxation penalties Min, Seungki; Maglaras, Costis; Moallemi, Ciamac C., 33rd Annual Conference on Neural Information Processing Systems, NeurIPS 2019, Neural information processing systems foundation, 2019-12-08 |
Discover