Multi-armed bandit with additional observations

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 154
  • Download : 0
We study multi-armed bandit (MAB) problems with additional observations, where in each round, the decision maker selects an arm to play and can also observe rewards of additional arms (within a given budget) by paying certain costs. We propose algorithms that are asymptotic-optimal and order-optimal in their regrets under the settings of stochastic and adversarial rewards, respectively.
Publisher
Association for Computing Machinery, Inc
Issue Date
2018-06-18
Language
English
Citation

2018 ACM International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2018, pp.53 - 55

DOI
10.1145/3219617.3219639
URI
http://hdl.handle.net/10203/247468
Appears in Collection
EE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0