DSpace at KOASAS: Multi-armed Bandit with Additional Observations

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Journal Papers(저널논문)

Multi-armed Bandit with Additional Observations

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 468
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Yun, Donggyu	ko
dc.contributor.author	Proutiere, Alexandre	ko
dc.contributor.author	Ahn, Sumyeong	ko
dc.contributor.author	Shin, Jinwoo	ko
dc.contributor.author	Yi, Yung	ko
dc.date.accessioned	2019-01-22T08:38:07Z	-
dc.date.available	2019-01-22T08:38:07Z	-
dc.date.created	2018-11-28	-
dc.date.created	2018-11-28	-
dc.date.created	2018-11-28	-
dc.date.issued	2018-03	-
dc.identifier.citation	Proceedings of the ACM on Measurement and Analysis of Computing Systems, v.2, no.1, pp.1 - 22	-
dc.identifier.issn	2476-1249	-
dc.identifier.uri	http://hdl.handle.net/10203/249061	-
dc.description.abstract	We study multi-armed bandit (MAB) problems with additional observations, where in each round, the decision maker selects an arm to play and can also observe rewards of additional arms (within a given budget) by paying certain costs. In the case of stochastic rewards, we develop a new algorithm KL-UCB-AO which is asymptotically optimal when the time horizon grows large, by smartly identifying the optimal set of the arms to be explored using the given budget of additional observations. In the case of adversarial rewards, we propose H-INF, an algorithm with order-optimal regret. H-INF exploits a two-layered structure where in each layer, we run a known optimal MAB algorithm. Such a hierarchical structure facilitates the regret analysis of the algorithm, and in turn, yields order-optimal regret. We apply the framework of MAB with additional observations to the design of rate adaptation schemes in 802.11-like wireless systems, and to that of online advertisement systems. In both cases, we demonstrate that our algorithms leverage additional observations to significantly improve the system performance. We believe the techniques developed in this paper are of independent interest for other MAB problems, e.g., contextual or graph-structured MAB.	-
dc.language	English	-
dc.publisher	ACM Association for Computing Machinery	-
dc.title	Multi-armed Bandit with Additional Observations	-
dc.type	Article	-
dc.type.rims	ART	-
dc.citation.volume	2	-
dc.citation.issue	1	-
dc.citation.beginningpage	1	-
dc.citation.endingpage	22	-
dc.citation.publicationname	Proceedings of the ACM on Measurement and Analysis of Computing Systems	-
dc.identifier.doi	10.1145/3179416	-
dc.contributor.localauthor	Yi, Yung	-
dc.contributor.nonIdAuthor	Yun, Donggyu	-
dc.contributor.nonIdAuthor	Proutiere, Alexandre	-
dc.contributor.nonIdAuthor	Ahn, Sumyeong	-
dc.contributor.nonIdAuthor	Shin, Jinwoo	-
dc.description.isOpenAccess	N	-

Appears in Collection: EE-Journal Papers(저널논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Multi-armed Bandit with Additional Observations

KOASAS

Communities & Collections