Monte-Carlo Tree Search for Constrained MDPs

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 242
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorLee, Jongminko
dc.contributor.authorKim, Geon-Hyeongko
dc.contributor.authorPoupart, Pascalko
dc.contributor.authorKim, Kee-Eungko
dc.date.accessioned2019-03-19T01:38:59Z-
dc.date.available2019-03-19T01:38:59Z-
dc.date.created2019-03-09-
dc.date.issued2018-07-15-
dc.identifier.citationICML/IJCAI/AAMAS Workshop on Planning and Learning (PAL)-
dc.identifier.urihttp://hdl.handle.net/10203/251744-
dc.description.abstractMonte-Carlo Tree Search (MCTS) is the state-ofthe-art online planning algorithm for very large MDPs. However, many real-world problems inherently have multiple goals, where multi-objective sequential decision models are more natural. The constrained MDP (CMDP) is such a model that maximizes the reward while constraining the cost. The common solution method for CMDPs is linear programming (LP), which is hardly applicable to large real-world problems. In this paper, we present CCUCT (Cost-Constrained UCT), an online planning algorithm for large constrained MDPs (CMDPs) that leverages the optimization of LPinduced parameters. We show that CCUCT converges to the optimal stochastic action selection in CMDPs and it is able to solve very large CMDPs through experiments on the multi-objective version of an Atari 2600 arcade game.-
dc.languageEnglish-
dc.publisherICML/IJCAI/AAMAS Workshop on Planning and Learning (PAL)-
dc.titleMonte-Carlo Tree Search for Constrained MDPs-
dc.typeConference-
dc.type.rimsCONF-
dc.citation.publicationnameICML/IJCAI/AAMAS Workshop on Planning and Learning (PAL)-
dc.identifier.conferencecountrySW-
dc.identifier.conferencelocationStockholm-
dc.contributor.localauthorKim, Kee-Eung-
dc.contributor.nonIdAuthorLee, Jongmin-
dc.contributor.nonIdAuthorKim, Geon-Hyeong-
dc.contributor.nonIdAuthorPoupart, Pascal-
Appears in Collection
CS-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0