DC Field | Value | Language |
---|---|---|
dc.contributor.author | Bae, Jeongmin | ko |
dc.contributor.author | Lee, Joohyun | ko |
dc.contributor.author | Chong, Song | ko |
dc.date.accessioned | 2021-05-25T08:50:24Z | - |
dc.date.available | 2021-05-25T08:50:24Z | - |
dc.date.created | 2021-05-25 | - |
dc.date.created | 2021-05-25 | - |
dc.date.created | 2021-05-25 | - |
dc.date.created | 2021-05-25 | - |
dc.date.issued | 2021-04 | - |
dc.identifier.citation | IEEE-ACM TRANSACTIONS ON NETWORKING, v.29, no.2, pp.750 - 763 | - |
dc.identifier.issn | 1063-6692 | - |
dc.identifier.uri | http://hdl.handle.net/10203/285347 | - |
dc.description.abstract | As network architecture becomes complex and the user requirement gets diverse, the role of efficient network resource management becomes more important. However, existing throughput-optimal scheduling algorithms such as the max-weight algorithm suffer from poor delay performance. In this paper, we present reinforcement learning-based network scheduling algorithms for a single-hop downlink scenario which achieve throughput-optimality and converge to minimal delay. To this end, we first formulate the network optimization problem as aMarkov decision process ( MDP) problem. Then, we introduce a new state-action value function called Q(+)-function and develop a reinforcement learning algorithm called Q(+)-learning with UCB (Upper Confidence Bound) exploration which guarantees small performance loss during a learning process. We also derive an upper bound of the sample complexity in our algorithm, which is more efficient than the best known bound from Q-learning with UCB exploration by a factor of gamma(2) where gamma is the discount factor of the MDP problem. Finally, via simulation, we verify that our algorithm shows a delay reduction of up to 40.8% compared to the max-weight algorithm over various scenarios. We also show that the Q(+)-learning with UCB exploration converges to an gamma-optimal policy 10 times faster than Q-learning with UCB. | - |
dc.language | English | - |
dc.publisher | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC | - |
dc.title | Learning to Schedule Network Resources Throughput and Delay Optimally Using Q(+)-Learning | - |
dc.type | Article | - |
dc.identifier.wosid | 000641964600020 | - |
dc.identifier.scopusid | 2-s2.0-85100475440 | - |
dc.type.rims | ART | - |
dc.citation.volume | 29 | - |
dc.citation.issue | 2 | - |
dc.citation.beginningpage | 750 | - |
dc.citation.endingpage | 763 | - |
dc.citation.publicationname | IEEE-ACM TRANSACTIONS ON NETWORKING | - |
dc.identifier.doi | 10.1109/TNET.2021.3051663 | - |
dc.contributor.localauthor | Chong, Song | - |
dc.contributor.nonIdAuthor | Lee, Joohyun | - |
dc.description.isOpenAccess | N | - |
dc.type.journalArticle | Article | - |
dc.subject.keywordAuthor | Network resource management | - |
dc.subject.keywordAuthor | throughput and delay optimality | - |
dc.subject.keywordAuthor | reinforcement learning | - |
dc.subject.keywordAuthor | upper confidence bound | - |
dc.subject.keywordPlus | ALLOCATION | - |
dc.subject.keywordPlus | OPTIMIZATION | - |
dc.subject.keywordPlus | MODEL | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.