Learning to Schedule Network Resources Throughput and Delay Optimally Using Q(+)-Learning

Cited 12 time in webofscience Cited 0 time in scopus
  • Hit : 336
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorBae, Jeongminko
dc.contributor.authorLee, Joohyunko
dc.contributor.authorChong, Songko
dc.date.accessioned2021-05-25T08:50:24Z-
dc.date.available2021-05-25T08:50:24Z-
dc.date.created2021-05-25-
dc.date.created2021-05-25-
dc.date.created2021-05-25-
dc.date.created2021-05-25-
dc.date.issued2021-04-
dc.identifier.citationIEEE-ACM TRANSACTIONS ON NETWORKING, v.29, no.2, pp.750 - 763-
dc.identifier.issn1063-6692-
dc.identifier.urihttp://hdl.handle.net/10203/285347-
dc.description.abstractAs network architecture becomes complex and the user requirement gets diverse, the role of efficient network resource management becomes more important. However, existing throughput-optimal scheduling algorithms such as the max-weight algorithm suffer from poor delay performance. In this paper, we present reinforcement learning-based network scheduling algorithms for a single-hop downlink scenario which achieve throughput-optimality and converge to minimal delay. To this end, we first formulate the network optimization problem as aMarkov decision process ( MDP) problem. Then, we introduce a new state-action value function called Q(+)-function and develop a reinforcement learning algorithm called Q(+)-learning with UCB (Upper Confidence Bound) exploration which guarantees small performance loss during a learning process. We also derive an upper bound of the sample complexity in our algorithm, which is more efficient than the best known bound from Q-learning with UCB exploration by a factor of gamma(2) where gamma is the discount factor of the MDP problem. Finally, via simulation, we verify that our algorithm shows a delay reduction of up to 40.8% compared to the max-weight algorithm over various scenarios. We also show that the Q(+)-learning with UCB exploration converges to an gamma-optimal policy 10 times faster than Q-learning with UCB.-
dc.languageEnglish-
dc.publisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC-
dc.titleLearning to Schedule Network Resources Throughput and Delay Optimally Using Q(+)-Learning-
dc.typeArticle-
dc.identifier.wosid000641964600020-
dc.identifier.scopusid2-s2.0-85100475440-
dc.type.rimsART-
dc.citation.volume29-
dc.citation.issue2-
dc.citation.beginningpage750-
dc.citation.endingpage763-
dc.citation.publicationnameIEEE-ACM TRANSACTIONS ON NETWORKING-
dc.identifier.doi10.1109/TNET.2021.3051663-
dc.contributor.localauthorChong, Song-
dc.contributor.nonIdAuthorLee, Joohyun-
dc.description.isOpenAccessN-
dc.type.journalArticleArticle-
dc.subject.keywordAuthorNetwork resource management-
dc.subject.keywordAuthorthroughput and delay optimality-
dc.subject.keywordAuthorreinforcement learning-
dc.subject.keywordAuthorupper confidence bound-
dc.subject.keywordPlusALLOCATION-
dc.subject.keywordPlusOPTIMIZATION-
dc.subject.keywordPlusMODEL-
Appears in Collection
AI-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 12 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0