Use of lagged information in partially observablemarkov decision process = 간접관측이 가능한 마코브 의사 결정과정의 지연정보 이용

This thesis studies the control of a finite state, discrete time Markov process with only incomplete state observation. This problem is generally called by Partially Observable Markov Decision Process(POMDP). The performance of such system is affected by the measurement quality of state observation, i.e., uncertainty of state. Thus, in order to reduce the uncertainty of state, we have better to obtain additional information concerning every state of Markov process if possible and valuable. Among various cases with different additional information structure, this study focuses on the case that we can obtain uncertain delayed observation of state after one transition. In other words, our interest exists in reducing state uncertainty inherent in general POMDP by using a lagged information and in controlling Markov process with two types of observation obtained from each other information sources. That is, this study could be considered as Markov Decision Process(MDP) with lagged and current partial observations. This thesis consists of three main parts. First, a finite horizon POMDP with lagged and current partial observations is considered. An algorithm for finding an optimal policy and minimum expected total cost of the policy is developed. Second, the thesis considers a POMDP with only the current observation for the case in which the system has an infinite number of time period. An algorithm finding an optimal stationary policy that minimizes the expected discounted cost. The algorithm is a modified version of the well known policy iteration algorithm. The modification focuses on the value determination routine of the policy iteration algorithm. Some properties of the approximated functions for the expected discounted cost of a stationary policy are investigated. The expected discounted cost of a stationary policy is approximated based on theses properties. That is, the value determination step adopts with the successive approximation concept. Lastly, this ...
Advisors
Kim, Soung-Hieresearcher김성희researcher
Publisher
한국과학기술원
Issue Date
1989
Identifier
61352/325007 / 000835365
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 산업공학과, 1989.2, [ [iv], 124 p. ]

URI
http://hdl.handle.net/10203/40393
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=61352&flag=t
Appears in Collection
IE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.
  • Hit : 209
  • Download : 0
  • Cited 0 times in thomson ci

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0