Maintenance strategies of multi-component system by reinforcement learning강화학습을 활용한 다부품 시스템의 유지 보수 전략

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 566
  • Download : 0
In this dissertation, we find optimal or suboptimal policies of different maintenance strategies such as age-based preventive maintenance and opportunistic preventive maintenance of a multi-component system composed of non-identical components by modeling with the MDP formalism and solving it by using model-free reinforcement learning algorithms. On one hand, we model preventive maintenance strategies for an equipment composed of multi-non-identical components which have different time-to-failure probability distribution, by using a Markov Decision Process (MDP). Its originality resides in the fact that a Monte Carlo Reinforcement Learning (MCRL) approach is used to find the optimal policy for each different strategy. The approach is applied to an already existing published application which deals with a fleet of Military trucks. The fleet consists of a group of similar trucks that are composed of non-identical components. The problem is formulated as a MDP and solved by a MCRL technique. The advantage of this modeling technique when compared to the published one is that there is no need to estimate the main parameters of the model, for example the estimation of the transition probabilities. These parameters are treated as variables and they are found by the modeling technique, while searching for the optimal solution. Moreover, the technique is not bounded by any explicit mathematical formula, and it converges to the optimal solution whereas the previous model optimizes the replacement policy of each component separately, which leads to a local optimization. The results show that by using the reinforcement learning approach, we are able of getting a 36.44% better solution that is less downtime. On the other hand, equipment usually consists of many components arranged in hierarchical structure. In order to achieve efficient maintenance strategy, the system hierarchy should be taken into account. In this dissertation, we first give a nomenclature to describe a system composed of multiple non-identical components in a hierarchical structure, then we model the system for an age-based and an opportunistic preventive maintenance strategies by using MDP formalism. Then, we find near-optimal policies through the SARSA algorithm from RL, where we minimize the expected discounted cost. We perform simulation experiments to compare near-optimal policies obtained by SARSA for both strategies with corrective maintenance and with age-based preventive maintenance policy obtained from renewal reward theory. We show that the proposed opportunistic preventive maintenance outperforms other strategies.
Advisors
Shin, Hayongresearcher신하용researcher
Description
한국과학기술원 :산업및시스템공학과,
Publisher
한국과학기술원
Issue Date
2017
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 산업및시스템공학과, 2017.2,[iv, 37 p. :]

Keywords

Markov Decision Process; Preventive Maintenance; Opportunistic Maintenance; Reinforcement Learning; Multi-Component System; 마르코브 결정 프로세스; 예방보전; 기회주의적 보전; 강화학습; 다부품 시스템

URI
http://hdl.handle.net/10203/243036
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=675233&flag=dissertation
Appears in Collection
IE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0