FIT2014 第13回情報科学技術フォーラム

抄録

RF-005
POMDP環境下での強化学習におけるGAによる問題分割

◎野村拓己・加藤昇平（名工大）

In this paper, we will propose a method generating sub-goal for reinforcement learning
for POMDP. POMDP is an environment where an agent gets confused by several states
even when same information is observed from the environment. To resolve this problem
we will propose a genetic algorithm that dynamically generates sub-goal for reinforcement
learning. the number of sub-goals are not tuned for our method, and each of the agents
has different solutions since they behave independently.We confirmed the effectiveness of our method by some experiments with partially observable mazes with HQ-Learning.