情報処理学会第77回全国大会講演要旨

6R-03

POMDP環境下でのサブゴール創発による強化学習の動的階層化

○野村拓己，加藤昇平（名工大）

In this paper, we will propose a method generating sub-goal for reinforcement learning for POMDP. POMDP is an environment where an agent gets confused by several states even when same information is observed from the environment.
To resolve this problem we will propose a genetic algorithm that dynamically generates subgoal for reinforcement learning. Subgoals are constructed as conditional formulas.The number of subgoals are not tuned for our method, and each of the agents has different solutions since they behave independently. We confirmed the effectiveness of our method by some experiments with partially observable mazes.

情報処理学会 第77回全国大会講演要旨

情報処理学会第77回全国大会講演要旨