6R-03

POMDP環境下でのサブゴール創発による強化学習の動的階層化

In this paper, we will propose a method generating sub-goal for reinforcement learning for POMDP. POMDP is an environment where an agent gets confused by several states even when same information is observed from the environment.

To resolve this problem we will propose a genetic algorithm that dynamically generates subgoal for reinforcement learning. Subgoals are constructed as conditional formulas.The number of subgoals are not tuned for our method, and each of the agents has different solutions since they behave independently. We confirmed the effectiveness of our method by some experiments with partially observable mazes.

To resolve this problem we will propose a genetic algorithm that dynamically generates subgoal for reinforcement learning. Subgoals are constructed as conditional formulas.The number of subgoals are not tuned for our method, and each of the agents has different solutions since they behave independently. We confirmed the effectiveness of our method by some experiments with partially observable mazes.