Study for the exploration-exploitation strategy of human based on restless two-armed bandit task
Studying human decision-making mechanisms and modeling them can understand and predict decision-making behaviors. The core problem in our unstable environment is the exploration-exploitation dilemma, and we divide the decision-making models into two parts: (a) value function through reward; (b) strategies balance the exploitation and exploration process based on the value function. We investigate decision-making in a restless two-armed bandit task and use multiple methods for each part to fit the dataset. We use AIC and BIC to evaluate the best method and find that the exploration-exploitation tradeoff parameters can better classify the human choice patterns.