ИСТИНА |
Войти в систему Регистрация |
|
ИПМех РАН |
||
Post-feedback frontal beta-band oscillations are commonly considered a mechanism that is triggered by positive outcomes and promotes strengthening of the action plan that has led to the reward. However, an action rewarded is not always a truly optimal choice. In probabilistic gambling tasks, after the optimal strategy has been learned by participants, they still occasionally commit exploratory choices, which are disadvantageous on average – but still may be positively rewarded on some trials. We asked whether positive feedback for deliberately explorative choices was followed by beta-band synchronization. Alternatively, a negative outcome, which matches the prediction of the acquired utility model, might induce beta synchronization, thus strengthening the probabilistically optimal action strategy. To answer this question, we recorded MEG, while forty volunteers performed a probabilistic two-alternative gambling task, during which they were asked to make choices between two alternatives. One alternative incurred wins more often than losses, while the other one incurred losses more often than wins. We found that the commonly observed pattern of beta synchronization in response to the feedback was paradoxically inversed for deliberate exploratory choices: frontal beta power was increased after the negative feedback but not after the positive feedback. We hypothesize that since exploratory choices are tentative quests for information committed by participants in violation of the learned utility model, the negative feedback for such choices actually works to strengthen the utility model and to re-establish its advantage over a competitor - the explorative decision.