Metacognition guides near-optimal exploration of a large state space with sparse rewards

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 179
  • Download : 0
Earlier studies have used the reinforcement learning theory to explain how animals explore a task space to maximize reward. However, a majority of empirical tests heavily rely on a simple task paradigm. This limits our understanding of the ability to explore an uncharted world with infinitely-many options, which inevitably entails the sparse reward problem. Here, we test a theoretical idea that metacognition1,2, the ability to introspect and estimate one’s own level of uncertainty, guides efficient exploration. We designed a novel two-stage decisionmaking task with infinitely-many choices and sparse rewards (90% rewarded in less than 8% options in reward states on average) and collected 88 subjects’ behavioral data. First, we identified two key variables guiding exploration: uncertainty about the environmental structure (state-space uncertainty) and information about the reward structure (reward information). To further understand exploration dynamics, we differentiated between the two variables as a function of a learning stage. We found that the state-space uncertainty is significantly correlated with the individual metacognitive ability measured using an independent perception task3. Interestingly, highly metacognitive subjects act on the state-space uncertainty over the course of learning, while the effect of the reward information on exploration behavior diminishes after the early learning stage. Note the learning bias towards the environmental structure and against the reward structure is a near-optimal exploration strategy for the sparse reward problem. On the other hand, the effects of both variables last in the low metacognitive subject group. Our theory is further supported by the finding that the high metacognitive subject group showed higher task performance and sampling efficiency in the test phase following the learning phase. Taken together, our work elucidates a crucial role of metacognition in fostering a sample-efficient, near-optimal exploration strategy to resolve uncertainty about environmental and reward structures.
Publisher
Computational and Systems Neuroscience (COSYNE)
Issue Date
2021-02-24
Language
English
Citation

Computational and Systems Neuroscience, COSYNE 2021

URI
http://hdl.handle.net/10203/290697
Appears in Collection
BiS-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0