Adaptive decision-making is not only about choosing the correct option. It also requires choosing at the right speed. In uncertain environments, the brain must continuously manage the trade-off between accuracy and reaction time: waiting longer can improve decisions, but waiting too long may reduce reward rate. How cortico-basal ganglia-thalamic (CBGT) circuits learn to tune this balance remains a central question in computational neuroscience.
In this study, together with Timothy Verstynen and Jonathan Rubin in Pittsburgh, PA, we used a spiking neural network model of CBGT circuits to investigate how decision policies change during early reward-based learning. They simulated a two-choice task in which one target was rewarded and examined how feedback-driven dopaminergic plasticity at cortico-striatal synapses altered circuit dynamics and behavior. The model’s behavior was then interpreted through an evidence accumulation framework, linking biological circuit activity to decision variables such as evidence accumulation rate, decision threshold and choice bias.
A key contribution of the work is the identification of how low-dimensional CBGT “control ensembles” shape decision policy. These subnetworks — termed responsiveness, pliancy and choice — map onto distinct aspects of the evidence accumulation process. Responsiveness influences how quickly evidence evaluation gets underway; pliancy relates to the standard of evidence required before committing to a decision; and choice reflects the circuit’s commitment toward one available option.
The study shows that reward prediction errors, transmitted through dopaminergic plasticity, shift the activity of these control ensembles over successive trials. As learning unfolds, these shifts improve reward rate by strategically changing how evidence is used: decisions become better tuned to the task, combining improved accuracy with reduced decision time. In this view, learning does not merely strengthen the “correct” action channel. It reshapes the policy by which the circuit evaluates and commits to evidence.
A useful way to summarize the result is: the basal ganglia do not simply select actions; they help tune the decision-making strategy itself. Through dopamine-dependent plasticity, CBGT subnetworks adjust how cautious, responsive or biased the system should be in order to gain rewards more efficiently.
This work provides a mechanistic bridge between synaptic plasticity, circuit-level subnetworks and cognitive decision policies. It shows how biologically grounded changes in cortico-striatal connections can be translated into interpretable shifts in evidence accumulation dynamics, offering a framework for understanding how the brain learns not only what to choose, but how to choose.
To know more:
- Bahuguna, J., Verstynen, T., & Rubin, J. E. (2025). How cortico-basal ganglia-thalamic subnetworks can shift decision policies to increase reward rate. PLOS Computational Biology, 21(11), e1013712. https://doi.org/10.1371/journal.pcbi.1013712
