Dopamine based error signals suggest a reinforcement learning algorithm during song acquisition in birds

Jesse Goldberg


Reinforcement learning enables animals to learn to select the most rewarding action in a given context. Edward Thorndike posed a simple solution to this problem in his Law of Effect: ‘Responses that produce a satisfying effect in a particular situation become more likely to occur again in that situation, and responses that produce a discomforting effect become less likely to occur again in that situation.’ This idea underlies stimulus-response, reinforcement, and instrumental learning and implementing it requires three pieces of information: (1) the action (response) an animal makes; (2) the context (situation) in which the action is taken; and (3) evaluation of the outcome (effect). In vertebrates, the basal ganglia have been proposed to integrate the three pieces of information required for reinforcement learning: (1) The situation, or current context, is thought to be signaled by a massive projection from the cortex to the striatum, the input layer of the BG; (2) The chosen action is signaled by striatal medium spiny neurons (MSNs) that drive behavior via projections to downstream motor centers; and (3) The evaluation of the outcome is transmitted to the striatum by midbrain DA neurons. These signals underlie a simple ‘three-factor learning rule’: If a cortical input is active (signifying a context), the MSN discharges (driving the action chosen), and an increase in DA subsequently occurs (signifying a good outcome), then the connection strength of the cortical input to the MSN is increased. Overall, by controlling the strength of the corticostriatal synapse, this dopamine-modulated corticostriatal plasticity governs which action will be chosen in a given context, placing DA in the premier position of determining what animals will learn and how they will behave. Here, I will discuss how our recent identification of dopaminergic error signals in birdsong support the potential generality dopamine modulated corticostriatal plasticity in implementing learning in a wide range of behaviors.