Title: Causality, contingency, and dopamine: a Bayesian causal structure learning theory of associative learning experiments
Abstract: The canonical temporal difference (TD) learning account of associative learning experiments, which posits that phasic dopamine signals reward prediction errors (RPEs), appears to be in tension with experimental results showing that contingency (e.g., whether a cue is required for reward) impacts the learned value of a reward-associated cue. Attempts to explain these results generally invoke cause-effect learning, but either lack a rigorous theoretical foundation or are difficult to generalize. In this talk, I present a Bayesian causal structure learning theory of these results that is simple, rigorously formulated, and easy to generalize across associative learning paradigms. According to the theory, animals experience continuous-time event sequences (e.g., cue, reward, cue, reward, …) and use them to adjudicate between different possible underlying causal models (e.g., ‘cue causes reward’, versus ‘cue and reward occur independently’). Dopamine is still assumed to encode RPEs, but these RPEs are modulated by the learned causal model, which determines the animal’s belief about the ‘state’ of the environment. I will discuss how this theory generically (i.e., without parameter tuning) explains contingency degradation experiments, as well as a plethora of other puzzling results in associative learning, like the observation that associations can be learned in fewer trials if each trial is longer. Finally, I will discuss how this theory connects to causal reinforcement learning, and in particular defines a natural continuous-time analogue to the structural causal models (SCMs) usually used to formulate cause-effect learning.