Abstract
Task: The task is a probabilistic learning task as described in Pasquereau et al., 2007 where four visual cues are associated with different reward probabilities (0.00, 0.33, 0.66 & 1.00). A trial is made of the simultaneous presentation of two random cues with equal salience at two random positions. Some time after the presentation, a switch in the cortex activities is observed, representing the decision taken. After the model has chosen one cue or the other, a reward is given according to the probability associated with the chosen cue. Connections between the cortex and the striatum are then modified using a reinforcement learning rule based on the reward signal. The model is trained over 120 trials such that each combination of cues is presented equal number of times at uniformly sampled positions and the model performance reaches at least 0.9 measuring the ration of optimal choices. The decision switch and the performance are identical to the results when primates are tested with same task. Learning is then disabled and the model is tested using always the same pair of cues A (P (R)= 1) and B (P (R)= 0.33) in the presence of external factors. We study how, despite reward based learning, visual salience of the stimuli and the temporal difference between stimulus presentations affect the model to take a sub-optimal decision.