In serial recall tasks, presenting items in alternating female and male voices impairs performance relative to the single-voice presentation. This phenomenon, termed the talker-variability effect (TVE), was recently reexamined by Hughes, Marsh, and Jones (2009, 2011), who used the effect as confirmatory evidence for their perceptual-gestural account of serial recall performance. Despite the authors’ claim of generalisability, the serial recall paradigm employed did not reflect the standard parameters more generally adopted in verbal short-term memory research. Specifically, the presentation rate of the stimuli was almost 3 times that typically used. We sought to determine if the TVE, as observed by Hughes et al., was generalisable to the standard serial recall task by directly comparing recall performance in talker-variable conditions at fast and slow stimulus presentation rates. Experiment 1 employed a systematic replication of the foundational study undertaken by Hughes et al. (2009). Utilising a novel stimulus set, Experiment 2 provided a subsequent test of the generalisability of the TVE, examining the influence of item properties. Both experiments showed a robust TVE at the atypical fast presentation rate; however, for the slower item presentation, the TVE was unreliable. Furthermore, error analysis suggests that item recall also contributes to the TVE, contrary to the current explanation proposed by Hughes et al. (2009, 2011). The challenge of the present data to the perceptual-gestural account of the TVE is explored. Alternative accounts that focus on the resource cost of categorical speech perception in the context of talker variability are posited.