Chapter 11: Hearing in Complex Environments
11.7. Categorical Perception
One cue that our brains use to analyze phonemes is voice onset time (VOT). This is the time between the onset of a sound of a phoneme and the formants are created by the vibration of the vocal cords. If we look at Figure 11.11, we see that the spectrograms for “die” and “tie” look very similar to each other in terms of spectral energy but “die” has a shorter VOT (indicated by the peach color). We can manipulate the VOT using a computer-generated voice and change the perception of the sound from die to tie by lengthening the VOT. However, we only ever hear die or tie – not a combination of sounds – this is called categorical perception. Even though the VOT is changed continuously across a wide range, the listener perceives only two different sounds: “die” on one side of the phonetic boundary and “tie” on the other side. Since nothing in between would make sense, we don’t hear it! A ‘phoneme boundary’ refers to the point at which one phoneme (a distinct unit of sound in a language) transitions into another. It is the point at which a change in sound can produce a change in meaning.” (Glossary of Psychology, 2024).

Cheryl Olman PSY 3031 Detailed Outline
Provided by: University of Minnesota
Download for free at http://vision.psych.umn.edu/users/caolman/courses/PSY3031/
License of original source: CC Attribution 4.0
Adapted by: Samira Moalim-YusufOpenStax, Speech Perception
Provided by: Rice University
URL : https://cnx.org/contents/bwyYFEPa@10/Speech-Perception
License of original source: CC-BY 4.0