| Any Sound
has an Envelope and Fine Time Structure
Any sound can be mathematically factored into the product of a slowly
varying envelope (also called modulation), and a rapidly-varying
fine
time structure (also known as carrier). The figure shows an example of
this factorization for a filtered speech signal.
Cochlear implant processing strategies such as continuous
interleaved sampling (CIS) present only envelope information and
discard the fine time structure (Wilson et al., 1991). Some
patients show excellent performance with these processors.
However, performance might be further improved by modifying the
processors to deliver fine structure information. We are thus
interested in assessing the relative importance of envelope and fine
structure for speech perception.
|
 |
|
The Envelope is
Important in Speech Perception
 | Speech intelligibility in noise and reverberation can be predicted
from how well envelope information from different frequency bands is
preserved (Houtgast and Steeneken, 1973). |
 | In normal-hearing listeners, envelope information from as few as
3-4 frequency bands suffices for speech reception in quiet (Shannon
et al., 1995). |
|
|
There is Scant
Information on the Role of the Fine Time Structure in Speech Perception
 | Speech processed by peak clipping so that broadband envelope
information is eliminated remains fairly intelligible (Licklider,
1949). In this condition, speech recognition must be based on the
fine structure in a broad frequency band. |
 | The fine time structure of most speech sounds is precisely coded
in the temporal discharge patterns of auditory-nerve fibers (Young
and Sachs, 1979). The central nervous system is likely to utilize
some of this information. |
|
|
Houtgast, T., and Steeneken, H.J.M. The modulation
transfer function in room acoustics as a predictor of speech
intelligibility. Acustica 28:
66-73,1973. |
|
Licklider, J. Effect of amplitude distortion upon the intelligibility of speech.
J. Acoust. Soc. Am. 18: 429-434, 1946. |
| Shannon, R.V., Zeng, F.-G., Kamath, V., Wygonski, J., and Ekelid, M. Speech recognition with primarily temporal
cues. Science 270: 303-304, 1995. |
| Wilson, B.S., Finley, C.C., Lawson, D.T., Wolford, R.D., Eddington, D.K., and Rabinowitz, W.M. Better speech
recognition with cochlear implants. Nature 352: 236-238, 1991. |
| Young, E.D., and Sachs, M.B. Representation of steady-state vowels in the temporal aspects of the discharge
patterns of populations of auditory nerve fibers. J. Acoust. Soc. Am.
66: 1381-1403, 1979. |
|