Abstract

Contrary to traditional thinking about hearing in which the broadband audio spectrum is taken as a whole, modern hearing science has gradually uncovered how the channel-based temporal envelope and its own spectrum are often prioritized by the auditory system. This is achieved through various processing mechanisms at different stages between the auditory brainstem and cortex, which operate on the temporal envelopes both within single auditory channels and between channels of different frequencies. Without loss of generality, it is possible to formulate the temporal envelope as a complex function that varies slowly around a fast center carrier frequency. The complex envelope includes all frequency and amplitude modulations, and hence includes the signal onset and offset cues, by definition. Tracking the transformations that the complex envelope undergoes between the acoustic source and the listener's brain should therefore be one of the key points of hearing theory. However, no systematic treatment of the complex envelope transformations relevant to hearing exists. Rather, only fragmentary treatments are available that primarily rely on empirical findings that pertain to particular stages of hearing.

The new theory of mammalian hearing that is presented here attempts to bridge this gap in the science by consulting the two disciplines that offer the most extensive analytical tools that deal with complex envelope transformations. The first one is imaging optics, which deals with the spatial envelope that propagates between an object and an image and undergoes diffraction and refraction—as is the basis for vision. The second is communication theory, which devises various types of temporal modulations to transfer information between a receiver and a transmitter, over a noisy channel.


Drawing from optical physics, it is argued that an auditory image is formed in the midbrain (inferior colliculus) of an object that is located in the acoustical environment of the listener. Using the space-time duality, it is shown that the ear is a temporal imaging system that comprises three transformations of the envelope functions: cochlear group-delay dispersion, cochlear time lensing, and neural group-delay dispersion. These elements are analogous to the familiar transformations from the visual system of diffraction between the object and the eye, spatial lensing by the crystalline lens, and second diffraction between the lens and the retina. However, unlike the eye, it is established that the human auditory system is naturally defocused, so that coherent stimuli do not react to the defocus, whereas completely incoherent stimuli are impacted by the defocus and may be blurred by design. It is argued that the auditory system can use this differential focusing to enhance or degrade the images of real-world acoustical objects that are partially coherent, predominantly. In addition to the imaging transformations, the corresponding inverse-domain modulation transfer functions are derived and interpreted with consideration to the nonuniform neural sampling operation of the auditory nerve. These ideas are used to rigorously initiate the concepts of sharpness and blur in auditory imaging, auditory aberrations, and auditory depth of field.


In parallel, ideas from communication theory are invoked to show that the organ of Corti functions as a multichannel phase-locked loop (PLL) that constitutes the point of entry for auditory phase locking. It provides an anchor for a dual coherent and noncoherent auditory detection further downstream in the auditory brain. Phase locking enables conservation of coherence between the mechanical and neural domains.


Combining the logic of both imaging and phase locking, it is speculated that the auditory system should be able to dynamically adjust the proportion of coherent and noncoherent processing that comprises the final image or detected product. This can be the basis for auditory accommodation, in analogy to the accommodation of the eye. Such a function may be achieved primarily through the olivocochlear efferent bundle, although additional accommodative brainstem circuits are considered as well.


The hypothetical effect of dispersion and synchronization anomalies in hearing impairments is considered. While much evidence is still lacking to make it less speculative, it is concluded that impairments as a result of accommodation dysfunction and excessive higher-order aberrations may have a role in known hearing-impairment effects.