Spectrum resolving power of hearing: measurements, baselines, and influence of maskers

Main Article Content

Alexander Ya. Supin *
(*) Corresponding Author:
Alexander Ya. Supin | alex_supin@mail.ru


Contemporary methods of measurement of frequency tuning in the auditory system are reviewed. Most of them are based on the frequency-selective masking paradigm and require multi-point measurements (a number of masked thresholds should be measured to obtain a single frequency-tuning estimate). Therefore, they are rarely used for practical needs. As an alternative approach, frequency-selective properties of the auditory system may be investigated using probes with complex frequency spectrum patterns, in particular, rippled noise that is characterized by a spectrum with periodically alternating maxima and minima. The maximal ripple density discriminated by the auditory system is  a convenient measure of the spectrum resolving power (SRP). To find the highest resolvable ripple density, a phase-reversal test has been suggested. Using this technique, normal SRP, its dependence on probe center frequency, spectrum contrast, and probe level were measured. The results were not entirely predictable by frequency-tuning data obtained by masking methods. SRP is influenced by maskers, with on- and off-frequency maskers influencing SRP very differently. Dichotic separation of the probe and masker results in almost complete release of SRP from influence of maskers.

Downloads month by month


Download data is not yet available.

Article Details


The paper presents a review of both classical and some contemporary methods of measurement of frequency tuning and frequency-spectrum resolving power in the auditory system. The frequency-spectrum resolving power is the ability to discriminate (resolve) the fine pattern of the frequency spectrum of acoustic signals. The spectrum patterns and their variation in time, i.e., the spectral-temporal portraits characterize all acoustic signals. The ability to discriminate the signals depends on the ability to discriminate their frequency spectra, i.e., on the spectrum resolving power of the auditory system. Degradation in frequency-spectrum resolution results in poor speech recognition.1-6 Since degradation in frequency resolution accompanies sensorineural hearing loss7-11 and hearing deterioration with age,4 measurement of frequency resolution is very important for characterization of hearing abilities. Measurements of frequency resolution may also be helpful in selecting appropriate hearing-aid characteristics.12-14

Masking methods of measurements of frequency tuning

There is a variety of methods of measuring the frequency selectivity of hearing. Most of them are based on the frequency-dependent masking paradigm. These methods allow assessment of the bandwidth and quality of frequency-selective channels (filters) in the auditory system or their psychophysical equivalent, the critical bands.

The most demonstrative version of frequency-dependent masking is tonal masking: measuring the masked thresholds of a tonal probe in the presence of a background tonal masker. The masked threshold depends on the frequency difference between the probe and masker, so the threshold-vs-frequency function (frequency-tuning curve) directly reflects the auditory filter form: the narrower the tuning curve, the more acute the filter tuning.15-17 Disadvantages of these methods include the influence of beats arising at small frequency spacing between the probe and masker tones and the off-frequency listening effect.18-20 Beats are absent, however, when forward, not simultaneous, masking is used and the off-listening effect can be avoided by the use of two-tone maskers that are symmetrical relative to the probe frequency.20-22

Other widely used versions of frequency-dependent masking are: i) the use of a narrow-band noise masker of variable bandwidth centered at the probe frequency. Variation of the noise band width influences the masked threshold, and differentiation of the threshold-vs-bandwidth function returns the auditory filter form. This method is the basis of the critical band paradigm;23,24 ii) the use of comb-filtered (rippled) noise with a spectrum featuring periodically alternating peaks and valleys of spectral density, either a peak or valley centered at the probe frequency.7,25-27 Variation of frequency spacing of ripples influences the masked threshold, so deconvolution of the threshold-vs-ripple spacing function allows for derivation of the auditory filter form; iii) the use of a notch-noise masker with a spectrum featuring a stop band (notch) centered at the probe frequency.28-30,4,31 The notch width influences the masked threshold; therefore differentiation of the threshold-vs-notch width function allows deriving the auditory filter form.

Studies using the masking methods have yielded estimates of frequency tuning in normal listeners. At relatively high frequencies, the equivalent rectangular bandwidth (ERB) of the auditory frequency-tuned filters is around 10% of the center frequency, however the ERB does not fall below 25 Hz at low frequencies. To describe ERB variation with central frequency, several analytical expressions have been suggested,23,30-32 for example, a simple equation given by Glasberg and Moore:31 ERB = 24.7(4.37F+1), where ERB is given in Hz, and F is the central frequency, kHz.

However, despite availability of a few well elaborated methods and the obvious importance of frequency-resolution measurements, to date their practical applications are rare. To a significant extent, it is because these methods are rather time consuming. A common feature of all the methods listed above is that they use multi-point measures, i.e. the determination of a single frequency-tuning estimate requires several threshold measurements at various values of the masker parameter (tone frequency, bandwidth, notch width, ripple spacing) to obtain a function describing the masked threshold dependence on the masker parameter; then, a resolution value can be computed from the obtained function. These time-consuming methods are appropriate for fundamental investigations of hearing in laboratory conditions; however, in clinical conditions where time is short, these methods are inconvenient.

Estimation of critical bands by comparison of AM and FM modulation thresholds33 also requires a large body of measurements.

Contrary to those methods, the critical ratio is a one-point measure that requires only one threshold measurement to obtain one resolution value. This measure is a ratio of the masked threshold to the spectral density of wide-band masking noise.34 However, the critical ratio is a poor estimate of frequency selectivity because it confounds the frequency tuning with the efficiency of signal detection in noise. It is important that small changes in the signal detection ratio influence the critical ratio as much as large changes in frequency resolution. For example, a 3-dB or 10-dB shift corresponds to two-fold or ten-fold changes of frequency resolution, respectively. Inaccuracy of threshold determination within a few dB (which is quite possible in practical measurements) produces the same dramatic error in estimation of the frequency resolution. Modifications were suggested to improve the critical ratio method. Frequency resolution has been measured using notched noise in which the masker level and notch width were kept constant and only probe level was varied (the notched noise critical ratio by Patterson et al.4), or the notch width was varied keeping both probe and masker levels.11 These one-point measures are more sensitive to frequency resolution than the standard critical ratio. However, they imply a voluntarily chosen notch width or probe-to-masker ratio, and results depend on these values. Another one-point measure bases on the frequency spacing between tones at which roughness of the sound disappears;35,36 however it requires a listener to be carefully instructed of what sound quality must be detected and well trained. As a result, none of the methods described above are widely used for practical needs.Apart from difficulties in practical use of the masking methods for frequency resolution measurements, there is one more fundamental problem. These methods provide estimates of frequency tuning of the auditory filters. If the auditory system were linear, knowing the frequency tuning of the filters would allow easy prediction of the response to any complex sound signal. However, the auditory system is not linear in many respects. Therefore, knowing the auditory filter forms is not always sufficient to predict how well the auditory system is capable of discriminating complex sound signals.

Measurement of frequency-spectrum resolution using complex-spectrum probes

Many of the problems listed above may be avoided by using sounds with complex spectra as probes for frequency resolution measurements. A typical version of such sounds is the comb-filtered (rippled) noise (Figure 1). The frequency spectrum of the rippled noise contains periodically alternating peaks and valleys. As mentioned above, this kind of noise was used as one of the masker versions for measurements of frequency tuning of the auditory filters.7,25-27 However, the spectral grid of the rippled noise may also be used as a probe to estimate directly the ability of the auditory system to discriminate complex frequency spectra. The finer is the ripple spectrum pattern that can be discriminated by the auditory system, the better the spectrum resolving power (SRP). The rippled spectrum pattern can be quantitatively characterized by ripple density (the number of ripples per frequency unit) and ripple depth (deviation of the spectrum maxima and minima from the middle level). In particular, the highest resolvable ripple density may be adopted as a reliable quantitative measure of SRP.

For using the rippled spectrum as a probe, a reliable test is necessary to show either a certain rippled spectrum structure is or is not resolvable. For this purpose, a ripple phase-reversal test was suggested.37 The test principle is simple. Rippled noise of a certain ripple depth and density is presented to a listener. At a certain instant, the noise is replaced by another one of the same intensity, ripple depth and density but of the opposite position of spectral peaks and valleys (solid and thin lines in Figure 1A). This is the phase reversal test. At the phase reversal instant, the listener detects some change in the noise timbre. It is only possible if the listener discriminates the fine spectrum structure. If the ripples are spaced too densely (Figure 1B) or their depth is too low (Figure 1C) to be discriminated, the phase reversal cannot be detected because the noise before and after the switch is the same in all respects except the peak and valley positions. Thus the highest ripple density at which the phase reversal is detectable can be taken as a measure of SRP. A more detailed study may include variation of both the ripple density and depth (spectral contrast). In such a way, contrast thresholds at various ripple densities can be found using the phasereversal test.

This method features a few advantages as compared to the majority of masking methods. i) It yields a one-point measure, since only one limit of the ripple pattern resolution has to be found to obtain one SRP value. ii) It does not confound the frequency tuning with the signal detection efficiency. iii) It provides resolution of complex spectra as a result of all transforms, both linear and non-linear, of the signal in the auditory system. iv) The listener does not need to be experienced or carefully instructed since his only task is to report any detectable change in the probe noise.

Various versions of rippled noise were used to measure SRP. Originally, it was the wide-band rippled noise with equally spaced ripples, i.e., the frequency intervals between the adjacent ripples were of a constant value δf. To a large extent, it happened because such kind of rippled noise could be easily generated by mixing a noise with its delayed version. If the delay is δt, then the mixing results in a rippled frequency spectrum with the ripple spacing of δf = 1/δt. Respectively, the ripple density (the number of ripples per frequency unit) is d = 1/δf = δt. The noise with equally spaced spectral ripples produces specific psychoacoustical effects, in particular a pitch sensation depending on the ripple spacing – the time-separation pitch.38-41 This version of rippled noise in conjunction with phase reversal test was used in early attempts to measure SRP in normal listeners.37

However, probes with equally spaced ripples are not the best for testing SRP because frequency representation in the cochlea is closer to frequency-proportional rather than to frequency constant: representation of each frequency-proportional band (e.g., octave) occupies almost equal part of the cochlea (in humans, around 4 mm per octave).32 Therefore, probes with frequency-proportional ripple spacing are more adequate for testing SRP. In a frequency-proportional ripple pattern, absolute ripple spacing δf and density d vary across the spectrum band, so for this pattern, more convenient measures of ripple spacing and density are their relative (dimensionless) measures which are constant across the spectrum band: relative spacing δf/f and relative density D = f/δf, respectively. The modern digital technologies make it possible to synthesize signals of any arbitrarily defined spectra, in particular probes with either equal or frequency-proportional spaced ripples.

Spectrum resolving power estimates based on rippled-spectrum probes

Using various versions of rippled-spectrum probes in conjunction with the phase-reversal test yields basic data on the spectrum-pattern resolution. Majority of the data was obtained in a group of 5 to 8 listeners 25 to 55 years old who had normal hearing thresholds and no signs of hearing decrease. In those studies, a two-alternative forced-choice procedure was used. Each trial consisted of two stimuli (intervals I and II in Figure 2) with an interval between them. Two trial types alternated randomly: either the first stimulus within the trial contained several ripple phase reversals while the second one was constant (Figure 2A), or vice versa (Figure 2B). The listener was instructed to respond, which of the two noise bursts contained any periodical changes of noise timbre.

First of all, the ripple-density resolution limit was found as a function of mean probe frequency.42 The measurements have shown that when the modulation depth of a rippled spectrum is 100% (i.e., the power in spectral valleys decreases to zero), the highest ripple density that can be discriminated by normal listeners ranges from 11 relative units at a mean frequency of 1 kHz to almost 16 units at 8 kHz; i.e., the threshold intervals between ripples range from 1/11 to 1/16 of the center ripple frequency, respectively (Figure 3.1). At lower (less than 1 kHz) frequencies, not the relative but the absolute ripple density resolution limit is nearly constant at a value of 16-20 ripple/kHz (constant ripple spacing threshold of 50-60 Hz) (Figure 3.2).

These data can be compared with data on frequency tuning obtained by classic masking methods. Computation presented by Supin et al.,42 has shown that if the auditory system were linear, the auditory filters of equivalent rectangular bandwidth of 11-12% (as follows from an equation presented31 should provide a ripple-density resolution limit of 6-8 units (Figure 3.3). Thus, the actual ripple-density resolution limit of 11-15 units is almost twice higher than predicted by the auditory filter frequency tuning. This disagreement, being one of many manifestations of the non-linearity of the auditory system, clearly indicates the importance of direct measurements of SRP.

Apart from ripple density, the ripple depth (spectral contrast) is an important parameter determining discrimination of complex spectra. Originally, discrimination of spectra of varying contrasts was investigated using spectral profiles composed of a number of harmonic components.43,44 It was found that non-uniformity of a spectrum is detectable if a deviation of components from the mean spectrum level exceeds – 24.5 dB, which corresponds to RMS deviation of ±6% or peak deviation of around 10%. Similar results were obtained with the use of rippled-spectrum probes of varying ripple depth and using the ripple phase reversal test.45 At low ripple densities, the ripple depth threshold was about 10% in the spectrum magnitude domain, which corresponds to about 20% in the power domain. With an increase in the ripple density, ripple depth thresholds increase until they reach the highest possible value of 100%, i.e., the ripple density resolution limit is achieved (Figure 4). The results may be satisfactorily explained by a model implying that the contrast of the internal spectrum representation in the auditory system decreases with increasing the ripple density; therefore, the ripple density, the higher ripple depth is necessary to make the internal spectrum representation exceeding the contrast threshold.

One more important issue is how the spectrum pattern resolution depends on sound level. Measurements of the auditory filters by masking methods have shown that the filter bandwidth increases (the filter acuteness decreases) with increasing sound level.4,22,46 This occurs because the ratio between the more acute active mechanism of frequency tuning (based on electrokinetic activity of the outer hire cells) and less acute passive mechanism (based on hydromechanical properties of the cochlea) is level-dependent. It might be expected that this property of auditory filters should manifest itself in the ability to discriminate complex sound spectra. However, in conditions of negligible background noise, direct measurements of SRP by rippled-spectrum probes have shown no decrease of rippled-spectrum resolution with probe level increasing47 (Figure 5).

At a first sight, this result seems paradoxical, however it is easily explainable. Indeed, the change of the ratio between the active and passive mechanisms of frequency tuning results not in widening of the filter peak (reflecting the active tuning mechanism) but in widening of the filter tail (reflecting the passive mechanism).48,49

However, it is the peak that transfers the major part of the signal power. Therefore, while the tail remains at least 10-15 dB below the peak, the tail widening negligibly influences the transfer of complex spectrum patterns. Quantitative analysis of the process has been presented in the original paper describing this effect.47

Effects of background noise on spectrum pattern resolution

In natural conditions, a sound signal almost never appears in absolute silence. The presence of other sounds may significantly influence the signal detection and recognition. These sounds overlapping the target signal may be considered as background (masking) noise. To a large extent, the deteriorated signal recognition may be a result of poorer spectrum pattern discrimination. This was demonstrated by direct measurements of SRP in background of masking noise. The measurements have shown that the presence of masking noise results in poorer spectrum pattern discrimination. This effect depends on i) relation between the probe and masker frequency bands, ii) masker-to-probe ratio, and iii) overall masker + probe level.47,50,51

When the frequency bands of the probe and masker coincide (on-frequency masker), the masker produces almost no effect while the masker level is below the probe level (negative masker/probe dB ratios). SRP does not differ from the no-masker condition and remains nearly independent of the probe level (Figure 6A). When the masker level approaches the probe level (zero masker/probe dB ratio), small SRP reduction becomes noticeable, however SRP is still negligibly dependent on the probe level. When the masker probe exceeds the probe level (positive masker/probe dB ratios), SRP steeply falls down, mostly at high probe levels (+5-dB masker/probe ratio in Figure 6A). The spectrum pattern discrimination becoming entirely impossible at masker/probe dB ratios of +10 dB and higher.

When the masker frequency band is below the probe band (the lowfrequency masker, Figure 6B), SRP depends on the probe level and masker/probe ratio in a quite different manner. SRP decreases with increasing the probe level. At high probe levels, a small but noticeable SRP reduction appears at a very low masker/probe ratio, below –20 dB. As the masker/probe ratio increases, the deteriorating effect of the masker increases: SRP decreases with increasing the probe level, and the higher the masker/probe ratio, the lower SRP and steeper its decrease with increasing the probe levels. At high masker/probe ratios and high probe levels (more than 80 dB SPL at a 10-dB masker/probe ratio, more than 50 dB SPL at a 20-dB ratio) the spectrum-pattern discrimination becomes completely impossible). Contrary to the on- and low-frequency maskers, the high-frequency masker produces very small effect on the spectrum pattern discrimination: SRP remains nearly independent of the probe level being almost the same as in the no-masker condition. Only at high masker/probe ratios (20-30 dB) and lowest probe levels (40 dB SPL) does SRP slightly decrease (Figure 6C).

Traditionally, the effect of decreased spectrum pattern resolution produced by background noise was considered a result of superimposition of the noise on the probe signal, thus being a case of the classical energetic masking. This superimposition reduces the contrast of the internal spectrum representation of the probe, thus degrading the spectrum discrimination.46 Measurements with rippled-spectrum probes of various contrasts have confirmed that reduction of the ripple depth (spectral grid contrast) results in reduction of the resolvable ripple density.45

It should be noted that superimposition of the probe and masker takes place not only with on-frequency maskers but also with low-frequency maskers due to the effect of upward spreading of masking. This effect appears due to the asymmetric form of the auditory filters with low-frequency tails.31,52,53 Participation of the upward spreading of masking may explain why low-frequency but not high-frequency maskers effectively influence SRP. However, not all masking effects can be explained in this simple manner. In particular, at some combinations of levels of the probe and masker, low-frequency maskers produce more effective deterioration of SRP than the on-frequency masker of the same level.50 This effect is well visible in Figure 6 at all masker/probe ratios of 0 dB and below if to compare SRP values in graphs A (on-frequency masking) and B (low-frequency masking). This result cannot be explained by the upward spreading of masking only, because the effect of the upward spreading masking can never exceed the effect of the on-frequency masking. Therefore, the high effectiveness of low-frequency maskers indicates that apart from classical energetic masking, some additional non-energetic mechanisms are involved in deterioration of SRP. The nature of these mechanisms is not investigated yet. We can just hypothesize that they present a kind of lateral suppression or inhibition. The presence of non-energetic masking influencing discrimination of complex auditory stimuli, including speech, has been known and mostly interpreted in terms of informational masking.54-56 The idea of informational masking implies that the addition of a masker (either on- or off-frequency) make the discrimination of particular components of a complex signal more difficult due to informational competition. As shown by the data reviewed above, reduction of the spectrum discrimination ability is also an important factor of both on- and off-frequency masking.

Dichotic and binaural release of spectrum resolving power from masking

It has been shown long ago that spatial separation of the signal and masker sources results in release from masking. A number of experiments have shown that the presence of a masker in one ear has little or no impact on a listener’s ability to recognize a target speech signal presented in the other ear.57-61 In free-field conditions, a similar effect known as the spatial release from masking appears when sources of the probe and masker are spatially separated.62-71 A key part in this effect was assigned to interaural level difference (ILD) resulting in predominant presentation of the target signal in one ear and the noise in the other ear. Apart from ILD-based release from masking, there are releasing effects based on interaural phase relations. If a stimulus is presented in both ears in-phase and masker is presented counterphase, or vice versa, the masking effect is weaker than when both the stimulus and masker are presented in the same mode: ether in-phase, or counter-phase, or monaurally.72-79 This effect is known as the binaural masking level difference (BMLD). In the free field, BMLD may appear because of the phase difference of signals reaching the left and right ear. These findings mostly address the energetic masking that appears when the spectral bands of the signal and masker overlap. However, spatial release from masking including both ILD and BMLD effects characteristic of informational masking too.55,80-85 Recent investigations have shown that the release of SRP is an important factor of both ILD-based and BMLD-based release from masking.86 In conditions of dichotic presentation (the probe in one ear and the masker in the other ear), SRP remained almost the same as in control no-masker conditions, even at very high masker/probe ratios (Figure 7). Thus, almost complete ILD-based release from masking took place both for on-frequency and off-frequency maskers. When the probe was presented binaurally in-phase and the masker – binaurally counter-phase, smaller but noticeable BMLD-based release from low-frequency masking took place.

Implications to practical audiology

In addition to classic masking methods of measurement of frequency tuning in the auditory system, the data reviewed above present a method of SRP measurements based on the use of complex spectrum probes. This method may be helpful in a number of respects, in particular, for practical audiology because of its features as follows. i) The method privides direct estimates of complex spectrum-pattern resolution which not always can be predicted from frequency tuning of the auditory filters (e.g., effects of sound level in the presence and absence of background noise, the degree of deterioration effects of on-frequency and off-frequency maskers, the degree of dichotic and spatial release from masking, etc.). ii) The method may be appropriate for individual diagnostics, being less time consuming than majority of masking methods. The method is little time consuming because it is a one-point method (one determination of a ripple resolution limit for obtaining one estimate of the spectrum resolution), contrary to majority of multipoint masking methods (several masked threshold determinations for obtaining one estimate of the filter tuning).

Nevertheless, using the ripple-spectrum probes for practical needs of audiology is still limited. Rippled-spectrum signals have been used to assess effectiveness of cochlear implants, in particular, basing on listener’s ability to discriminate ripple spectra of opposite ripple phases.87-90 Comparison of normal-hearing, impaired-hearing, and cochlear-implant listeners revealed significant correlations between the ripple-pattern resolution and the speech recognition. Those studies were performed using classic experimental protocols when the listener has to discriminate a difference between two or three noise bursts with different ripple phases. The method described above may help for wider using of rippled-spectrum signals as reliable spectrum-resolution tests.


Figure 1.: Examples of rippled spectra used as probes for SRP measurements. Spectra are centered at 2 kHz and enveloped by 1-octave cosine function. Note that the ripples look equally spaced on the log frequency scale which is characteristic of frequency-proportional ripple spacing. A) Ripple relative density f/δf = 6 (resolvable by normal human hearing), ripple depth 100%; solid and thin lines represent two versions of the rippled spectrum which replace one another in the phase reversal test. B) The same as A, ripple density f/δf = 18 (irresolvable by human hearing). C) Ripple density f/δf = 6, ripple depth 10% (irresolvable by human hearing).
Figure 2.: Temporal diagram if spectrum resolving power test using the ripple phase reversal test. I-II) two successive intervals of stimulus; solid and thin lines indicate probe signals with opposite peak-valley positions of spectral ripples, 1-2) periods of alternative presentation of these two spectra. A-B) diagrams of trials with opposite orders of presentation of stimuli with and without ripple phase reversals.
Figure 3.: Spectrum resolving power dependence on probe center frequency. 1) experimental data (probe level 70 dB SPL). 2) slope corresponding to a constant absolute ripple density of 16 cycles/kHz. 3) linear prediction of spectrum resolving power based on frequency tuning dependence on center frequency, according to Glasberg and Moore (1990).
Figure 4.: Ripple depth threshold dependence on ripple density at various probe center frequencies, from 0.5 to 8 kHz, as indicated in the legend. Probe level 70 dB SPL.
Figure 5.: Spectrum resolving power dependence on probe level at three probe center frequencies (1, 2, and 4 kHz, as indicated in the legend).
Figure 6.: Spectrum resolving power dependence on probe level at various masker/probe ratios. Diotic presentation of the probe and masker. Probe center frequency 2 kHz. Masker/probe ratios (dB) are indicated in the legends; Cont – control (no masker). A) Onfrequency masker (the same center frequency as the probe). B) Low-frequency masker (masker center frequency ¾ octaves below the probe). C) High-frequency masker (masker center frequency ¾ octaves above the probe).
Figure 7.: The same as Figure 5, dichotic presentation of the probe and masker (the probe in the left, the masker in the right ear).