Technical Articles and Editorials

What We Hear vs. What We Perceive

Audiophiles love to characterize the sonics of pretty much everything using words that really don't apply to sound. Terms range from the general such as "Tight", "bright", "laid back", "warm," to the absurdly abstract such as my all-time favorite, "pace." Audiophiles also tend to be a bunch of gear heads (including all of us here at Secrets) who have as much love for the equipment, and operation of that equipment, as the music they listen to, and so like to offer explanations as to why something sounds the way it does, explanations which may or may not be a bunch of hooey.

That's the goal of this somewhat informal jaunt, the perception of sonics, and the causes behind them. They may or may not directly correlate with your own listening evaluations, as the nature of the subjective is, by definition, variable. But, as many have learned by traversing simple worlds, what is, and what seems to be, don't always fit in the same egg carton. I like to think that my experience, and that of my colleagues, who have spent a lot of time not only with audio equipment, but with recording and mixing techniques, may offer something to ponder the next time a feature in playback requires characterization, or provides an opportunity for speculation.

The following is an introductory list of audible artifacts commonly imposed on audio reproduction, either in the studio, or at home. The manner by which we might be able to interpret these artifacts, as well as common causes for the artifacts, follows for each one on the list.


Limiting the dynamic range limits dynamic contrast. It can consist of reducing the loudest peaks so that they are the same loudness as the not-so-loud regions. It can consist of raising the quietest portions so that they are as loud as the reasonably quiet portions. Or, it can be a combination of both. But it is not a simple adjustment of the volume control. "Microdynamics" (low level detail), can become more apparent with signal compression simply because it's louder compared to the average level. Strangely enough, because it brings the average level up in relation to peaks, it may also make the music sound powerful and superficially more dynamic, lending a sense of drive and impact. Dance club mixes are notorious for apply high amounts of compression. Too much compression can make music sound thick and heavy. Less compression, by comparison, sounds open and effortless, without much loudness, but can strain the dynamic capabilities of playback equipment at listening levels many would consider quite reasonable. Dynamic compression modes are popular now on the latest surround sound processors because DD and DTS give us such tremendous dynamics. Turning the compression mode on can reduce problems with small amplifiers, small speakers, and sleeping spouses. Possible causes include compressor/limiter components in recording studios (vinyl mastering requires LOTS of compression due to the limited dynamic range of the medium,) "headroom" circuits, and to a minor degree, some tube circuits.

Harmonic Distortion

Adding harmonics to both the fundamentals and harmonics of musical content changes the character of instruments and voices, but does not necessarily sound distorted without an undistorted frame of reference. Furthermore, identifying the slightly distorted as opposed to the undistorted can be difficult, even if a difference between the two is audible. Many people prefer the sound of certain distortion, and may even find it clearer and/or more natural. Second order harmonics (twice the frequency of the fundamental frequency) can have a richening effect, adding an almost sweetened texture. Third order harmonics can add bite, heightening the perception of leading edge transients and dynamic shifts. Higher ordered distortion (such as fifth), without lower order distortion, can prove quite irritating. Harmonic distortion in relatively small amounts can add body and presence, making the undistorted sound in comparison seem dry, analytical, and sterile. Those accustomed to it may find the absence disturbing, sometimes even complaining of missing notes. Possible causes include pretty much everything (except for a banana split with extra nuts), but the biggest culprits are usually loudspeakers, vinyl (phonograph records), and under-powered amplifiers.

Frequency Response Dips Peaks and Dips

Our primer (Secrets, Volume 1, Number 1, 1994) covers the general characterization of frequency response. Broadband peaks and dips change not only the tonality of the presentation, but the character as well. As certain sounds dominate specific areas of the audible spectrum, frequency response variations can highlight sounds with certain tonal characters and/or ranges. Narrow band peaks are usually less obvious at first, as they affect less musical material, since fundamentals and harmonics may straddle the peak undisturbed. Generally, they can initially impart the illusion of more detail and dynamic contrast, as they will suddenly boost an aspect of a signal, drawing attention where it might have otherwise gone unnoticed. Narrow peaks may at first impress across the entire spectrum with an artificial snap or somewhat edgy type of clarity. Narrow dips prove much more difficult to pin point, unless the listener is so familiar with the program material to notice a subtle missing element. The usual culprits, aside from poorly implemented EQ, are loudspeakers with poorly tuned bass reflex systems, higher ordered crossovers constructed with poor tolerance drivers/crossover components, and/or drivers exhibiting significant diaphragm resonance and break-up.


This technique results in the sound getting to your ears at a later time than it would have without the delay (duh!). Although there is a delay that occurs when the sound emerges from the speaker cone and when it arrives at the listening position, the term usually refers to an electronic delay added in the signal path, so that you can match the arrival of sound from one speaker with another that is at a different distance. For it to work properly and not sound like an echo, it must not have sufficient time between the initial and secondary arrival to distinguish between the two. This means about 30 milliseconds or less, but it can continue to pile up before it dies out, resulting in reverberation. Delayed information in real environments, though redundant regarding the source of the sound, adds aural cues about ambient space, which we interpret as spatial presence, and even more detail. With artificially induced delay/reverb, music becomes, to a point, more lively and dimensional. With too high a ratio of reflected (delayed) sound, localization can become ambiguous and confusing, as exhibited by a very popular/notorious mass market speaker eschewed by audiophiles. Aside from acoustical information encoded by the recording process, artificial sources of delay include DSP (both in and out of the studio), turntable playback, loudspeaker/room interaction (especially apparent with dipolar/bipolar loudspeakers), and to a more limited extent, some microphonic tube equipment.

Low Frequency Roll-off

A sharp cut-off of the lowest bass frequencies can alter content to sound tighter, and subjectively faster, which explains the observations by many audiophiles that smaller mid-bass drivers produce a quicker sound than large woofers with more extension. Many ported mini-monitors take advantage of this perception, making up the lack of extension with a slight rise before cut-off in order to fake the low-end bandwidth extension. Many recording studios (the majority actually) do this selectively to tracks in a mix to punch up or lean out kick drums or the like. Some engineers do it to entire projects so that small boom boxes and lesser car stereos can play the result at higher volumes. Vinyl mastering absolutely requires it so that the stylus can track the groove.

High Frequency Roll-off

Usually more gradual in nature, a roll-off on top can taper the highest frequencies, smoothing out recordings that otherwise would have an extra bit of tizz acquired from prior EQ or microphone techniques. Perceptually, it can take the edge off, make things sound less "digital" and more "analog," even more refined. It may also improve the perceived focus of the center soundstage. Typical causes are devices that limit bandwidth near the audible range, be it analog and/or digital electronics, loudspeakers, capacitive interconnects, or inductive loudspeaker cables. It's absolutely required for vinyl mastering if the engineer wishes to avoid burning the cutting head.

High Frequency Rise

A gradual rise before the inevitable drop can lend a sense of excitement, snap, and dynamic energy. Often, the rise is mistaken for more extension or high-end detail. Usually it's either a result of narrowing high frequency dispersion in a loudspeaker (the output begins to beam due to the diaphragm increasing to a significant size in relation to the wavelength), ringing in tweeters with high resonance materials, the interaction between an amplifier with a high output impedance and the uncompensated inductance of a tweeter, or an amplifier with a poorly implemented negative feedback loop.

Phase Shifts

This is a slippery one. Some researchers argue that other than the most extreme cases, such as analog brick wall filters, phase shifts by themselves aren't very audible, if at all. Others claim that they're of utmost importance. It's a difficult phenomenon to evaluate in non DSP settings since phase shifts are usually tied to frequency response deviations, which certainly have audible artifacts, and which may or may not be mistaken for effects of group delays (a group of frequencies is delayed in relation to others, hence a phase shift). What is pretty accepted, though, is that in the stereo format, phase shifts between channels affects the perception of directionality. I once heard an acquaintance remark about how spacious and open a pair of speakers were. I thought they sounded weird, and looked to discover that one was out of phase in relation to the other. Inter-channel phase shifts, aside from polarity mistakes in setup, can occur with poor tolerance drivers and/or crossover components, with higher ordered crossover designs tending to be more susceptible. Another culprit is the listening position. The stereo format requires a center listening position to achieve a focused image with any reasonable depth rendition, regardless of dispersion properties . . . that's just how stereo is. Even though 5.1 can get away with a wider listening area, the same rules apply.

There are many audible artifacts, but these cover the most general, and most common of them. Intentionally introducing those artifacts to compare to one's perception can be a lot of fun, and quite surprising. In the long run, who knows? The experience may lead to a more educated selection of equipment.