Page Title

-- Hearing the Sound of the Midrange Colorations

In the Tamino's coloration, the affected midrange region sounds almost as if parts of it were phase inverted (hence the ghostly quality), and as if parts of it were destructively interfering with other parts, thereby decreasing the amplitude and causing the tonal balance recession here. You can hear this sonic quality for yourself by using your own voice to perform a simple exercise. First, pretend you're a famous opera singer, and sing out heartily the words, "The king sat upon the throne". Then sing again the word "sat" with the same hearty gusto, and stretch it out in time so you are singing "sa-a-a-at". Note that the "s" and "t" of "sat" involve strong treble sounds, in which you sharply exhale through your clenched teeth. On the other hand, the long central part of the musical tone, the "a-a-a-a" of "sa-a-a-at", contains a lot of midrange energy, right in the region where the Tamino has its midrange coloration.
Now, let's try something peculiar. Sing the word "sa-a-a-at" again, but this time inhale your breath (instead of exhaling) while singing the "a-a-a-a" part of "sa-a-a-at". Try this a few times and, with a little practice, you should be able to exhale normally while singing the "s" part of "sa-a-a-at", then quickly switch to inhaling for singing the "a-a-a-a" part of "sa-a-a-at", and then switch back again to exhaling normally for the "t" part. Notice how the midrange "a-a-a-a" portion has much less amplitude, and sounds recessed or far away (you simply can't sing it nearly as loudly while inhaling as you can while exhaling). And notice how this midrange "a-a-a-a" portion also sounds phantomlike or ghostly, instead of sounding like your full bodied, solid flesh and blood voice does when you are exhaling normally.
That, in exaggerated form, is how the Tamino's midrange coloration sounds, and how it affects music in general, and the human singing voice in particular. The Tamino's upper midrange and treble regions from the tweeter are exemplary in amplitude, and in tactile, articulate solidity, just like the "s" and "t" portions that you sang while exhaling normally. But the Tamino's colored midrange region, immediately adjacent to its wonderful tweeter output, sounds recessed and ghostlike, just like the "a-a-a-a" portion that you sang while inhaling.
By inhaling instead of normally exhaling, you were singing the midrange "a-a-a-a" portion in inverted phase polarity. During the very same musical note, however, you were normally exhaling instead of inhaling for the treble "s" and "t" portions, so you were singing these treble portions in correct absolute phase polarity. In other words, you were mixing phase polarities within the same musical note. That provides a crucial clue to explaining why the Tamino evinces this peculiar coloration in its midrange reproduction.

-- Cruel Choice for Third Order Crossover

Why, then, does the Tamino exhibit this peculiar coloration? This second story, like the first, again begins with the tweeter. Once again, we'll also find that the cause does not lie with the fault of the designers, but rather with the dictates of the laws of physics, and the cruel choices forced upon the designers by these dictates. And, once again, we'll find that it is the designers' praiseworthily purist pursuit of high end perfectionism which puts them in a difficult quandary.
As before, the Tamino's tweeter driver, in order to be so transparent, fast, and neutral, needs to have a low mass moving system, and this means that it needs to be well protected from the large excursions and heat dissipation problems that would be imposed if much energy from frequencies below the 3500 Hz crossover point were to get through to this tiny .75 inch tweeter driver. That in turn means that the crossover network must employ a steeper, higher order filter, in order to protect this tweeter. Evidently a second order filter, as employed in some other loudspeakers, is not steep enough to afford adequate protection. So the Tamino employs a third order filter. Now, most other loudspeakers employ a fourth order filter, when they need a steeper filter than second order, thereby skipping over and eschewing the use of a third order filter. These popular fourth order filters have their own sets of sonic problems, which we won't get into here.
The unpopular third order filter, which Verity employs for the Tamino, is unpopular for a reason: it forces a very cruel choice upon the designer. A third order crossover filter goes through a sudden, steep phase rotation at the crossover frequency, going through a full 360 degrees (which is the equivalent of rotating all the way around the phase clock, within a short frequency span, and coming all the way around back to 0 degrees, which equals 360 degrees).
This gives the designer a cruel choice. If he chooses to connect the tweeter in correct absolute phase polarity, then the loudspeaker system as a whole will go through this sudden, steep phase rotation of a full 360 degrees at the crossover frequency. On the other hand, he can choose to connect the tweeter in inverted absolute phase polarity, and thereby get the benefit of having the loudspeaker system as a whole go through a more gentle and gradual phase rotation of only 180 degrees at the crossover frequency. Thus, the loudspeaker system as a whole would behave better in the immediate region of the crossover frequency. But, if he makes this choice, there's a different penalty to pay, namely that the tweeter will be playing in inverted absolute phase polarity throughout the remainder of its range, above the crossover frequency. Neither choice is the right one, each imposing different sonic benefits and penalties, as dictated by the laws of physics.
Most designers choose the latter tactic, feeling that inverted absolute phase polarity through the tweeter's range is not such a bad thing. Sonically, a tweeter playing in inverted absolute phase polarity sounds softer and sweeter, with attack transients that sound gentler and indirect rather than hard and direct. Now, most recordings are too closely miked, which makes attack transients on a recording sound harder and more direct than you would hear them live when sitting at a reasonable distance at a live concert. The softening and sweetening that is imposed by an inverted tweeter closely mimics what happens to sound at a live concert, When you listen to live music at a reasonable distance, the distance of air to the stage naturally softens attack transients, and the hall reverberations you hear from your far field concert hall listening seat naturally also soften and sweeten the treble sounds, making them sound indirect rather than hard and direct and in your face. Thus, the softening and sweetening coloration wrought by an inverted polarity tweeter can have euphonic benefits, making typically close miked recordings sound more natural, more like the type of sound you remember hearing from your reasonably distant seat at a live concert. This softening and sweetening coloration, wrought by inverted tweeter polarity, can also help offset and hide typical sonic flaws that many tweeters have, such as an edgy brightness due to diaphragm resonance and breakup.
But the excellent Tamino tweeter does not have these sonic flaws that need to be offset and hidden by polarity inversion. It sounds (and measures) remarkably smooth, and it is very transparent, fast, and uncolored just as it is. Given Verity's purist background in perfectionist high end audio, it is natural that they would want the Tamino's tweeter to be as accurate as possible throughout its operating range, so they would not want to impose the softening and sweetening coloration that is produced by connecting the tweeter in inverted polarity. So Verity chose to connect the tweeter in correct absolute polarity. This purist choice gives the Tamino tweeter a very accurate sound on all treble attack transients, accurately representing the recording just as it was miked. The Tamino tweeter is indeed a joy to hear in correct absolute polarity, as it reproduces treble transients with such accurate attack, and with such transparency, speed, and neutrality. So in this sense Verity's choice was very wise.
But, alas, the laws of physics dictate that there is still a piper to pay for this purist and otherwise wise choice. That piper is the very sudden and steep phase rotation of a large 360 degrees through the crossover region, just where we hear that peculiar coloration, from about 2 kHz to about 4 kHz. This large phase rotation within a small spectral region means that adjacent and nearby portions of the spectrum will be reproduced in different phases from one another. The most dramatic (and easily understandable) example occurs for the portion of the spectrum where this 360 degree phase rotation has reached the halfway mark, at 180 degrees. The portions of music (and voice and sound effects) in this central spectral portion will be reproduced in completely opposite, inverted polarity from the portions of music in nearby portions of the spectrum, both above and below in frequency, which will be reproduced in correct polarity.

-- Phase Rotation Seen in Frequency Domain

What would be the sonic consequences of this? Would this cause the tonal balance recession we heard in the midrange crossover region, and would it also cause that ghostlike, phasey quality we heard here? If yes, then how could it cause such sonic effects?
Let's first use the simplistic (and ultimately incorrect) sine wave model of music to look at this. Imagine a musical instrument, rich in overtones, playing a note at A, 440 Hz (it could even be a singer singing the "a" portion of "sat", as you did in the exercise above). If the musical instrument's overtones were harmonically related, then they would be spaced 440 Hz apart. Thus, it is very likely that at least one of these overtones would fall within that portion of the spectrum where the Tamino crossover's drastic phase rotation of 360 degrees takes place. To make this example easy, imagine that one of the overtones happens to fall at (or very near) the frequency where this 360 degree rotation has reached its halfway point of 180 degrees, and for argument's sake let's say that this happens to be at 3080 Hz. This means that the overtone at 3080 Hz will be reproduced by the Tamino in inverted phase polarity, while most of the remainder of this very same musical note, far from the phase rotation in the crossover region, will be reproduced in correct absolute phase polarity.
Thus, part of the reproduced musical note will be inhaling (like the peculiar sounding "a-a-a-a" you sang above while inhaling), while most of the remainder of this same musical note will be normally exhaling. For example, the opening "s" and terminating "t" of the "sa-a-a-at" you sang are rich in correct polarity treble energy, assuming you correctly exhaled while singing the "s" and "t", so part of the musical note you sang as "sa-a-a-at" is at odds with another part of the very same musical note. The result will be a very peculiar sounding phasey coloration, with this one musical note seeming to fight itself, trying to exhale and inhale at the same time.
You have already heard how peculiar this phasey coloration sounds, because you performed an exaggerated version of it while completely inhaling when you sang the "a-a-a-a" above. The "a-a-a-a" you sang while inhaling sounded very, very different from the normal "a-a-a-a" you sang while exhaling. Now imagine further that some of this very different, very peculiar inhaling sound is mixed in with the normal exhaling version of singing "a-a-a-a", and that they are fighting each other, as a vocalist tries to exhale and inhale at the same time while singing "a-a-a-a". This gives you an idea of the peculiar phasey or ghostly quality of the Tamino's midrange coloration.
This also explains in part why the Tamino's tonal balance sounds recessed or diminished in this midrange crossover region, and why a bold singing voice is changed into a shy, withdrawn voice. Obviously, if you are partly inhaling while you are primarily exhaling, that inhaling portion will subtract from the total amount of air you exhale, and will subtract from the total power, boldness, directness, and loudness of sound you make by exhaling.

-- Looking at Time Domain Waveform

We can also see graphically how this happens, by looking at the time domain waveform of the signal representing music (including voices and sound effects). Let's look at just a positive half cycle of the music waveform (again just sticking with the simplistic sine wave model of music). This positive half cycle will look like a ragged mountain. Now, the loudness you hear from this mountain is determined by the total area within the profile of this mountain. If all frequencies represented in this mountain waveform shape were reproduced in correct phase, then the mountain would have its normal profile, and its correct full amplitude throughout, so you'd hear the musical instrument or bold singer at correct full volume, and you'd hear the correct tonal balance for the whole frequency spectrum, including all the overtones in the midrange crossover region.
But the Tamino reproduces the midrange crossover region with severe phase rotation, and this 360 degree rotation inevitably goes through a phase portion at and near 180 degrees, which represents completely opposite, inverted phase polarity. An opposite polarity waveform has the same full amplitude, but in a negative direction, below the zero amplitude line. Thus, that portion of the music signal seen as a mountain, which corresponds in the time domain to the Tamino's inverted phase polarity portion of the spectrum in the frequency domain, will want to dive down steeply, cutting a deep crevasse into the mountain profile. That portion will want to dive down toward and even below the zero amplitude line, until it reaches and is represented by its full amplitude, but in a negative direction. Thankfully, the musical overtones up around 3080 Hz are normally (for most musical instruments) much lower in amplitude than the fundamental of the musical note, so normally this steep crevasse or notch in the mountain profile doesn't make it all the way down to the zero amplitude line, or beyond it into negative territory (although there are a few musical instruments with fundamentals in the Tamino's crossover region, and they would sound really weird, since then the crevasse would plunge deeply into negative territory).
Even so, this crevasse cut into the mountain profile has several sonic consequences, which are easily audible, especially because human hearing has its maximum sensitivity in this midrange region around 3080 Hz. This crevasse lessens the overall area within the profile of the mountain, thus lessening the overall perceived loudness of the musical note, changing a bold singer into a shy one. It dramatically lessens the perceived loudness in that portion of the mountain waveform that corresponds to the spectral region around 3080 Hz - since there's a notch in this portion of the mountain profile, so the amplitude of this portion is obviously less than it should be, compared to the rest of the mountain profile, which represents all the other spectral portions of the musical note - so this makes the perceived tonal balance of the Tamino noticeably recessed in this midrange region, just as we heard. And it creates that peculiar phasey sound of the musical instrument trying to fight itself, trying to partially inhale while it simultaneously exhales.
Incidentally, the existence of this notch is confirmed by the manufacturer's own measurements of the Tamino's time domain step response. A step response measurement essentially feeds a positive mountain profile into the loudspeaker. And Verity's own measurement of the signal coming out of the Tamino shows a deep, steep notch in this mountain profile. Moreover, the position of this notch in the measured time domain waveform confirms that it corresponds to the midrange crossover region in the frequency domain, and is therefore caused by the phase rotation of the crossover. The portion of the frequency spectrum, corresponding to the point in time of this anti-polarity phase notch, will sound recessed (since we hear waveforms as reproduced in time domain), and will also sound phasey (since its phase is polarity inverted).

-- True Transient Model of Music

The simplistic sine wave model of music is sufficient to demonstrate and account for the midrange coloration we heard from the Tamino, explaining both the recessed midrange tonal balance and the peculiar phasey quality. But the news gets even worse when we look at the true nature of most music, by discarding the simplistic sine wave model and employing instead the truer transient model. Engineers like to speak of music as being a collection of sine waves, because this makes their equations simpler to handle. But the actual truth is very different. In point of fact, the only music that comes close to qualifying as a collection of sine waves is a long clarinet note or a low organ note. A sine wave has the property of staying the same forever, and of therefore occurring only at a discrete frequency (its temporal behavior determines its frequency characteristics, since frequency is merely a construct or concept defined from time). But music does not stay the same forever. Music is transient, ephemeral, constantly changing. Indeed, the whole point and message of most music is how it changes through time.
Music (including voices and sound effects) consists of transients, and is constantly changing. This means that most transients are singular and non-repeating. That's especially true if more than a single instrument is playing, since the transients from plural instruments will sum together in complex, ever changing ways, thereby insuring that a given transient of this now complex signal waveform will probably never be identically repeated. Thus, every temporal slice of complex music's ever changing waveform is likely a unique transient. As a listener pays attention to this music, his ear and brain take in and comprehend each temporal slice, and his ear and brain hear, appreciate, and analyze how each temporal slice of the music sounds. Each temporal slice heard by the listener is evaluated in turn, so each evaluation by the listener also effectively regards that time slice as a single transient.
Now, each single transient has a very interesting property, in terms of its frequency spectral content. Each single transient has (by definition) an infinitely dense spectral content. Likewise, each temporal slice taken in and evaluated by the listener's ear/brain has an infinitely dense spectral content. And this is doubly true if, as is likely, each transient of the complex music waveform, and hence also each temporal slice taken in by the listener's ear/brain, is unique and non-repeating.
What do we mean by infinitely dense spectral content? It means that each musical transient, and each temporal slice taken in by the listener's ear/brain, actually contains energy at every frequency and every fraction of a frequency in its makeup. This is in contrast to the sine wave model of music, which posits that the spectral content of a given musical note merely contains energy at discrete

(Continued on page 98 )