We have already summed all the sample values we have gathered, say 100,000 of them, into a single number. The simple act of summing has already quieted the random noise by self cancellation. This single number we have as the sum is very close to the number we would obtain if the noise added to the sine wave had never even existed. In other words, it's virtually the same number we would obtain if we had had only the pure sine wave itself without noise from the beginning, and had simply added together the amplitude of 100,000 identical positive peaks of this sine wave. Obviously, then, to derive the amplitude of just one of these 100,000 identical positive peaks, all we need to do is divide this sum number by 100,000. And of course, when you add together the amplitude of 100,000 samples and then divide the sum by the number of samples (100,000), the resulting number is called the average, and it represents the average value of all the samples you looked at and gathered together. In the case of the example we have been using, we know the sum of 100,000 samples of sine wave plus noise at +13, +8, etc. is very close to the sum of 100,000 samples of just the sine wave itself at hidden amplitude x, x, x, etc. And so, to discover the correct hidden amplitude x of the sine wave obscured by noise, all we need to do is divide the sum by 100,000.
      To make a long story short, we can accurately discover and reproduce the correct amplitude of just the hidden sine wave peak itself by calculating the average in the long run of many, many samples of sine wave peak plus noise. We can recover the pure signal itself, plucking it out of the noise and quieting the noise, by the simple process of averaging together many, many samples. And, by quieting the noise and getting closer to accurately reproducing the correct amplitude of the pure sine wave signal itself, we have effectively improved the bit resolution of our digital system (in a crude resolution digital system with fewer bits of resolution, the approximation or quantization error from its limited resolving power appears as a quantization noise, and if this noise is quieted by our averaging technique, then the effective bit resolution is improved).
      Averaging together these 100,000 samples of the sine wave peak plus noise essentially gives us the correct value of the sine wave peak without the noise contamination. That value is a single number, culled from 100,000 numbers. As we noted previously, that single number is all we need in this case. From that single number, accurately giving us the value of the sine wave positive peak, we can reconstruct the entire positive half of the sine wave, using the reconstruction filter that the Nyquist theorem requires in any case. Similarly, we can reconstruct the sine wave negative peak from a single number, derived as an average in the same way as the sine wave positive peak.
      We have now succeeded in discerning and reproducing a whole sine wave cycle very accurately, even though it was previously hidden by added noise. We looked at and gathered in 100,000 sine wave cycles in order to do this. It's worth noting that Fig. 5 in Keith Howard's HFN article, showing a sine wave recovered via this technique, required the gathering in and averaging of 24,000 samples for the peak value, and it still shows some visible waveform inaccuracy errors (which would be highly audible, since the ear is far more sensitive to hearing such distortions than the eye is to seeing them on a waveform plot). Also, Fig. 3, which shows how a 16 bit system can have its resolution enhanced up to 24 bits, employed the averaging of 1,048,576 samples to achieve this.

Creating the Output Signal

      In order to generate this one cycle of a sine wave free from noise, we had to gather in many, many cycles of the sine wave with the noise (in order to average them all together). In this example, our input signal was 100,000 cycles of a sine wave, and our output so far is just one single cycle of a sine wave.
      Suppose though that we want to be able to reproduce the whole span of the input signal, all 100,000 cycles worth. After all, a single cycle of a sine wave isn't going to do anyone much good, even if it is noise free. And our digital recording/reproduction systems are all about recording and reproducing the entire span or duration of an input signal, not just a single cycle fraction of it, not just 1/100,000 of the input signal's span. Is there any way, after averaging 100,000 cycles down to just one cycle, of somehow reconstituting all 100,000 cycles of the input signal? And is there any way of doing so while preserving the noise quieting benefits we have achieved by averaging?
      Yes, yes. It's really very simple. We already have derived a single cycle of a sine wave that is substantially free of the noise that plagued all 100,000 input samples. All we have to do is employ that single cycle as an ideal model and simply repeat it 100,000 times, to produce an output signal. This output signal will have the same span or duration as the input signal, so it will essentially reproduce the entire input signal, but without the noise that the input signal contained. Of course, this entire process is occurring while the signal is in digitized form, so it's a piece of cake to program our digital computer to first gather in 100,000 noisy samples, then calculate their average, and finally use this calculated average as a model to spit out 100,000 output samples.
      Well, this is all wonderful. Everything Keith Howard's HFN article promised us has come true. By simply using averaging, we can clean the noise off an input signal and enhance our digital system's bit resolution, and we can deliver an output signal which is purer and more accurate than the input signal was. What a miracle! Sounds like the cue for the boys from Sony and Philips to enter stage left, so they can design a recording/reproduction system like DSD based on these miraculous techniques. Hey, they could even design a system with absurdly low bit intrinsic resolution like DSD, but then simply average the heck out the input signal, to dramatically quiet noise and enhance bit resolution, enough to effectively provide a quiet, high resolution digital system. If you have a technique for miraculously making a silk purse from a sow's ear, then a digital system designed as a sow's ear will do just fine, thank you. With the promise of this miraculous signal enhancing averaging technology in hand, nothing can possibly go wrong, go wrong, go wrong, go wrong….

Hidden Assumptions

      Of course, something does go wrong. Horribly wrong. The miracle of the averaging technique is true, as far as the HFN article chooses to tell the story. The problem lies in what is missing from the story, in what is left untold.
      For, you see, the simple truth is that the averaging technique works great for a single sine wave signal, just as described in the HFN article. But the untold part of the story is that it doesn't work for a music signal. The averaging technique is almost totally irrelevant to a music signal, and indeed can do great harm to a music signal if it is applied aggressively enough to provide significant quieting of noise or enhancement of bit resolution. In other words, the chief reasons for wanting to employ the averaging technique for a sine wave are actually the worst reasons for a music signal, because if we use the averaging technique aggressively enough to provide lower noise or resolution enhancement benefits in significant measure (as in the HFN examples and DSD/SACD), then the averaging technique will harm the integrity of the very music signal it's supposedly trying to improve.
      When HFN applied the averaging technique with such success to the single sine wave plus noise input signal, they were actually relying on some hidden, unspoken assumptions. It's time to bring these tacit assumptions out of the closet and to expose them to the full light of critical analysis.
      The first hidden assumption is that we even know what the nature of the input signal is. In the example, we knew in advance that it was a single sine wave. Knowing this in advance then allowed us to select or design a technique for handling the input signal, and hopefully improving upon it. But with real music we don't know in advance what the signal will look like, so we can't even design a strategy for intelligently monkeying with it. Real music is ever changing, is unpredictable, and is chock full of singular events that never repeat themselves.
      Indeed, a real music signal is very much like random noise, and very different from a single sine wave. Since real music is like random noise, any technique (like averaging) that reduces noise will probably also reduce the legitimate information content of the music itself. And it seems prima facie obvious that a technique (like averaging) which might have worked well to separate two very different types of signals, like random noise vs. a single sine wave, should be far less successful at separating two signals if they are very similar in nature, like music and random noise.
      The second hidden assumption is that the nature of the input signal is uniformly repetitious. This assumption is true of the single sine wave, which is the simplest possible signal and which repeats itself endlessly with unvarying uniformity. But this assumption does not apply to real music, which keeps changing.
      Unfortunately, this assumption is required as the key lynchpin for the averaging technique to even work in the first place. The first step of the averaging technique was the summation of many, many samples. Any unwanted random noise added to a desired signal tends to cancel itself out in the long run in such a summation or average, because random noise tends in the long run to have equal energy above and below a baseline. But for this technique to even get off the ground, that baseline absolutely has to remain constant for all the samples we look at and add together. That baseline could be zero, as in the case where random noise is the only signal and we seek to quiet it. Or that baseline could be some other fixed amplitude, such as the positive peak of a sine wave that repeats itself uniformly and identically, over and over for all samples.
      The adding or averaging of many samples tends to reduce or eliminate fluctuations above and below such a fixed baseline. Random noise is one such fluctuation, and that's why long term summation or averaging tends to quiet random noise added to a baseline constant (such as the constant peak amplitude of an endlessly repeating sine wave). But if the desired signal itself were to fluctuate from sample to sample, then the summation or averaging technique would also cancel out part of the desired signal, since it can't tell whether fluctuations going up and down represent unwanted random noise or desired signal. The summation or averaging technique is dumb; it doesn't know what is desired signal and what is noise. The summation or averaging technique can only separate something that fluctuates from something that does not fluctuate. It can separate random noise from the desired signal only if one fluctuates and the other does not fluctuate at all. The summation or averaging technique worked great for the example of the single sine wave as an input signal, precisely and only because a sine wave repeats itself constantly and identically, over and over, for all the thousands or millions of samples you care to sum, in your quest to quiet the noise and enhance bit resolution. A single sine wave is ideal as a fixed baseline, since it is the simplest possible waveform, and its simple pattern remains constant as it repeats itself over and over.
      On the other hand, real music as an input signal is anything but a fixed, uniform, unchanging baseline. A real music signal keeps changing all over the map, and scarcely ever repeats itself even once identically, let alone repeating itself over and over. Furthermore, a real music signal is very complex, unlike the simple single sine wave, so even a musical passage that might be repeated doesn't furnish a simple, constant enough platform to use as a fixed baseline for the summation or averaging technique to use.
      Because real music as an input signal keeps changing, it doesn't give the summation or averaging technique the fixed baseline it requires to separate desired signal from unwanted noise. So the summation or averaging technique has very limited potential for improving a real music signal. Thus, it very misleading for an article to demonstrate that there is great potential for improving an input signal that is a single sine wave, thereby implying that these great improvements seen with a single sine wave are relevant to improvements realizable with real music.
      The third hidden assumption is that the output signal is to be uniformly constant, for example a sine wave that repeats itself endlessly. Recall that the summation or averaging process produces as a result just a single number, which will represent the average value of the many, many input signal samples looked at. That single number can then be output, and can be easily repeated, to produce as many identical clone samples as we want in our output signal. We of course want to output as many samples as we have previously just gathered as our input for averaging, so that the span or duration of the output signal matches the span or duration of the input signal (if it doesn't, then we can't claim to have a recording or reproduction system). Since we only get one number from a summation or averaging calculation, we have only one number from which to generate the entire span of our output signal, to match the span which we gathered from the input signal to average (which might be 100,000 samples). At the very least, we surely need to generate more than one output sample for this span, since we surely gathered in more than one sample to sum or average. Since we have to generate more than one output sample, but we only have one number (the sum or average) to generate all the output samples from, it follows necessarily that all of our output samples for this span will be identical clones of one another.
      There's no problem here, so long as the output signal is a single sine wave. Since a sine wave continues uniformly unchanged forever, the sample point at one positive peak is an identical clone of the sample point at every other positive peak (again, our example focuses on just the positive peak of the sine wave, but the same thing would be true for any other sampling point repeated at any other place in the sine wave cycle). Thus, if we know in advance that the output signal is to be only a single sine wave, then we can employ the averaging technique with aggressive abandon, and reap rich rewards in dramatic noise quieting and resolution enhancement. We can sum and average even more samples, indeed as many samples as we want (say 100 million instead of 100,000), to achieve even more impressive improvements in noise quieting and resolution enhancement. The more samples we gather into our average, the wider the span and longer the duration of our input sample gathering, and thus the wider the span and longer the duration of our generated output signal that consists of identical clones repeated ad nauseam. So long as the output signal is to be a single sine wave, that's fine, since a sine wave is itself a clone identically repeated forever.
      However, if the output signal is to be real music, this third hidden assumption gets us in big trouble. Real music keeps changing from one sample to the next. So if we're trying to output a real music signal, we can't output a long string of samples that are identical clones of each other. And if we output a string of identical clones to create an unchanging signal for some significant span or duration, that won't be real music. Gershwin's Rhapsody in Blue begins with a single clarinet note that is pretty close to a sine wave. Imagine what Gershwin's composition would sound like if we made the whole piece an identical clone of that first sine wave note.
      Thus, there are three hidden assumptions which all have to be true for the averaging technique to be able to work its magic as advertised, in quieting noise and enhancing resolution. All three assumptions do in fact apply to a single sine wave, so the averaging technique works great for a single sine wave, and produces dramatic improvements (naturally seducing people into believing that it will also then work for all signals). But none of these three assumptions apply to a music signal. So it won't work its magic as advertised for music.

Send in the Clones

      The miracle of the averaging technique seemed even better than magic. It seemed like a free lunch. But of course there is no such thing as a free lunch. There is always a piper to pay. We might make a desirable gain in quieting noise and enhancing effective bit resolution. But we also have to surrender and lose something desirable.
      Notice something important. When we gathered many, many samples and added them together to form a sum (thereby allowing the random noise to self-cancel), that number we got as a sum was just a single number. Likewise, when we calculated an average for these many, many samples, that number we got as an average was just a single number. We took in many, many numbers as input, and we produced as an output just one number.
      Thus, there has been a huge loss of information. We previously had individual information about each of say 100,000 samples, and in a Faustian bargain we have discarded all that information in order to obtain merely a single number. That single number is valuable to us, because it reduces noise and thereby describes more accurately the prototypical or average sample. But such a single number is also much more circumscribed and limited, because it only describes a prototypical or average sample, and surrenders all pretense of being able to describe each, every, or even any of the 100,000 individual member samples that were surveyed to reach the summary average description. It's like a mass market survey, looking only for the lowest common denominator. They'll survey 100,000 people, and conclude that the average person's favorite food is a hamburger. That might be an accurate overall average description, but it discards all the information about each individual person's actual characteristics (I like Chinese cuisine, you like Hungarian cuisine).
      Now, having just one number to describe an entire population can be sufficient under one very narrow circumstance. If all 100,000 members of the population just happen to be identical clones of one another, then and only then can a description of any one member also be sufficient to describe any other member, and furthermore a description of the average member would be identical to a description of any individual member.
      Of course, we all know that 100,000 members of virtually any population are not identical clones. Take 100,000 people, or 100,000 fleas, or 100,000 successive orchestral sounds in the progress of a musical symphony. In all these populations, each individual is noticeably different from every other, and usually in many complex ways. Indeed, the very notion that these populations could be composed of 100,000 identical clones strikes us as weirdly unusual.
      But there is one rare, unusual, weird entity on earth where the members of the population are indeed identical clones of each other. This is such a rare and weird phenomenon, and it is so irrelevant and inapplicable to everything else on earth, that it almost isn't even worth mentioning. Unfortunately, in the present context, we do have to mention this weird entity. We have to mention

(Continued on page 50)