In May of 2001, Stanley Lipshitz and John Vanderkooy of the University of Waterloo, in Canada, published a paper titled “Why 1-bit Sigma-Delta Conversion is Unsuitable for High-Quality Applications”. In the paper’s Abstract (a kind of introductory paragraph summing up what the paper is all about) they made some unusually in-your-face pronouncements, including “We prove this fact.”, and “The audio industry is misguided if it adopts 1-bit sigma-delta conversion as the basis for any high quality processing, archiving, or distribution format…”. DSD had, apparently, met its Waterloo.
What was the basis of their arguments? Quite simple, really. They focussed on the problem of dither. As I mentioned in an earlier post, with a 1-bit system the quantization error is enormous. We rely on dither to eliminate it, and we can prove mathematically that TPDF dither at a depth of ±1LSB is necessary to deal with it. But with a 1-bit system, ±1LSB exceeds the full modulation depth. Applying ±1LSB of TPDF dither to a 1-bit signal will subsume not only the distortion components of the quantization error, but also the entire signal itself. Lipshitz and Vanderkooy study the phenomenon in some detail.
They then go on to characterize the behaviour of SDMs. SDMs and noise shapers are more or less the same thing. I described how they work a couple of posts back, so you should read that if you missed it first time round. An SDM goes unstable (or ‘overloads’) if the signal presented to the quantizer is so large as to cause the quantizer to clip. As Lipshitz and Vanderkooy observe, a 1-bit SDM must clip if it is dithered at ±1LSB. In other words, if you take steps to prevent it from overloading, then those same steps will have the effect that distortions and other unwanted artifacts can no longer be eliminated.
They also do some interesting analysis to counter some of the data shown by the proponents of DSD, which purport to demonstrate that by properly optimizing the SDM, any residual distortions will remain below the level of the noise. Lipshitz and Vanderkooy show that this is a limitation of the measurement technique rather than the data, and that if the signal is properly analyzed, the actual noise levels are found to be lower but the distortions levels are not, and do in fact stand proud of the noise.
Lipshitz and Vanderkooy do not suggest that SDMs themselves are inadequate. The quantizer at the output of an SDM is not constrained to being only a single-bit quantizer. It can just as easily have a multi-bit output. In fact they go on to state that “… a multi-bit SDM is in principle perfect, in that its only contribution is the addition of a benign … noise spectrum”. This, they point out, is the best that any system, digital or analog, can do.
The concept of a stable SDM with a multi-bit output is what underlies the majority of chipset-based DAC designs today, such as those from Wolfson, ESS, Cirrus Logic, and AKM. These types of DAC upsample any incoming signal - whether PCM or DSD - using a high sample rate SDM with a small number of bits in the quantizer - usually not more than three - driving a simplified multi-bit analog conversion stage.
Lipshitz and Vanderkooy’s paper was of course subjected to counter-arguments, mostly (but not exclusively) from within the Sony/Phillips sphere of influence. This spawned a bit of thrust and counter-thrust, but by and large further interest within the academic community completely dried up within a very short time. The prevailing opinion appears to accept the validity of Lipshitz and Vanderkooy from a mathematical perspective, but is willing to also accept that once measures are taken to keep any inherent imperfections of 1-bit audio below certain presumed limits of audibility, 1-bit audio bitstreams can indeed be made to work extremely well.
Where we have reached from a theoretical perspective is the point where our ability to actually implement DSD in the ADC and DAC domains is more of a limiting factor than our ability to understand the perfectibility (or otherwise) of the format itself. Most of the recently published research on 1-bit audio focuses instead on the SDMs used to construct ADCs. These are implemented in silicon on mixed-signal ICs, and are often quite stunningly complex. Power consumption, speed, stability, and chip size are the areas that interest researchers. From a practicality perspective, 1-bit audio has broad applicability and interest beyond the limited sphere of high-end audio, which alone cannot come close to justifying such an active level of R&D. Interestingly though, few and far between are the papers on DACs.
For all that, the current resurgence of DSD which has swept the high-end audio scene grew up after the Lipshitz and Vanderkooy debate had blown over. Clearly, the DSD movement did NOT meet its Waterloo in Lipshitz and Vanderkooy. Its new-found popularity is based not on arcane adherence to theoretical tenets, but on broadly-based observations that, to many ears, DSD persists in sounding better than PCM. It is certainly true that the very best audio that I personally have ever heard was from DSD sources, played back through the Light Harmonic Da Vinci Dual DAC. However, using my current reference, a PS Audio DirectStream DAC, I do not hear any significant difference at all between DSD and the best possible PCM transcodes.
There is no doubt in my mind that we haven’t heard the last of this. We just need to be true to ourselves at all times and keep an open mind. The most important thing is to not allow ourselves to become too tightly wed to one viewpoint or another to the extent that we become blinkered.