How many of you have ever been to The Louvre in Paris? Next question, how many of you have been to Paris but have not visited The Louvre? In the humble opinion of your author, The Louve is the finest museum and art gallery in the world. The scale, breadth, and sheer quality of The Louvre is quite breathtaking. This is not an attraction that you set an afternoon aside to check off your bucket list. I don’t see how you can do it justice in less than two solid days.
The Louvre is a stunning experience. The exhibits are laid out with flawless vision, and a quality that places the world’s finest works of art in an appropriate setting, while avoiding the temptation of overblown in-your-face opulence. It also avoids that sense of tired dowdiness that mars so many of Europe’s oldest and most famous establishments. It is surprisingly spacious. Unlike Florence’s Uffizi, it manages to maintain a serene, contemplative, and unhurried ambiance, even during the busiest times. But that is with one glaring exception. Everybody who comes to The Louvre comes to see Leonardo Da Vinci’s masterpiece Mona Lisa; sometimes that is the only thing they enter the building to accomplish. Mona Lisa sits inside a room of its own, maybe 2,000 square feet in all. It is always packed. What do you do? You can elbow your way to the front, arrogantly and unashamedly, and people [insert your preferred stereotype here] do just that. Or you can just go with the flow, and gradually drift towards the front over a period of maybe 45 minutes. This is a great way to contemplate the work’s iconically enigmatic message. The third thing you could do is to check it off your bucket list from the back of the room and head for the exit and a nice tasse of French coffee.
I took the middle approach, allowing me to contemplate La Giaconda from a number of perspectives. One of the thoughts that occupied my mind was this one. How do I know I am looking at the real thing? Was that just a reproduction, with the original sealed deep in a climate-controlled nuclear-bomb-proof vault? How would I know? If I was in possession of an accurate replica, which I could hang above my fireplace, would I be able to tell that it wasn’t the original? We’ll set aside the logical (not to mention legal) difficulties of explaining how it got from the Louvre to my Lounge.
Proving the provenance and authenticity of original art is a thorny problem. Currently disputed works include those of artists from Caravaggio to Jackson Pollock. Inevitably, the problem ends up being resolved based on the opinion of a single expert, or panel of experts. On occasion, even the opinions of those experts is hotly contested.
So if only a tiny panel of experts can tell the real Mona Lisa from a high quality fake, it follows that I can’t. If I had the money and the desire, I could commission my own Mona Lisa replica, and hang it in my lounge. It may as well be the real thing, as far as I would be concerned, because I would be quite unable to tell the difference. Of course, the more of an Art buff I was, and the greater my appreciation and knowledge of fine art, the more accurate the replica would have to be so that I could not tell it apart. And, of course, the more expensive it would be.
The parallels to high end audio are obvious. The concept of the “original” Mona Lisa corresponds to the original performance of musicians in a recording studio, and the concept of the “fake” corresponds to the distributed recording. There are two significant differences. The first is that in the music sphere there is an intent on the part of the artist to distribute accurate copies of the art to consumers. The second is that in music we all get to be members of the “select panel of experts” who get to weigh in on the authenticity debate. Even, unfortunately, the Internet Trolls who believe that their opinions count for more than yours.
Sunday, 29 June 2014
Wednesday, 18 June 2014
Snow Leopard Dropped
A quick update for BitPerfect's few remaining Snow Leopard users.
We were obliged to stop supporting Snow Leopard with the 2.0.1 release of BitPerfect when we discovered that the iTunes Library Framework implementation in iTunes 11.x for Snow Leopard was flawed. Recently, iTunes 11.2.2 was released and this bug was corrected, so we had real hopes of re-introducing support for Snow Leopard with the next release of BitPerfect.
Unfortunately, in extensive Beta testing, BitPerfect is still proving to be rather unstable under Snow Leopard, and the problems appear to be rather deep-seated so we don't anticipate a simple fix.
The bottom line for us is that the amount of effort required to attempt to continue supporting Snow Leopard does not justify the hold-out user base (estimated at less than 100 users out of a total active user base of over 21,000). Therefore Snow Leopard users should no longer expect to see the return of Snow Leopard support in a future update.
It used to be that Snow Leopard was worth hanging on to from an audio perspective since it provided access to BitPerfect's Integer Mode. However, with Integer Mode support enabled in Mavericks this no longer became a critical issue. Finally, extensive listening tests have revealed that Mavericks does in fact sound significantly better than Snow Leopard, with or without Integer Mode.
So, our apologies must go out to our Snow Leopard users, but, being realistic, you must surely have anticipated that this day would come at some point.
We were obliged to stop supporting Snow Leopard with the 2.0.1 release of BitPerfect when we discovered that the iTunes Library Framework implementation in iTunes 11.x for Snow Leopard was flawed. Recently, iTunes 11.2.2 was released and this bug was corrected, so we had real hopes of re-introducing support for Snow Leopard with the next release of BitPerfect.
Unfortunately, in extensive Beta testing, BitPerfect is still proving to be rather unstable under Snow Leopard, and the problems appear to be rather deep-seated so we don't anticipate a simple fix.
The bottom line for us is that the amount of effort required to attempt to continue supporting Snow Leopard does not justify the hold-out user base (estimated at less than 100 users out of a total active user base of over 21,000). Therefore Snow Leopard users should no longer expect to see the return of Snow Leopard support in a future update.
It used to be that Snow Leopard was worth hanging on to from an audio perspective since it provided access to BitPerfect's Integer Mode. However, with Integer Mode support enabled in Mavericks this no longer became a critical issue. Finally, extensive listening tests have revealed that Mavericks does in fact sound significantly better than Snow Leopard, with or without Integer Mode.
So, our apologies must go out to our Snow Leopard users, but, being realistic, you must surely have anticipated that this day would come at some point.
Wednesday, 11 June 2014
Solti’s Ring
Sir Georg Solti was one of the preeminent conductors of the latter part of the 20th Century. Many conductors are polarizing figures, particularly among the musicians over whom they hold sway. But the polarizing opinions of Solti are equally distributed among musicians, critics, the general public, and even his fellow conductors. Most unusually, Solti’s reputation was built upon one particular recording, or rather one series of them, Wagner’s Ring Cycle which he recorded for Decca between 1958 and 1964.
Wagner’s Ring Cycle is a demanding undertaking. It is a series of four Operas of colossally long duration. Most of them go on for a good four hours. It is the very definition of Heavy Opera. Unlike most Operas, the Ring Cycle has no arias. They are constructed like vast tone poems, with the various themes and ideas upon which the plot is constructed being represented by ‘leitmotifs’ - recognizable music fragments - which appear throughout the entire cycle. Wagner wrote all his own libretti, which are obtruse and allegorical, and in German. The Ring Cycle is a compositional tour de force. Not only did Wagner turn the entire concept of what an Opera should be upside-down, he turned the whole idea of how an Opera should be experienced upside-down. He built a custom-designed Opera House in Bayreuth for the express purpose of performing the Ring Cycle. To this day, the Bayreuth Festspielhaus is used solely for the production of Wagner’s Operas. And if you should want to go there and hear one, you will have to enter a lottery!
The design of the Festspielhaus was totally radical in its day, and to some extent remains so today. It has no boxes or privileged seating. Wagner felt that all men were equals when it came to listening to his Opera, and there should be no special seating for those of high status, who would mingle with those of the lowest status who could still afford a ticket. There is not a single bad seat in the house. The orchestra pit is most unusual, not only in shape (a large part of it is in a shallow space underneath the stage), but also in the way its acoustics work. The pit, the orchestra, and the conductor are all invisible from the auditorium. It is designed to project the sound onto the stage, and from there to be reflected back to the audience. The violins in the pit sit to the right of the conductor, not to the left, so that their sound dispersion pattern favours projecting back over the stage. The idea was that the sound would seem to emanate from stage itself, and [I haven’t been] it is by all accounts considered to hold good today.
Wagner wished for his Ring Cycle to be experienced as one single entity. Das Rheingold, the first to be completed, was given its own premiere, but the other three, Die Walküre, Siegfried, and Götterdämmerung, were not performed until the opening of Bayreuth Festspielhaus in 1876, as part of the first complete Ring Cycle.
Wagner was so determined to achieve the exact sound palette he had in his mind that he went so far as to design a whole raft of brass instruments specially for The Ring. He commissioned instrument makers to go away and make them. [Bruckner and Strauss are composers who went on to call for some of Wagner’s brass tubas in some of their own symphonic works.] Das Rheingold even demands an ensemble of 18 anvils with hammers, and Wagner went to far as to specify the dimensions and weight of each of them!
Into this musical context we must also thrust a heavy dose of political context. The Ring Cycle is dosed to the eyeballs with Germanic symbolism and mythology, expressed always in the most abstract of ideas. It is therefore manna from heaven for those who would wish for it to be construed to support their own extreme philosophies, particularly as they bear on race, history and culture. Wagner was also somewhat of an anti-semite, an attitude which seemed to develop relatively late in life in response to the perceived public adulation of Mendelssohn and Meyerbeer and not Wagner. Anti-semitism was also quite fashionably established across many walks of life a mid-19th Century Europe which still was a Century away from discovering political correctness.
When Adolf Hitler and the Nazi movement adopted Wagner’s music as being at the heart and soul of what they perceived as their own philosophy, the linkage became somewhat symbolic for an entire generation of Europeans who faced death, destruction, and even extermination at the hands of Germany over the course of two world wars in rapid succession. The Nazis perceived in Wagner’s writings support for their most extreme ambitions, and many observers were willing to take these interpretations at face value. Even today there remains considerable disagreement over what Wagner’s personal beliefs may have been.
At the outbreak of WWII, Georg Solti was a young conductor seeking to embark upon a career is his native Hungary. He was also a Jew, and had the good fortune to find himself in Lucerne when the war broke out. He was wisely advised not to come home, and saw the war out in Switzerland. At the end of the war he was invited to participate in the reconstruction of post-war Germany by taking on the prestigious post of Director of the Bavarian State Opera in Munich. Although he had no real experience conducting Opera - and was a Jew, never mind not being a Catholic - he took up the post and held it for 5 years before moving on to a similar post in Frankfurt.
In the 1956, John Culshaw of the Decca Record company in London was anxious to record a major classical work that would showcase the capabilities of the new stereophonic music systems now being sold. He knew that the drama and sonorities of Wagner’s Ring Cycle would be absolutely perfect for the task, but received a lot of resistance within Decca because of the political ramifications. This was, after all, only 11 years after the war, and in Britain the population was still on rations. Additionally, the complications - and cost - of recording such a work was not to be underestimated. But neither was Culshaw, and in 1958 the project got underway with the recording of Das Rheingold. Georg Solti, by this time gaining a reputation as a rising star in the world of Opera, plus not having risen so highly that his availability to commit to such an undertaking was not an issue, was contracted for the task.
Culshaw wanted to recreate for the home audience the experience of going to the Opera, but he did not wish to record a live performance, with all the ‘warts-and-all’ aspects of doing so. Also, he wanted the flexibility which a studio setting would give when it came to microphone placements. He even had 18 anvils custom-made to Wagner’s specifications. Das Rheingold, the first and the shortest of the four Ring Cycle Operas, was an ideal vehicle to test the waters. It would take less work to record it, and Decca could spend some time seeing how the record did before committing to the remainder of the Cycle. As it happened, Das Rheingold was a great success, outselling even Elvis Presley’s latest offering (to the enormous consternation of rivals EMI), but nonetheless it was not until 1962 that the forces were reassembled to record Die Walküre. By the time Götterdämmerung was completed in 1964 the whole enterprise was beginning to take on almost legendary status. The BBC even sent a film crew out to Vienna to make a documentary about the recording of the Cycle which is still available on DVD as “The Golden Ring”.
Success was not an adequate word. Solti’s Ring Cycle was a stunning success. To this day this 15-hour exposition of some of the most introspective Opera on the standard repertoire remains the best selling classical music recording of all time. Let me give you an idea of how highly it is rated. In 2009, the Esoteric company of Japan performed a complete multi-channel digital remaster and released the result in a 14-SACD boxed set. They put it up for sale in December 2009 in a limited edition of 1,000 sets priced at $800 by mail order only. By April they had all sold out.
Solti’s recording of the Ring Cycle remains a tour de force, and is still regarded as probably the finest classical music performance - certainly the finest Opera performance - ever captured for posterity. The immediate success of the recording made Solti an international sensation. He became Director of the Royal Opera Company Covent Garden, and from there went on to hold the Directorship of the Chicago Symphony Orchestra for 25 years. In each case he took over a provincial ensemble of no great reputation and shaped it into one of the finest in the world, collecting both admirers and detractors along the way. Sir Georg Solti died in 1997, on the same day as Mother Teresa, and in the same week as Princess Diana. But his name will forever live on in association with his great recording of the Ring Cycle.
Here is a cool Youtube extract from the BBC Documentary “The Golden Ring” that I mentioned. The excerpt is Siegfried’s Funeral March from Act III of Götterdämmerung. I think it captures the essence of Georg Solti and his Ring Cycle rather wonderfully.
Wagner’s Ring Cycle is a demanding undertaking. It is a series of four Operas of colossally long duration. Most of them go on for a good four hours. It is the very definition of Heavy Opera. Unlike most Operas, the Ring Cycle has no arias. They are constructed like vast tone poems, with the various themes and ideas upon which the plot is constructed being represented by ‘leitmotifs’ - recognizable music fragments - which appear throughout the entire cycle. Wagner wrote all his own libretti, which are obtruse and allegorical, and in German. The Ring Cycle is a compositional tour de force. Not only did Wagner turn the entire concept of what an Opera should be upside-down, he turned the whole idea of how an Opera should be experienced upside-down. He built a custom-designed Opera House in Bayreuth for the express purpose of performing the Ring Cycle. To this day, the Bayreuth Festspielhaus is used solely for the production of Wagner’s Operas. And if you should want to go there and hear one, you will have to enter a lottery!
The design of the Festspielhaus was totally radical in its day, and to some extent remains so today. It has no boxes or privileged seating. Wagner felt that all men were equals when it came to listening to his Opera, and there should be no special seating for those of high status, who would mingle with those of the lowest status who could still afford a ticket. There is not a single bad seat in the house. The orchestra pit is most unusual, not only in shape (a large part of it is in a shallow space underneath the stage), but also in the way its acoustics work. The pit, the orchestra, and the conductor are all invisible from the auditorium. It is designed to project the sound onto the stage, and from there to be reflected back to the audience. The violins in the pit sit to the right of the conductor, not to the left, so that their sound dispersion pattern favours projecting back over the stage. The idea was that the sound would seem to emanate from stage itself, and [I haven’t been] it is by all accounts considered to hold good today.
Wagner wished for his Ring Cycle to be experienced as one single entity. Das Rheingold, the first to be completed, was given its own premiere, but the other three, Die Walküre, Siegfried, and Götterdämmerung, were not performed until the opening of Bayreuth Festspielhaus in 1876, as part of the first complete Ring Cycle.
Wagner was so determined to achieve the exact sound palette he had in his mind that he went so far as to design a whole raft of brass instruments specially for The Ring. He commissioned instrument makers to go away and make them. [Bruckner and Strauss are composers who went on to call for some of Wagner’s brass tubas in some of their own symphonic works.] Das Rheingold even demands an ensemble of 18 anvils with hammers, and Wagner went to far as to specify the dimensions and weight of each of them!
Into this musical context we must also thrust a heavy dose of political context. The Ring Cycle is dosed to the eyeballs with Germanic symbolism and mythology, expressed always in the most abstract of ideas. It is therefore manna from heaven for those who would wish for it to be construed to support their own extreme philosophies, particularly as they bear on race, history and culture. Wagner was also somewhat of an anti-semite, an attitude which seemed to develop relatively late in life in response to the perceived public adulation of Mendelssohn and Meyerbeer and not Wagner. Anti-semitism was also quite fashionably established across many walks of life a mid-19th Century Europe which still was a Century away from discovering political correctness.
When Adolf Hitler and the Nazi movement adopted Wagner’s music as being at the heart and soul of what they perceived as their own philosophy, the linkage became somewhat symbolic for an entire generation of Europeans who faced death, destruction, and even extermination at the hands of Germany over the course of two world wars in rapid succession. The Nazis perceived in Wagner’s writings support for their most extreme ambitions, and many observers were willing to take these interpretations at face value. Even today there remains considerable disagreement over what Wagner’s personal beliefs may have been.
At the outbreak of WWII, Georg Solti was a young conductor seeking to embark upon a career is his native Hungary. He was also a Jew, and had the good fortune to find himself in Lucerne when the war broke out. He was wisely advised not to come home, and saw the war out in Switzerland. At the end of the war he was invited to participate in the reconstruction of post-war Germany by taking on the prestigious post of Director of the Bavarian State Opera in Munich. Although he had no real experience conducting Opera - and was a Jew, never mind not being a Catholic - he took up the post and held it for 5 years before moving on to a similar post in Frankfurt.
In the 1956, John Culshaw of the Decca Record company in London was anxious to record a major classical work that would showcase the capabilities of the new stereophonic music systems now being sold. He knew that the drama and sonorities of Wagner’s Ring Cycle would be absolutely perfect for the task, but received a lot of resistance within Decca because of the political ramifications. This was, after all, only 11 years after the war, and in Britain the population was still on rations. Additionally, the complications - and cost - of recording such a work was not to be underestimated. But neither was Culshaw, and in 1958 the project got underway with the recording of Das Rheingold. Georg Solti, by this time gaining a reputation as a rising star in the world of Opera, plus not having risen so highly that his availability to commit to such an undertaking was not an issue, was contracted for the task.
Culshaw wanted to recreate for the home audience the experience of going to the Opera, but he did not wish to record a live performance, with all the ‘warts-and-all’ aspects of doing so. Also, he wanted the flexibility which a studio setting would give when it came to microphone placements. He even had 18 anvils custom-made to Wagner’s specifications. Das Rheingold, the first and the shortest of the four Ring Cycle Operas, was an ideal vehicle to test the waters. It would take less work to record it, and Decca could spend some time seeing how the record did before committing to the remainder of the Cycle. As it happened, Das Rheingold was a great success, outselling even Elvis Presley’s latest offering (to the enormous consternation of rivals EMI), but nonetheless it was not until 1962 that the forces were reassembled to record Die Walküre. By the time Götterdämmerung was completed in 1964 the whole enterprise was beginning to take on almost legendary status. The BBC even sent a film crew out to Vienna to make a documentary about the recording of the Cycle which is still available on DVD as “The Golden Ring”.
Success was not an adequate word. Solti’s Ring Cycle was a stunning success. To this day this 15-hour exposition of some of the most introspective Opera on the standard repertoire remains the best selling classical music recording of all time. Let me give you an idea of how highly it is rated. In 2009, the Esoteric company of Japan performed a complete multi-channel digital remaster and released the result in a 14-SACD boxed set. They put it up for sale in December 2009 in a limited edition of 1,000 sets priced at $800 by mail order only. By April they had all sold out.
Solti’s recording of the Ring Cycle remains a tour de force, and is still regarded as probably the finest classical music performance - certainly the finest Opera performance - ever captured for posterity. The immediate success of the recording made Solti an international sensation. He became Director of the Royal Opera Company Covent Garden, and from there went on to hold the Directorship of the Chicago Symphony Orchestra for 25 years. In each case he took over a provincial ensemble of no great reputation and shaped it into one of the finest in the world, collecting both admirers and detractors along the way. Sir Georg Solti died in 1997, on the same day as Mother Teresa, and in the same week as Princess Diana. But his name will forever live on in association with his great recording of the Ring Cycle.
Here is a cool Youtube extract from the BBC Documentary “The Golden Ring” that I mentioned. The excerpt is Siegfried’s Funeral March from Act III of Götterdämmerung. I think it captures the essence of Georg Solti and his Ring Cycle rather wonderfully.
Wednesday, 4 June 2014
Bit Depth and Noise Floor
I thought I would write a post on the relationship between bit depth and signal-to-noise ratio. The two are related, and the relationships can get quite complicated depending on how deep into the analysis you want to go.
The simplest rule of thumb, one that most of us know, is that the signal-to-noise ratio (SNR) in a simple linear PCM representation is 6dB per bit. So a 16-bit system will have a signal-to-noise ratio of 96dB. This is a pretty good approximation to the real answer, which is that the SNR is 1.76dB plus 6.02dB per bit. So in reality a 16-bit system has a theoretical maximum SNR of 98.08dB. Why is that?
It is the nature of digital audio that when we digitize a signal we must unavoidably incur a quantization error. This is the difference between the actual instantaneous value of the signal and the nearest quantization level which is what we actually record. Sometimes we are going to round up to the nearest quantization level, and sometimes down. In any case, the quantization error will always have a value between zero and one-half of the Least Significant Bit (LSB). It is the analysis of this quantization error that gives rise to the equation alluded to in the previous paragraph. The takeaway here is that any attempt to digitally encode a real-world signal must give rise to a background noise floor of quantization noise, and we can predict what its level ought to be in any given system.
These days anybody can download a high-quality, free, audio analysis program such as Audacity, which is available for both Mac and Windows. We can use that to perform a frequency analysis of any music track we like. Suppose you obtain a 16-bit test track with a single tone at some arbitrary frequency (1kHz is often chosen) at full scale (0dB) and plot the frequency response. What do expect to see?
It is easy to imagine that the answer would be a continuous background at -98dB, with a single spike at 1kHz going up to 0dB. But that is not what we see. Instead, the background will be a lot lower than -98dB, and we will have a few additional low level peaks that we can’t explain. For the purposes of this post I am not interested in the low level peaks. These are all related to the non-random nature of quantization noise; they can be eliminated by adding dither noise, and we can conveniently ignore them right now. Aside from that, what is happening? Why is the noise floor quite a long way below -98dB?
The answer is that it is the sum total of all of the quantization noise which amounts to -98dB. But that noise is distributed pretty much equally among all the different frequencies between zero and one-half of the sample rate. When we plot the frequency analysis using Audacity we see how that noise is distributed within the frequency space. Each fragment of noise at each frequency is well below -98dB, but taken together they will all add up more or less to -98dB. The next question is a little trickier. If the background noise on a frequency analysis is actually lower than -98dB, just how low should it be? And can we do anything useful with it?
The answer to this is not as cut and dried as you might like it to be. Remember that the noise is actually divided out, more or less evenly, among all of the different frequencies. Every plot on that frequency analysis curve is NOT simply a measure of the noise at that particular frequency. What it actually is is a measure of the sum total of the noise at all frequencies in the immediate vicinity of that frequency. When we do the frequency analysis, we can stipulate how many frequencies we want to divide the audio band into. The more frequencies we choose, the fewer frequencies will lie within the immediate vicinity of each point, and the lower the SNR noise floor will appear. So the value of the SNR noise floor will depend on the resolution with which we want to plot it. Surely that can’t make sense?
But it does make perfect sense, and here is why. The frequency response is the outcome of a Fourier Analysis, which takes a chunk of raw audio data and analyzes its frequency content. The number of frequencies it spits out is equal to one half of the number of audio samples that it analyzes. So if you want to increase the number of frequencies you have to increase the number of audio samples, which means you have to analyze a chunk of music of a longer duration. For example, if I perform a Fourier analysis with 16,384 samples I can reduce the noise floor by more than 20dB. But 16,384 samples is more than one third of a second of music, and this is important.
Lets go back to the notion of the noise floor actually being well below the quantization noise limit, which is in principle the lowest level signal that can be digitally encoded. If it is, for example, 20dB below that limit, it implies that I should be able to encode a signal that is at a level approaching 20dB below the quantization limit. And this is quite correct - I am indeed able to do that. But there are swings and roundabouts to be negotiated. If I need one third of a second of music to get the noise floor down below the level of the sub-quantization signal that I want to encode, then it also follows that that signal must persist for a full third of a second in order for it to appear above that noise. So, to the extent that I can actually make this happen, it is a pure party trick and has no practical value. The constituent parts of real music signals that exist below the -98dB quantization limit of 16-bit audio are not pure tones of extended duration.
But the story doesn’t end there. Recall that I said that the quantization noise is more or less divided equally among all the frequencies below one-half of the sample rate. Well, it needn’t be. It is possible to skew the distribution such that more of it goes into some frequencies than others. This is only useful if there are “unused” frequencies available where there is no signal content, which we can filter out later. We can then reduce the amount of quantization noise at the frequencies of interest, at the expense of increasing it at the “unused” frequencies. With a sample rate of 44.1kHz, though, there are no such “unused” frequencies. The 44.1kHz sample rate was devised precisely because all of the audio frequency band (as we understood it at the time) was contained neatly within its encodable bandwidth. To address this we would need to increase the sample rate quite considerably so that a whole new range of “unused” high frequencies are now accessible. We can therefore pull some of the quantization noise out of the audio frequencies and put it into those high frequencies instead. This process is called “Noise Shaping”, and what is really interesting about it is that the noise shaping process itself can also be used to back-fill usable “signal” into the newly-created gap between the quantization noise limit and the locally reduced noise floor.
Taken to its limits, this process can become very interesting. By reducing the bit depth all the way down to 1-bit, now the quantization noise itself gets to be massive at -7.78dB. But by increasing the sample rate all the way up to 2.8MHz, we can create enough “unused” frequency space that we can “shape” an additional 110dB (or more) of noise out of the audio bandwidth and stick it where the sun don’t shine. Sound familiar?….
The simplest rule of thumb, one that most of us know, is that the signal-to-noise ratio (SNR) in a simple linear PCM representation is 6dB per bit. So a 16-bit system will have a signal-to-noise ratio of 96dB. This is a pretty good approximation to the real answer, which is that the SNR is 1.76dB plus 6.02dB per bit. So in reality a 16-bit system has a theoretical maximum SNR of 98.08dB. Why is that?
It is the nature of digital audio that when we digitize a signal we must unavoidably incur a quantization error. This is the difference between the actual instantaneous value of the signal and the nearest quantization level which is what we actually record. Sometimes we are going to round up to the nearest quantization level, and sometimes down. In any case, the quantization error will always have a value between zero and one-half of the Least Significant Bit (LSB). It is the analysis of this quantization error that gives rise to the equation alluded to in the previous paragraph. The takeaway here is that any attempt to digitally encode a real-world signal must give rise to a background noise floor of quantization noise, and we can predict what its level ought to be in any given system.
These days anybody can download a high-quality, free, audio analysis program such as Audacity, which is available for both Mac and Windows. We can use that to perform a frequency analysis of any music track we like. Suppose you obtain a 16-bit test track with a single tone at some arbitrary frequency (1kHz is often chosen) at full scale (0dB) and plot the frequency response. What do expect to see?
It is easy to imagine that the answer would be a continuous background at -98dB, with a single spike at 1kHz going up to 0dB. But that is not what we see. Instead, the background will be a lot lower than -98dB, and we will have a few additional low level peaks that we can’t explain. For the purposes of this post I am not interested in the low level peaks. These are all related to the non-random nature of quantization noise; they can be eliminated by adding dither noise, and we can conveniently ignore them right now. Aside from that, what is happening? Why is the noise floor quite a long way below -98dB?
The answer is that it is the sum total of all of the quantization noise which amounts to -98dB. But that noise is distributed pretty much equally among all the different frequencies between zero and one-half of the sample rate. When we plot the frequency analysis using Audacity we see how that noise is distributed within the frequency space. Each fragment of noise at each frequency is well below -98dB, but taken together they will all add up more or less to -98dB. The next question is a little trickier. If the background noise on a frequency analysis is actually lower than -98dB, just how low should it be? And can we do anything useful with it?
The answer to this is not as cut and dried as you might like it to be. Remember that the noise is actually divided out, more or less evenly, among all of the different frequencies. Every plot on that frequency analysis curve is NOT simply a measure of the noise at that particular frequency. What it actually is is a measure of the sum total of the noise at all frequencies in the immediate vicinity of that frequency. When we do the frequency analysis, we can stipulate how many frequencies we want to divide the audio band into. The more frequencies we choose, the fewer frequencies will lie within the immediate vicinity of each point, and the lower the SNR noise floor will appear. So the value of the SNR noise floor will depend on the resolution with which we want to plot it. Surely that can’t make sense?
But it does make perfect sense, and here is why. The frequency response is the outcome of a Fourier Analysis, which takes a chunk of raw audio data and analyzes its frequency content. The number of frequencies it spits out is equal to one half of the number of audio samples that it analyzes. So if you want to increase the number of frequencies you have to increase the number of audio samples, which means you have to analyze a chunk of music of a longer duration. For example, if I perform a Fourier analysis with 16,384 samples I can reduce the noise floor by more than 20dB. But 16,384 samples is more than one third of a second of music, and this is important.
Lets go back to the notion of the noise floor actually being well below the quantization noise limit, which is in principle the lowest level signal that can be digitally encoded. If it is, for example, 20dB below that limit, it implies that I should be able to encode a signal that is at a level approaching 20dB below the quantization limit. And this is quite correct - I am indeed able to do that. But there are swings and roundabouts to be negotiated. If I need one third of a second of music to get the noise floor down below the level of the sub-quantization signal that I want to encode, then it also follows that that signal must persist for a full third of a second in order for it to appear above that noise. So, to the extent that I can actually make this happen, it is a pure party trick and has no practical value. The constituent parts of real music signals that exist below the -98dB quantization limit of 16-bit audio are not pure tones of extended duration.
But the story doesn’t end there. Recall that I said that the quantization noise is more or less divided equally among all the frequencies below one-half of the sample rate. Well, it needn’t be. It is possible to skew the distribution such that more of it goes into some frequencies than others. This is only useful if there are “unused” frequencies available where there is no signal content, which we can filter out later. We can then reduce the amount of quantization noise at the frequencies of interest, at the expense of increasing it at the “unused” frequencies. With a sample rate of 44.1kHz, though, there are no such “unused” frequencies. The 44.1kHz sample rate was devised precisely because all of the audio frequency band (as we understood it at the time) was contained neatly within its encodable bandwidth. To address this we would need to increase the sample rate quite considerably so that a whole new range of “unused” high frequencies are now accessible. We can therefore pull some of the quantization noise out of the audio frequencies and put it into those high frequencies instead. This process is called “Noise Shaping”, and what is really interesting about it is that the noise shaping process itself can also be used to back-fill usable “signal” into the newly-created gap between the quantization noise limit and the locally reduced noise floor.
Taken to its limits, this process can become very interesting. By reducing the bit depth all the way down to 1-bit, now the quantization noise itself gets to be massive at -7.78dB. But by increasing the sample rate all the way up to 2.8MHz, we can create enough “unused” frequency space that we can “shape” an additional 110dB (or more) of noise out of the audio bandwidth and stick it where the sun don’t shine. Sound familiar?….
Subscribe to:
Posts (Atom)