Friday 27 September 2013

Intern Opportunity

BitPerfect is looking to take on an Intern to work on one of our upcoming projects. We need somebody who can conceptualize and design user interfaces both on the Mac and on the iPad.  Programming expertise is not a requirement. The chosen candidate can work remotely from wherever they are located in the world.

Anybody interested is requested to e-mail me directly.

BitPerfect v1.0.8 Released!

BitPerfect version 1.0.8 is now available on Apple's Macintosh App Store.  It is a free upgrade for all existing BitPerfect users.  Version 1.0.8 is a maintenance release.

Version 1.0.8 fixes the looping/skipping problem that has been plaguing certain users with iTunes 11.x.

It is ready to support the new OS/X 10.9 (Mavericks) which is expected to be announced soon.

Wednesday 25 September 2013

Not coming out

It was twenty years ago today … actually, no, it was thirty-nine years ago.    I visited my friend Paul Leech at his parents house.  We were both 19 at the time.  He invited me up to his bedroom and introduced me to something that had an instant impact, and has been an important part of my life ever since.  No, we're not coming out - that would be as much of a shock to his wife as it would be to mine!

You see, Paul kept a primitive old gramophone in his bedroom.  That way his parents did not have to listen to his records.  Paul had just bought two new LPs, and was anxious to play them for me.  I am mighty glad he did.

First up was Eric Clapton's 461 Ocean Boulevard.  In 1974, Clapton was emerging not only from a period of heroin addiction, but this album was also to mark his emergence as a solo artist.  Despite the immediate and lasting impression it made on a pair of 19 year olds, 461 was not well received by the critics of the day.  Most expressed disappointment at some level, but in general one senses that they were not ready to accept Clapton's new songwriting-based aesthetic and a more textured and phrasing based guitar style, perhaps expecting an offering more in line with his guitar hero reputation.  One of its interesting aspects is that it is mainly an album of cover versions, Clapton contributing only three tracks.  His fellow musicians were a mostly unremarkable crew, although bassist Carl Radle did appear with him in Derek and the Dominos.  Today, 461 Ocean Boulevard is more broadly accepted as a masterpiece. 

Those in search of a definitive digital version need to search out the Japanese SHM-SACD remastered version.  But you'll need (i) an SACD player, and (ii) somewhere between $60 and $80 burning a hole in your pocket.  The SHM-SACD is finely textured, dynamic, and beautifully captures Carl Radle's deepest and tastiest bass lines and Jamie Oldaker's precise, yet pounding drums.

By contrast, Supertramp's Crime of the Century is a Proggy classic, cut from the same cloth as Dark Side of the Moon.  Supertramp had been a pretty unremarkable band up until that time, with a changing line-up.  Although Breakfast in America has proven their most commercially successful album, Crime of the Century is their true Magnum Opus.  Although not constructed as a concept album, it plays that way with themes of loneliness, uncertainty, and mental instability.  It gives a good impression of being a cohesive story about the life of a young outcast.  Like the aforementioned DSOTM, Crime demands to be played at ear-splitting volume, under the influence of whatever it takes to chill you out.  Supertramp was already splitting apart at the seams when Crime was being recorded, and to this writer's ears, they came close to, but never managed to scale, the same creative peaks again.  They still tour, and are a great band to go see.

Unlike 461 Ocean Boulevard, no native high-resolution digital remaster of Crime of the Century has ever been released.  None of the CD releases are worth searching out, and your best bet is to seek out a good quality vinyl rip ('needle drop') from the nether regions of the internet.  Unless, of course, you have the wherewithal to track down and play the "Speakers Corner" 180g audiophile-grade LP.

So there you have it. Two albums which have remained near the top of my all-time favourites list for my entire adult life.  Thanks, Paul!

Friday 20 September 2013

Dither - some Hard Data.

Some of you are apparently quite skeptical about the claims I made regarding dither - even regarding the most basic claims.  This skepticism, it seems to me, is borne mostly out of some false assumptions.  So I think this post might be useful in clearing those up. 

I have posted below (sorry, but I cannot find a way to intersperse them throughout the text) some graphs I have prepared showing a selection of Fourier Transforms I put together.  What I did was to produce some WAV files containing a 0dB pure tone at a frequency of 344Hz, and added various levels of dither.  The files are all 16-bit, 44.1kHz, standard CD resolution.  The choice of frequency is very important, and I will come back to that later.  The important point is that it was chosen to enable me to illustrate the point I am going to make with the most clarity, and not to mislead.

Skip this paragraph if technical disclosure details bore you.  At BitPerfect we have developed our own audio analyzer, which has unusually high resolution.  Unfortunately, this analyzer is not available to you, so you cannot use it to confirm my results.  Also, its graphical output is somewhat utilitarian to say the least, and does not lend itself to illustrating nice on-line posts.  The FFTs I am showing were instead created using Audacity, which is a free program you can download yourself if you are interested.  For those who want to do just that, the FFTs were done with 16,384 samples using a Blackman-Harris window function (number of terms not specified), and exported to text files for graphing in MS Excel.

The first graph is labelled "Undithered".  This, as its name suggests, is a pure tone with no dithering applied.  I want to draw your attention to two features.

First, the background noise level is down at something like -180dB.  Many of you will be wondering how a 16-bit audio file can have a background noise level so low that it would take 30-bits to encode it.  Good question.  The answer is that the file itself contains no information whatsoever at those frequencies.  The correct value for background noise level would actually be minus infinity.  What you are seeing instead is the fact that the mathematics of the FFT are done by a computer whose calculations are performed using numbers with finite precision.  The thousands of calculations that go into producing every result each carry forward a tiny rounding error which ends up being the result you see.  The background noise represents the mathematical limit of Audacity's FFT algorithm.  By comparison, BitPerfect's own analyzer has a background noise level which is 100dB lower!

Second, there are a whole bunch of spikes starting at about 1kHz, and stretching out up to 22.05kHz.  This, you might think, is not unusual.  These are the harmonics of the 344Hz base tone (the 2nd harmonic, and one or two others, are missing), and represent the harmonic distortion spectrum of the QE (Quantization Error).  It is interesting to note that the QE here comprises 100% harmonic components and no measurable un-correlated noise.  The QE can therefore be seen to be entirely a distortion-based problem.  The highest peaks are at -100dB, and most of the rest are between -100dB and -120dB.  The point has previously been made that Harmonic Distortion is more objectionable to the human ear than Noise, so I won't belabour that one.  However, this graph also lays bare the falsehood that 16-bit data encodes nothing below the -96dB level of 16-bit resolution.  For sure it cannot encode a SIGNAL at those low levels, but it certainly can encode the unwanted consequences of quantizing at the 16-bit level.

Time to expand on the carefully chosen frequency, 344Hz.  The period of a 344Hz oscillation is an integer number of samples at 44.1kHz.  128 samples to be exact.  This is chosen precisely because at such frequencies the QE does indeed totally comprise harmonic distortion.  As we move away from these frequencies it becomes more of a mix of distortion-like components plus true noise.  I could likewise choose specific frequencies which have the property that the QE comprises 100% noise and no distortion-like components.  But I have chosen a frequency that best illustrates my point, and you need to be aware that not all frequencies behave the same way.

In a moment, I will ask you to take a look at the second graph, entitled "LSB dither".  This is the exact same data, but this time with a dither signal added.  This dither is TPDF dither, with a peak-to-peak amplitude of one LSB (Least Significant Bit).  The LSB is the separation between individual quantization steps.  In other words, the magnitude of the dither is ±0.5 of one LSB, which means it is pretty much of the same magnitude as the QE itself.  Before you look at the graph, I would challenge you to ask yourself what you expect to see.  What do think it will do to the background noise level?  And what do you think it will do to the QE distortion peaks?  OK, time to take a look.

The first obvious thing is that the background noise has increased to about -125dB.  This is real now, and is no longer an artefact of the FFT algorithm.  The thoughtful ones among you will immediately ask why the noise floor is not nearer to -96dB.  After, this is roughly the magnitude of the dither signal we added in.  The answer is quite simple.  The Total Noise Power that we added may well be closer to 96dB, but that noise is spread out over all of the possible frequencies (0 - 22.05kHz).  The portion of the total dither noise found in each of the frequency bins is correspondingly lower.  Hence the -125dB noise floor.

The next thing that you will notice is that the QE distortion peaks are still visible.  This shows that this amount of dither is not enough to entirely eliminate the distortion components.  However, if you look very closely, you will see that the magnitude of the distortion peaks has gone down by a decent 10-20dB.  So our dithering has at least reduced the existing distortion, and not just masked it.  That's a pretty interesting result.  What is happening here is that the dither is so small that for many of the samples it fails to actually change the QE.  Therefore the total QE is a mixture of enough unchanged values to still encode a (reduced) amount of harmonic distortion, and some dithered values which encode noise.  The magnitude of the QE distortion peaks falls, and the noise floor rises.

Now we'll increase the amount of dither to ± one LSB.  Will that manage to completely eliminate the QE distortion peaks?  What do you think?  Please look at the graph labelled "2LSB dither".  Compared to the previous graph, the background noise level has gone up by about 3dB.  But this time all of the QE distortion peaks have been completely eliminated.  The highest peaks have been suppressed by over 20dB compared to the undithered result.

What happens if we increase the dither even further?  I have added further graphs to illustrate that.  Again, before you look, ask yourself what you expect to see.  These graphs are labelled "3LSB dither" and "4LSB dither".  If you predicted that the background noise level would go up, and the QE distortion peaks would remain totally suppressed, well done.  You have learned well, Grasshopper.

Finally, just to make life easier I have added two final summary graphs.  "Combo" superimposes all the curves onto one graph to make comparisons easier, and "Zoom Combo" zooms in on the area between 15kHz and 20kHz and shows perfectly how the 2LSB (
±1LSB) dither has completely suppressed the QE distortion peaks at the cost of very little additional noise.

That was a relatively simplistic analysis, but I think it gets the point across quite well.  Quantization Error can add distortion, and dithering can really show it the door.  It is all very clever stuff.  Just don't read anything more into it than I have tried to show you.  There are many more complexities lurking in the mathematical murk to trip up anyone who wants to use this type of data to make glib generalizations regarding the bigger picture.

Thursday 19 September 2013

iTunes 11.1

Tunes 11.1 is apparently available for download, although it is still refusing to show up in my App Store updates list.  However, we have been using the pre-release beta for several weeks now and have encountered no new problems. I think BitPerfect users are safe to proceed with the update.

Signal-to-Noise Ratio and the Cocktail Party

I want to go over some potentially obvious stuff mainly because I want it to set the table for tomorrow's post.  You have all heard of the old chestnut where, if you focus clearly, it is often possible to pick out one individual conversation from among the hubbub of a noisy cocktail party.  There may be a hundred people all talking at the same volume.  Together, this forms the noise, and it sounds like we have a hundred times more Noise than the Signal which we are trying to extract from it.  Clearly, the Noise overwhelms the Signal.  Yet most of us have already performed this social experiment, so we know it is not that hard to do.  What, then, is going on here?

To understand this, we need to go back to the concept of noise.  What exactly is Noise, and what makes it different from Signal?  Basically, noise occurs whenever what we are observing appears to be random.  Consider a sequence of random numbers.  What makes them random is that we can discern no pattern or sequence within them, regardless of the level of analytical sophistication, whether real or hypothetical, that we can bring to bear upon them.  If any such pattern can be established, then the numbers are no longer random.  Actually, generating truly random numbers is an astonishingly challenging task, as any expert in cryptography will tell you.  In audio, if the signal - whether an analog signal or a digital representation thereof - is totally random, then it comprises totally noise.

Having said that, there are many different flavours of random.  For example, we can generate a sequence of random numbers that lie between 0 and 1.  Or between -10 and +10.  The other interesting thing is that we can generate random numbers where all the different numbers do not actually have the same chance of appearing.  But is that random, you ask?  Yes it is, and here is an experiment you can do yourself.  Toss two coins (we will assume this to be a truly random process).  Repeat this as often as you like and make a tally of the outcomes.  Two heads or two tails will each appear about a quarter of the time.  But the combination of a head and a tail will appear about half the time.  In audio, the equivalent is what we call Noise Colours.  The noise signal itself may be random, but its frequency content can have any distribution that we like.  For example, White Noise has equal components at all frequencies, whereas Pink Noise has fewer components the higher the frequency goes.

A signal at a certain frequency can only be separated from the noise if its magnitude is higher than the magnitude of the fraction of the noise which is at that frequency.  Lets go back to the cocktail party.  A hundred people are talking, all at the same volume.  But you are only interested your boss, who is talking to the company chairman.  You can hear him discussing his thoughts on the new Vice President, an appointment that everyone expects to be announced soon.  Your boss's voice is like one frequency component in an audio spectrum, where all the other peoples' voices represent other frequency components.  By concentrating only on your boss's frequency, you can tune out all the other frequencies and listen in on his conversation.  Provided, that is, his voice stays above the residual background noise at that frequency.  So, finally the boss leans forward, and lowering his voice, tells the chairman who the new Vice President will be.  But - dammit it all! - by lowering his voice, he has reduced it below the level of the residual background noise.  And you can no longer make out what he says.  But at least there is a lesson to take away.  When the signal drops below the overall noise level, it is still possible to recover it.  But when it drops below the level of that component of the noise which is at the frequency of the signal, then it is irretrievably lost.  If the signal is fainter than the noise, it simply means that what you are listening to is indistinguishable from being random.  Your only option is to change the way you measure the signal.

So how do we know what the level of the signal is at a particular frequency, and how do we know what the background noise is?  The mathematical tool we use to analyze the frequency content of a signal is the Fourier Transform.  It is called a Transform, because the original audio data is transformed into something that bears no immediately obvious resemblance to it, and yet contains all of the information necessary to enable it to be transformed back into the exact original data.  If you want to see what the math looks like, look it up on Wikipedia!  The Fourier Transform of an audio signal turns out to be a representation of the frequency content of the audio signal.  It is a mathematically exact representation.  If there is any frequency information that cannot be precisely extracted from the Fourier Transform, this is simply because that information does not actually exist in the original signal.  Conversely, if you see something in the Fourier Transform, then, whether you like it or not, that means it is also in the original signal.

Taking our noisy cocktail party analogy, we can see what is necessary for us to identify a signal within a noisy environment.  We have to strip everything away that we can identify as not being part of the bit of the signal we are interested in, and focus just on those aspects of the data that could actually be the signal.  Provided we limit our thinking to the frequency domain, we can think this through quite nicely.  Within the data, we will be able to identify the presence of a signal at a certain frequency, but only if the magnitude of the signal is higher than magnitude of all of the noise that is within a narrow band of frequencies surrounding the one we are looking for.  And we can use a Fourier Transform to see whether that is in fact the case.

It sounds like a whole load of stuff and nonsense, but tomorrow we'll look at a practical example.

Sunday 15 September 2013

For Sale - Stello U3


We are selling our Stello U3 USB-to-SPDIF converter.  We bought it for our evaluation bench a little over 18 months ago.  We have been using the U3 whenever we evaluate a new DAC which supports a more limited range of sample rates over USB than over S/PDIF.  These days, there is no reason for anybody to do that any more.

The Stello U3 has proven itself to be at least the equal of anything similar we have evaluated at up to three times the price, and in conjunction with DAC hardware up to and including the Light Harmonic Da Vinci.  If you have a legacy DAC which is designed to give its best performance over a AES/EBU or Coaxial S/PDIF interface, you should be using a device such as the Stello U3.  The Stello connects to the Computer's USB output, and delivers the audio signal via your choice of AES/EBU or Coaxial S/PDIF.

Our U3 unit is in mint condition.  It only comes out of its box when needed for testing purposes.  We are offering it for sale in its original packaging for US$300, plus shipping.  (We are located in Montreal, Canada, so you can estimate shipping costs accordingly.)

If you are interested, please e-mail me.

Buy it new from here.

Read a review here.

Tuesday 10 September 2013

Everything You Always Wanted To Know About Dither (But Were Too Afraid To Ask)

Many digital audio Apps - BitPerfect included - have the ability to apply dither to the audio signal.  Many of those Apps provide a great deal of control over the type of dither employed, and at what stage it is added.  BitPerfect does not.  The reason is that when dither is applied it needs to applied for very good reasons, in circumstances that indicate a requirement for dither, and using an appropriate choice of dithering algorithm.  BitPerfect allows you to choose between two types of dither - an unidentified algorithm provided by CoreAudio, and a BitPerfect-implemented TPDF (Triangular Probability Density Function) dither.  BitPerfect then decides when - and if - this dither should be applied.  It is done this way because experience shows very clearly that users of those Apps which offer in-depth user control over dithering, routinely exercise that control unwisely.

So here is a brief tutorial on dither.  What it is, why we do it, and how it works.

Dither has its roots in dropping bombs during WWII.  Elaborate mechanical contraptions were devised to enable the bombardiers to aim their bombs as accurately as possible when dropping them from bombers while getting shot at.  In the safe confines of the engineering lab, the engineers could not get these devices to work with sufficient accuracy.  But wartime needs being expedient, they were installed into bombers anyway and pressed into service, where, much to the surprise of the engineers, they proved to be far more accurate than expected.  It turned out that the elaborate mechanisms were rather "sticky" in operation, and when installed on a bomber, the immense constant vibration jogged the mechanisms out of their "sticky" positions and caused them to function properly.  Those of you who still like to knock on an old analog meter before taking a reading are doing exactly the same thing.  Back in the lab, the engineers mounted the bombing aids onto a vibrating table, and all of a sudden were able to replicate the excellent in-service performance.  They termed this forced vibration "dither".

Dither is now a well-used art in digital signal processing, in video as well as audio.  I am going to focus on one particular aspect - the most important of the audio applications.  When a signal is digitized (and I will use the term 'quantized' from here on in), it is in effect assigned a very specific value, which, as often as not, is a measure of the amplitude of a voltage.  Because quantization means assigning the magnitude of the voltage to one of a limited number of fixed levels, it is inevitably the case that there is some residual error between the actual value of the voltage and the stored quantized value.  This error is called the quantization error, and what is typically done is to choose the quantization level which is closest to the actual value of the voltage, thereby minimizing the quantization error.  Are you with me so far?

It turns out that minimizing the quantization error is not necessarily the best way to go.  This method produces a quantization error signal that correlates quite well with the original signal.  In plain English, this means that the quantization error signal looks more like distortion than it does noise.  And we know that the human ear is far more tolerant to noise than it is to distortion.  But lets stop to think about this.  Done this way, the quantization error has a magnitude which is always less than one half of the magnitude of the least significant but.  So it will only correlate with the original signal (and therefore produce distortion) if the original signal is sufficiently clean that when looked at with a magnifying glass that sees all the way down to the level of the least significant bit, the signal contains no additional noise.  But if the original signal does contain noise, and if the noise is of a magnitude that swamps the least significant bit, then the quantization error can only correlate with the noise and not with the signal, and the resultant quantization error signal will only comprise noise and no distortion.  Still with me?

So, if the original signal is clean and contains no noise, all we need to do is add some noise of our own, and any distortion components present in the quantization error signal will be replaced by noise components.  Although the magnitude of the noise we need to add turns out to be larger than the magnitude of the original distortion components, this noise turns out to be more pleasing on the ear.  A lot more pleasing, actually.  This added noise is what we call dither.

There are actually many different types of noise.  We want to add the best type of noise for the particular circumstances, and this is where it can get a lot more complicated.  BitPerfect uses TPDF (Triangular Probability Density Function) noise.  This type of noise has been shown mathematically to maximally suppress quantization error distortion with the minimum amount of added noise.  Other types of noise have other properties.  One of the most interesting is "Noise Shaped" noise.  This type of noise is more complicated, and in order to work properly has to be added within a frequency-sensitive feedback loop.  It has the interesting property that the added noise is actively shaped away from one portion of the frequency range (where the ear is most sensitive) and into another (where the ear is less sensitive).  Surprisingly, in the right circumstances, noise shaped dither is capable of suppressing the SNR to a level below the notional theoretical limit imposed by the bit depth (approx 6dB per bit).

It is important to appreciate that dithering the signal adds noise to it - noise that was not there before, and which can never be removed again afterwards.  If you dither an already-dithered music data stream, you will only be adding further noise to the already-added noise which is not normally of much benefit - in fact it will usually degrade the signal.  In particular, adding noise-shaped dither to a data stream that has already received noise-shaped dither can raise the noise to quite unpleasant levels at higher frequencies.  Many CDs are mastered with a final application of noise-shaped dither, so if you have ripped one of these, you don't really want to be applying more noise-shaped dither when it comes to playback.  Unfortunately it takes a suite of Analytical DSP Apps to determine this, and even those of us who do have these Apps generally cannot be bothered with doing it on any sort of routine basis!

I will comment specifically on two scenarios which will be of relevance to BitPerfect users.  Sample Rate Conversion and Digital Volume Control.

In BitPerfect's implementation of SRC, the 16-bit or 24-bit integer data is first transcoded to a 64-bit Float format.  SRC comprises some heavy mathematics operating on the 64-bit Float data, at the end of which we have a bunch of 64-bit Float numbers that need to be converted back to integers again.  64-bit Float numbers are stored with 48 bits of precision, and to convert them back to 16-bit (or 24-bit) integers we have to throw away the least significant 32 bits (or 24 bits, respectively) of data.  The difference between the new 16-bit (or 24-bit) value and the original 48-bit precision becomes the new quantization error.  So it is wise to apply some dither.  Easiest to do would be to choose TPDF, and this would be a good choice.  With a 16-bit output format, there is some potential benefit to be had in applying noise-shaped dither, but you would need to be confident that no further noise-shaped dither is being applied downstream.  With 24-bit data, there is a very fair argument to be made that it does not need any dithering at all, since nothing below the 22nd bit is ever audible anyway.  But dithering 24-bit data can't hurt either way.

With digital volume control, we are usually only talking about digital attenuation.  Some DACs can provide an amount of digital gain, but these are relatively few, and digital gain is not normally of interest to audiophiles, so I will ignore it here.  Digital attenuation effectively reduces the bit depth of the music data.  Every 6dB of attenuation loses you one bit of resolution.  So 24dB of attenuation loses you 4 bits of data.  Those lost bits of data drop off the bottom end, into the digital void below the LSB, and so any dither present in the signal gets lost in the process.  It is therefore advisable to re-dither the signal after performing volume control.  TPDF dither would again be a good choice here, but this could also be an ideal place to introduce an appropriate noise-shaped dither function.

In BitPerfect, we apply dither after all SRC operations, using either TPDF or CoreAudio according to the user selection, but we do not yet dither after volume control.  This is to do with limitations on the way we have written our audio engine, but version 1.1 of BitPerfect will include a completely new audio engine that can perform real-time dithering on the volume control.

So spare a thought for those valiant WWII bombardiers.  All this was furthest from their minds!

Wednesday 4 September 2013

Facts, Myths, Misconceptions, and Lies

When you want to pull the wool over someone's eyes, the easiest way to do it is to employ indisputable facts, and present them in such a way as to enable the listener to draw an inappropriate conclusion, usually with the aid of some logical leger-de-main.  Sometimes they won't bite.  So you move on to myths.  You invoke something that has been repeated so often that, through plain and simple brainwashing, it has come to be regarded as fact even though it has no factual basis to support it.  Sometimes even the myth fails.  So you have little choice but to move onto misconceptions.  Misconceptions are a tricky animal, because they usually have their origins in indisputable facts.  What you do is take a fact, tack something onto it that is not itself supported by facts, and hope to get away with passing it off as a fact by association.  And if all else fails, you just have to lie.

Most of the time deceptions are manifestations of plain and simple dishonesty.  But sometimes misconceptions themselves worm their way into our knowledge base in such a way that we forget where the lines lie between the underlying facts and the remora-like add-ons that attached themselves like suckers.  To create a misconception you need three simple ingredients.  First, you need a subject matter which is inherently complicated, but which can be easily described in simple terms so that people can feel comfortable with it to a certain level.  Next, you need something which is actually incorrect.  Third, you need a logical link - an argument or demonstration by which the flawed conclusion becomes conflated with the factual aspects of the subject matter.

There are many misconceptions that plague the complex world of digital audio.  I am going to try and clarify one of them for you, because you will hear it repeated ad nauseam.  It is the one where you will be told that by careful application of dither, you can extract signals whose amplitude lies below the LSB (least significant bit), and which, apparently, cannot otherwise be encoded.

I want to introduce you first to the concept of image enhancement.  Many of you will have come across a scene in a movie or TV show where a blurred image is magically sharpened to incredible resolution with little more than a handful of keystrokes on a computer.  It is utter balderdash.  A misconception.  But behind it lie some real facts.  I have seen real demonstrations of blurry indistinct video of a military nature, where a computer is asked to uncover the presence of, say, a tank.  When the magic button is pressed, a tank does indeed appear out of the murk.  These demonstrations are very compelling and very convincing - and, yes, are factual.  They key element of what is happening is that in order to see a tank, you have to be looking for a tank.  If the murk hides a car, a hot-dog vendor, or even Osama Bin Laden holding a bazooka, you will never see them.  You need to be looking specifically for them.  How the technology works, is that the complex object hiding in the murk of the image disturbs the murk ever so slightly, and by correlating the disturbance with what we know of the appearance of a tank, we can infer to a greater or lesser degree, the presence of the tank.  It is important to note that, so long as we are looking for the tank we will never be able to perceive Osama and his bazooka.  We have to be looking specifically for him.  And his bazooka.

Applying this to audio dither, the same arguments hold true.  Yes, it is possible to take 16-bit audio data and apply the right sort of dither, and then observe a recorded pure tone at -120dB, which is 20dB below the Signal-to-(Quantization)-Noise ratio of the 16-bit format.  That's a good 4 bits below the level of the 16th bit.  This is because we were looking for that specific pure tone using a Fourier Transform.  Music is NOT being encoded at -120dB.  We are simply inferring the presence of the -120dB tone through its residual interaction with the (dithered) noise.  In fact the residual evidence of the tone was only inserted in the first place during the dithering process.

Here is the true test of whether one can encode music below the 16th bit purely through the magic of dither.  Take an undithered 16-bit recording.  Apply 20dB of digital attenuation using whatever dithering technique you like.  Mathematics says that the 4 least significant bits of the music data will be pushed below the level of the 16th bit where they are simply lost forever.  However, according to the misconception, the 4 least significant bits of the 16-bit data stream have somehow been safely preserved by the dither, all ready to be re-constructed.  Now take the attenuated data stream and apply 20dB of gain.  Have you managed to reconstruct the original music data?  No, you haven't.  Not even close.

This is not to say that dither is neither useful nor valuable.  The fact is that it CAN reduce the measured SNR to below the theoretical quantization noise limit over a certain range of the audio bandwidth.  This is useful and valuable.  But to demonstrate the extraction of test tones from below that limit is only a mathematical party trick, and nothing more should be inferred from it.

I close with a rhyme, wherein the landlord of a country Inn uses his own form of dither to fit ten men into nine bedrooms:

Ten weary travellers, cold and wet
To a country Inn did come.
The night was cold, their clothes were damp,
Their hands and feet were numb.

"Come in, come in", the landlord cried,
"A room for all ye men,
"But I have only nine spare beds,
"And ye are numbered ten."

"Then one of us shall take the floor,
"For none of us are gay!"
"Nay, nay, my friends", the landlord cried,
"There is an easy way."

Two men he placed in room marked A,
The third he lodged in B,
The fourth and fifth in C and D,
The sixth in bedroom E.

Seven, eight, nine, in F, G, H,
Then back to A did fly,
Wherein remained the tenth and last,
And lodged him safe in I.

Nine beds had he, and yet had one
For each of travellers ten.
And this is it that puzzles me,
And many wiser men.

Monday 2 September 2013

Our Bits Are Perfect!

Not many people can make that proud boast.  But we can, and now you can too, with our new range of BitPerfect merchandise.  Buy a T-shirt, cap, tote bag, hoodie, mug, or even a skin for your iPhone or Samsung Galaxy, and let the world know that your Bits are Perfect too! 

Are your Bits Perfect?  Don't be shy about it…