Thursday 13 August 2015

Audio Files for Audiophiles

A few years back I purchased a Windows App called dBpoweramp.  It met my needs for a while.  Upon installation, I learned that the App supports a huge number of different music file formats.  Today, that list reads:  AIFF, ALAC, CDA, FLAC, MP3, WAV, AC3, AAC, SND, DFF, DSF, MID, APE, MPP, OGG, OPUS, WVC, W64, WMA, OFR, RA, SHN, SPX, TTA, plus a number of variants.  Who knew there were so many audio formats?  I for one have never heard of most of these.  Counting through them, I have only ever used eight of ’em, and of the rest I have only ever come across three.  Well, good for dBpoweramp!  I can sleep comfortably knowing that if I ever want to convert a TTA file to OFR I probably have just the the tool for the job.  Today I use iTunes, and it only supports a handful of those file formats.  What's the story with that, I wonder?

Music file formats arise to fill a need, and each and every one of those file formats I mentioned represents a need which went unmet at the time the format was devised.  Actually, I even invented an audio file format of my own, way back in 1979.  In my lab at work I had a Commodore Pet computer which was attached to an X-Y graphic printer.  I used the Pet to control a laser test apparatus and had the printer output the results graphically.  As the printer’s two stepper motors (one for each axis) drove the pen holder across the paper, the tone of each motor would sound a certain note.  By having the printer draw out a certain pattern I could get it to play “God Save the Queen”.  Not very imaginative, I agree, but it was quite a party trick in its day.  I then wrote a program that would allow you to compose a tune which you could then play on the printer.  Finally, I devised a simple format with which to store those instructions in a file which the Commodore Pet saved on its audio-cassette tape drive.  I could conceivably claim to have developed one of the world’s first audio file formats!  Looking back, the Zeitgeist was quite delicious - a computer audio file stored in digital form on an analog audio cassette tape.

But back to the myriad file formats supported by dBpoweramp.  Each one has a purpose, and I suppose not all of those involve the distribution of music for commercial or recreational purposes.  For what it’s worth, the developers of iTunes could have arranged for it to support all of these weird and wonderful file formats too, but they didn’t.  In some cases there are good technical reasons why they would elect not to support a particular file type.  In others it is a matter of choice.  Some of those formats are Audio-Video formats, and iTunes is, after all, a multi-media platform.  But for the purposes of this post I am going to constrain the discussion to audio-only playback.

Not just the developers of iTunes, but every developer who writes an audio playback App has to decide for themselves which of those (and, perhaps others too) file formats their App is going to support.  I am going to break these formats down into four camps - Uncompressed, Lossless Compressed, Lossy Compressed, and DSD.  Lets look at each one, and discuss how they handle the audio data.

The simplest audio file formats contain raw uncompressed audio data.  The actual audio data itself is written straight into the file.  It is not manipulated or massaged in any way.  The advantage of doing it this way is that the audio data can be both written and read with the minimum of fuss.  The two most commonly used examples of this type of file format are AIFF (released by Apple in 1987) and WAV (released by Microsoft in 1991).  iTunes will happily load either file type.

Back in those days the file size of a AIFF or WAV file was utterly prohibitive.  A five-minute track ripped from a CD would be require a file size of 53MB which represented something like three times the capacity of a good-sized hard disk drive at that time.  Clearly, if computers were going to be able to handle digital audio something needed to be done to reduce the file size.  To address this problem, during the early 1990’s the Fraunhofer Institute in Germany developed what we now call the MP3 file format.  What this does is, effectively, to figure out which parts of an audio signal are the least audible and throw them away.  By throwing away more and more of the audio signal the file size can be reduce rather dramatically.  This approach is referred to as Lossy Compression, because it compresses the file size but loses data (and therefore sound quality) along the way.

The first MP3 codec was released in 1995.  In 1997 Apple introduced their own version of MP3 called AAC.  Structurally, AAC is very similar to MP3 but has some significant differences aimed at improving the subjective audio quality.  However, each format requires a separate codec to be able to read it.

By the turn of the millennium, the confluence of the ubiquitous MP3 codec and the ready availability of hard discs with capacities exceeding 100MB had ushered in the age of computer audio.  As always, there was a fringe element who still preferred the improved sound quality of uncompressed WAV and AIFF files, but who were still troubled by the enormous file sizes.  Programs like PKZip proved that ordinary computer files could be compressed to a smaller file size and subsequently regenerated in their exact original form.  However, PKZip did not do a very good job of reducing the file size of audio files.  A dedicated lossless compressor was needed, one specifically optimized around the characteristics of audio data.  In 2001 the first FLAC format specification was released.  The FLAC codec could produce compressed files that are approximately 50% of the size of the original WAV or AIFF file.  Later, in 2004, Apple responded with their own lossless compression format ALAC (or Apple Lossless).

Meanwhile, in 1999, Sony and Philips tried and failed to launch the SACD format as a successor to the ubiquitous CD.  SACD uses a radically different form of audio encoding called DSD.  Ultimately, the SACD launch flopped, although the format has never actually gone away, and the DSD format acquired its own band of loyal followers.  The developers of SACD each developed a file format that could handle DSD data - the DFF format developed by Philips, and the DSF format developed by Sony.  By 2011, DSD enthusiasts had demonstrated the ability to manage DFF and DSF files on their computers, and to transmit DSD data to a DAC, and the first DSD-compatible DACs trickled onto the market.  Consumer-level DSD recording equipment is also now available, and produces output files in either DSF or DFF format.

Today, although other file formats do persist, the computer audio market has more or less settled down to four format types, with two competing format offerings for each type.  AIFF (Apple) and WAV (everybody else) for uncompressed audio; ALAC (Apple) and FLAC (everybody else) for lossless compression; and AAC (Apple) and MP3 (everybody else) for lossy compression.  DSF and DFF continue to duke it out in the DSD world.  Note that, except for DSD which Apple does not support in any form, the formats have shaken down into pairs of Apple and everybody else.  Why is this?

Frankly, there is absolutely no reason why any software player should not be able to support all of these file formats.  The process of reading (or writing) any of them is quite straightforward.  Yet, Apple originally refused to support WAV and MP3 formats in its iTunes software and iPod players, instead requiring users to use its own AIFF and AAC formats.  In fact, to this day Apple products continue to refuse to support FLAC files, instead requiring its customers to use ALAC.  From a functionality viewpoint none of this really matters.  ALAC and FLAC can be seamlessly transformed from one to the other and back again using high quality free software (as can AIFF and WAV, AAC and MP3).  But this is not what customers want.  So why is it that Apple takes this unhelpful stance?

The reason is simple.  From a business perspective, Apple’s entire iTunes ecosystem exists not to provide you with a platform on which to manage and play your music, but as a platform to sell you the music that you listen to.  Apple’s business model is for you to buy your music from them rather than from anybody else.  Therefore when you buy music from the iTunes Store it comes in AAC format only and not in MP3 or FLAC.  But if you buy your music virtually anywhere else, it only comes in the MP3 and FLAC formats.  Virtually nobody outside of Apple is interested in selling AAC or ALAC files.

The situation is even more bizarre when it comes to lossless compressed audio.  Apple isn’t actually selling any ALAC files on its iTunes Store!  You really have to wonder what their thinking is.  Do they consider that they are motivating me to buy lossy AAC files from Apple instead of lossless FLAC files from someone else?  Really?  Hey, maybe they’re right - maybe that’s exactly what we do.  Consumers are a pretty dumb species after all.  It has also been suggested that Apple is scared of becoming targets of a patent troll if they start offering FLAC support, but that seems to be an even more feeble explanation.  Google have been supporting FLAC in Android for some time now, and have not attracted any trolls’ attention that we know of.  In any case, as far as I know, nobody has ever identified any significant patents FLAC might possibly be infringing, given that it is all open-source.  But given the size of Big Apple (even bigger than Big Google!), they would certainly make for a tasty target.

Interestingly, back in the early 2000's, with the overwhelming consumer embrace of MP3, Apple realized very early on that if they were going to continue refusing to support MP3 they could risk losing out on the whole mobile music opportunity to one of the competing platforms such as Rio, Zune, Nomad/Zen and others.  Deciding to support MP3 was a key tactical business decision that took the air out of their competitors’ sails and ultimately paved the way for the total dominance of iPod and iTunes.  Today, despite the overwhelming consumer embrace of FLAC, there is no such pressure on Apple to encourage them towards supporting FLAC.

At one time there was an App called Fluke which allowed users to import FLAC files into iTunes.  Unfortunately, that loophole relied on a 32-bit OS kernel, and as a result Fluke no longer works with OS X 10.7 (Lion) and up.  Just to be clear, there are absolutely no technical reasons whatsoever that prevent Apple from supporting FLAC files.  It would be a trivial move for them to make, if they wanted to.  Their refusal to support FLAC is entirely a tactical decision on their part.

The situation with DSD is significantly different.  OS X and iOS are both fundamentally incapable of supporting DSD.  It would require significant changes to the way their audio subsystems work in order for that to happen, and, being honest, I see some fundamental issues that they would face if they ever considered doing that.  Consequently, I don’t see DSD being supported by Apple in any form for the foreseeable future.  The way the audio industry has got around that is with the DoP data transmission format.  This dresses up native DSD data so that it looks like PCM, which OS X can then be fooled into sending to your DAC, but it means that any Mac Apps which support DSD would have to be extremely careful how they went about it.  BitPerfect, for example, will do that for you, but iTunes won’t.  This is different from the situation with FLAC files.  Whereas iTunes would have no problems reading a FLAC file if Apple chose to let it, it would have absolutely no idea what to make of a DSD file.  You might as well ask it to load an Excel spreadsheet.

In order for BitPerfect to manage DSD playback, we have created what we call the Hybrid-DSD file format.  Hybrid-DSD files are ALAC files that iTunes recognizes, and can import and play normally.  However they also contain the native DSD audio data as a sort of “trojan horse” payload.  If iTunes plays a Hybrid-DSD file it plays the ordinary ALAC content.  But if BitPerfect plays the file it plays the DSD content.  We really like that system.  Other software players have instead adopted the idea of a “proxy” file.  This is a similar thing, but instead of containing ordinary ALAC music plus the DSD payload, they contain no music and include information that enables the playback software to locate the original DSF or DFF file.  Some may like the proxy file format, indeed some may prefer it, but we don’t, and this isn’t the place to discuss that.

It has often been suggested that BitPerfect could adopt a mechanism similar to either the Hybrid-DSD file or the proxy file to import FLAC files into iTunes.  And yes, we could do that.  But frankly, why bother converting from FLAC to "Hybrid-FLAC", when it is even easier to transcode FLAC files to ALAC using a free App such as XLD.  It is simple and effective, and the ALAC files can just as easily be transcoded back into FLAC form if needed, with zero loss of fidelity.

The final topic I want to cover in this post is Digital Rights Management (DRM).  This is a method by which the audio content in the file is encrypted in such a way as to prevent someone who does not “own” an audio file from playing it.  In other words, it is an anti-piracy technique.  Files containing DRM are pretty much indistinguishable from files that do not contain it, and most audio file formats support the inclusion of DRM (I am given to understand that FLAC does not, but I am not 100% sure).  For example, Apple included DRM in almost all of the music downloads sold on iTunes between 2004 and 2009.   

DRM is something that tends to get forced on the distributors (i.e. the iTunes Store) by content providers (i.e. the record labels), and is a major inconvenience for absolutely everybody involved in the playback chain.  Between 2004 and 2009 Apple had grown to hold sufficient clout that they could dictate to the content providers their intention to discontinue supporting DRM.  Today, DRM is a non-factor, although the new Apple Music service, plus TIDAL, and other streaming-based services which offer off-line storage, rely on their own versions of it.  The advance and retreat of DRM is an interesting barometer of who has the upper hand at any time in the music business between the distributors and the content providers.