Tuesday, 30 April 2013

Ripping your CD Collection - V. Storage

My own collection, which includes a healthy mix of CDs and downloaded high resolution audio files is now approaching the 30,000 files mark.  This occupies 1.2TB of disk space for the FLAC master library and 1.4TB for a replicated Apple Lossless library.  I am totally paranoid over the consequence of losing all of this to a HD failure.  I have endured several HD failures in my lifetime, so I know that they happen more often than you would like, and generally without warning.  I don’t want to imagine how much time it would take me to re-rip my CD collection, and re-download all my downloaded music.  I don’t know if that would even be possible!

There are various ways to address this issue.  First, is the obvious one, back it up!  I must confess I don’t like backup utilities much.  They seem to be overly complicated, designed as they are to deal with the myriad complexities of the gamut of computerized data.  I always suspect that when I need to use it, I won’t be able to work it!  Another option is to keep a nice simple copy of everything on another HD somewhere.  The problem is that you get lazy, and you don’t back up as often as you should.  But in general, backing up and copying strategies can be quite successful.

The solution I favour is to put everything on a Networked Attached Storage (NAS) unit.  A NAS is a very powerful device, but can also be very expensive.  There are cheap units at the $200 price point, and expensive units at the $1,000+ price point.  And that’s without any HDs!  The cheap units are more than likely going to be less reliable, and that completely misses the point of using a NAS for storing your audio data.  The most expensive units have a higher level of performance (read/write speeds, ability to connect simultaneously with multiple users with little performance loss, etc) and are aimed mainly at corporate applications.  As usual, the best deal is to be found somewhere in the middle.  I bought a Synology DS411j unit a couple of years ago and it has worked flawlessly for me.  Tim has a similar one.

After you buy your NAS, you have to kit it out with multiple Hard Drives.  Mine takes four 3.5” hard disks.  These disks are then formatted into what is known as a RAID array.  RAID is a scheme whereby the data is distributed across the multiple disks in such a way that if one disc should suddenly fail, then none of your data is lost.  You just replace the failed disk, rebuild the RAID, and you are back again at full operation.  Some NAS units will let you do that without even having to power it down (so-called “hot-swapping”).  There are different “Levels” of RAID, and they all have different characteristics.  Some will even allow more than one disk to fail with no loss of data.  When I started out, I built my NAS with four 500GB HDs, but soon learned my lesson and swapped them out for four 2TB HDs.  Configured in “Synology Hybrid RAID” this gives me 5.8TB of storage, and I can survive the failure of any one HD without losing any data.  As well as my music, I can store a whole load of other mission-critical data on there too.  My NAS is also plugged into a UPS so that if the power fails it can shut itself down gracefully.  The whole shebang sits in a room in the basement, next to the network router, well out of harm’s way.

My paranoia leads me to also be very picky about which model of HD I put in my NAS.  Ordinary consumer grade HDs fail – in my experience – at a rate higher than I feel comfortable exposing myself to.  Therefore I only buy “Enterprise Class” HDs.  These have a much lower failure rate, but, as you might expect, cost a good 50% more.  My own HD of choice has been the Western Digital RE4 family, and I own several of these without having incurred a single failure.  At one time they were in short supply and I had to buy a pair of Seagate Constellation ES 2TB units.  One of these subsequently failed – actually, the Synology warned me of its impending failure before it actually died, and Seagate were happy to replace it under warranty based on the warning alone – and the replacement unit has so far functioned without further incident.

I can attach an external USB drive to my NAS, so I have plugged in a spare 1.5TB external LG unit.  I place a further double-paranoid copy of my FLAC library on there for safe keeping, just in case...

If you are keeping tally, that’s about $1,200-$1,500 spent on my file storage system.  6TB of cheap storage capacity will set you back about $600, so that’s an awful lot extra to budget for little more than data security. You may feel differently from me as to whether it is worth it.  But so long as you have taken the time to pause and think it through, then that’s good enough for me.

Back to Part IV.
Part VI can be found here.

Monday, 29 April 2013

Ripping your CD Collection – IV. Ripping

By now I have laid the groundwork for the things you have had to be thinking about before you start ripping your CDs.  Now its time to look at the ripping process itself.  This is where we remove the audio data from the CD and replace it with a collection of files on your computer.  And the first thing you might be wondering is why that should be any sort of a problem.

The answer is that the music data on a CD is not stored as a nice neat collection of files.  Instead, the data is arranged more like a cassette tape, or an LP record.  It is designed to come off the CD in a continuous stream, and to be more or less played on the fly.  In the same way an LP has visible “gaps” between the tracks to guide you to drop the needle in the right place, so a CD has pointers in a “Table of Contents” to enable the CD player to find places in the stream where one track ends and the next one begins.

When the CD format was first established, the method of reading data off the disc – using a laser to detect the presence of microscopic pits embedded just below the disc’s surface – was truly revolutionary.  I was working in the laser industry at the time, and when the commitment was made to move forward with CD technology, laser technology had not yet produced a laser design that lived nearly long enough to last for the desired lifetime of the player!  That aside, there were two problems that the project’s design goals sought to overcome.  The first was that since the disc was to be read on-the-fly, there was no possibility to go back and read a section again if you got a bad reading first time.  The second was that the discs were going into an uncontrolled consumer environment, and were likely to be subjected to physical damage and deterioration.  Taken together, this meant that the discs had to have – to as great a degree as was practical – the ability to survive a significant loss of data, or significant faults in the accuracy of the data.  It is a tribute to the success of the technologies put in place to address these problems that CD technology today is of archival quality, something which the more advanced DVD technology does not quite match.

So when you rip a CD, one thing that is of prime importance is to be assured that you have ripped it accurately.  Since you are going to rip the CD once (we hope), and thereafter only ever play the ripped tracks, you want to be confident that you ripped it right the first time.  Most reputable CD ripping tools have special capabilities that seek to assure you of this, with ever greater degrees of confidence, but we are not going to lose sight of the fact that this is job #1.  I am going to mention three ripping tools which all have the compelling advantage of being free.  The ubiquitous iTunes for either Mac or Windows platforms; XLD for the Mac platform; and EAC for the Windows platform.  Of the three, I personally use EAC.  I have never made a serious attempt to compare XLD with EAC, and have occasionally used iTunes as well, but I have the most personal experience with EAC.  XLD and iTunes both have the advantage that they can rip and import into iTunes in one step.  I will describe each one, and outline the main options that impact the quality of the resultant rips.

Starting with iTunes, it has one option only “Use error correction when reading Audio CDs”.  Apple doesn’t actually tell us what this means, but it does seem to result in more accurate rips than with it turned off.  It probably invokes a number of re-reads if the data it’s getting looks at all dubious.  Checking this setting does slightly slow down the ripping process, so that makes sense.

XLD is the big daddy of ripping tools for the Mac, and it is 100% free.  It provides a plethora of settings to customize the ripping process to just what you want.  It has three ripping modes – in order of speed they are “Burst”, “XLD secure ripper” and “CD Paranoia”.  These trade ripping speed for progressively more thorough assurance of an accurate rip.  XLD can make use of the AccurateRip database, an on-line resource that cloud-sources data to permit XLD to verify the accuracy of the rip.  It also has the option to test the ripped track before committing it to file as a belt-and-braces check of the final rip.  Ripping a disc in CD Paranoia mode can often take well over half an hour.

EAC has an even more bewildering array of options and customizations, but at least it simplifies the ripping quality choices to “High”, “Medium”, and “Low”.  You should just use “High”.  EAC supports AccurateRip.  There is a good set of instructions from Carlton Bale here, although his graphics show a slightly older version of EAC than the most current one.  I don’t like his entry for Naming Convention – I prefer to use “%tracknr1%-%title%”, but you may prefer otherwise.  Once you have EAC set up, you won’t have to fiddle with the settings again.  EAC is the most thorough ripping tool I know, but its complexity may put you off.

You are going to spend many, many, many hours ripping a sizeable CD collection.  Do yourself a favour and don’t skimp on taking however much time is necessary to get your ripping process properly set up and streamlined before you start cranking the handle.

Back to Part III.
Part V can be found here.

Friday, 26 April 2013

Ripping your CD Collection – III. Data Grooming

I hope I have managed to get across in my previous posts that the central benefit behind ripping your files and playing from a computer lies in the ability to use the metadata to enhance your playback experience.  What you will be able to do with your music collection will be limited – to a large extent – by the quality of your metadata.  So I want to spend some time on what is often called “metadata grooming” before getting round to the actual process of ripping the CDs and embedding the metadata into the resultant files.

For most users, this will not present too much of a challenge.  The metadata structures, and the way that current software uses it, was pretty much defined back in the nineties by techno-geeks who did the work in order to fill unserviced gaps their own needs.  So, if you listen mostly to rock, pop, and other modern musical genres, you too will probably find that the existing metadata structures meet most, if not all, of your needs.  But if you listen to classical music, you will find that the opposite is true.  For this reason, I will devote a separate post to data grooming for classical music listeners.  This current post considers only the existing mainstream musical needs.

Metadata is just Fields and Content, and how the two match up.  The Fields are the names given to the specific categories of metadata.  Typical Fields are Album, Artist, Title, and so forth.  The Content is what goes in the Field.  So an item of Content which might be “The Rise And Fall Of Ziggy Stardust And The Spiders From Mars” needs to go into a Field called “Album” and not one called “Artist”.  Data Grooming is basically the process of adjusting the Content to make sure it is properly descriptive of the Field, and in a form that will provide the most utility to you when it comes time to use it to browse your music collection.  But don’t worry too much, because most of the time that is going to happen automatically without your having to think about it.

One thing that is very important to grasp is that Apps such as iTunes provide only perfunctory support for the richness that good metadata provides.  Here is an obvious example.  Most Beatles songs were written by Lennon & McCartney.  So what do you enter into the “Composer” Field?  You have several approaches you can take.  First, you can enter “Lennon & McCartney”.  Second, you can enter “John Lennon & Paul McCartney”.  Or, if your name is Paul McCartney, you can write “Paul McCartney & John Lennon”.  So what happens if you want to browse your music collection to find cover versions of Beatles songs?  If you use Column Browser to list the “Composers”, you will find separate entries for all three of those variants, and they will not be adjacent to each other because they get listed in alphabetical order.  You might scroll down to “Lennon & McCartney” and not realize that the other entries exist further up and further down the Composers list.

Data Grooming is the process of finding and correcting these sorts of ambiguities.  And the first step in correcting them is to do your best to make sure they don’t happen in the first place, although if you buy downloaded music you don’t have any control over the metadata which has already been embedded into it.  When you rip your own CDs, you have the opportunity to perform a first pass over the metadata and make sure it conforms to one consistent standard.  Of course, you need to put some thought into what that standard should be.  Whatever you cannot correct at rip (or download) time, you will have to "groom" afterwards.

Think about how you want to use all that metadata.  When you browse through the list of Composers – and believe me that list can quickly grow to be pretty darned big – how do you want all the entries to appear?  Do you want to see “Bob Dylan” or “Dylan, Bob”.  If the former, “Bob Dylan” will be listed between “Bob Crew” and “Bob Feldman”.  If the latter, he will appear between “Dvořák, Antonín” and “Earle, Steve”.  It’s all about what makes the most sense to you, and you really need to spend time thinking about it before you start ripping.  But at the same time, you should bear in mind that the most popular nomenclature is “Bob Dylan” and that this will be what is employed in most everything you download, so if you want to standardize on “Dylan, Bob”, you need to be prepared to do a lot of Data Grooming to correct these entries.  Of course, some of you are going to believe it is only right and proper to use “Zimmerman, Robert...

Another important aspect to be aware of is that most metadata standards actually support multiple-valued entries.  So we can enter TWO items of Content for the “Composer” in our Beatles collection.  John Lennon” and “Paul McCartney” can appear as two separate entries in the list of composers, and any search for songs written by “John Lennon” (or “Lennon, John”…) will come up with songs he co-wrote with others as well as songs he wrote by himself.  However – big however – you need to be aware that a simple software App such as iTunes does not support multiple value fields.  A Day In The Life” would show up in iTunes as being composed by “John Lennon/Paul McCartney” if the file was in Apple Lossless format, and “John Lennon;Paul McCartney” if the file was in AIFF format, since the two formats specify different delimiters to separate individual content items in a multi-valued field.  (Interestingly, the Apple Lossless specification means that the band AC/DC would be treated as two separate Artists, “AC” and “DC”.  Ha Ha!).

I have focused on Composers here, because it is convenient, but the same applies to Artists.  Take the album “Supernatural”.  This is a Santana album, and so the Album Artist would be “Santana”.  However, each track features a different guest vocalist.  Therefore a good strategy would be, for each track, to enter “Santana” as the Album Artist, and to have multiple values for the Artist field, “Santana” (or “Carlos Santana”, or “Santana, Carlos”, according to your personal preference) together with “Rob Thomas”, etc.  Note that there can be any number of multiple entries.

My view is that multiple value fields are a HUGE benefit.  The fact that iTunes doesn’t handle it properly today is NOT in my view sufficient reason not to take full advantage of it.  If you put it off until such time as Apps improve to support it, you may find the size of the task will have become daunting.

When you use a music player App (such as iTunes) to edit your metadata, one thing you need to be sure about is whether or not the App just updates the metadata within its own internal database, or if it then updates the metadata embedded within the individual files to reflect the changes you made.  It should be your objective to keep the metadata embedded within the files current, because you want the flexibility of being able to move from your existing music player App to any better one that comes along, without leaving your precious groomed metadata behind.  I don’t use iTunes to groom metadata, so I am not 100% sure, but I think it does update the embedded metadata whenever you make an edit to the “Get Info…” page.

My own practice is to use a totally separate App to perform Data Grooming.  That App is MusicBee and it is a free App that runs only on Windows.  I just like the convenience of their user interface.  Plus, I can use it to play music while I’m working!  My process for adding new music to my library is (1) rip or download it on my Windows machine using EAC; (2) groom the metadata on the Windows machine using MusicBee; (3) make an Apple Lossless copy on the Windows machine using dBpoweramp; (4) move everything to my NAS; and (5) import the Apple Lossless files into iTunes.  Again, not for everybody, but it’s what I do.

Back to Part II.
Part IV can be found here.

Thursday, 25 April 2013

Ripping your CD Collection – II. Which File Format?

The process of extracting the music from a CD and placing it in a set of computer files is called “ripping”.  When you come to rip a CD, the first decision you have to make is which file format you are going to use.  There are several of them.  All of them have both advantages and disadvantages.  It is useful to understand what these are so you can make an informed choice.

The first, and most dramatic distinction is between lossless and lossy files.  This arises because, in order to minimize the amount of hard disk space taken up by your music files, or alternatively to maximize the number of music files you can fit on any given hard disk, you usually want to aim to store your music in the smallest convenient file size.  Much like “zipping” a regular computer file to get it down to a small enough size to attach it to an e-mail message, music files can be “compressed” down to a manageable smaller size.  This compression can be either lossy or lossless.  Lossy compression results in a much smaller file size, but at the cost of some loss in quality.  Generally the more the compression, the smaller the file size and the lower the quality.  The term “lossy” is used because some of the musical data is irretrievably lost in the process and can never be recovered.  I never recommend ripping a CD to a lossy format unless you are very clear in your mind that you really want/need smaller file sizes and are prepared to accept compromised sound quality to get it, and that if you ever change your mind about that you will have to rip all of your CDs over again.

Lossless file formats store the music data in such a way that all of the music data that was on the original CD can be precisely recreated during playback, bit for bit, each and every time.  This can be done either with or without compression.  The music data on a CD comprises 16 bits (2 Bytes) of data, per channel, 44,100 times per second.  So every second of music requires 176.4kB of disk space.  Lossless compression techniques can reduce this disk space requirement.  The amount of compression that can be achieved will vary depending on the musical content.  Some tracks will compress more than others.  But a rough guideline is that a lossless compressed file will use about one-half to two-thirds of the disk space compared to an uncompressed file.  This allows you to make a rough estimate of how much disk space you will need to store your entire collection.  Another (very) rough guideline is to allow for 200-300MB per CD (if compressed) for rock and jazz, and 300-400MB per CD (if compressed) for classical.  YMMV.

There are two major uncompressed formats in use today, WAV and AIFF.  Both are, to all intent and purpose, identical.  The differences are marginal.  The former was developed by Microsoft for use on Windows machines, and the latter by Apple for use on Macs.  In reality, there is nothing to stop a Mac from reading a WAV file and vice-versa.  It is just a question of whether or not the software you are running supports that file format.

There are also two major lossless compressed formats in use today, FLAC and Apple Lossless (also called ALAC, and sometimes ALE).  FLAC is an open-source format which has become widely adopted, and is now very close to being the de facto industry standard.  The latest version of the FLAC spec also includes an “uncompressed” option.  Apple Lossless, on the other hand, was developed by Apple for use with iTunes.  It was originally a private format, but has now been thoroughly hacked so third party software can support it.  But Apple has still not published a specification, and some minor incompatibility issues still surface from time to time.  It has no real use other than with iTunes, and lives on only because Apple still refuses to support the FLAC standard.  Apple Lossless files usually have the extension “.m4a”.

The two major lossy compression formats are MP3 (used everywhere – even in iTunes) and AAC (used only within iTunes).  I am not going to discuss lossy formats any further.  As they say at Ruth’s Chris Steak House, “customers wishing to order their steaks well done are invited to dine elsewhere”.

You will read in some places that music stored under various different lossless file formats actually sounds different.  This appears to stretch credibility somewhat.  Let me state for the record that if you are using BitPerfect there is absolutely no possibility of this happening.  At the start of playback, the file is opened, read and decoded, and loaded into memory.  This process normally takes less than five seconds, (but can be longer for some higher resolution music tracks).  Once this five seconds is over, then the precise same data will reside in the precise same memory location, regardless of what the file format is.  For the remainder of playback there is no possible mechanism by which the file format can influence the sound quality.  Arguably, if you use different software to play back the music, and the music is streamed from disk and not from memory, then the slight differences in specific disk and CPU activity needed to access the different file formats could conceivably be reflected in the resultant sound quality.  I have never personally heard any differences, though.

It is really important to understand that the different file formats store their metadata in different ways.  WAV files, for example, normally the first format that springs to people’s minds once they have dismissed MP3, only supports a very limited number of metadata fields – few enough to be a serious strike against it in my view.  Some people modify the WAV format to include metadata in the ID3 format, which is a comprehensive metadata standard.  Unfortunately, this results in non-standard WAV files which your choice of playback software may have trouble reading.  Apple’s AIFF format supports ID3 out of the box, but Apple Lossless supports the Quicktime metadata format, a symptom of its “Apple proprietary” origins.  FLAC supports a comprehensive metadata format called Vorbis Comments, which are flexible and easy to read and write, but the standards that define what the fields should be and what should go in them are very lax indeed.  This is both an advantage (since you can define whatever metadata implementation you want) and a disadvantage (since the software that reads the metadata may not interpret it in the same way as the software that wrote it).  Having said that, this is only a problem if you want to store “extended” metadata that goes beyond the commonly implemented “standard” fields, in which case there are no existing standards that you can adhere to anyway, regardless of whatever file/metadata format you may choose.

Since having good metadata is in my view the principle raison d’etre for moving to computer audio in the first place, this argues against using WAV files.  FLAC has become the de facto standard for lossless downloaded music, but the big strike against FLAC is that you cannot load FLAC files into iTunes.  So you cannot use FLAC with BitPerfect (for the moment).  AIFF and Apple Lossless appear to sound like good bets, but in reality there is limited enthusiasm for these formats outside of the Apple ecosystem (although, to be fair, that is slowly changing).  At the root of this is a battle between Apple and the rest of the world for your music download dollars.

Please read the previous paragraph again.  There are no simple answers to the conundrum posed by it.

Most of the music I download is only available in FLAC format, but I cannot load these files into iTunes for playback using BitPerfect.  So my own approach is to transcode them to Apple Lossless immediately after downloading using a free App called XLD.  If I ever wanted to, it would be just as easy to convert them back to FLAC with absolutely zero loss of quality.  There are both free and paid Apps available on both Windows and Mac platforms which convert freely between lossless formats, so it is not really too big of a deal to convert an entire library from one format to another should the need arise.

Back to Part I.
Part III can be found here

An expanded discussion of audio file formats can be found here.

Wednesday, 24 April 2013

Ripping your CD Collection – I. Metadata

It happens quite often.  People mention to me that they have started the process of ripping their CD collections to WAV files so they can start to play them through their computers.  And they haven’t paused to think it through before they start.  This is definitely an area where “look before you leap” or “an ounce of prevention is worth a pound of cure” can be held to apply.  Maybe I can help.

This is the first in a series of posts where I will talk about the real-world issues you will encounter when you take the plunge and commit to ripping your CD collection.  This mostly introductory post addresses the main predicament we face, and how we arrived to this juncture in the first place.

Part I    Metadata

I can’t think of anybody who has successfully made the transition from CDs to computer-based audio, and abandoned it to go back to CDs.  Once your music collection is safely tucked away on Hard Disk, the ability to navigate through it, to prepare playlists and collections, to browse intelligently – even to control it remotely using a mobile device such as an iPad or an iPhone – massively enriches the experience.  Even with a relatively mundane piece of software such as iTunes.  And serious high-end products such as Sooloos elevate the user experience much closer to the incredible possibilities that the brave new world of computer audio opens up for us.

My good friend ‘Richard H’ has something approaching 3,000 CDs in his collection.  They live in a selection of shelving units and cupboards that dot his listening room.  Richard knows pretty much where most of his CDs live, but occasionally some are hard to track down.  (Particularly if I’m the person who last put it away…)  Extracting full value from that collection involves not only knowing exactly where every disk sits, but also having a good memory for what tracks are on every one of those disks.  I’m sure many of you will identify with that.  But it is at least manageable.  Its what we’ve all gotten used to.

On the other hand, my sister Barbara works for a NPR radio station, WKSU, which is one of the biggest classical music stations in the world.  Their music collection comprises MANY THOUSANDS of disks, and their ability to function as a station relies to a great degree on the people who work there knowing how to find every last piece of music they own.  It is a nightmare of a task, and I have no idea how they manage that, but it seems they do it very effectively!  The thing is, with classical music, how do you organize a library of thousands of CDs with the sole assistance of a BIG shelving unit?  Do you do it by composer, by musical style, by period, by performer, by record label … or do you just stack ’em up one by one in the order you bought ’em?  There is no natural solution.  Particularly since, with classical music, a single CD can contain works by different composers, in different musical styles, of different periods, by different performers, and so forth.

But once you rip that library into computer files, there is an immediate, and very natural solution.  All that information is just data, and computers handle data very, very well.  The challenge, then, is to get all that valuable data off the discs and into the computer.  And that is where the problems start.  Because the data isn’t on the discs in the first place.

All of the information that is relevant to the music on an audio disc is termed “metadata”.  Most of it is printed on the jewel case artwork, or in the enclosed booklet, but none of it is encoded on the disc itself.  Back when the format of the CD was devised, more than 30 years ago, the concept did not exist of wanting to read that information from the disc, and so nobody thought to standardize any method for putting it on there.  Finally, in the mid-1990’s, when a standard did emerge for combining audio and data onto the same disc, there was no interest – let alone any sort of agreement – in establishing a standard format for doing so.  So it never happened.

What did happen was what always happens when a stubborn industry fails to meet the needs of their customers.  The geeks step in and engineer a solution of their own.  In this case it was called MP3.  Techies realized that they could play their music on their computers, if only they could get their music in there in the first place.  The trouble was, music files were so darned HUGE that you couldn’t fit many on the size of hard drives that were available at the time.  It is easy to forget that way back then the capacity of a CD exceeded the capacity of most computer hard drives!  You had to do something to get the size of the files down.  That something was the MP3 format.

So it soon became possible to collect a fair-sized number of music files on your computer and play them using some custom software.  Of course, if you wanted to be able to properly manage the new music collection on your computer – or even just identify which tracks were which – you wanted access to some of that “metadata” that I described.  So the next thing the geeks developed was the ID3 “metadata” tagging system, which was a way to embed metadata into the same files that contained the music.  MP3 became a file format that would store not only the music, but also all of the information that describes the music.  It was a revolutionary development, to which the music industry responded with various enlightened practices including refusing to accept it, pretending it didn’t exist, and trying to ban it.

With the record industry standing off to one side with its head in the sand, the next thing the geeks did was to come up with huge on-line databases which “cloud-sourced” (as we would describe the activity today) all of this metadata, together with some very clever information that individual users could use to interact with it.  Using these on-line resources, you could insert a CD into your computer, some clever software would analyze the CD, correctly identify it, locate all of its metadata, and – Bingo! – automatically insert it into the resultant audio files as part of the ripping process.

The “end of the beginning” (if I may channel Churchill) came when the hard disk industry started manufacturing drives big enough to hold the contents of a large numbers of CDs, and in response the geeks started developing alternative formats to MP3 which could store the music in a lossless form – the FLAC file format is by far the most popular – thereby preserving intact all of the musical information.  These new formats would also support the new high definition audio standards that were emerging at the same time.

Thus, with the support of an enthusiastic, geek-driven, audio hardware industry, the computer-based audio paradigm reached its first level of practical maturity.  The record industry at first refused staunchly to participate, and now that they are finally getting on board with downloading as a legitimate mainstream sales & marketing channel, they can no longer hope to control its de facto standards, which continue to evolve pretty much independently, for better or for worse.  Which is why we need a set of posts like this one – and the rest in this short series – to guide you through the perils of ripping a large collection.  Because it can be quite a frustrating business, and can take up an awful lot of your time.

Part II can be found here.

Tuesday, 23 April 2013

The Nutty Professor

From some of my recent posts you will have observed that BitPerfect has been heavily involved in DSD over the past several weeks.  DSD is a form of Sigma-Delta Modulation (SDM), which, as I have pointed out, is a mathematically challenging concept.  Just to grasp its most basic form is quite an achievement in its own right, but as soon as you think you have got your head around it you learn that there are yet further wrinkles you need to understand, and it just goes and on and on and on.  It is very dense in its reliance on mathematics, and in fact you could earn a PhD studying and developing ever better forms of SDM, or coming up with newer and deeper understandings regarding distortion, stability, and the like.

For BitPerfect, we have been looking to find some “grown-up help”, in the form of a person or persons in the world of academia who can (a) help us to better understand the concepts; (b) help us to steer a path through the state-of-the-art in terms of both current implementations and the latest theoretical developments; (c) help us to avoid re-inventing wheels wherever possible; and (d) simply help to sort out facts from nonsense.  The last one of these is quite important – more so than you might imagine – because there is a lot of nonsense out there, mixed in with all the facts, and you really don’t want to waste brain cycles on any of it.

You would think it would be easy to develop the sort of relationships we are looking for, but not so.  Facts and nonsense still get in the way.  Take the Nutty Professor I recently met with.  This gentleman is head of faculty of a group which calls itself something along the lines of Faculty of Digital Music Technology (I’m not going to identify this person).  Our conversation got off on the wrong foot when, right off the bat, he insisted that DSD and PCM were in essence the same thing, and that you could losslessly convert between one format and the other (such as between FLAC and Apple Lossless, for example).  In his view, both were simply digital storage formats and so they HAD to have direct equivalence.  He was quite adamant about this, but didn't want to justify it.  I was to accept it as a fact.  Since a significant element in what I was looking for was clarity of thought on matters such as precisely this, I came away from the encounter somewhat disappointed.  At that point in time I wished I had the necessary understanding to present at least a simple argument to the Nutty Professor to counter his position, but I didn’t have one.

Today, I do – which I think is sufficiently elegant that I want to share it with you.  And I don’t think you need a background in mathematics to grasp it.

Refer to the graph below.   I have plotted signal-to-noise ratio (SNR) as a function of frequency.  The red line is a curve which is typical of DSD.  The SNR is very low across the frequency range that is important for high quality music playback (20Hz – 20kHz), and rises very dramatically at higher frequencies.  This is the famous Noise Shaping (that I described in yesterday’s post) in action.  Superimposed upon that is the blue line representing PCM in its 24-bit 88.2kHz form.  One simple way to interpret these curves is that each format is capable of fully encoding any musical signals at any points above the SNR, and is incapable of fully encoding anything below the line.
Suppose we have music encoded in the DSD format, and we convert it to 24/88.2 PCM format.  If we do this, all of the musical information represented by the hashed region labeled [A] must by necessity be lost.  This information is encoded into the DSD data stream, but cannot be represented by the PCM data stream.  Likewise, suppose we convert the 24/88.2 PCM data stream to DSD.  In this case, all of the musical information represented by the hashed region labeled [B] must by necessity be lost.  This information is encoded into the PCM data stream, but cannot be represented by the DSD data stream.  Regardless of whether we are converting DSD to PCM or the other way round, information is being lost.

Of course, there is an argument to be made regarding lost information, that if it represents something inaudible, then we can afford to throw it away.  In the example I have shown, the information contained in both [A] and [B] regions are arguably inaudible.  But don’t tell me that the conversion is lossless.  With a computer it is quite trivial to convert back and forth as often as you like between FLAC and Apple Lossless.  You can do it hundreds, thousands, even millions of times (if you are prepared to wait) and the music will remain unchanged.  Do the same thing between DSD and 24/88.2 PCM, and even after a hundred cycles the music will be all but unlistenable.

The Nutty Professor will not be advising BitPerfect.

Monday, 22 April 2013

Dither and Noise Shaping

People often ask what dithering does.  Most seem to know that it involves adding random noise to digital music for the apparently contradictory purpose of making it sound better, but don’t know how it accomplishes that.  On the other hand, very few people understand what Noise Shaping is and what it actually does – or that it is in reality a form of dithering.  Since neither concept is particularly difficult to grasp, I thought you might appreciate a short post on the subject.  I warn you, though, there are going to be some NUMBERS involved, so you might want to grab a pencil and a piece of paper to hand.

Suppose we encode a music signal as a PCM data stream (read my earlier post “So you think you understand Digital Audio?” if you are unsure how that works).  Each PCM “sample” represents the magnitude of the musical waveform at the instant it is measured, and its value is stored as that of the nearest “quantized level”.  These “quantized levels” are the limited set of values that the stored data can take, and so there will be some sort of error associated with quantization.  This “quantization error” is the difference between the actual signal value and the stored “quantized” value.  For example, suppose the value of the signal is 0.5004 and the two closest quantization levels are 0.500 and 0.501.  If we store the PCM sample as 0.500 then the associated quantization error is +0.0004.  If we had chosen to store the PCM sample as 0.501 then the associated quantization error would have been -0.0006.  Since the former is less of an error than the latter, it seems obvious that the former is the more accurate representation of the original waveform.  Are you with me so far?

One way to look at the PCM encoding process is to think of it as storing the exact music signal, plus an added error signal, comprising all of the quantization errors.  The quantization errors are the Noise or Distortion introduced by the quantization process.  The distinction between noise and distortion are critically important here.  The difference between the two is that distortion is related to the underlying signal (the term we use is “Correlated”), whereas noise is not.

I am going to go out of my way here to give a specific numerical example because it is quite important to grasp the notion of Correlation.  I am going to give a whole load of numbers, and it would be best if you had a pencil and paper handy to sketch them out in rough graphical form.  Suppose we have a musical waveform which is a sawtooth pattern, repeating the sequence:
0.3000, 0.4002, 0.5004, 0.6006, 0.7008, 0.8010, 0.7008, 0.6006, 0.5004, 0.4002, 0.3000 …
Now, lets suppose that our quantization levels are equally spaced every 0.001 apart.  Therefore the signal will be quantized to the following repeating sequence:
0.300, 0.400, 0.500, 0.601, 0.701, 0.801, 0.701, 0.601, 0.500, 0.400, 0.300 …
The resultant quantization errors will therefore comprise this repeating sequence:
0.0000, +0.0002, +0.0004, -0.0004, -0.0002, 0.0000, -0.0002, -0.0004, +0.0004, +0.0002, 0.0000 …
If you plot these repeating sequences on a graph, you will see that the sequence of Quantization Errors forms a pattern that is intriguingly similar to the original signal, but is not quite the same.  This is an example of a highly correlated quantization error signal.  What we want ideally, is for the quantization errors to resemble, as closely as possible, a sequence of random numbers.  Totally random numbers represent pure noise, whereas highly correlated numbers represent pure distortion.

In reality, any set of real-world quantization error numbers can be broken down into a sum of two components – one component which is reasonably well correlated, and another which is pretty much random.  Psychoacoustically speaking, the ear is far more sensitive to distortion than to noise.  In other words, if we can replace a small amount of distortion with a larger amount of noise, then the result may be perceived as sounding better.

Time to go back to our piece of hypothetical data.  Suppose I take a random selection of samples, and modify them so that we choose not the closest quantization level, but the second-closest.  Here is one example – the signal is now quantized to the following repeating sequence:
0.300, 0.400, 0.501, 0.600, 0.701, 0.801, 0.700, 0.601, 0.500, 0.401, 0.300 …
The resultant quantization errors now comprise this repeating sequence:
0.0000, +0.0002, -0.0006, +0.0006, -0.0002, 0.0000, +0.0008, -0.0004, +0.0004, -0.0008, 0.0000...

There are three things we can take away from this revised quantization error sequence.  The first is that it no longer looks as though it is related to the original data, so it is no longer correlated, and looks a lot more like noise.  The second is that the overall signal level has gone up, so we have replaced a certain amount of correlated signal with a slightly larger amount of noise signal.  Third, and this is where we finally get around to the second element of this post, the noise seems to have quite a lot of high-frequency energy associated with it.

So here we have the concepts of Dither and Noise Shaping in a nutshell.  By carefully re-quantizing certain selected samples of the music data stream in a pseudo-random way, we can replace distortion with noise.  Likewise, using what amounts to the same technique, we can do something very similar and replace an amount of noise in the portion of the frequency band to which we are most sensitive, with a larger amount of noise in a different frequency band to which we are less sensitive, or which we know can be easily filtered out at some later stage.

One thing needs to be borne in mind, though.  Dithering and Noise Shaping operate only on the noise which is being added to the signal as a result of a quantization process, and not on the noise which is already present in the signal.  After the Dithering and Noise Shaping, all of this new noise is now incorporated into the music signal, and is no longer separable.  So you have to be really careful about when you introduce Dither or Noise Shaping into the signal, and how often you do it, because its effects are cumulative.  If you do it too many times, it is easy to end up with an unacceptable amount of high frequency noise.

I hope you were able to follow that, and I apologize again for the ugly numbers :)

Friday, 19 April 2013

Surround Sound – Making a Comeback?

BitPerfect user John Bacon-Shone correctly pointed out in response to my recent musings on DSD that the SACD format delivers a huge amount of music in surround sound format, which is a particular boon to classical music listeners.  And not many people are aware of that.

Surround sound as a consumer format goes back to the 1970’s although its roots precede that by several decades in cinematic applications, and even in concert performances such as Pink Floyd’s “Games for May” concert of 1967.  The appeal of surround sound is quite obvious – why constrain the sonic image to the traditional one of a stage set out in front of you?  Arguably, this idea was first reduced to practice by Hector Berlioz in his “Grand Messe des Morts” or Requiem, waaaaay back in 1837, which called for four brass bands to be located in the front, the back, and the two sides of the performance venue.

In the 1970’s several consumer formats appeared, each aimed at extending the two-speaker stereo layout with an additional pair of rear speakers.  The term “Quadrophonic” was coined to describe this arrangement, and there was much enthusiasm in the music industry to support 4-channel technology with recorded material.  As we now know, when new a consumer technology tries to emerge, the major stakeholders take turns to shoot themselves in the foot.  In this case, the hardware manufacturers brought forth a plethora of incompatible solutions to deliver a four-channel experience.  QS, SQ, CD-4 (all LP-based formats), 8-track tape, and surprisingly many others, all came and went.  It was another 20 years before the movie industry, and its DVD technology, finally lit a fire under the surround sound concept.

One of the problems with surround sound is that it is much harder to create a solid 3-dimensional sonic image which creates the same soundfield for multiple listeners distributed throughout the listening room.  This problem is exacerbated for home theater applications where there is a physical image (the screen), and a need for much of the sound – particularly the dialogue – to appear to come from it.  This resulted in the adoption of the front center speaker, through which dialogue can be readily piped.  Also, in movie soundtracks the role of deep bass is dramatically different from that of pure audio, and so a special channel which provides only deep bass (the “Low Frequency Effects” channel) was specified.  This complete configuration is well known today as “5.1”.  Additional main speakers tend to be added from time to time, and today’s home theater receivers often support up to “7.1” channels.

Now that surround-sound's structural formats have at last become established, the music industry can now focus on recording and delivering music in multi-channel formats.  The venerable CD is too old to be adapted to surround sound, and so SACD is now the only viable hardware format available for delivery of multi-channel audio content (its one-time competitor and supposed vanquisher, DVD-Audio, is all but extinct now).  Except that here in the West, as consumers, we omitted to climb on the SACD bandwagon.  If only Sony and Philips had marketed SACD as a surround-sound format rather than an audio quality format, things might have turned out differently.

Anyway, audiophiles being audiophiles, the surround sound debate is alive and well.  There is an emerging body of opinion that says the centre speaker is actually ruinous when it comes to creating a stable sonic image.  Additionally, sub-woofer advocates believe that a single LFE channel is inadequate, and that each full-range speaker needs its own sub-woofer.  There is also (thankfully) some agreement that, for classical music at least, the two rear speakers do not need the full 20Hz bass response.  What we used to refer to as “Quadrophonic” is now called 4.0 and one of its keenest advocates is Peter McGrath, a revered recording engineer whose day job is Sales Director for Wilson Audio Specialties.  Peter’s classical recordings are absolutely the finest I have ever heard, so his opinion counts for something!  But relatively few multichannel SACDs are presented in 4.0 format.

Thursday, 18 April 2013

DSD – The Next Big Thing?

The bleeding edge of the audiophile universe – inhabited by those of us who probe ever deeper into the outer reaches of diminishing returns in search of audio playback perfection – is strangely characterized by apparently outdated, abandoned, superseded technologies, shouting their last hurrahs in stunningly expensive Technicolor. Tubes and turntables are guilty as charged here, and I own both.

Why do some apparently stone-age technologies still persist, yet others less venerable vanish never to be heard from again (hello cassette tapes, receivers, and soon CD players)? In the cases of tubes and turntables, I venture to suggest that these are technologies which, at their zenith, were the products of craftsmanship and ‘black art’ rather than the concentrated application of science. Their full scientific potential was never truly reached, and they were replaced for reasons of practicality, convenience, and cost. But they still have not gone away.

A slightly different situation arose for the SACD, a technology developed by Sony and Philips as an intended replacement for the CD around the turn of the millenium. SACD was designed from the start to be a vehicle for delivering notable superior sound quality compared to the CD, which is strange, since the same two companies foisted CD on us under the pretext of “Pure Perfect Sound, Forever”. But whereas in the 1980’s they were able to create a real consumer demand for a delivery platform which was convincingly marketed as being superior to the LP, with SACD they found that there was in fact no market interest in a sound quality superior to CD. In fact, their customers were more preoccupied with a delivery format of demonstrably INFERIOR sound quality – the MP3 file. But that is another story.

The SACD fizzled upon launch, but thanks to the Japanese, it didn’t actually die. There is a healthy market for the SACD in Japan, and this is sufficient to keep the format alive, if not necessarily healthy. So what is it with the SACD? Does it actually sound better? And if so, how does it do that?

Well, yes, there is broadly held agreement that SACD does indeed sound markedly better than CD, and arguably even CD’s high-resolution PCM format cousins (with 24-bit bit depth and higher sampling rates). You see, SACD stores its digital music in a totally different way than CD. It uses a format called DSD, which I shall not go into here, save to say that conversion from DSD to PCM seems to consistently result in some significant sacrifice of sound quality.

Here in the West, where we never really adopted the SACD, we moved from listening to music on CDs to listening to music stored in computer files. So, instead of wondering whether or not to adopt the SACD, we ask whether or not we can store music in DSD format in computer files and have the best of both worlds. Well, of course we can! What did you think?...

Two file formats, one developed by Sony called DSF, and one developed by Philips called DFF, seem to have recently emerged. If you have a PC, you can easily send DSD bitstreams from DSF and DFF files to DACs that support DSD. In the Mac, it is a little more complicated, and there is an emerging standard called DoP (DSD over PCM) which enables Mac users to transmit DSD over USB and other asynchronous communications interfaces. Boutique record labels are emerging, such as Blue Coast Records, which record exclusively in DSD, and sell DSF/DFF files for download.
http://bluecoastrecords.com/

Perhaps most intriguing is that many of the major labels – but DON’T go looking for much in the way of public acknowledgement – have discovered a preference for using the DSD format for archival of their analog tape back catalog, having once already gone down the path of digitizing it to PCM and finding it to have been sadly lacking. Don’t look for this to happen any time soon, but this lays the groundwork for the major labels to finally release their back catalog in a format that truly captures the sound quality of the original master tapes. Before that happens, the labels are going to have to realize that the only sustainable format for music distribution is going to be one that works on-line, and they are going to have to find a way to make that work for them.

DSD could end up emerging as the format of choice for audiophile quality audio playback.

Monday, 1 April 2013

Naim DAC-V1

Naim Audio now provides SetUp instructions for using BitPerfect with their DAC-V1 that you can download from their web site.   A very nice route to obtain "World Class Sound"!