It happens quite often. People mention to me that they have started the process of ripping their CD collections to WAV files so they can start to play them through their computers. And they haven’t paused to think it through before they start. This is definitely an area where “look before you leap” or “an ounce of prevention is worth a pound of cure” can be held to apply. Maybe I can help.
This is the first in a series of posts where I will talk
about the real-world issues you will encounter when you take the plunge and
commit to ripping your CD collection.
This mostly introductory post addresses the main predicament we face,
and how we arrived to this juncture in the first place.
Part I – Metadata
I can’t think of anybody who has successfully made the
transition from CDs to computer-based audio, and abandoned it to go back to
CDs. Once your music collection is
safely tucked away on Hard Disk, the ability to navigate through it, to prepare
playlists and collections, to browse intelligently – even to control it remotely
using a mobile device such as an iPad or an iPhone – massively enriches the
experience. Even with a relatively
mundane piece of software such as iTunes.
And serious high-end products such as Sooloos elevate the user
experience much closer to the incredible possibilities that the brave new world
of computer audio opens up for us.
My good friend ‘Richard H’ has something approaching 3,000
CDs in his collection. They live in a
selection of shelving units and cupboards that dot his listening room. Richard knows pretty much where most of his
CDs live, but occasionally some are hard to track down. (Particularly if I’m the person who last put
it away…) Extracting full value from
that collection involves not only knowing exactly where every disk sits, but
also having a good memory for what tracks are on every one of those disks. I’m sure many of you will identify with that. But it is at least manageable. Its what we’ve all gotten used to.
On the other hand, my sister Barbara works for a NPR radio
station, WKSU, which is one of the biggest classical music stations in the
world. Their music collection comprises
MANY THOUSANDS of disks, and their ability to function as a station relies to a
great degree on the people who work there knowing how to find every last piece
of music they own. It is a nightmare of
a task, and I have no idea how they manage that, but it seems they do it very
effectively! The thing is, with
classical music, how do you organize a library of thousands of CDs with the
sole assistance of a BIG shelving unit?
Do you do it by composer, by musical style, by period, by performer, by
record label … or do you just stack ’em up one by one in the order you bought ’em? There is no natural solution. Particularly since, with classical music, a
single CD can contain works by different composers, in different musical
styles, of different periods, by different performers, and so forth.
But once you rip that library into computer files, there is
an immediate, and very natural solution.
All that information is just data, and computers handle data very, very
well. The challenge, then, is to get all
that valuable data off the discs and into the computer. And that is where the problems start. Because the data isn’t on the discs in the
All of the information that is relevant to the music on an
audio disc is termed “metadata”. Most of
it is printed on the jewel case artwork, or in the enclosed booklet, but none
of it is encoded on the disc itself.
Back when the format of the CD was devised, more than 30 years ago, the
concept did not exist of wanting to read that information from the disc, and so
nobody thought to standardize any method for putting it on there. Finally, in the mid-1990’s, when a standard
did emerge for combining audio and data onto the same disc, there was no interest
– let alone any sort of agreement – in establishing a standard format for doing
so. So it never happened.
What did happen was what always happens when a stubborn
industry fails to meet the needs of their customers. The geeks step in and engineer a solution of
their own. In this case it was called
MP3. Techies realized that they could
play their music on their computers, if only they could get their music in
there in the first place. The trouble
was, music files were so darned HUGE that you couldn’t fit many on the size of
hard drives that were available at the time.
It is easy to forget that way back then the capacity of a CD exceeded
the capacity of most computer hard drives!
You had to do something to get the size of the files down. That something was the MP3 format.
So it soon became possible to collect a fair-sized number of
music files on your computer and play them using some custom software. Of course, if you wanted to be able to
properly manage the new music collection on your computer – or even just
identify which tracks were which – you wanted access to some of that “metadata”
that I described. So the next thing the
geeks developed was the ID3 “metadata” tagging system, which was a way to embed
metadata into the same files that contained the music. MP3 became a file format that would store not
only the music, but also all of the information that describes the music. It was a revolutionary development, to which
the music industry responded with various enlightened practices including
refusing to accept it, pretending it didn’t exist, and trying to ban it.
With the record industry standing off to one side with its
head in the sand, the next thing the geeks did was to come up with huge on-line
databases which “cloud-sourced” (as we would describe the activity today) all
of this metadata, together with some very clever information that individual
users could use to interact with it.
Using these on-line resources, you could insert a CD into your computer,
some clever software would analyze the CD, correctly identify it, locate all of
its metadata, and – Bingo! – automatically insert it into the resultant audio files as
part of the ripping process.
The “end of the beginning” (if I may channel Churchill) came
when the hard disk industry started manufacturing drives big enough to hold the
contents of a large numbers of CDs, and in response the geeks started
developing alternative formats to MP3 which could store the music in a lossless
form – the FLAC file format is by far the most popular – thereby preserving intact
all of the musical information. These
new formats would also support the new high definition audio standards that
were emerging at the same time.
Thus, with the support of an enthusiastic, geek-driven,
audio hardware industry, the computer-based audio paradigm reached its first
level of practical maturity. The record industry
at first refused staunchly to participate, and now that they are finally getting
on board with downloading as a legitimate mainstream sales & marketing
channel, they can no longer hope to control its de facto standards, which
continue to evolve pretty much independently, for better or for worse. Which is why we need a set of posts like this
one – and the rest in this short series – to guide you through the perils of
ripping a large collection. Because it
can be quite a frustrating business, and can take up an awful lot of your time.
Part II can be found here.