There 
are two uses of the word Compression in the world of digital audio.  The
 first is dynamic compression.  This is where we want to increase the 
volume of a track, but in doing so we make the loudest bits so
 loud that their signal level is larger than the maximum value the 
format can encode.  Here we would use “dynamic compression” to 
selectively reduce the gain on those loudest passages so that they fit 
inside the available headroom.  This note is not about dynamic 
compression.  Instead it is all about file compression.
 
 File 
compression is a process that takes a computer file which takes up a 
certain number of Megabytes of storage space, and manipulates it so it 
takes up a lesser number of Megabytes.  Ideally, but not necessarily 
always, this compression is lossless, by which we mean that identical 
raw data can be extracted from both the original file and the compressed
 file.  There are two reasons for wanting to do this.  To reduce the 
amount of storage space required to store the file, and to reduce the 
bandwidth required to transmit a file from one place to another within a
 constrained amount of time.
 
 Most of the time, we find that 
everyday computer files can be readily compressed.  Why is this?  In the
 software world, the format of a file is typically chosen so as to allow
 the computer to write data to the file, and read data from the file, in
 an efficient manner.  Scant regard is often paid to the resultant 
efficiency of data storage.  An example might be a simple text file.  A 
simple ASCII character set uses only 7 bits to encode it.  However, 
computer files are typically written in chunks of 8-bits, called Bytes. 
 So every time we want to write a character we use up 8 bits of storage 
when in practice we only needed 7 bits.  A simple file compression 
technique can use this observation to recover the unused storage space 
and reduce the file size by one eighth.  With more complex file 
structures, a general-purpose strategy is not so obvious.  Native music 
file formats are similarly inefficient.
 
 Anybody who has used a 
zipping program to make a ZIP file to transmit a file over the Internet 
will be familiar with lossless compression.  A ZIP file is a 
general-purpose lossless file compression utility.  Some files, for 
example Bitmap (BMP) image files will compress very nicely into much 
smaller ZIP files.  On the other hand, files such a JPG images are very 
seldom reduced at all in file size by zipping.  This is because the file
 format used for BMP files is particularly inefficient, whereas by 
contrast the file format for JPG files is highly efficient.  In 
principle, any computer file can be reduced in size by a well-chosen 
lossless compression utility, unless the file format was specified to be
 efficiently compressed in the first place.
 
 In general, the 
more we know about a file, and about the data that the file contains, 
the more freedom we can have in selecting an optimum strategy to 
compress it.  With music files there are number of attributes that can 
be exploited to effect lossless compression.  Here are two of the easier
 to describe attributes:  (i) Because music files encode a waveform, and
 because the waveform is not totally random (in which case it would be 
noise, not music), we can use the waveform’s immediate past to predict 
what its immediate future might look like, and encode instead the 
differences between the predictions and the actual values.  This is used
 very effectively in many well-known lossless encoders.  (ii) Stereo 
music, content is dominated by centred images which contain identical 
information in the right and left channels.  If instead of encoding L 
and R, we encode L+R and L-R we find we end up with waveforms that are 
more readily susceptible to other compression methodologies.
 
 
Despite the effectiveness of these methods, there are still realistic 
limits on how much a native music file can be compressed without losing 
data.  For most music this averages out at around 50%.  To reduce file 
sizes by more than that, it is necessary to adopt lossy compression 
features.  Lossy is exactly what it says it is.  In order to further 
reduce the file size, we take something that we think you probably can’t
 hear and we throw it away.  Lossy compression makes great use of the 
findings of the field of psychoacoustics in order to help us decide 
what, exactly, you ‘probably’ can’t hear.  Lossy compression technology 
is fabulously creative, extremely clever, and very interesting, but for all that it still makes your music sound worse.
 
 MP3 is the 
granddaddy of lossy audio compression technologies.  I do not propose to
 go into detail about how MP3 does its thing, but at its core it makes 
use of a key finding of psychoacoustics, that of ‘masking’.  Masking 
states that certain sounds are more effectively masked by some sounds 
than by others.  For example, a louder sound masks a quieter one (well, 
duh!).  Also, a sound at one frequency effectively masks other sounds at
 adjacent frequencies.  So if we we can identify and extract one element
 of a waveform, and determine that it is ‘masked’ by another one, then 
we could, for example, encode the ‘masked’ element using a much lower 
bit depth.
 
 MP3 sets about breaking the music into as many as 
572 frequency subbands, the contents of which are then scaled up or down
 according to the aforementioned psychoacoustic principles, and end up 
being encoded using a technique called “Huffman Coding”, by which the 
most commonly-occurring values are encoded using fewer bits than the 
less-common values (quite simple, yet really rather clever).  Using this
 approach we can, in effect, controllably reduce the resolution of the 
encoded music, reducing it more for those elements in the music which 
are ‘masked’, and less for those doing the masking.  The Huffman Codes 
are typically stored in one or more look-up tables, and by choosing an 
appropriate table we can end up with a larger or smaller effective bit 
rate.
 
 In effect, lossy compression techniques employ much more 
in the way of signal processing than lossless compression in order to 
identify and extract which components can be effectively thrown away 
while minimizing (note, never eliminating) the audible deterioration in 
the perceived sound quality.  For this reason, more recent encoders such
 as Apple’s AAC, which are more elaborate and require more processing 
power than MP3, tend to sound better at equivalent bit rates.
 
 
