Sunday, 8 March 2015

The Story of The "New" Beethoven From Japan.

Here is a beautifully written piece about a present-day composing sensation from Japan. Deaf, like Beethoven, and a second-generation survivor of Hiroshima, and something of a cultish figure in his own right, Mamoru Samuragochi was a highly popular composer of music for video games who became a recent sensation with his 1st Symphony, a work hailed as true genius. Finally, Japan had produced a classical composer to be revered among the company of Mahler, Bruckner, Beethoven and the like.

Except that it wasn't actually like that at all. Samuragochi - a master of marketing and self-promotion - actually paid a musical prodigy with a serious problem of low self-esteem to write his music for him, and to stand back while Samuragochi took all the credit. But more than that, Samuragochi's deafness was another artifact - conjured up to enable him to avoid having to answer awkward questions in press conferences.

In today's world, lies of this magnitude, coupled with both success and a huge public profile, cannot be kept under wraps for very long. Even so, many of the corporate and institutional organizations who had hitched their wagons to the Samuragochi juggernaut, decided that turning a matching deaf ear of their own to emerging shouts of protest would be their preferred course of action.

That this whole drama played out in Japan, a country whose culture is so unique, and so different to what we in the west like to think of as "Normal", adds a unique spice to the whole story.

Please read Christopher Beam's well-written piece in "New Republic", which provides a combination of wonderfully intriguing characters, Shakespearean tragedy, cultural back story, and even the hints of a twisted coda yet to be played out. Frankly, I see this as magnificent, classical, absolutely first-rate operatic material. Thomas Adès, are you reading?....

http://www.newrepublic.com/article/121185/japans-deaf-composer-wasnt-what-he-seemed

Friday, 27 February 2015

DSD vs Class-D

There was a post the other day on the Audio Review web site by Brent Butterworth.  In it, he laments how on one hand audiophiles are falling over themselves swooning over DSD, while at the same time Class-D amplifiers receive short shrift.  After all, he tells us, DSD and Class-D are the same thing.  Reading between the lines, this is all some kind of giant marketing faux-pas.

Well, sorry, but DSD and Class-D are most definitely not the same thing.  In his post he suggests “Put bigger transistors, a bigger power supply, and larger filter components on the end of a DSD DAC …. and what have you got?  A Class-D Amp.”.  Well, if life really were as simple as that, Class-D amplifiers might indeed sound great - at least better than they do for now.  But it is not at all the same thing.  Let me explain why.

I have already explained elsewhere how DSD DACs function, and what they have to do to sound as good as they do.  But, in order that this post can stand on its own two feet I am going to go over the key points once more.

DSD is a 1-bit binary bitstream running at a frequency of 2.8224MHz.  In order to convert it to analog, all we have to do is pass the bitstream itself directly through an analog low-pass filter, and the result is music.  It really is as simple as that.  Except, that is, if you want the ultimate in sound quality.  In that case, what you find is that the filters required for that task are not as benign as you might like.  They tend to be similar to the anti-aliasing filters required to be inserted into the signal path prior to 16/44.1 PCM encoding, and tend to endow the sound quality with many of the same characteristics.  To get around that, virtually all DSD DACs up-convert the incoming DSD to a ‘variant format’ of DSD with a higher sample rate, and quite often with an increased bit depth of 3-5 bits.  It is that ‘variant format’ of DSD which is passed into the low-pass filter during analog reconstruction.

With the ‘variant format’ of DSD, we can specify a filter whose characteristics are not so aggressive, and which has, as a result, better sound quality.  It was always so, even with the first SACD players introduced to the market some 15 years ago.

But what does it mean in practice, to pass a digital bitstream through an analog filter?  This is easiest to describe when we confine ourselves to a pure 1-bit bitstream.  The waveform is a pseudo-square wave, which is either at its maximum when the bitstream reads ‘1’, or at its minimum when the bitstream reads ‘0’.  In a DAC chip, those maxima are of the order of +1 Volt and the minima of -1 Volt.  So in a DSD DAC we would be generating a pseudo-square wave whose voltage varies rapidly between plus and minus one Volt.

The differences between a DAC and a Power Amplifier are twofold.  A DAC is a low-power device whose output is of the order of a few hundred milliwatts, with a maximum swing voltage of the order of a volt.  By contrast, a Power Amplifier is a high power device, whose output is of the order of hundreds of Watts, with a maximum voltage swing of many tens of volts.  This places some significant demands on the circuit whose job it is to feed the appropriate pseudo-square wave into the analog reconstruction filter.  Instead of switching the voltage between plus and minus one volt it now has to switch between plus and minus a hundred volts (give or take).  And instead of those voltages carrying a few hundred milliamps, they must now carry a few Amperes of current.

What happens when you start switching a 100-volt line carrying several amps of current on and off at a frequency of several MHz?  The answer is that you generate massive quantities of radio-frequency energy.  In fact, the chances are good that you will jam everybody’s radio for a distance of several blocks.  That can get you into a lot of trouble.  But even if you could fix that particular problem - which you can, at some cost - you still require some pretty sophisticated (read expensive) components to switch that kind of signal at that kind of frequency.  Frequency is the big problem here.  The higher the frequency, the worse the problem gets.  So to answer Brent Butterworth’s question “Put bigger transistors, a bigger power supply, and larger filter components on the end of a DSD DAC …. and what have you got?”  The answer is: “A radio station.”.

So how does a Class-D amplifier work, then?  The answer is that they grab the bull by the horns and instead of moving the frequency UP, they move it DOWN.  Using a sigma-delta modulator a Class-D amplifier remodulates the incoming signal to a lower sample rate than that of DSD, but to preserve the integrity of the signal it uses a bit depth of more than 1-bit.  I don’t want to get too technical at this point, but how it does that takes it into a different realm.  It increases the effective bit depth not by using it to encode the intensity of the pulse in the waveform, but instead the width of the pulse.  In effect it encodes the output of the sigma-data modulator not in with “Pulse Density Modulation” representation (which is what DSD is) but with a “Pulse Width Modulation” (PWM) representation.

By using this PWM approach at a much lower frequency, usually in the high hundreds of kHz, it means that the switching can be accomplished using affordable components, and we have less of a problem with RF emissions.  But we still have the analog filtration to do, and this remains an area of concern for ultimate sound quality.  The other issue is whether or not the PWM switching can maintain the linearity and distortion performance necessary for high-end audio applications, while still delivering the amount of current that loudspeakers typically consume.  It is this area which is ripest for exploiting with intelligent and innovative engineering solutions.

Today, Class-D amplifier technology is making serious in-roads.  For non-audiophile applications they are beginning to rule the roost.  Even in the audiophile sphere, a number of quality Class-D amplifiers are now on the market, and while there are notable exceptions which do deliver seriously impressive sound quality (I’m looking at you, Devialet!), they remain largely confined to the low-to-mid range price/performance tier.  But note that even for serious audiophile applications, Class-D amplifiers have totally revolutionized powered subwoofer technology.  All things considered, at the present rate of progress, I can see Class-D dominating even the high-end market before too long.  But not for the reasons Butterworth suggests.

All that said, we’re not there yet.  And don’t forget that, even today, the market is still very much alive with vacuum tube power amplifiers.

Thursday, 26 February 2015

Not Patently Obvious

It is surprising how little-understood patents are.  Ever since 1989 I have managed to find myself responsible, at one time or another, for the patent portfolios of each of the companies I have worked for.  I am also an inventor on several issued patents.  So I know a little bit about the subject - at least enough to know that most people hold onto a number of misconceptions.  I thought it would be instructive to post an overview of the key issues pertaining to patents.

First of all, it is important to understand what can be patented and what can’t.  A patent must describe either a specific thing, or a specific method of making a thing.  By specific, I mean that the inventor must clearly describe exactly what constitutes that thing, and provide clear criteria that permit the reader to distinguish the invention from something which is not covered by the patent, leaving as little as possible in the way of a "grey area".  Generally, there must be an ‘inventive step’ - a critical point at which the invention departs from what was previously known (what we call ‘prior art’).  Also, a patent must contain ‘full disclosure’.  In other words, it must contain everything that
a person skilled in the art would need to know to be able to successfully replicate the invention.  The inventor must not withhold some key “secret sauce” from the patent disclosure.  Weaknesses in these areas can result in a patent having its applicability restricted, or even being invalidated, at some point in the future.

Patents can only cover something which has been invented rather than created, so, for example, you can patent a clever new way of making words appear on paper, but you cannot patent the words themselves - those you might consider copyrighting.  You cannot patent your logo or business name - those you might consider trademarking.  You cannot patent your customer list - that would be a trade secret.  And you cannot patent all the brilliant things that could be done "if only someone would invent such-and-such a thing".  Those you could write a science-fiction novella about.

Finally, the patent should disclose who invented the invention.  There can be multiple inventors, but if so, each listed inventor must be able to point to a critical aspect of the
inventive step for which they are responsible, and all of the actual inventors must be included in the patent.  Just being the owner of the company for whom the inventor worked does not entitle you to be listed as an inventor [It is said that Elena Ceaușescu, wife of the communist dictator of Romania, had her name included as an inventor on all Chemistry patents issued in Romania].  It is not unusual for all rights in the patent to be assigned to a third party, usually the employer of the inventor(s), although the inventors’ explicit consent is required for this to happen.

The structure of a patent comprises two parts, the Specification and the Claims.  The Claims are the most important section of the patent.  Only what is claimed in the Claims is protected by patent law, and this is a very important distinction.  The claims describe, in a manner set forth both by statute and by common practice, exactly what has been invented.  Someone reading them might well be unclear on exactly what those dry and clipped claims actually describe.  Taken on their own, there may be some ambiguity about what the specific language in the claims refers to.  Therefore, patents also include a Specification section in which the invention is described in detail, in the context of the pre-existing state-of-the-art, and some examples of specific embodiments of the invention are provided.  That way the Claims are designed to be read and interpreted in the light of the Specifications.

It is important that the claims of a patent describe only the new inventions for which the inventor is seeking protection.  If a claim describes something which pre-existed before the patent application was submitted, a circumstance known as ‘prior art’, then that claim - and in some circumstances the entire patent - can be held to be invalid.  This will be the case regardless of whether or not the inventor was aware of it.  Often there may be some ambiguity regarding whether a key aspect of a claim is or is not anticipated by a certain item of prior art, and this may be known and understood by the inventor.  In that case, the Specification will typically include material identifying such prior art and explaining why and how the inventor’s claims are distinct and different.

Once you have written your patent, it must be submitted to the patent office for approval.  The patent office will assign it to a patent examiner who will make a cursory, but intelligent, examination of your patent and will attempt to establish whether or not the submitted document meets all of the requirements to be granted as a patent.  He may question whether the disclosure is complete.  Or he may raise specific objections based on existing patents or other publications which he considers may describe prior art.  You will then have the opportunity to address those objections and either re-submit the patent or provide clarifying information to the examiner.  This can go back and forth many times, or not at all.  The details of all such back-and-forth dialog will stay in the patent’s history file, and may be referred to in future if the patent is ever challenged in court.  In any case, the end result of the process is that in due course the patent is usually issued.  A patent only comes into force after it has been issued.

Once issued, the patent has a severely restricted lifetime.  In the US this is 20 years from the date when the patent was first filed with the patent office, regardless of how long it may have spent going back and forth with the examiner’s office prior to issuance.  Once the patent expires, it no longer conveys any protection whatsoever.  There are no ways to get around that.

It is one of the big mistakes that people make in regard to patents that they over-estimate the value of an issued patent.  All issuance demonstrates is that the examiner has been persuaded that the inventor has met the requirements for a valid patent.  It does NOT guarantee that the patent actually does meet those requirements!  Which can come as a big surprise to someone who has shelled out a lot of money to get to that point.

So what use is a patent then?  In reality, if you are the owner of a valid, issued patent, it gives you a legal basis on which to approach a third party who may be infringing that patent against your wishes and ask them to either stop doing so or purchase a license.  Generally, what happens next depends on whether the third party is bigger than you, and has greater financial clout.  If the party continues to ignore your entreaties, you will have the right to sue them for patent infringement.  Always bear in mind that knowing someone is infringing your patent rights is one thing - but it may be rather more difficult to prove it in a court of law.

A court of law is the only place where the ultimate blessing of validity can be bestowed upon a patent.  This is where you end up if you sue somebody - or if somebody sues you - for patent infringement.  A court of law can do what the patent examiner does not.  It can examine the patent in minute detail and pronounce with finality on whether the patent is or is not valid.  It can choose to limit the patent’s validity or invalidate it entirely.  In rare cases it can order the patent to be re-submitted with additional material in order to expand its validity.  A patent whose claims have been upheld in court can no longer be challenged.  The owner of a patent whose claims have been upheld (or even declared invalid, for that matter) in a court of law will also be up to $10M lighter in the wallet.  Yes, Doris, that was an ‘M’.  A patent infringement lawsuit is not for the faint of heart.

In the US, the doctrine of triple damages applies.  This means that if you infringe on the patent rights of a third party, in full awareness of those rights, then you will be liable for not only the damages you are held to have caused, but triple the damages.  The fear of incurring triple damages ensures that even large and powerful entities will take a patent infringement lawsuit seriously, because triple damages presents even a penurious client with an opportunity to seek serious legal representation on a contingency basis.

We’re still not done yet.  You have yet to decide where, geographically, you want your patent to have force.  If you have a US patent for example, your competitors in Germany, Japan, China, etc., can freely and legally enjoy full use of your patented inventions.  Your only remedy may be to stop that entity from importing infringing products into the US.  If you want the protection of your patent to extend to other countries of the world, then you have to file for patent protection in those countries too.  But be aware that your patent rights, the degree of protection offered, and the remedies available against infringement, may be different in each country.  Filing internationally gets to be very expensive, since your patent will usually require to be translated into each country’s native language, and rendered fully in compliance with each country’s patent codes.  Also, you cannot sit on your hands and see how things work out before deciding whether to file internationally.  You have to make that decision up front (in practice, there is a mechanism that can give you up to 12 months of breathing room for some countries, but that’s all, and its not much).

Finally, what does all this cost?  First of all, you will benefit from the services of a good patent attorney.  Yes, they charge up to $400 an hour, but there are good reasons for that.  I wouldn’t dream of filing a patent without the assistance of top quality counsel, and indeed I never have.  There are so many ifs and buts when it comes to costs, but I will give you two pegs in the ground that I think are fair.  To get to the point where you have a high-quality issued US patent will cost you $10k to $20k.  If your ambitions are international, a fully issued patent portfolio in a basket of countries in which a technology-oriented company might wish to do business will set you back $150k to $250k per patent.  That’s not chump change.  Remember, this is to arm yourself with a single issued patent, which may be willfully ignored by someone who doesn’t think you have the balls to sue them, or which may be shown to be invalid - whether in court, or in one of those “oh dear” moments when you open a letter containing a sheaf of technical documents that you wish had come to your attention before committing to all the expense of filing the patent in the first place!

At this point, a quick detour into good business practice.  Because of the doctrine of triple damages, companies and individuals should always make it strict policy that nobody (but NOBODY, other than in-house counsel) should ever read the Claims of any patent of which they are not the author.  If such a policy is carefully implemented in practice, then it follows that, legally, neither the person nor the entity can possibly be aware of any infringement of any patent.  Since only the Claims describe what is patented, even if you have read the Specifications section of a patent which you are accused of infringing, if you haven’t read the Claims you cannot know what has actually been patented.  This may sound devious - lets be honest here, it IS devious - but if you retain a blue-chip patent attorney this is the first lesson that he will hammer into you.  Practically it is not that hard to do, since claims make for very dry reading anyway.

Patents exist purely as a ‘barrier to entry’.  They are a barrier that obstructs your competitors from entering your line of business.  In that sense they are no different than the padlock you put on the factory’s front door when you go home for the weekend, or the insurance policies that you pay for once a year.  Like the padlock and the insurance, you need to understand what you are protecting yourself against, what the costs are of indulging yourself in that degree of protection, and what risks you run in not doing so.  There are exceptions to every rule, but for most small businesses - and I think all audio businesses are small businesses - patents are very rarely a justifiable form of protection.  But when telecom giant Nortel’s assets were sold off in 2011 following their bankruptcy, their patent portfolio was sold for $4.5B, in cash.  Yes, Doris, that’s a ‘B’.

Tuesday, 24 February 2015

On the Audibility of Phase

This post is nothing so much as some extended thinking aloud on the subject of the audibility of phase.  I have written before about how phase relationships can profoundly affect the actual waveform of a complex sound even though the frequency content remains unaltered.  Experiments to determine whether those phase-induced changes are actually audible, using synthesized sounds, are unsatisfactory.  I personally am totally unable to hear any difference between the sounds of different tracks where all I do is vary the phase content.  But this doesn’t really prove much, because the human brain is not well adapted to discern subtle differences in synthetic non-real-world sounds.  Remember - the EARS listen, but the BRAIN hears.

A great, and very valid point of reference, are the ultra high-end loudspeakers of the Wilson Audio range (whose “entry-level” models cost more than my daughter paid for her two year-old Ford).  These top-end models include a facility for adjusting the positioning of the mid-range and tweeter units.  The idea, as claimed by Wilson, is to permit fine adjustment of the “time alignment” between the treble, mid, and bass drivers.  That such adjustments should have an audible effect is not surprising since, most obviously in the crossover regions, the signal reaching the listener is a combination of signals originally emitted by one or more drivers.  The “time alignment” of those signals can make the difference between those signals reinforcing one another, or trying to cancel one another out.  Those effects will manifest themselves in aspects of the speakers’ measured frequency response.  But beyond that, these adjustments have the effect of fine tuning the phase response of the speakers, at least to some degree.

What effect do these adjustments actually have?  I can tell you from personal experience that they are most effective.  And it is not a question of optimizing the tone colour for ‘naturalness’ as you might presume if the effect you were hearing were that of the phase reinforcement/cancellation effects alone.  No, what I heard, and what everyone else I have spoken to who has listened for themselves has reported, is that when the adjustments are ‘just right’ the whole soundstage seems to suddenly snap into focus in a way that only the Big Wilsons seems able to command.

This is personally interesting because a good 30 years ago I bought my first ever pair of seriously high-end loudspeakers, the Advanced Acoustic Designs ‘Solstice’ model produced by Colin Brett, a one-man operation whose day job was as owner of the local shaver repair shop.  Inspired by the now-legendary Dahlquist DQ-10, Colin designed a speaker with a sealed bass unit, above which he mounted an open-frame midrange unit and tweeter.  The open-frame units were progressively set back from the front of the bass unit’s baffle in order to provide a degree of time-alignment.  By the time I came on the scene he had completed that phase of the design by mounting a selection of differently cut frames and listening to how they each sounded.  I, on the other hand, wanted to hear this for myself, and suggested that he repeat the experiment this time using a pair of staggeringly precise piezo-electric slides, which I could conveniently borrow from where I worked.  Sadly, that experiment never came about.  I still have my pair of Solstice loudspeakers in my basement, although one of the mid-range units, long since out of production, has gone to meet its maker.

Just how much ‘time alignment’ do the Big Wilsons provide for, and how significant might you expect that to be?  The full range of adjustment is confined to something like a couple of millimetres (by my estimation).  That’s about one tenth of the wavelength of a 20kHz sound wave.  The process of homing in on the ‘right’ position involves setting it within what looks like less than 1/10 of a millimetre.  It seems a little surprising that mechanical adjustments of that order are necessary to fine tune the temporal response of a loudspeaker, but for the sake of this discussion lets take it at face value.  The adjustable Wilsons make me yearn for what Colin Brett might have heard if he had voiced the Solstice with a precision positioner instead of the much cruder and significantly less precise method he chose!  Although I wonder whether he would have been able to maintain such tolerances in manufacture, given the technology of loudspeaker cabinets in the early 1980s.

Phase and Time Alignment are different ways of looking at the same thing.  Phase is measured in fractions of the oscillation period of a wavelength, and Time Alignment in fractions of a second.  A fixed Phase error corresponds to a progressively smaller time error as the frequency gets higher and higher.  Alternatively, as the frequency gets higher, a fixed amount of time represents a progressively larger fraction of the period of the oscillation and therefore its Phase.  So ‘Time Alignment’ is more critical at higher frequencies than at lower ones, because it induces - or corrects for - a larger Phase error.

So to the extent that the Big Wilsons provide a crude “Phase Response” correction tool, and to the extent that the audible changes heard by the listener in response to those corrections represent the audibility of phase, we can look at various process that affect the phase response of an audio signal and compare those to the magnitude of the phase errors which are ‘audible’ on the Big Wilsons.  There are a lot of ‘ifs’ in there, but if you bear with me it might be instructive.

I like digital filters when it comes to this sort of discussion, because digital filters can - if designed properly - have a known and precisely constrained effect.  By constrained, I mean that all of their effects are knowable and are precisely quantifiable, even if, like the ‘phase response’ we may have trouble knowing what they all mean in terms of audibility.  By contrast, in an analog filter, both capacitors and inductors are in reality complex physical constructs whose behaviour we can only ever approximate, and can never precisely know.

I want to look at a simple low-pass filter and try to make some very general conclusions as regarding the audibility (or otherwise) of their phase responses.  I am going to choose a filtering operation I know quite well - a digital low-pass filter designed to convert a DSD source signal to PCM.  Filters similar to these are used in virtually all modern PCM ADSs.  Lets make some simplistic assumptions for the design of that filter.  We’ll specify the low-pass corner frequency to be 20kHz, the accepted upper limit of human audibility.  In order to eliminate any aliasing effects the filter needs to eliminate all signals above one half of the PCM sampling rate.  If the PCM bit depth is 24-bits, then we need to attenuate such frequencies by at least 144dB.  Finally, we want the character of the filter to have a Pass Band (the region below the corner frequency) with a frequency response as flat as possible.  There are some other parameters I won’t trouble you with.  Lets go away and design some filters and see how they look.

We’ll start by designing filters for 24/88.2, 24/176.4, and 24/352.8 PCM formats.  We’ll come back to 16/44.1 PCM later because, as we’ll see, it is a lot more complicated.  The first decision we need to make is regarding the type of filter we want to use.  There are two types of filter that we would ideally prefer to choose from, which both have a flat frequency response characteristic in the Pass Band.  Those are the Butterworth and Type-II Chebychev filters.  Butterworth has the advantage that the attenuation keeps falling the higher the frequency gets, whereas the Type-II Chebychev only provides a minimum guaranteed attenuation.

With each filter design we are going to look at two things.  First the ‘order’ of the filter.  This is something I am not going to get too deeply into, save to say that if the ‘order’ is too high then the filter may become unstable or inaccurate.  You’re going to have to take my word for it if I say the order of the filter is too high.  Second, we’re going to look at the ‘Group Delay’ of the filter.  This is a calculation that takes the Phase Response, corrects for the phase-vs-frequency relationship, and spits out the corresponding time delay.  In essence, if we had a hypothetical loudspeaker that had one drive unit for every frequency, ‘Group Delay’ would tell us how far forward or backward we would have to adjust the position of the drive unit - Wilson style - to correct for it.  The important thing here is the difference between corrected positions of the bass unit (the ‘lowest frequency driver’) and the 20kHz unit (the ‘highest frequency driver’).  I will call that the ‘Wilson Length’, which is the distance by which the tweeter position would have to be adjusted in order to correct for it.  This is the result that I will report.  I hope that makes sense.

We’ll start with a Butterworth filter for 24/88.2 PCM.  After doing my Pole Dance, what I come up with is a 31st-order filter, whose ‘Wilson Length’ is 14mm.  That 31st-order filter is a non-starter to begin with.  For 24/176.4 PCM the filter is 17th-order, which ought to be acceptable, and its Wilson Length is 3.5mm.  For 24/352.8 PCM, the filter is 12th-order, which is fine, and the Wilson Length is 1.3mm.  Given that experience with the Big Wilsons suggests that the Wilson Length needs to be optimized to within a fraction of a millimetre, it implies that the phase distortions of ALL of these filters could well result in audible deterioration of the perceived sound quality.

Type-II Chebyshev filters are the traditional workhorse for low-pass audio filters because they give good frequency response without requiring as high an order filter as the equivalent Butterworth.  For the three applications above, the filter orders workout to be 18th, 12th and 9th respectively, all of which ought to be acceptable.  Their Wilson Lengths work out to be 7.6mm, 1.8mm, and 0.6mm respectively.  In all, the Type-II Chebyshev filters seem to be slightly better than their Butterworth counterparts although without really knocking the Wilson Length parameter out of the park.  Only the 24/352.8 filter appears to have a shot at being ‘inaudible’.  Bear in mind, though, that the specific filter designs I described may not be optimal for those applications.  They were just chosen for illustrative purposes.

At this time it is instructive to look at the 16/44.1 variant of this filter.  With only 16-bits of bit depth we can reduce the attenuation requirement to 96dB, but with the Nyqvist frequency of 22kHz so close to the corner frequency of 20kHz this places great demands on the filter.  With a Butterworth design what we get is a 192nd-order filter which is a total non-starter.  With the Type-II Chebyshev it is a 44th-order filter, which, despite being a much smaller number is still of no practical value.  To get the level of performance we require will need what is called an Elliptic filter.  This can actually be achieved with an acceptable filter order, but an analysis of its ‘Wilson Length’ behaviour is both more complicated and in any case will be much poorer than any of the results obtained above.

The above analysis seriously reduces a complex subject to an unfairly simple catch-all number, but I think it has some value if taken on its own terms.  I hold the view that the sound of PCM is the sound of the anti-aliasing filters to which the source signal has to be subjected prior to being encoded in the PCM format.  We understand those filters very well, and in terms of frequency response we know that those filters ought to be inaudible, but we are less clear on whether their phase responses are in any way audible.  I personally suspect that the things we don’t like about PCM are the artifacts of the phase response of their anti-aliasing filters, which are baked into the signal.  If we are willing to accept at face value that (i) the ‘time-alignment’ capability of the Big Wilsons provides an audible and beneficial optimization; (ii) the underlying cause of such optimizations are changes in signal phase; and (iii) the amount of adjustment needed to bring the Big Wilsons into their ‘optimum alignment’ reflects the sensitivity of human brain to phase errors; then this would seem to be a good basis for arguing that the phase distortions induced by anti-aliasing filters are more than capable of adversely impacting the sound of PCM audio - particularly so in the 16/44.1 format.  By contrast, DSD requires no anti-aliasing filters in the encoding path.

I think that’s rather interesting.  While I recognize that there are a lot of broad sweeps and generalizations involved in all this, I think it has significant validity, provided it is confined to being taken on its own terms.

I want to conclude by commenting on ‘time alignment’ in the specific context of speaker design.  Clearly, if you apply the same signal to each drive unit of a loudspeaker, there can be only one unique position at which the two drive units are correctly time-aligned to one another.  Any other position would be, by definition, out of alignment.  So why offer the possibility of adjusting that alignment?  The answer lies in the caveat “… if you apply the same signal …”, because we don’t.  Different drive units receive different signals, each contoured to the drive unit’s needs by the loudspeaker’s crossover.  Crossovers are filters, and yes, they too have a phase response.  Those phase responses mean that there is usually no one fundamentally correct time alignment.  Wherever the alignment is set there are going to be some frequencies for which the alignment is ideal, and others for which it is less than ideal, and this may change with, for example, the relative listener position.  Whether or not an audibly optimum position exists at all will vary from speaker to speaker, according to its design.  So it doesn’t necessarily follow that you will be able to replicate the “Wilson Effect” by jerry-rigging some sort of alignment capability on your own loudspeakers, although, as I have mentioned in a previous post, simply tilting the speaker can have a surprising effect.  I suspect the speakers have to be designed from the ground up to take full advantage of this design approach.

Tuesday, 10 February 2015

Musicians, Restaurants, and Plumbers

An opinion piece this week designed to get your backs up and make you think.  You read a lot of brouhaha these days about how musicians are not making any money out of streaming services.  There are so many streaming services available - some even offer high-resolution lossless content - and much like Netflix in the video domain, we as consumers can now access a lot of content for a nominal (i.e. affordable) outlay.  How, you might wonder, can the musicians who create the music in the first place be making any money out of it?

Recently, a study has been doing the rounds which purports to analyze the revenues of the streaming service Spotify, and indicates how that revenue is divvied up among the Streaming Service itself, the Record Labels, the Writers/Composers, and the Artists.  The report is available on the web site of Music Business World, and was prepared by the accounting firm Ernst & Young, so it has at least a minimum acceptable level of credibility.

Ask yourself this - according to the report, for every dollar you spend on Spotify, just how much of it ends up in the pocket of the artist whose music you are listening to?  Before you go on to read the the answer, I want you to ponder the issue for a moment and ask yourself how much you think OUGHT to go to the artist?  Also, stop for a moment to consider the rationale behind your calculation, so that it is a little bit more than a number you pulled out of thin air.  On what basis should the artist receive whatever it was that you thought was appropriate?

So what did you come up with…?  50 cents?  20 cents?  10 cents?  The actual answer is less than 7 cents.  Not seven cents every time you listen to a track, but 7 cents out of every dollar you spend.  If you subscribe to their premium service that’s about $10 a month.  So your subscription to Spotify generates 70 cents a month to be shared among all of the artists that you listen to.  Lets imagine that you listen to 20-25 tracks a day, and lets assume that the money gets split evenly among the artists on a per-play basis.  In that scenario you are playing about 700 tracks a month.  So each time you play a track, the artist you are listening to earns something like one tenth of one cent.

In some circles, this has aroused the anger of musicians who feel that the Spotifys of this world are screwing them out of their rightful earnings.  First Napster, then bit torrents and file sharing, and now this!

There are two problems with this.  The first is that, as best as anyone can tell, none of these streaming services are actually making any money!  It is one thing to argue a case against someone who is making scads of money off the backs of others, but another thing entirely to vent your spleen at someone who isn’t even profitable - unless your complaint is about the lack of any profit itself, which isn’t the case here.

Can it really cost that much money to run Spotify?  Which brings me to the second problem.  What happens to all the money that you pay to Spotify?  The answer is that Spotify in turn pays the majority of it in fees to the Labels.  Spotify pays about 17 cents on the dollar in taxes, and uses 20 cents to run its own operations.  The rest - amounting to nearly two-thirds of their revenues - is paid directly to the Record Labels who manage the distribution to the Artists.  In other words, Spotify doesn’t get to decide how much of their take goes to the Artists - that is entirely within the purview of the Record Labels themselves.

So now lets take a look at the money that the Labels receive - how do they distribute that?  According to the Music Business World report only 10% of what they receive goes to the Artists, and 15% goes to the Songwriters and Publishers, which means that the Labels keep a whopping 75% of the pie for themselves.  That’s a lot of pie.

It is therefore wrong-headed for the Angry Artists to get all stroppy about Spotify eating their lunch.  It is the Labels who are doing all the munching.  And it has been thus for as long as there has been a music industry.  But, the argument goes, it is a different world in 2015.  Labels used to have to pay for record stamping plants, or even CD stamping plants.  They had to maintain a sales force to get their product stocked by the music stores, and a promotional force to get their customers into the stores.  Plus the costs of transporting the product internationally.  Today this doesn’t happen any more.  All of the above is theoretically replaced by an “Upload” button that someone has to punch.

But even taking all of that into account, it still misses the point entirely for the Artists to be taking pot shots at the Labels.  If the Artist feels that the Label is charging too much for what they provide, then their solution is simple - they don’t have to sign with a Label.  Like just about any transaction, if you don’t like the price, you don’t have to make the purchase.  Unfortunately, though, for the majority of Artists, they don’t even have the option to not sign for a Label.  The reality is that as an Artist you hope to generate enough buzz that a Label - any freakin’ Label will do! - will deign to offer you a deal.  The idea that you can shop around and choose the one that offers you the best deal is a pipe dream for all but the privileged few.

For the Artist, what are the alternatives?  The obvious one is that they can start their own Label.  Sure they can … there’s nothing to stop them.  Well, except one thing.  You’ll need some money.  And as an Artist without a record deal capable of putting one tenth of a cent in your pocket every time someone plays one of your tunes on Spotify, you won’t actually have any money.

The view from the other side of the fence is not all roses either.  As a Label, you are hopefully making money from your roster of Artists.  But they come and go, as do their sales.  You always need to be replenishing your portfolio.  For every new Artist that you have a budget to take on there are a hundred who are convinced that they are The One.  You need to be really smart about which ones you sign and which ones you pass on.  After all, you’re not as dumb as the Decca executive who passed on The Beatles because guitar music was going out of style, are you?

Once you’ve signed a new Artist you are going to need to pay for some studio time to record their new album.  You’ll need to pay people to design the cover work and take publicity shots.  You’ll probably need professional video work doing.  You’ll need to schedule radio and TV spots if you’re sufficiently gung-ho about their prospects.  And you’ll need to cut that deal with Spotify.  All that expense must be incurred without any guarantee that you’ll ever get a penny in return.  And for every Artist who generates a handy revenue stream for you, there will be four or five who fail to make any sort of impact at all.  On top of it all, it may be you who screws up.  The Artist may leave you and sign for another Label, and under their guidance hit the big time.

For this reason, most Labels are very controlling when it comes to their stable of Artists.  They will control a large part of the product, how it sounds, whose arrangements are used - they’ll even kick out members of the band and bring in better session musicians.  If they don’t like your songs they’ll use their own songwriters.  The Labels are in the business of knowing what will sell and what won’t.  They won’t always get it right, but like a profession stock trader, they’ll get it right way more often than you will.  Even the poor sod at Decca who turned down The Beatles (his name was Dick Rowe) went on to sign The Rolling Stones.  Consequently, the Artists very soon find out exactly where on the totem pole a place has been reserved for them, even as their backs are being patted and their egos pumped.

So, as a musician, if you can do all that then you don’t need the services of a Label, and you’ll make ten times as much from Spotify as you might otherwise have done.  If not, then you have little choice but to work within the established Label system if you can get one sufficiently interested.  Otherwise, as one certain Norman Tebbitt might have put it, you should consider getting ‘on yer bike’ and finding a proper job :)

Here’s the thing about musicians in particular, but Artists generally.  You are only an Artist while you are creating art for your own personal satisfaction.  As soon as you aim to sell it for even a modest profit you become a businessperson, no different from a restaurant owner or a plumber.  Its a dog-eat-dog world, whether you’re selling art or amplifiers, and you need to have a minimum of business savvy if you are going to survive in it.  You need to identify smart things to do and dumb things to avoid doing.  The world has little sympathy for poor businesspeople.  And it won’t pay $10 for something if there is something else it thinks might be just as good priced at $9.95.  Don’t take my word for it.  Spend some time in Walmart.

My advice to musicians who fret about how much they are getting from Spotify is simple.  You are businesspeople first and foremost, and you had better start looking at yourselves in that light.  Would you open a paint store that only sold green paint?  I know I wouldn’t.  The thing about business is that sometimes the best thing to do is not the same as the thing you really wanted to do.  If you can’t - or won’t - see that, then your prospects for success will have a lot in common with buying a lottery ticket.  Which is fine, because most of us do not make particularly good businesspeople, and rarely win the lottery.  In which case you should go back to being an artist, and create art for no purpose other than your own satisfaction - in your spare time of course, since you’ll have a ‘proper’ job to do as well.

Tuesday, 3 February 2015

iTunes 12.1.0

I have been using iTunes 12.1.0 for a couple of days now.  It seems to work fine with BitPerfect.  The only issues I am aware of are those which also affected previous versions of iTunes and represent mostly edge cases and minor inconveniences.  BitPerfect user can feel comfortable upgrading from iTunes 12.0.x.

Tuesday, 20 January 2015

High-Order DSD

As support for regular DSD (aka DSD64) starts to become close to a requirement for manufacturers of not only high-end DACs, but also a number of entry-level models too, so the cutting edge of audio technology moves ever upward to more exotic versions of DSD denoted by the terms DSD128, DSD256, DSD512, etc.  What are these, why do they exist, and what are the challenges faced in playing them?  I thought a post on that topic might be helpful.

Simply put, these formats are identical to regular DSD, except that the sample rate is increased.  The benefit in doing so is twofold.  First, you can reduce the magnitude of the noise floor in the audio band.  Second, you can push the onset of undesirable ultrasonic noise further away from the audio band.

DSD is a noise-shaped 1-bit PCM encoding format (Oh yes it is!).  Because of that, the encoded analog signal can be reconstructed simply by passing the raw 1-bit data stream through a low-pass filter.  One way of looking at this is that at any instant in time the analog signal is very close to being the average of a number of consecutive DSD bits which encode that exact moment.  Consider this: the average of the sequence 1,0,0,0,1,1,0,1 is exactly 0.5 because it comprises four zeros and four ones.  Obviously, any sequence of 8 bits comprising four zeros and four ones will have an average value of 0.5.  So, if all we want is for our average to be 0.5, we have many choices as to how we can arrange the four zeros and four ones.

That simplistic illustration is a good example of how noise shaping works.  In effect we have a choice as to how we can arrange the stream of ones and zeros such that passing it through a low pass filter recreates the original waveform.  Some of those choices result in a lower noise floor in the audio band, but figuring out how to make those choices optimally is rather challenging from a mathematical standpoint.  Theory, however, does tell us a few things.  The first is that you cannot just take noise away from a certain frequency band.  You can only move it into another frequency band (or spread it over a selection of other frequency bands).  The second is that there are limits to both how low the noise floor can be depressed at the frequencies where you want to remove noise, and how high the noise floor can be raised at the frequencies you want to move it to.

Just like digging a hole in the ground, what you end up with is a low frequency area where you have removed as much of the noise as you can, and a high frequency area where all this removed noise has been piled up.  If DSD is to work, the low frequency area must cover the complete audio band, and the noise floor there must be pushed down by a certain minimum amount.  DSD was originally developed and specified to have a sample rate of 2,822,400 samples per second (2.8MHz) as this is the lowest convenient sample rate at which we can realize those key criteria.  We call it DSD64 because 2.8224MHz is exactly 64 times the standard sample rate of CD audio (44.1kHz).  The downside is that the removed noise starts to pile up uncomfortably close to the audio band, and it turns out that all the optimizing in the world does not make a significant dent in that problem.

This is the fundamental limitation of DSD64.  If we want to move the ultrasonic noise further away from the audio band we have to increase either the bit depth or the sample rate.  Of the two, there are, surprisingly enough, perhaps more reasons to want to increase the bit depth than the sample rate.  However, these are trumped by the great advantages in implementing an accurate D/A converter if the ‘D’ part is 1-bit.  Therefore we now have various new flavours of DSD with higher and higher sample rates.  DSD128 has a sample rate of 128 times 44.1kHz, which works out to about 5.6MHz.  Likewise we have DSD256, DSD512, and even DSD1024.

Of these, perhaps the biggest bang for the buck is obtained with DSD128.  Already, it moves the rise in the ultrasonic noise to nearly twice as far from the audio band as it was with DSD64.  Critical listeners - particularly those who record microphone feeds direct to DSD - are close to unanimous in their preference for DSD128 over DSD64.  The additional benefits in going to DSD256 and above seem to be real enough, but definitely fall into the realms of diminishing returns.  However, even though the remarkably low cost and huge capacity of hard disks today makes the storage of a substantial DSD library a practical possibility, if this library were to be DSD512 for example, this would start to represent a significant expense in both disk storage and download bandwidth costs.  In any case, as a result of all these developments, DSD128 recordings are now beginning to be made available in larger and larger numbers, and very occasionally we get sample tracks made available for evaluation in DSD256 format.  However, at the time of writing I don’t know where you can go to download samples of DSD512 or higher.

In the Apple World where BitPerfect users live, playback of DSD requires the use of the DoP (“DSD over PCM”) protocol.  This dresses up a DSD bitstream in a faux PCM format, where a 24-bit PCM word comprises 16 bits of raw DSD data plus an 8-bit marker which identifies it as such.  Windows users have the ability to use an ASIO driver which dispenses with the need for the 8-bit marker and transmits the raw DSD data directly to the DAC in its “native” format.  ASIO for Mac, while possible, remains problematic.

As mentioned, DoP encoding transmits the data to the DAC using a faux PCM stream format.  For DSD64 the DAC’s USB interface must provide 24-bit/176.4kHz support, which is generally not a particularly challenging requirement.  For DSD128 the required PCM stream format is 24-bit/352.8kHz which is still not especially challenging, but is less commonly encountered.  But if we go up to DSD256 we now have a requirement for a 24-bit/705.6kHz PCM stream format.  The good news is that your Mac can handle it out of the box, but unfortunately, very few DACs offer this.  Inside your DAC, if you prise off the cover, you will find that the USB subsystem is separate from the DAC chip itself.  USB receiver chipsets are sourced from specialist suppliers, and if you want one that will support a 24/705.6 format it will cost you more.  Additionally, if you are currently using a different receiver chipset, you may have a lot of time and effort invested in programming it, and you will have to return to GO if you move to a new design (do not collect $200).  The situation gets progressively worse with higher rate DSD formats.

Thus it is that we see examples of DSD-compatible DACs such as the OPPO HA-1 which offers DSD256 support, but only in “native” mode.  What this means is that if you have a Mac and are therefore constrained to using DoP, you need access to a 24/705.6 PCM stream format in order to deliver DSD256, and the HA-1 has apparently been designed with a USB receiver chipset that does not support it.  It may not be as simple as that, and there may be other considerations at play, but if so I am not aware of them.

Interestingly, the DoP specification does offer a workaround for precisely this circumstance.  It provides for an alternative to a 2-channel 24/705.6 PCM format using a 4-channel 24/352.8 PCM format.  The 8-bit DoP marker specified is different, which enables the DAC to tell 4-channel DSD128 from 2-channel DSD256 (they would otherwise be indistinguishable).  Very few DAC manufacturers currently support this variant format.  Mytek is the only one I know of - as I understand it their 192-DSD DAC supports DSD128 using the standard 2-channel DoP over USB, but using the 4-channel variant DoP over FireWire.

Because of its negligible adoption rate, BitPerfect currently does not support the 4-channel DoP variant.  If we did, it would require some additional configuration options in the DSD Support window.  I worry that such options are bound to end up confusing people.  For example, despite what our user manual says, you would not believe the number of customers who write to me because they have checked the “dCS DoP” checkbox and wonder why DSD playback isn’t working!  Maybe they were hoping it would make their DACs sound like a dCS, I dunno.  I can only imagine what they will make of a 2ch/4ch configurator!!!

As a final observation, some playback software will on-the-fly convert high-order DSD formats which are not supported by the user’s DAC to a lower-order DSD format which is.  While this is a noble solution, it should be noted that format conversion in DSD is a fundamentally lossy process, and that all of the benefits of the higher-order DSD format - and more - will as a result be lost.  In particular, the ultrasonic noise profile will be that of the output DSD format, not that of the source DSD format.  Additionally, DSD bitstreams are created by Sigma-Delta Modulators.  These are complex and very challenging algorithms which are seriously hard to design and implement successfully, particularly if you want anything beyond modest performance out of them.  The FPGA-based implementation developed for the PS Audio DirectStream DAC is an example of a good one, but there are some less-praiseworthy efforts out there.  In general, you can expect to obtain audibly superior results pre-converting instead to 24/176.4 (or even 24/352.8) PCM using DSD Master, which will retain both the extended frequency response and the lower ultrasonic noise floor of the DSD256 original.