You can’t get far into an audio forum or the comments sections of audio websites without encountering the statement “Some products that measure well sound bad, and some products that measure poorly sound good.” Depending on who said it, it’s at best uninformed and at worst a lie. And it’s a lie that sometimes sticks listeners with underperforming audio gear.

The inaccuracy of the first half of that sentence can be shown in scientific papers and in the absence of documented examples. The second half could be true if the words “. . . to me” were added, but to the best of my recollection, I’ve always seen it presented as a universal statement, in which case it’s false. This platitude reflects not wisdom, but a rejection of science by people who, as far as I can tell, haven’t bothered to look into the science and have no measurement experience.

The Biggest Lie

One of the most glaring examples of this sentiment appeared just this month, in a review of the Tannoy Revolution XT 6 speaker by Herb Reichert in the July 2020 issue of Stereophile. The first sentence of the review reads, “I’ve been wrestling with my elders about new ways to measure loudspeakers, lobbying for methods that might collaborate [sic] more directly with a listener’s experience.” In another article, the same writer states his opinion more directly: “As a tool for evaluation, or as a predictor of user satisfaction, today’s measuring procedures are almost useless.” As we’ll see, this review clearly shows why measurements are so essential in the evaluation of audio products.

Both of the author’s statements reflect ignorance of the subject. In the case of speakers, measurement methods that have been shown to predict user satisfaction with 86% correlation were established more than 30 years ago. They were developed largely through extensive research led by Dr. Floyd Toole, conducted at Canada’s National Research Council (NRC) in Ottawa, and continued at Harman International. Countless speaker companies now use these methods as a design guideline. That’s because they know that speakers that measure well according to these principles will sound good to most listeners.

Some might point out that the model fails 14% of the time, but it’s unlikely that the 14% of speakers that measure well but didn’t win universal love from the listening panel sound “bad,” unless they have, say, high distortion -- which a different set of measurements could easily detect. Regardless, it’s absurd to proclaim an 86% success rate “almost useless.”

More recently, scientific research has produced headphone and earphone measurements that predict user satisfaction about as accurately. For example, in AES paper 9878, “A Statistical Model that Predicts Listeners’ Preference Ratings of In-Ear Headphones: Part 2 -- Development and Validation of the Model,” a Harman International research team of Dr. Sean Olive, Todd Welti, and Omid Khonsaripour report a 91% correlation between measurements and listener preferences in an evaluation of 30 earphones using 71 listeners.

AES 9878

I’ll agree that measurements don’t predict which amps, DACs, and other electronics people will like. But that’s not because of flaws in the measurements -- it’s because listeners rarely agree about which audio electronics they like. Blind tests seldom show clear differences between, or preferences for, certain models, brands, or types of amplifiers, for instance. Reviews of these products do not indicate preference trends among reviewers; they tend to rave about all sorts of amps and DACs. If a statistically significant number of participants in controlled listening tests don’t express affection for some audio electronics and disdain for others, there’s no way measurements or subjective reviews can predict listener preference.

What about the idea that “some products that measure poorly sound good”? A solid argument against this notion came from Stereophile technical editor (and former editor-in-chief) John Atkinson, who, in a summary of his 1997 AES presentation, stated, “. . . once the response flatness deviates above a certain level -- a frequency-weighted standard deviation between 170Hz and 17kHz of approximately 3.5dB, for example -- it’s unlikely the speaker will either sound good or be recommended.” And he’s talking here about the speakers recommended by Stereophile writers. Research shows that a panel of multiple listeners in blind tests would likely be even less forgiving of speakers that measure poorly.

AES Atkinson

Of course, even a clearly flawed audio product might sound good to somebody. To find an example, look no further than the very same Tannoy review. Atkinson’s measurements show that, as he puts it, “. . . the tweeter appears to be balanced between 3dB and 5dB too high in level,” which creates an “excess of energy in the presence region, which I could hear with the MLSSA pseudorandom noise signal when I was performing the measurements.”

To get a rough idea of what this sounds like, turn the treble knob on an audio system up by 4dB. It’s far from subtle, and it’s not pleasant. I can’t look at that measurement without thinking the factory used the wrong tweeter resistor. In a blind test with multiple listeners, such as the evaluations conducted by the NRC or Harman, this speaker would almost certainly score poorly.

Yet I find no mention of this flaw in the subjective review. In fact, the reviewer describes the speaker’s sound as “slightly soft,” and concludes with the words “Highly recommended.” Based on this review, at least, it seems likely that if a measurement technique could be found that reliably predicts which speakers this reviewer likes, most listeners won’t like those same speakers.

Fortunately, those who read the measurements got the real story. Those who ignored the measurements because they’ve been told they’re “almost useless” may end up buying a speaker with an obvious tonal-balance error.

Don’t get me wrong -- I don’t mind if someone raves about an audio product with a huge, demonstrable flaw, just as I’d hope no one minds if I occasionally enjoy listening to Kiss’s Alive! album. I’ve read many such reviews, and rarely felt inspired to comment on them. But dismissing decades of work by some of the world’s most talented audio scientists just because it doesn’t fit your narrative is as frivolous as claiming that Gene Simmons is the greatest bass player of all time.

I would hope that audio writers would be curious about their avocation and want to learn everything they can about it, but a huge percentage of them have shut themselves off from any new information that might cast some of their beliefs in doubt. In their rejection of science, they’ve mired their readers and their industry in nonsense -- and in many cases, they’ve stuck their readers in the infinite loop of buying underperforming products and then selling those to buy other flawed products, instead of simply learning key facts about audio so they can buy good gear the first time.

Frequency response curves

I’m encouraged, though, because the headphone community isn’t burdened with an anti-science attitude. On the contrary, headphone enthusiasts are putting together measurement rigs, reading the research, and working to understand how their headphones and amps work and interact. Yet they understand that science provides only guidelines, and that they ultimately have to listen for themselves and trust their ears to make the final judgment. Most important, they are getting better reproduction of, and more enjoyment from, their music. I think and hope that this is the future of audio.

. . . Brent Butterworth
This email address is being protected from spambots. You need JavaScript enabled to view it.

Say something here...
Log in with ( Sign Up ? )
or post as a guest
People in conversation:
Loading comment... The comment will be refreshed after 00:00.
  • This commment is unpublished.
    Todd · 11 hours ago
    Question for Brent,

    Brent, I recently heard Andrew Jones remark that it is a misnomer that the majority of speaker damage is caused by amplifier clipping. Rather, he said speaker damage is almost always caused by too much power. I was surprised, and would like to hear your view on this if you care to opine.
  • This commment is unpublished.
    Dustin · 1 days ago
    Great article, Brent. There was also a fair bit of back and forth on this topic in the comments section in another recent Stereophile article (Totem Skylight speakers). I tried to argue in favour of what the science has demonstrated (username: buckchester). I even got some replies from Jim Austin, the new editor. I was disappointed with his responses. It’s frustrating when so many people in this hobby are so obviously irrational.

    Floyd Toole posts quite often on AVS Forum and he has actually stated that when speakers of similar bass capability were used, the correlation actually increased from 86% t 99%. I can find you the exact quote if you’d like.
  • This commment is unpublished.
    Dr. Ears · 2 days ago
    The biggest lie in audio is, "I think your system sounds better than mine".
    I have been buying & selling New Old Stock audio tubes for four decades.
    Whenever I buy a decent size lot, I take the best and worst testing pairs and listen to them, I have never heard an audible difference, so I concluded long ago that whatever we are testing for cannot be heard.
    As components have gotten better with the notable exception of audio vacuum tubes, we can now reproduce a flat frequency curve better than ever.
    However, I believe that most of us find a flat frequency curve to sound harsh with listening fatigue occurring fairly quickly.
  • This commment is unpublished.
    John Mayberry · 2 days ago
    Measurements are important. Yet they are not always definitive and don't necessarily tell a story accurately.

    I remember Dick Heyser and his knuckle test. He'd knock his on the side of a speaker. He said, "if you like that note you're going to love this speaker".

    The simple fact is there are only a few speakers which provide even a passible waterfall response or an impedance measurement without major phase related anomalies, A great many of them are truly a dog's breakfast.

    We still don't have a musical transfer function 50 years after first being postulated.

    That's before we even consider their interaction with the acoustic environment.

    Yes, testing is important. But 99% of the speakers out there don't test well. That may be the gist of the issue.

    • This commment is unpublished.
      Brent Butterworth · 2 days ago
      Hi, John. Much of what you've said is new to me. With waterfall responses, have we determined what "passable" is? Is there research that ties these to blind listening test results?

      Ditto for impedance -- my measurements demonsrate the corrleation of headphone impedance curves with sensitivity to output impedance of the source device, but I don't know of any research that ties speaker impedance curves to listening test results, other than a <4-ohm impedance is more than a lot of amps can handle.

      I measure only about 15-20 speakers a year right now, but I did a lot more when I worked for Sound & Vision. Off the top of my head, I'd guess that a third of them measured pretty well. Maybe even half of them. Of course, those were mostly fairly mainstream products; if you measured all the speakers at a high-end audio show, I expect the percentage wouldn't be as high.
  • This commment is unpublished.
    peter lyngdorf · 2 days ago
    Great article, and very timely. I recently heard the argument from an audio designer, that if a power amplifier measures well then it cannot possibly sound good. I tried to argue -to no avail -
  • This commment is unpublished.
    Erin · 3 days ago
    I appreciate this article. I believe that *proper measurements* and *accurate measurements* can do considerably more in a review than someone's subjective impression. Unfortunately, these two criteria are often failed. A measurement isn't useful if it doesn't meet these criteria. In my humble opinion. I believe, at the very least, objective measurements should be taken and an attempt to correlate with what was heard should be made. I'm trying to do that at my site. Here's an example review: https://www.erinsaudiocorner.com/loudspeakers/neumi_bs5/
  • This commment is unpublished.
    no ne · 3 days ago
    I'm naturally inclined towards the scientific method, so want to go by measurements, but have found both measuring and listening important (and not always in agreement).

    Would Brent please comment on how a company like B&W makes wildly-popular speakers that sound good to so many people (including Mr. Atkinson) yet measure so poorly? It almost seems like they use their engineering talents to tease Mr. Atkinson and the rest of us measurement types. I must assume that their creation is intentional, given their resources, so they've intentionally made a recessed-mid-range and hot tweeter sound good to a majority of listeners (both professional and consumer).

    I've personally struggled with this so am trying to understand. I've tried to gain a decent understanding of speaker science and measurement. From that I respect Revel more than B&W, for instance. Yet when I spend hours in the same room auditioning both brands I can't get my ears to agree with my intellect. I haven't had the pleasure of blind tests at Harmon's facility, but nor would I be using the speakers I buy there. So I enjoyed this article, am interested in the topic, yet want more information to corroborate what I hear with what I understand. Thanks.
    • This commment is unpublished.
      Brent Butterworth · 2 days ago
      Great question, and one that real speaker aficionados (e.g., people who measure, design, build, etc.) often discuss. The key here is "blind listening tests." B&Ws enjoy an esteem that almost certainly helps them in sighted testing; I suspect a desire to maintain that brand identification is the reason they stuck with Kevlar diaphragms long after most others abandoned them. But I cannot recall a situation where B&Ws excelled in a multi-listener blind test. In fact, most of the times when manufacturers have demoed their speakers for me in blind tests, a B&W model was among those they chose to demo against.

      I haven't measured or evaluated a B&W model in three or four years, but I have reviewed many, going back to the early 1990s. From what I have observed, the company's speaker lines are inconsistent. There seem to be great and mediocre models within each line. I cannot identify a common design philosophy running through them. In comparison, I have done a blind test with multiple models from the Revel Perfoma2 series, and they sound (and measure) almost the same.

      Speaker popularity from a sales standpoint has little to do with performance -- if memory serves, Bose was the most popular speaker brand from the late 1980s until a few years ago, when they were replaced by Amazon. Neither is revered for sound quality.
      • This commment is unpublished.
        Dr. Ears · 2 days ago
        My gutted and re-done Green Mountain Audio's sound great for two mains reasons, first they are time aligned & phase coherent, secondly, they are 4-ways using only first order cross overs. Matching drivers is a bitch and a once in a lifetime achievement.
  • This commment is unpublished.
    gzost · 3 days ago
    Thank you for the article! There are still way too many people out there who ignorantly deny the value of measurements, and this is a good intro text to point them to.

    Also: I am continually amazed that publications like stereophile exist and find readers. Their reviews provide no measurable value to the reader.
  • This commment is unpublished.
    gabs · 3 days ago
    Well, I discovered that on every headphone and earphones I have, I find that music is "better" through my Apple type-c dongle than my Chord Mojo. Measurements prove that. But the Mojo is beautiful and has taps you know...
  • This commment is unpublished.
    todd · 3 days ago
    Thank you brent, fantastic

SoundStage! Expert: Sonus faber Olympica Nova Speakers - 1) General Care (February 2020)

SoundStage! Expert: Sonus faber Olympica Nova Speakers - 2) Grille Care (February 2020)

SoundStage! Expert: Sonus faber Olympica Nova Speakers - 3) Cleaning (February 2020)

Latest Comments

Todd 11 hours ago The Biggest Lie in Audio
Question for Brent,

Brent, I recently heard Andrew Jones remark that it is a misnomer ...
Dustin 1 days ago The Biggest Lie in Audio
@Joe PopFloyd Toole has pointed out a number of times over at AVS Forum that John ...
Dustin 1 days ago The Biggest Lie in Audio
Great article, Brent. There was also a fair bit of back and forth on ...
Brent Butterworth 2 days ago The Biggest Lie in Audio
@Dr. EarsIf you haven't read it yet, this paper by Floyd Toole goes into depth about ...
Dr. Ears 2 days ago The Biggest Lie in Audio
@Brent ButterworthMy gutted and re-done Green Mountain Audio's sound great for two mains reasons, first they ...