Compared to the much of the animal kingdom, human beings have pretty terrible hearing. We have poor powers of echolocation, especially for sounds that come from behind us, we can only hear a relatively narrow bandwith of 20Hz-20kHz, and we’re easily fooled by illusions.
Some illusions, like those discovered by Diana Deutsch throughout her career, play on our brain’s inability to process too much information at once, and are lot like the illusion of moving pictures in a film. Other illusions take advantage of our mind’s propensity to pick up on several competing cues at once, giving us results that are like an audio version of an M.C. Escher painting.
But perhaps the most convincing tricks are the ones we play on ourselves. Expectation is one of the most powerful forces in shaping our perception, and it’s the reason that the same wine tastes better if we’re told it costs $90 instead of $10. The same goes for our stereo systems.
Because of this, I’m a strong believer in the value of blind listening tests. Engineers tend to love them, because they help us to test our ears and to identify which of our choices have meaningful effects and which amount to audio placebo.
Marketers of expensive audiophile snake oil, on the other hand, tend to hate blind listening tests. This is because they tend to show that dubious products like specialty power cords and bags full of magic pebbles have about as much impact on your sound as well, a bag full of rocks.
But to be fair, what’s impressive about these products is that in a sense, they do work — In much the same way that placebo medical treatments work. They work because we believe the story. Our minds long for convincing narratives, and we find them everywhere we look. This isn’t a matter of willpower, either. It’s a matter of science. Even the smartest and most objective among us are amazingly susceptible to suggestion.
But good engineers, musicians and producers want to get results that work by themselves — Results that don’t need a story to work. We want to make records that are so incredible they are the story. Blind listening is one of the effective weapons in our arsenal to help us do just that.
Technically, there is still some debate about the merits of blind listening tests. But that’s only true in the same sense that there’s technically still some debate about whether evolution is true, HIV causes AIDS, or whether God personally brought on the 2011 earthquakes in Japan for a variety of specific religious and political reasons.
Overall, blind listening is one of those rare subjects where one side is supported by the full weight of the scientific method, and the other side is full of superstitious hucksters with a bridge to sell you.
With that said, there are a few cases where there’s potential to misread the results of blind listening tests, and there are other times when these tests won’t tell us everything we want to know. But more on that later. First, let’s see how well you do.
Test Yourself: Do MP3 Bit-Rates Matter?
If you’re like the average music fanatic, you probably have an opinion on the quality of MP3s, even though you’ve never tested yourself blind to find out if you can hear the difference.
Well today, we’re going to do just that. Click over to MP3orNot.com for a couple of minutes and take a quick listening test to see if you can tell a difference between a lower-resolution 128kbps mp3, and a higher-resolution 320kbps file.
Go ahead, click here. We’ll wait. And we promise not to judge.
How Did You Do?
If you’re anything like the general population, the answer is probably “not so good.”
If you scored anything less than 4-out-of-5 on this double blind test, or if you can’t repeat a high score, I’m sorry to say that your results are no better than chance.
A similar test on NoiseAddicts.com reveals that listeners are pretty evenly split as to which choice is the superior file. The NoiseAddicts test even shows a slight tendency for music fans to prefer the lower-quality MP3 file. I’m not surprised by that finding. I’ve heard plenty of anecdotal evidence from college professors around the country that suggests when they blind-test their young students, the majority of them prefer the MP3 versions to full-resolution wav files.
If you found this test frustrating, you’re not alone. To restore your faith in yourself, read this brief and fantastic article on MaximumPC. It describes the results and reactions of four hardcore music fans to a similar test.
And if you want to try a test that’s a little easier to pass, try this one, which compares a fairly poor 128kbps MP3 encode to a full-resolution WAV file. But keep in mind that most people still can’t get it right.
Some Of Us Can Hear The Difference
I’ll admit it: When I took the test on MP3orNot.com I got a perfect score on my first pass: 5-out-of-5.
I could continue replicating that score for you all day long, if you wanted me to. This is probably partly because I have a propensity to do well on these kinds of tests, but mostly because I’m one of those giant nerds who has spent thousands of hours of his life listening for subtle sonic changes as he moves a microphone or tweaks an EQ.
This doesn’t mean I can explain exactly how you can learn to hear these difference for yourself. Taking tests like these is kind of like staring at a one of those 3D posters with a hidden image of unicorn. You relax, spread your focus, try not to think to much, and then BAM! There it is: A clear image of a team of anthropomorphic dinosaurs playing roller-hockey in full relief.
But knowing how brilliant and attractive the average TMimaS listener is, I’m sure some of you guys saw the dinosaurs too.
The Sins and Salvation of Blind Listening
As we’ve seen on SoundAddicts and heard from professors at SUNY Purchase and NYU, there appear to be times when a small majority of young listeners report that they’re unsure if they can hear a difference, but end up showing preference for lower-resolution files at a rate that seems higher than chance.
I haven’t seen great science on the subject, but it’s plausible to imagine this is a simple matter of listeners showing a subconscious preference for the types of sounds they’re already accustomed to. Or, because lower-resolution MP3 sometimes sound slightly brighter than their high-res counterparts, it could affect the results since brighter sounds tend to win out in quick “sip-tests” of two signals.
I have some anecdotal evidence of my own that could lead me to believe either of those theories. Years ago, I participated in a blind listening test on a website that will remain anonymous. This A/B test pitted a famous vintage hardware compressor against its digital plug-in counterpart. I had used each on many occasions and was certain I could tell which was which.
It turns out that I was right about that. But what amazed me were the results of the poll. As the votes came in, the crowd was split at first, and then began to veer in favor of the software plug-in. Not only did a small-but-significant majority of listeners show a preference for the sound of the plug-in, they also believed that they had selected the hardware version, because they believed the hardware version should sound better.
Once the curtain was lifted and the results were revealed, the listeners’ stories immediately began to change.The majority of participants now began to prefer the hardware version of the equipment in the non-blind version of the test.
Now this is where blind tests get messy. Did the test successfully show that the plug-in version sounded “better” or that there was no discernable difference between the two?
No, not all. We can’t come to that conclusion because a minority of trained listeners were able to blindly pick out the hardware version from the software version, and consistently preferred the hardware version they had selected. We also can’t come to that conclusion because with practice, listeners were able to improve their results in future blind tests of the same two devices.
Blind tests are valid and incredibly useful, but in aggregate, they also have the potential to lead us in the wrong direction. If you look at the average of a test’s results too hastily, you run the risk of missing when there’s room for personal taste, for participants to improve their scores, and for a meaningful minority of demonstrated preferences to develop around informed experience rather than conjecture.
Does Any of This Really Matter?
Now, lest I come across as cocky about my bat-like hearing, I want you to rest assured that A) I have the bat-like vision to match, and B) I know there are plenty of other blind tests where I’d never hear a difference myself.
In the software compressor test there were a portion of “golden ears” who could consistently differentiate between sources and showed a clear preference based on experience. But there are plenty of other blind tests were no “golden ear” can tell the difference reliably. Those differences are just too slight.
In test after test, even trained listeners have trouble telling codecs apart when they’re created at 160kbps and up, and I’m not yet aware of a study where blind listeners were able to tell 320kbps files apart from CDs.
With that said, I embrace the trend toward higher resolution multi-tracks, higher-resolution masters and higher-resolution consumer playback for many reasons, and consumer sound quality is just one of them. It’s also plausible that differences we can’t hear in an instant in an A/B “sip test” can become apparent over longer periods of time.
It’s not wise to trust superstition over science, but it’s important that we avoid becoming reductionist in our reliance on data, and that we remember not to totemize our current trials when better ones can still be imagined and designed.
What I recommend is twofold:
1) We should listen for ourselves and discover what differences we can and can’t hear — Without the luxury of telling ourselves the stories that can convince our ears of almost anything. Too many smart and well-meaning people make exaggerated claims about audio that they just can’t back up.
2) While we pursue better and better results, let’s also remember to be happy with what we’ve got. Technology can sound good, but music sounds best — And sometimes the story we tell ourselves is the most important thing.