The invention of the Compact Disc format that we know and love today is dated to around 1980 when Philips and Sony settled on the ‘Red Book’ standard, specifying a 120mm optical disc containing 2 channels of digitally encoded audio, with 16bit values sampled at 44100Hz. The format could store 74min-80mins of music on a single disc which was almost double capacity of a typical long playing record.
The CD has been with us now for 42 years and in that time it’s persisted as a relevant format despite the invention of others such as SACD, DVD-Audio, BluRay Audio and indeed the download and streaming revolution with promises of music delivered in up to 24bit 192kHz resolution.
It’s the latter revolution which has probably represented the biggest threat to the existence of the CD because of the ease in which it can be distributed, and also because of the promises on paper of much improved audio resolution which tantalises audiophiles with the prospect of music at a quality much higher than can be offered from our ageing silver disc.
Internet forums, online music groups and hifi blogs are literally full of praise for so called hi definition formats, with countless personal testimonies from those claiming that an HD music download offers far better quality than a CD.
Nothing could be further from the truth however and there is even plenty of good evidence that sample rates up to 192kHz could even be detrimental to sound quality due to factors like intermodulation distortion, which basically means high frequency noise affecting the frequencies we can hear.
To understand why 16bit 44.1kHz is all we need to enjoy audio at the best quality, we need to delve a little in to the science behind what exactly those numbers mean, namely the bit depth and sample rate. Let’s start with the sample rate:
CD audio is encoded at 44.1kHz employing Nyquist theory which states that a sample rate twice the frequency recorded, needs to be used to capture the highest frequencies that our ears can detect. A process called anti-aliasing is then employed which smooths the jagged waveform left by our sampling process and reconstructs a nice ‘ear friendly’ curve. In order for this process to work correctly the anti-aliasing filter blocks the audio above the Nyquist frequency to prevent this being incorrectly assigned to frequencies in the audible range. In the days of analogue filters the full range of audio typically up to 96kHz was left in. Importantly, this wasn’t left in to increase the resolution of audio but to provide more working space for the anti-aliasing filter to work correctly.
Modern digital to analogue converters (DAC) and analogue to digital converters (ADC) have much more capable digital filters meaning that the need to include these higher frequencies is no longer important and can actually be detrimental. In short, sample rates of 44.1kHz and 48kHz now offer more than enough resolution than we ever need, so the sampling rate of 44.1kHz in CD audio is more than adequate to correctly sample any frequency that we can hear.
Removing stuff from audio never sits well with those whose expectation is that this will reduce quality. But you have to remove the bad stuff to hear the good stuff, which ironically is a principle that we also employ in mixing.
I’ve heard some people state that capturing audio in the area above 22.1kHz somehow magically embues the frequencies in the audible range with more specialness but what exactly that consists of is something that science fails to recognise… I’m afraid there is literally no credible evidence to support that theory but yet the assertion that higher sample rate audio is ‘just better’, still persists despite science stating otherwise.
Let’s look next at 16bit vs 24bit.
The terms 16 bit/ 24bit are sometimes incorrectly described as bit rates but they are actually bit depths and describe the resolution of the quantisation process used to represent the volume dynamics in audio. 16bit audio is good enough to represent a dynamic range of 96dB by traditional calculations (6 bits per dB), but in reality the actual dynamic range can be much higher due to a process called dithering which is used whenever we reduce the bit depth of audio from say 24bit to 16bit. A shaped dither can processes the noise floor of the recording to an area where it can’t be heard which dramatically increases the range to 120dB.
Bit depth has no bearing at all on audio quality, just dynamics. In practical terms a 120db represents the difference between the quietest perceptible sound detectable by the human ear in a perfectly silent room, and a sound level that would cause hearing damage, akin to standing by a working jet engine.
I think we can safely say that 16bit is more than anyone needs in audio playback.
And in summary, the humble CD can convey all the dynamic range any human ever needs.
So if we only need 16bit to represent any music in playback, why do audio engineers record and mix in 24bit?. Well there are still reasons why working in 24bit is desirable. When recording, 24 bit depth gives a deeper noise floor, more headroom and gives a larger tolerance for inadvertently introducing clipping. Whilst the noise on a single 16bit audio file is insignificant, we typically work with 100s of tracks in a single session using huge numbers of plugins. Across thousands of operations, the summation of noise would eventually become noticeable so we use the best possible bit depth to minimise that.
Once the final track is mixed we dither down to 16bit for CD distribution, where there is no reason to keep any more than 16bits.
In recent years the channels for distribution have increased beyond CD audio and we also have streaming and downloads to consider. These platforms tend to accept the audio natively in 24bit 96kHz or 24bit 44.1kHz so there is no need to dither down to 16bit, which I think is where myth and legend has grown about ‘hi–def’ formats. Dithering is now an additional process for CD whereas it’s actually simpler to issue music in 24bit 44.1Hz or 96kHz. Who would not be impressed by files promising 24bit 96kHz quality rather than boring old 16bit 44.1kHz.?
Well as I’ve explained, there’s no difference that you can hear.
But don’t just take my word for it, many established music journals, hifi-magazines and forums have tried to conduct scientific trials on whether audiophiles can really hear the difference between 24 bit 96kHz and standard CD quality 16 bit 44.1kHz. In 2004, Mix magazine conducted a properly conducted double blind trial where they asked a large group of audiophiles, audio engineers and college students to see whether they could correctly determine the difference between 96kHz audio and the same one converted to 44.1kHz. The audiophiles failed to correctly determine the higher quality file more than 50% of the time, in other words they were just guessing.
So finally why do some people still say they can ‘hear’ a difference. Well as we’ve established the ear can only hear what it is capable of detecting, frequencies from 20Hz from 20kHz. Claims of having ‘golden ears’ are numerous but you can’t deny physiology. You can train yourself to hear very precise differences in audio however and there are people that can determine very subtle audio changes.
There is some very new evidence that our brain is influenced by audio outside of our hearing range. (infrasonic and hypersonic). But I think what is largely happening are factors like cognitive bias, the placebo effect and the absence of correctly setup objective double blind testing.
I see a lot of people writing on forums and social media saying that the 24bit 96kHz version of so and so’s album sounds so much better than the CD.
Well it’s difficult to unpack just how anecdotal and incorrect that statement is. For a start the likelihood is that the master is completely different, as any ‘high definition’ release will have almost certainly have been re-mastered. Then there is the fact that regardless of file formats and audio resolutions, a CD player will never sound the same as a file played on a streaming device. The audio path is completely different from conversion of the file from digital to analogue and the playback device will have its own sound signature. Another popular misconception is that every CD player sounds the same. They can be very different particularly in the audio signal stages after the digital to analogue conversion where the sonic signature of the player colours the audio.
I think what is clear is that, whilst our ears have very prescribed objectivity. Our brains on the other hand, are completely gullible.
The picture by the way is of my own Marantz CD53. It cost me about £40 from eBay and sounds utterly brilliant. Somewhere between the 1990s and now, consumers were hoodwinked into thinking that such a machine is obsolete. Their loss is your gain and those that are in the know can enjoy stunning audio for bargain prices.
But there will always be those that believe that only 24 bit 192kHz will do.