If you read the post about how average LUFS are irrelevant to you as a mixing engineer, you may remember that I closed with the promise to write about an question I've heard (and replied to) quite a few times.
Why my tracks, when they get normalized to whatever target LUFS the platform <insert platform here> sound quieter than the commercial track <insert track name here> on the same plaftorm?
Or, in short: why the heck two songs, which are supposed to have the same loudness, don't?
It's a legitimate question. Since LUFS is supposed to be measuring loudness, same LUFS should mean same loudness.
But obviously it doesn't. What gives?
Let's see have a look.
It sounds louder because it is louder!
Let's get the first thing out of the way: the second song sound louder because it is louder... at certain times.
In rough terms (see below), with loudness normalization you can imagine that you have a certain amount of total energy that you can use over the 3 minutes of your track or whatever.
That's the same for every track of the same duration. In other words, for every time unit (say second) you get a little more energy to use. No more, no less. All three-minutes songs will have the same amount of energy available, all three-and a half minutes will have the same (a bit higher) amount and so on.
Everybody is given the same amount of Monopoly money at the start of the producing game.
As in Monopoly, the trick is how you use it - in particular, how you distribute the energy over the duration of the song. In other words, you don't have to use all the energy you get every second. You can save it, and use what you have saved in specific moments.
It's like saving money for a middle class two-weeks vacation. You spend less for eleven months and two weeks, but in two weeks you can splurge much more than you would normally do. :D
In musical terms, this translates to basically two things:
the arrangement (that is, the amount of different sounds you mix at any given moment) The more sparse the arrangement, the less of the available energy you will use in any given moment, so the more you'll save for when you want to go "bang".
the frequency content (that is, the specific sounds that you use at any given moment).
Over the course of the song, if you stay on the same level all the time, your instantaneous level will be always the same as the average. If you stay quiet most of the time, in certain short periods you will be able to use much more energy than average - i.e. will be much louder in that short period.
All this considered, it's easy to see how a track whose momentary use of energy is the same as average will be perceived as much flatter and quieter than one which is mostly "quiet" but occasionally very loud.
The perceptual effect due to the fact that our brain works more by difference than by absolute levels (same in vision as in hearing) does the rest - we will remember a track that can go much louder for a short time as .. well, louder :)
More tweaks
Things are a little bit more complicated in practice. The algorithm that calculates the amount of energy available has to account for unusual sections - like all silence - which are discarded (it's wasted time :)). It also tries to guess which energy is meaningful to consider as "used", given how our ears and brains work, using a psychoacoustic model to translate the specific frequencies in a certain window of time into an "energy usage" number.
But the gist is that if you have a certain amount of "loud" over the entire song, and how you use it will determine the perception of loudness.
Do I still need to worry about loudness?
Well, yes and no.
Remember: the point of loudness normalization is not to make every song sound the same (that'd be boring!) but to give back control of the overall level to the listener (as opposite to the mastering engineer).
That's because, given that every song will have more or less the same amount of energy, the listener can decide - by turning the volume knob - how that amount will be reproduced by the speakers.
Within a song, however, the specific arrangement and choice of sounds will matter - and it will make songs momentarily different in loudness.
In general, the more sparse the arrangement (and the better the mixing engineer, in cutting stuff which does not contribute to the final sound but uses energy), the loudest the song will be perceived if compared to another.
Making a rock epic very loud, for example, is harder than with a sparse hip hop song with "only" a vocal and a beat and will require much work by the mixing engineer - relentlessly eliminating all cruff to leave all and only the components that make the sound.
Most songs will not be made with a specific loudness in mind however (or limited to their death) but simply to sound as nice and good as possible to the given reference level liked by the mixer.
When the user selects his/her own reference level, the same goodness should come out.
So differences, while still there, will be much lower than in traditional peak-normalized mastering.
More details
When I started writing this post, I began looking at the LN algorithm itself and how it does what it does, but quickly found there was a lot of mathematical base that couldn't be given for granted and made the whole too technical and boring to read.
So I decided to stay a little more high up and jump directly to the conclusions. But If you want a more technical description of the why and hows of the algorithm, just leave comments and I'll duly oblige!
Happy mixing! PS: you may also want to have a look for which low end rumble may make your mix much quieter than it could, without you being able to understand why - check out the "Shall I keep the low end?" post.