Audio Codecs for Intermediate Use

I’ve previously ranted about the use of MP3 for anything but final delivery, and here at IT Conversations, Team ITC has been experimenting with the best codec to use for exchanging in-progress audio files. Under consideration have been:

1. MPEG-1 Lauer III (MP3, at 192kbps or 256kbps)
2. MPEG-1 Layer II
3. MPEG-2
4. FLAC (lossless)

We’ve been using MPEG-1 Layer II (MP2) based on my own empirical tests in which I encoded a WAV/PCM original using one of the above, then decoded and re-encoded using MP3 for release. Using MP3 (even hi-res) causes annoying artifacts when decoded/re-encoded. FLAC doesn’t offer much compression, although the audio is perfect. MPEG-2 has a boatload of options, and the ones I tested weren’t any better than hi-res MP3. (I didn’t test them all.)

I then went out and got some advice fom the real experts in public radio. Here are some excerpts from their comments. Stephen Hill wrote:

The quick answer is MP2…[When] the MPEG 1 spec was designed, precisely the problems you are facing were addressed. The solution was a 3 tier system for origination, production, and release formats, which correspond to Layers 1, 2 and 3. Layer 2 is an intermediate production format specifically designed to limit artifacts when re-encoded or transcoded into the final Level 3 release format. You are correct that NPR standardized on MPEG-1 Layer II at 128kbps (mono) or 256kbps (stereo) some ten years ago, and continues to use them today. The latest distribution architecture which they call the ContentDepot, is built on this. You can see the details at http://www.prss.org.

Steve Schultze of prx.org had this to say:

Basically MP2 survives much better than other codecs when reencoded. In codec-greek MP2 is recommended for “contribution (i.e. link between broadcasting studios with provisions for post processing)” vs. MP3 which is only recommended for “commentary links, i.e. a link for speech signals which are transmitted to the broadcasting station using e.g. one B-channel of an ISDN line.” Read about it here if you’re so inclined.

NPR came to the same conclusion: “Now NPR does not accept audio feeds in MP3 format, because employing this algorithm can lead to degraded audio in fewer generations than does high-bit rate MPEG Layer II.”

For PRX we chose MP2 at 128kbps mono / 256 stereo because it can be used fairly reliably for re-encoding later, is 1/5 the size of an uncompressed WAV, and it is the NPR standard. We generate Real and MP3 files from these MP2s and will likely create Windows Media, AAC and others in the future. Our producers would not have been able to reliably upload hour long WAVs (~600mb) even on broadband connections.

FLAC is non-lossy and is about %60 the size of the WAV, so that’s an option worth considering. However…MP2 would make it easier to get into the radio broadcast chain. Also, FLAC is less widely supported in sound editing software (MP2 *should* work in anything that supports MP3s although that’s not always true) and MP2s can be played in iTunes etc for convenience. MP2s at 256kbps are also still ~half the size of FLAC. If you settle on MP2 then you could give your producers/engineers our mp2 encoder which is drag-and drop with no settings, meaning that everyone is using the same standard.

MPEG 2 frankly still confuses the heck out of me because of the varied video/audio layers and settings.

For our purposes, it ultimately made sense to go with MPEG-1 Layer II as our “archive” and “interchange” format. It sounds like one of the better options for you.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s