MPEG Audio: high level bitstream overview

Initially standardized as part of the MPEG-1 specification, MPEG Audio compression became available in 1993 after the releasing the first version as an international standard ISO/IEC 11172-3, known as MPEG-1 Audio or MPEG-1 Part 3. A subsequent iteration extended the MPEG-2 Audio specification, known as MPEG-2.5 audio, which was developed at Fraunhofer IIS. It’s important to note that MPEG-2.5 was not developed by MPEG and did not receive approval as an international standard.

NameStandardRelease date
MPEG-1 AudioISO/IEC 11172-3 (MPEG-1 Part 3)1993
MPEG-2 AudioISO/IEC 13818-3 (MPEG-2 Part 3)1995
MPEG-2.5 Audioproprietary2000
MPEG Audio standards

This article provides a high level overview of MPEG Audio bitstream. Also we will show the location of some parameters of the audio (channel count, frame_size, sampling rate) in an encoded elementary audio stream. To analyze MPEG Audio we will use Virinext Bitstream Analyzer. You can download the evaluation version on the Download page. For license acquiring please check the Buy license page.

High level MPEG Audio bitstream overview

At the high level MPEG Audio is the sequence of audio frames. Each frame starts with a header that describes basic encoded audio parameters. The header starts with 11 bits size syncword equal to 0x7FF. Later we will show the location of some parameters of audio in encoded MPEG Audio elementary streams. As an example we will use the files with parameters in the screenshot below.

Layer

The layer describes which part of coding systems is used. Encoder complexity and potential performance are increasing from Layer I to Layer III. The used value is stored in the layer field.

Value of layer fieldUsed layer
3Layer I
2Layer II
1Layer III
Semantics of the layer field

In the example file we have layer=1 that means Layer III is used.

Version

The used specification version is signaled by ID and syncword fields. Syncword field is 12 bits length for both MPEG-1 and MPEG-2 Audio and should be equal to 0xFFF. For these formats the length of the ID field is 1 bit. For MPEG-2.5 Audio the length of the syncword is 11 bits and the length of ID is 2 bits. Version is signaled by 2 bits value that consists of the last bit of syncword and 1 bit ID for MPEG-1 and MPEG-2 Audio or just by 2 bits ID for MPEG-2.5.

0MPEG-2.5
1Reserved
2MPEG-2
3MPEG-1
Semantics of the ID field

In the example file we have ID = 3 that means the file is encoded in MPEG-1 Audio.

Sampling rate

Sampling rate is calculated by using sampling_frequency field based on the version with the following algorithm:

Frequence_Tab = { 44100, 48000, 32000 }
Frequence_Divider = 1
IF version == MPEG-2 Audio
    Frequence_Divider = 2
IF version == MPEG-2.5 Audio
    Frequence_Divider = 4

Sample_Rate = Frequence_Tab[sampling_frequency] / Frequence_Divider;

In the example file we have sampling_frequency = 0, and version is MPEG-1 Audio, so the sampling rate is 48 kHz.

Channel count

Channel count is described by mode field.

0stereo
1joint_stereo
2dual_channel
3single_channel
Semantics of the mode field

In the example file mode = 0 that means audio is stereo.

Frame size and bitrate

Frame size and bitrate are described by bitrate_index and depend on layer, version, sampleRate and padding_bit. For calculating the following algorithm is used:

Bitrate_Tab[2][3][15] =
{
  {
      {0, 32, 64, 96, 128, 160, 192, 224, 256, 288, 320, 352, 384, 416, 448 },
      {0, 32, 48, 56,  64,  80,  96, 112, 128, 160, 192, 224, 256, 320, 384 },
      {0, 32, 40, 48,  56,  64,  80,  96, 112, 128, 160, 192, 224, 256, 320 }
  },
  {
    {0, 32, 48, 56,  64,  80,  96, 112, 128, 144, 160, 176, 192, 224, 256},
    {0,  8, 16, 24,  32,  40,  48,  56,  64,  80,  96, 112, 128, 144, 160},
    {0,  8, 16, 24,  32,  40,  48,  56,  64,  80,  96, 112, 128, 144, 160}
  }
}

VersionId = version == VERSION_1 ? 0 : 1
layerId = layer == LAYER_1 ? 0 : layer == LAYER_2 ? 1 : 2
Bitrate = Bitrate_Tab[versionId][layerId][bitrate_index] * 1000
FrameSize = Bitrate / 1000

switch (layer) 
{
case LAYER_I:
      FrameSize = (FrameSize  * 12000) / Sample_Rate;
      FrameSize = (FrameSize + padding_bit) * 4;
      break;
case LAYER_II:
      FrameSize = (FrameSize  * 144000) / Sample_Rate;
      FrameSize += padding_bit;
      break;
case LAYER_III:
      FrameSize = (FrameSize * 144000) / (Sample_Rate << versionId);
      FrameSize += padding_bit;
      break;
}

The example file is MPEG-1 Audio with Layer_III and bitrate_index=14, so bitrate is 320kBits.

Conclusion

In this article, we have explored the MPEG Audio compression standard. We have examining details of standard by looking for the location of key video parameters within the encoded MPEG-Audio bitstream, including format version, layer, sampling rate, channels count, audio frame size and bitrate. By using the powerful analysis tool Virinext Bitstream Analyzer, we have demonstrated how these parameters are stored into coded bitstream.
With Virinext Bitstream Analyzer, you can analyze MPEG-Audio files and gaining insights of the low level audio encoding and bitstream parameters. If you interested in exploring further, we are offering you to try free evaluation version of Virinext Bitstream Analyzer. Virinext Bitstream Analyzer it is a GUI tool for both in-depth and high-level analysis for many encoding standards including MPEG-Audio.