The structure of MPEG-2 Video

MPEG-2 Video compression standard (also known as H.262 or MPEG-2 Part 2) was released in 1995. This standard is very similar to MPEG-1 but provides some extra features. MPEG-2 Video is backward compatible with MPEG-1 Video. So MPEG-2 Video decoder can decode every MPEG-1 Video bitstream. This article is a high level overview of MPEG-2 Video bitstream (or elementary stream) syntax.  We will describe the main syntax elements of MPEG-2 Video. To analyze MPEG-2 Video we will use Virinext Bitstream Analyzer. You can download the evaluation version on the Download page. For license acquiring please check the Buy license page.

High level MPEG-2 Video bitstream overview

At the high level the MPEG-2 Video is the sequence of syntactic elements. Each syntactic element starts by the 0x000001 prefix followed by a start code byte that describes the type of element. There are following types of syntactic elements:

  • Sequence header
  • Group of pictures
  • Picture
  • Extension
  • Slice
  • User data
  • Sequence end 

MPEG-2 Video elementary stream starts with a sequence header and sequence extension which is followed by a group of pictures header and then by one or more coded frames. Each coded frame starts with a picture header followed by picture coding extension and consists of one or more slices. Slice is a row of squares (macroblocks) with 16 pixels side, so slices always have 16 pixels height.

screnshoot of Virinext Bitstream Analyzer with opened MPEG-2 Video file
Screenshot from Virinext Bitstream Analyzer

There are following picture types:

  • I-frame (intra-coded) – picture with only intra prediction
  • P-frame (predictive-coded) – picture with inter prediction from one previous I- or P-frame
  • B-frame (bidirectionally predictive-coded) – picture with inter prediction from two previous I- or P-frames

I-frame exploits spatial redundancy only. P-frame and B-frame can use intra prediction as well as inter prediction (exploits temporal redundancy between frames).

Group of pictures (GOP) is a sequence of pictures started by an I-frame. The GOP is called closed when every picture in the GOP can be predicted from previous pictures in the same GOP. The GOP is called open when B-frame immediately following the first coded I-frame is predicted from I- or P-frame from the previous GOP. 

At the screenshot below the first GOP is closed and the second GOP is open.

Screenshot of GOP structure displayed in Virinext Bitstream Analyzer
First GOP is closed, second GOP is open

Sequence header and sequence extension

Both sequence header and sequence extension contain coding parameters which apply to the series of consecutive coded video frames. 

Sequence header carries frame size, bitrate, frame rate, aspect rate and optionally custom quantization matrix.

Sequence Header screenshot from Virinext Bitstream Analyzer

Sequence extension provides information about profile and level, chroma format, scan type.

Sequence Extension Header screenshot from Virinext Bitstream Analyzer

Group of pictures header

Group of pictures header contains the start timecode and flag that describes the GOP type (open or closed GOP).

Group of Pictures Header screenshot from Virinext Bitstream Analyzer

Picture header and picture coding extension

Picture header and picture extension describe frame related parameters. These headers contain picture coding type, picture structure (frame or field), fields order (which of fields top or bottom should be displayed first), etc.

Picture Header screenshot from Virinext Bitstream Analyzer

Сonclusion

We have described base syntactic elements of MPEG-2 Video. There are few other elements besides described here. MPEG-2 Video bitstreams can be analyzed with Virinext Bitstream Analyzer. It is a GUI tool for both in-depth and high-level analysis for many encoding standards including MPEG-2 Video.