VP9: high level bitstream overview

VP9 video compression standard was developed by Google and released in 2013. VP9 is an open and royalty-free video coding format. This article provides a high level overview of VP9 bitstream syntax. In the article we will describe main syntax elements of VP9 video and show the location of some parameters of the video in encoded bitstream. To analyze VP9 we will use Virinext Bitstream Analyzer. You can download the evaluation version on the download page. For license acquiring please check the Buy license page or contact us via [email protected].

High level bitstream overview

On the high level VP9 video is the sequence of frames. Frame starts with an uncompressed header, that contains high level frame information (frame width and height, bit depth, chroma subsampling, etc.). The compressed header follows after the uncompressed header and specifies low level encoding parameters. This header is compressed by arithmetic coding. These headers are followed by tiles. The tile contains coded frame data. Tile size starts from 256 pixels and can be up to 4096 pixels wide.

The structure of VP9 bitstream.

Along with conventionally frames VP9 introduces special frame types:

  • Hidden frame. Frame can be marked as non showing for users right after decoding. This type of frame is called a hidden frame. Hidden frame can be used as reference in inter prediction for future frames. Hidden frame can be displayed directly – the encoder can send a very short special frame (some bytes), that tells the decoder to show one of the reference frames.
  • Superframe. Some VP9 frames can be combined together in one supeframe. It can be used for emulating B-frames. In this way the reference frame is marked as hidden and combined with reordered frame in one superframe. Reference frame can be displayed after superframe via direct request.

Later we will show the location of some parameters of the video in encoded VP9 elementary bitstreams. As an example we will use the file with parameters in the screenshot below.

Screenshot of uncomressed header from Virinext Bitstream Analyzer.

Frame type

Type of frame is stored in the frame_type field in uncompressed_header. A frame is a key frame when frame_type=0. When frame_type=1 the frame is non-key frame.

Frame size

Frame size is stored in frame_size::frame_width_minus_1 and frame_size::frame_height_minus_1 fields in uncompressed_header and present only for a key frames.

Aspect ratio

Aspect ratio is signaled via render_size field. When render_and_frame_size_different=0 the pixels in frame are squares. When pixels are non squares, the pixel aspect ratio is described by render_size::render_width_minus_1 and render_size::render_height_minus_1 fields in uncompressed_header.

Profile

Profile is stored in profile_high_bit and profile_low_bit fields in uncompressed_header that combine together. VP9 supports 4 profiles:

Table with VP9 profiles description.

Bit depth

When the profile is 0 or 1 the bit depth is 8 bit. In another case bit depth is described by color_config::ten_or_twelve_bit field in uncompressed_header.

Screenshot of Virinext Bitstream Analyzer with displayer uncompressed_header with ten_or_twelve_bit field.

Chroma format

When the profile is 0 or 2 the chroma format is always 4:2:0. When profile is 1 or 3 the chroma subsampling is described by color_config::subsampling_x and color_config::subsampling_y fields in uncompressed_header.

Conclusion

We have provided a high level overview of VP9 video and described the location of a some parameters of video in encoded bitstream. VP9 video files can be analyzed with Virinext Bitstream Analyzer. It is a GUI tool for both in-depth and high-level analysis for many encoding standards including VP9 video.