We can all agree that analyzing video quality is one of the biggest challenges when evaluating codecs. Companies use a combination of objective and subjective tests to validate encoder efficiency. In this post, I’ll explore why it is difficult to measure video quality with quantitative metrics alone because they fail to meet the subjective quality perception ability of the human eye.
Furthermore, we’ll look at why it’s important to equip yourself with the best resources when doing subjective testing, and how Beamr’s VCT visual comparison tool can help you with video quality testing.
But first, if you haven’t done so already, be sure to download your free trial of VCT here.
The most common objective measurement used today is pixel-based Peak Signal to Noise Ratio (PSNR). PSNR is a popular test to use because it is easy to calculate and nearly everyone working in video is familiar with interpreting its values. But it does have limitations. Typically a higher PSNR value correlates to higher quality, while a lower PSNR value correlates to lower quality. However, since this test measures pixel-based mean-squared error over an entire frame; measuring the quality of a frame (or collection of frames) using a single number does not always parallel true subjective quality.
PSNR gives equal weight to every pixel in the frame and each frame in a sequence, ignoring many factors that can affect human perception. For example, below are 2 encoded images of the same frame.1 Image (a) and Image (b) have the same PSNR, which should theoretically correlate to two encoded images of the same quality. However, it is easy to see the difference in this example of perceived quality as viewers would rate Image (a) as exceptionally higher quality than Image (b).
Due to the inconsistencies of error-based methods, like PSNR to adequately mimic human eye perception, other methods for analyzing video quality have been developed, including the Structural Similarity Index Metric (SSIM) which measures structural distortion. Unlike PSNR, SSIM addresses image degradation as measures of the perceived change in three major aspects of images: luminance, contrast, and correction. SSIM has gained popularity, but as with PSNR, it has its limitations. Studies have suggested that SSIM’s performance is equal to PSNR’s performance and some have cited evidence of a systematic relationship between SSIM and Mean Squared Error (MSE).2
While SSIM and other quantitative measures including multi-scale structural similarity (MS-SSIM) and the Sarnoff Picture Quality Rating (PQR) have made significant gains, none can truly deliver the same assurance as subjective evaluation, using the human eye. It is also important to note that the two most widely used objective quality metrics mentioned above, PSNR and SSIM, were designed to evaluate static image quality. This means that both algorithms provide no meaningful information regarding motion artifacts, whereby limiting the effectiveness of the metric with regards to video.
While objective methods attempt to model human perception, there are no substitutes for subjective “golden-eye” tests. But we are all familiar with the drawbacks of subjectivity analysis, including variance of individual quality perception and the difficulties of executing proper subjective tests in 100% controlled viewing environments so that a large number of testers can participate. Evaluating video using subjective visual tests can reveal key differences that may not get caught by objective measures alone. Which is why it is important to use a combination of both objective and subjective testing methodologies.
One of the logistic difficulties of performing subjective quality comparisons is coordinating simultaneous playback of two streams. Recognizing some of the drawbacks of current subjective evaluation methods, in particular single-stream playback or awkward dual-stream review workarounds, Beamr spent years in research and development to build a tool that offers simultaneous playback of two videos with various comparison modes, to significantly improve the golden-eye test execution necessary to properly evaluate encoder efficiency.
Powered by our professional HEVC and H.264 codec SDK decoders, the Beamr video comparison tool VCT allows encoding engineers and compressionists to play back two frame-synchronized independent HEVC, H.264, or YUV sequences simultaneously. And compare the quality of these streams in four modes:
- Split screen
- and the newest mode Butterfly
MPEG2-TS and MP4 files containing either HEVC or H.264 elementary streams are also supported. Additionally, VCT displays valuable clip information such as bit-rate, screen resolution, frame rate, number of frames, and other important video information.
Developed in 2012, VCT was the industry’s first internal software player offered as a tool to help Beamr customers conduct subjective testing while evaluating our encoder’s efficiency. Today, VCT has been tested by many content and equipment companies from around the world in multiple markets including broadcast, mobile, and internet streaming, making it the defacto standard for subjective golden-eye video quality testing and evaluation.
VCT BENEFITS AND TIPS
Your FREE trial of VCT will come with an extensive user guide that contains everything you need to get started. But we know you are eager to begin your testing, so following are a few quick tips we trust you will find useful. Take advantage of this “golden” opportunity and get started today!
Note: use Command (⌘) instead of Ctrl for the OS X version of VCT.
- Split Screen Comparison Mode:
- Great for viewing two clips when only one screen is available.
- Moving slider bar allows you to clearly see quality difference between two streams in your desired region of interest. For example, you can move the slider bar back and forth across a face to see quality differences between two discrete files.
- Pro Tips:
- Use the keyboard shortcut Ctrl + \ to re-center the slider bar after it is moved.
- Shortcut key Ctrl + Tab allows you to change which video appears on the left or right of the slider bar.
- Side-by-side Comparison Mode:
- Great for tradeshows. Solves the lack of synchronization of side by side comparison tests when using two independent players.
- Single control for both streams.
- Pro Tip:
- Shortcut key Ctrl + Tab allows you to change which video appears on which screen without moving the windows.
- Overlay Comparison Mode:
- Great for viewing the full frame of one stream on a single window.
- Shortcut key Ctrl + Tab allows you to cycle between the two videos. If you do this fast it is a great way to easily see quality differences between the two streams that you might not have noticed.
- Butterfly Comparison Mode:
- Very useful for determining the accuracy of the encoding process. The butterfly mode displays mirrored images of two sequences to help you assess whether an artifact occurs in the source when comparing an encoded sequence to the original.
- Use shortcut key Ctrl + \ to reset the frame to the leftmost view in and use shortcut Ctrl + Alt + \ to switch to the rightmost view in butterfly mode.
- Use shortcut key Ctrl + [ and Ctrl + ] to move image in butterfly mode left/right.
- Other Useful Tips:
- Ctrl + m allows you to toggle through the 4 comparison modes.
- Shift + Left Click opens the magnifier tool that allows you to zoom into hard to see areas of the video.
- Easily scale frames of different resolutions to the same resolution by clicking “scale to same look” on the main menu
- NEW automatic download feature on the splash screen notifies you of the latest version updates to ensure you’re always up to date.
- For more great features be sure to check out the VCT userguide beamr.com/vct/userguide.com.
(1) P. M. Arun Kumar and S. Chandramathi. Video Quality Assessment Methods: A Bird’s-Eye View
(2) Richard Dosselmann and Xue Dong Yang. A Formal Assessment of the Structural Similarity Index