4K TV AOM AV1 Content-Adaptive IBC Live NVIDIA

Live 4Kp60 Optimized Encoding with Beamr CABR and NVIDIA Holoscan for Media

Posted on September 10, 2024by Tamar Shoham

This year at IBC 2024 in Amsterdam, we are excited to demonstrate Live 4K p60 optimized streaming with our Content-Adaptive Bitrate (CABR) technology on NVIDIA Holoscan for Media, a software-defined, AI-enabled platform that allows live video pipelines to run on the same infrastructure as AI. Using the CABR GStreamer plugin, premiered at the NAB Show earlier this year, we now support live, quality-driven optimized streaming for 4Kp60 video content.

It is no secret that savvy viewers are coming to expect the high-quality experience of 4K Ultra-High-Definition streamed at 60 frames per second for premium events. What started with a drizzle a few years back has become the high end norm for recent events such as the 2024 Olympics, where techies were sharing insights on where it could be accessed.

Given the fact that 4K means a whopping four times the pixels compared to full HD resolution, keeping up with live encoding of 4K at 60 fps can be quite challenging, and can also result in bitrates that are too high to manage.

One possible solution for broadcasters is to encode and transmit at 1080p and rely on the constantly improving upscalers available on TVs to provide the 4K experience, but this of course means they cannot control the user experience. A better solution is to have a platform that is super fast, and can create live 4Kp60 encodes, which combine excellent quality with an optimization process that minimizes the required bitrate for transmission.

Comparison of 4K Live video before and after optimization

Beamr CABR on Holoscan for Media offers exactly that, by combining the fast data buses and easy-to-use architecture of Holoscan for Media with Beamr hardware-accelerated, quality-driven optimized AV1 encoding. Together, it is possible to stream super efficient, 4K, lower bitrate encodes at top notch quality.

Content Adaptive Bitrate encoding, or CABR, is Beamr’s patented and award-winning technology that uses a quality measure to select the best candidate with the lowest bitrate and the same perceptual quality as a reference frame. In other words, users can enjoy 30-50% lower bitrate, faster delivery of files or live video streams and improved user experience – all with exactly the same quality as the original video.

In order to achieve aggressive bitrates which are feasible for broadcast of live events, we configure the system to use AV1 encoding. The advanced AV1 format has been around since 2018. However, its full potential has not been fully realized by many players in the video arena. AV1 is raising the bar significantly in comparison to previous modern codecs, such as AVC (H.264) or HEVC (H.265), in terms of efficiency, performance with GPUs and high quality for real-time video. When combined with CABR – AV1 is offering up even more. According to our tests, AV1 can reduce data by 50% compared to AVC and by 30% compared to HEVC. We also showed that CABR optimized AV1 is beneficial for machine learning tasks.

Putting all three of these technologies together, namely deploying Holoscan for Media with the Beamr CABR solution inside, which in turn is using NVIDIA’s hardware-accelerated AV1 encoder, provides a platform that offers spectacular benefits. With the rise in demand for high-quality live streaming at high resolution, high fps and manageable bitrates, while keeping an eye on the encoding costs – this solution is definitely an interesting prospect for companies looking to boost their streaming workflows.

AOM AV1 Broadcast cabr Content-Adaptive Live NVIDIA

Real-time Video Optimization with Beamr CABR and NVIDIA Holoscan for Media

Posted on April 15, 2024by Dan Julius

This year at the NAB Show 2024 in Las Vegas, we are excited to demonstrate our Content-Adaptive Bitrate (CABR) technology on the NVIDIA Holoscan for Media platform. By implementing CABR as a GStreamer plugin, we have, for the first time, made bitrate optimization of live video streams easily achievable in the cloud or premise.

Building on the NVIDIA DeepStream software development kit, which can extends GStreamer’s capabilities, significantly reduced the amount of code required to develop the Holoscan for Media based application. Using DeepStream components for real-time video processing and NMOS (Networked Media Open Specifications) signaling, we were able to keep our focus on the CABR technology and video processing.

The NVIDIA DeepStream SDK provides an excellent framework for developers to build and customize dynamic video processing pipelines. DeepStream provides pipeline components that make it very simple to build and deploy live video processing pipelines that utilize the hardware decoders and encoders available on all NVIDIA GPUs.

Beamr CABR dynamically adjusts video bitrate in real-time, optimizing quality and bandwidth use. It reduces data transmission without compromising video quality, making the video streaming more efficient. Recently we released our GPU implementation which uses the NVIDIA NVENC, encoder, providing significantly higher performance compared to previous solutions.

Taking our GPU implementation for CABR to the next level, we have built a GStreamer Plugin. With our GStreamer Plugin, users can now easily and seamlessly incorporate the CABR solution into their existing DeepStream pipelines as a simple drop-in replacement to their current encoder component.

Holoscan For Media

A GStreamer Pipeline Example

To illustrate the simplicity of using CABR, consider a simple DeepStream transcoding pipeline that reads and writes from files.

Simple DeepStream Pipeline:

gst-launch-1.0 -v \
  filesrc location="video.mp4" ! decodebin ! nvvideoconvert ! queue \
  nvv4l2av1enc bitrate=4500 ! mp4mux ! filesink location="output.mp4"

By simply replacing the nvv4l2av1enc component with our CABR component, the encoding bitrate is adapted in real-time, according to the content, ensuring optimal bitrate usage for each frame, without any loss of perceptual quality.

CABR-Enhanced DeepStream Pipeline:

gst-launch-1.0 -v \
  filesrc location="video.mp4" ! decodebin ! nvvideoconvert ! queue \
  beamrcabvav1 bitrate=4500 ! mp4mux ! filesink location="output_cabr.mp4"

Similarly, we can replace the encoder component used in a live streaming pipeline with the CABR component to optimize live video streams, dynamically adjusting the output bitrate and offering up to a 50% reduction in data usage without sacrificing video quality.

Simple DeepStream Pipeline:

gst-launch-1.0 -v \
  rtmpsrc location=rtmp://someurl live=1 ! decodebin ! queue ! \ 
  nvvideoconvert ! queue ! nvv4l2av1enc bitrate=3500 ! \
  av1parse ! rtpav1pay mtu=1300 ! srtsink uri=srt://:8888

CABR-Enhanced DeepStream Pipeline:

gst-launch-1.0 -v \
  rtmpsrc location=rtmp://someurl live=1 ! decodebin ! queue ! \
  nvvideoconvert ! queue ! beamrcabrav1 bitrate=3500 ! \
  av1parse ! rtpav1pay mtu=1300 ! srtsink uri=srt://:8888

The Broad Horizons of CABR Integration in Live Media

Beamr CABR, demonstrated using NVIDIA Holoscan for Media at NAB show, marks just the beginning. This technology is an ideal fit for applications running on NVIDIA RTX GPU-powered accelerated computing and sets a new standard for video encoding.

Lowering the video bitrate reduces the required bandwidth when ingesting video to the cloud, creating new possibilities where high resolution or quality were previously costly or not even possible. Similarly, reduced bitrate when encoding on the cloud allows for streaming of higher quality videos at lower cost.

From file-based encoding to streaming services — the potential use cases are diverse, and the integration has never before been so simple. Together, let’s step into the future of media
streaming, where quality and efficiency coexist without compromise.

cabr Cloud Gaming Codec Performance Content-Adaptive Intel Gen 11 Graphics Intel Media SDK

How To Cut Cloud Gaming Bitrates In Half So That Twice As Many Users Can Play

Posted on November 19, 2019by Dror Gill

TL;DR: Beamr CABR operating with the Intel Media SDK hardware encoder powered by Intel GPUs is the perfect video encoding engine for cloud gaming services like Google Stadia. The Intel GPU hardware encoder reaches real-time performance with a power envelope that is 90% less than a CPU based software solution. When combined with Beamr CABR (Content-Adaptive Bitrate) technology, the required bandwidth for cloud gaming is reduced by as much as 49% while delivering higher quality 65% of the time. Using the Intel hardware encoder combined with Beamr CABR enables players to enjoy a gaming experience that is competitive to a console and able to be streamed by cloud gaming platforms. Get more information about how CABR works.

The era of cloud gaming.

With the launch of Google Stadia, we have entered a new era in the games industry called cloud gaming. Just as streaming video services opened media and entertainment content to a broader audience by freeing it from the fixed frameworks of terrestrial (over-the-air), cable, and satellite distribution, so to will cloud gaming open gameplay to a larger audience. Besides extending gameplay to virtually anywhere the user has a network-connected device, the ability for a player to access an extensive library of games without needing to use a specific piece of hardware will push 25.9 million players to cloud gaming platforms by 2023, according to the media research group Kagan.

In addition to opening up gameplay to an “anywhere/anytime” experience. A major user experience benefit of cloud gaming is that players will not necessarily need to purchase a game, but in many cases will be free to access a vast library of their choosing instantaneously. Cloud gaming services promise the quality of a console or PC experience, but without the need to own expensive hardware and the configuration and software installation work that comes with that.

The one constraint that could cause cloud gaming to never catch up with the console experience.

With the wholesale transition of video entertainment content from traditional broadcast and physical media to streaming distribution, it is not hard to project the same pattern will occur for games. Except now, unlike the early days of video streaming where a 3Mbps home Internet connection was “high speed,” and the number of devices able to decode and reliably play back H.264 video was limited, even the lowest cost smartphone can stream video with acceptable quality.

Yet, there is a fundamental constraint that must be overcome for cloud gaming to reach its full market potential, and that is the bandwidth required to deliver a competitive video experience at 1080p60 or 4kp60 resolution. To better understand the bandwidth squeeze that is unique to cloud gaming, let’s examine the data and signal flow.

In FIGURE 1 we see the cloud gaming architecture moves compute-intensive operations, like the graphics rendering engine, to the cloud.

Shifting the compute-intensive function to the cloud eliminates device technical capability from being a bottleneck. However, as a result of the video rendering and encoding function not being local to the user, it means the video stream needs to be delivered over the network, with latency in the tens of milliseconds. And, at a framerate that is double the entertainment video frame rate of 24, 25, or 30 frames per second. Additionally, video game resolutions need to be HD with 4K preferable. Also, HDR is an increasingly important capability for many AAA game titles.

None of these requirements are impossible to meet, except as a result of needing fast encoding speed, the encoder must be operated in a mode that makes it difficult to produce high-quality and with small stream size. Because of the added time needed for the encoder to create B frames, and without the benefit of a look-ahead buffer, producing high quality with low bitrate is not possible. Hence why cloud gaming services require a significantly higher bitrate than what is possible with traditional video on demand streaming video services.

Beamr has been innovating in the area of performance, allowing us to encode H.264 and HEVC in software with breathtaking speed, even when running our most advanced Content-Adaptive Bitrate (CABR) rate-control. For video applications where a single encoder can serve hundreds of thousands or even millions of users, the compute requirement to do this in software, given the tremendous benefits of lower bitrate and higher quality, makes it easy to justify. But, in an application like cloud gaming, where the video encoder is matched 1:1 to every user, the computing cost to do this in software makes it uneconomical. The answer is to use a hardware encoder controlled by software, and running a content-adaptive optimization process which can deliver the additional bitrate savings needed.

FIGURE 2 illustrates the required Google Stadia bitrates.

The answer is to leverage hardware and software.

The Intel Media SDK and GPU engines occupy a well-established position in the market, with many video services relying on its included HEVC hardware encoder for real-time encoding. However, using the VBR rate-control only, there is a limit to the quality available when bitrate efficiency is essential. The advantage of Beamr’s next-generation rate-control technology, CABR (Content-Adaptive Bitrate), combined with Intel GPUs, is the secret to delivering bitrate efficiency and quality, in real-time, with 90% less power than software alone.

In verified testing, Beamr has shown that the Intel Media SDK hardware encoder controlled by CABR will produce the same perceptual quality as VBR encodes, with a confidence level greater than 95%. Using CABR gives a meaningful impact on user experience. 65% of the time, the player will perceive better quality at the same bandwidth, even while the gaming platform experiences up to a 49% reduction in the bandwidth required to provide the same quality level.

Watch Beamr Founder Sharon Carmel present Beamr CABR integrated with Intel Gen 11 hardware encoder at Intel Experience Day October 29, 2019 in Moscow.

Proof of performance.

As an image science company, Beamr is committed to proof of performance with all claims. For this reason, the industry recognizes that all technology, products, and solutions which carry the Beamr name, represent the pinnacle of quality. For this reason, it was insufficient to integrate CABR with the Intel Media SDK without being able to prove that the original quality of the stream is always preserved and that the user experience is improved. Testing comprised corresponding 10-second segments extracted from clips created with the Intel hardware encoder using VBR, and clips encoded using the Intel hardware encoder but with the integrated Beamr CABR rate-control.

The only way to test perceptual quality is with subjective techniques. We used a process similar to forced-choice double stimulus (FCDS), and closely approximating the ITU BT.500 method. Using the Beamr Auto-VISTA framework, we recruited anonymous viewers from Amazon Mechanical Turk where each viewer was shown corresponding segment pairs and asked to select which video had lower quality. The VBR and CABR encoded files were placed at random on the left and right sides. Validation pairs were used to verify the user’s capabilities with visible artifacts inserted, and only test results for users who correctly answered all four validation pairs were incorporated into the analysis. The viewers had up to five attempts to view the pairs before making a decision. Each viewer watched 20 segment pairs consisting of sixteen actual CABR, and VBR encodes, and four validation pairs.

Games used for testing were: CSGO, Fallout, and GTA5. To reflect realistic bitrates, we only tested the middle four bitrates out of the six bitrates provided. This was because the bitrate for the top layer was very high, and the bottom layer quality was very low. The four bitrates tested were spaced one JND (just noticeable difference) apart. Each target test pair was viewed 13 to 21 times by valid users, with a total of 800 target pair viewings, or about 17 viewings per pair on average. The total number of valid test sessions were 50, completed by more than 40 unique viewers.

Peeling back the data, you will notice that the per-pair statistical distribution is quite symmetrical above and below 50%. With the sampling base, this phenomenon is no surprise; human perception varies. The overall results had 800 views of 48 pairs, which make the statistical certainty higher, indicating that CABR is not compromising perceptual quality.

FIGURE 4 shows CABR encodes had the same perceptual quality as VBR and with a confidence level of more than 95%.

Better quality, lower bitrate.

Beamr CABR encoded streams offer higher quality when compared subjectively to a VBR equivalent encode, while offering a bitrate savings of up to 49%. Benefits of CABR for cloud gaming or any live streaming service, are quantified by better quality, greater bandwidth savings, and a reduction in storage cost. For the files that we tested, the aggregated metrics were as follows:

65% of the time, users will experience better quality for a given bandwidth.
40% bandwidth savings on average across all three titles (GTA5 had a savings of 49%).
30% overall storage savings.

FIGURE 5, 6, and 7 illustrate for the three video samples used that for a given User Bandwidth, CABR provides higher quality. You will interpret the chart by observing that where VBR is blue, CABR is BLACK (higher quality), and where VBR is turquoise, CABR is BLUE.

Conclusion.

Beamr CABR controlling the Intel Media SDK hardware encoder is the perfect video encoding engine for cloud gaming services like Google Stadia. The Beamr CABR rate-control and optimization process works with all Intel codecs, including AVC, HEVC, VP9, and AV1. All bitstreams produced by the Intel + Beamr CABR solution are fully standard-compliant and work with every player in the field today. Beamr CABR is proven and protected by 46 International patents, meaning there is no other solution that can reduce bitrate by as much as 49% while working in real-time using a closed-loop perceptually aligned quality measure to guarantee the original quality.

The single most important technical hurdle for anyone building or operating a cloud gaming service or platform is the bandwidth consumption required to deliver a player experience on par with the console. Now, with Intel + Beamr CABR, the ideal solution is here; one that can reach the performance and density needed for cloud gaming at scale, so that more players can enjoy a premium gaming experience. Streaming video upended the media and entertainment business, with the rise of Netflix, Hulu, Amazon Prime Video, Disney+, Apple TV Plus, and dozens of other tier-one streaming services. In the same way, cloud gaming will create new service platforms, gaming experiences, and business models.

To experience the power of Beamr CABR controlling the Intel hardware encoder, send an email to info@beamr.com.

cabr Content-Adaptive quality measure

The Patented Visual Quality Measure that was Designed to Drive Higher Compression Efficiency

Posted on September 11, 2019by Tamar Shoham

At the heart of Beamr’s closed-loop content-adaptive encoding solution (CABR) is a patented quality measure. This measure compares the perceptual quality of each candidate encoded frame to the initial encoded frame. The quality measure guarantees that when the bitrate is reduced the perceptual quality of the target encode is preserved. In contrast to general video quality measures – which aim to quantify any difference between video streams resulting from bit errors, noise, blurring, change of resolution, etc. – Beamr’s quality measure was developed for a very specific task. It reliably and quickly quantifies the perceptual quality loss introduced in a video frame due to artifacts of block-based video encoding. In this blog post, we present the components of our patented video quality measure, as shown in Figure 1.

Pre-analysis

Before determining the quality of an encoded frame, the quality measure component performs some pre-analysis on the source and initial encoded frames to extract data used in the quality measure calculation and to collect information used to configure the quality measure. The analysis consists of two parts, where part I of the analysis is performed on the source frame and part II of the analysis is performed on an initial encoded frame.

beamr closed loop perceptual quality measure functional block diagram

Figure 1. A block diagram of the video quality measure used in Beamr’s CABR engine

The goal of part I of the pre-analysis is to characterize the content, the frame, and areas of interest within a given frame. In this phase, we can determine whether the frame has skin and face areas, rich chroma information typical of 3D animation, or highly localized movement with static background, found in cell animation content. The algorithms used are designed for low CPU overhead. For example, our facial detection algorithm applies a full detection mechanism at scene changes and a unique, low complexity adaptive-tracking mechanism in other frames. For skin detection, we use an AdaBoost classifier, which we trained on a marked dataset we created. The classifier uses YUV pixel values and 4×4 Luma variance values input. At this stage, we also calculate the edge map which we employ in the Edge-Loss-Factor score component described below.

Part II of the pre-analysis is used to analyze the characteristics of the frame after the initial encoding. In this phase, we may determine if the frame has grain and estimate the amount of grain, and use it to configure the quality measure calculation. We also collect information about the complexity of each block, which is indicated, for example, by the bit usage and block quantization level used to encode each block. At this stage, we also calculate the density of local textures in each block or area of the frame, which is used for the texture preservation score component described below.

Quality Measure Process and Components

The quality measure evaluates the quality of a target frame when compared to a reference frame. In the context of CABR, the reference frame is the initial encoded frame and the target frame is the candidate frame of a specific iteration. After performing the two phases of the pre-analysis, we proceed to the actual quality measure calculation, which is described next.

Tiling

After completing the two phases of the pre-analysis stage, each of the reference and target frames is partitioned into corresponding tiles. The location and dimensions of these tiles are adapted according to the frame resolution and other frame characteristics. For example, we will use smaller tiles in a frame which has highly localized motion. Tiles are also sometimes partitioned further into sub-tiles, for at least some of the quality measure components. A quality metric score is calculated for each tile, and these per-tile scores are perceptually pooled to obtain a frame quality score.

The quality score for each tile is calculated as a weighted geometric average of the values calculated for each quality measure component. The components include a local similarity component which determines a pixel-wise difference, an added artifactual edges component, a texture distortion component, an edge loss factor, and a temporal component. We now provide a brief review of these five components of Beamr’s quality measure.

Local Similarity

The local similarity component evaluates the level of similarity between pixels at the same position in the reference and target tiles. This component is somewhat similar to PSNR, but uses adaptive sub-tiling, pooling, and thresholding, to provide results that are more perceptually oriented than regular PSNR. In some cases, such as when pre-analysis determined that the frame contains rich chroma content, the calculation of pixel similarity for chroma planes is also included in this component, but in most cases, only luma is used. For each sub-tile, regular PSNR is calculated. To give greater weight to low-quality sub-tiles, which are located in tiles that have far superior quality, we perform the pooling using only values which are below a threshold that depends on the lowest sub-tile PSNR values. This can happen when there are changes only in a small area, even just a few pixels. We then scale the pooled value using a factor which is adapted according to the level of brightness in the tile, since distortion in dark areas is more perceptually disturbing than in bright areas. Finally, we clip the local similarity component score so that it lies in the range [0,1], where 1 indicates that the target and reference tiles are perceptually identical.

Added Artifactual Edges (AAE)

The Added Artifactual Edges score component evaluates additional blockiness introduced in the target tile compared to reference tile. Blockiness in video coding is a well-known artifact introduced by the independent encoding done on each block. Many previous attempts have been made to avoid this blockiness artifact, mainly using de-blocking filters which are integral parts of modern video encoders such as AVC and HEVC. However, our focus in the AAE component is to quantify the extent of this artifact rather than eliminate it. Since we are interested only in the added blockiness in the target frame relative to the reference frame, we evaluate this component of the quality measure on the difference between the target and reference frames. For each horizontal and vertical coding block boundary in the difference block, we evaluate the change or gradient across the coding block border and compare it to the local gradient within the coding block on either side. For example, for AVC encoding this is done along the 16×16 grid of the full-frame. We apply soft thresholding to the blockiness value, using adaptive threshold values, adapted according to information from the pre-analysis stage. For example, in an area recognized as skin, where human vision is more sensitive to artifacts, we will use tighter thresholds so that mild blockiness artifacts are more heavily penalized. These calculations result in an AAE scores map, containing values in the range of [0, 1] for each horizontal and vertical block border point. We average the values per block border, and then average these per-block-border average values, excluding or giving low weight to block borders with no added blockiness. The value is then scaled according to the percent of extremely disturbing blockiness artifacts, i.e. cases where the original blockiness value prior to thresholding was very high, and finally is clipped to the range [0,1] with 1 indicating no added artifactual edges in the target tile relative to the reference tile.

Texture Distortion

The texture distortion score component quantifies how well texture is preserved in the target tile. Most block-based codecs, including AVC and HEVC, use a frequency transform such as DCT and perform quantization of the transform coefficients, usually applying more aggressive quantization to the high-frequency components. This can cause two different textural artifacts. The first artifact is a loss of texture detail, or over-smoothing, due to loss of energy in high-frequency coefficients. The second artifact is known as “ringing,” and is characterized by the noise around edges or sharp changes in the image. Both these artifacts cause a change in the local variance of the pixel values: over-smoothing causes a decrease in pixel variance, while added ringing or other high-frequency noise, causes an increase in pixel variance. Therefore, we measure the local deviation, in corresponding blocks in the reference and target frame tiles, and compare their values. This process yields a texture tile score in the range [0,1] with 1 indicating no visible texture distortion in the target image tile.

Temporal consistency

The temporal score component evaluates the preservation of temporal flow in the target video sequence compared to the temporal flow in the reference video sequence. This is the only component of the quality measure that also requires the preceding target and reference frames to be leveraged. In this component, we measure two kinds of changes: “new” information introduced in the reference frame which is missing in the target frame, and “new” information in target frame where there was no “new” information in the reference frame. In this context, “new” information refers to information that exists in the current frame but doesn’t exist in the preceding frame. We calculate the Sum of Absolute Differences (SAD) between each co-located 8×8 block in the reference frame and the preceding reference frame, and the SAD between each co-located 8×8 block in the target frame and the preceding target frame. The local (8×8) score is derived from the relation between these two SAD values, and also according to the value of the reference SAD, which indicates whether the block is dynamic or static in nature. Figure 2 illustrates the value of the local score for different combinations of the reference and target SAD values. After all local temporal scores are calculated, they are pooled to obtain a tile temporal score component in the range [0,1].

Figure 2. local temporal score as a function of reference SAD and target SAD values

Edge Loss Factor (ELF)

The Edge Loss Factor score component reflects how well edges in the reference image are preserved in the target image. This component uses the input image edge map, generated during part I of the pre-analysis. In part II of the pre-analysis, the strength of the edge at each edge point in the reference frame is calculated, as the most substantial absolute difference between the edge pixel value and its 8 closest neighbors. We can optionally discard pixels which are considered false edges, by comparing the reference frame edge strength of the pixel to a threshold, which can be adapted, for example, to be higher in a frame which contains film grain. Once values for all edge pixels have been accumulated the final value is scaled to provide an ELF tile score component, in the range [0,1] with 1 indicating perfect edge preservation.

Combining the Score Components

The five tile score components described above are combined into a tile score using weighted geometric averaging, where the weights can be adapted according to the codec used or according to the pre-analysis stage. For example, in codecs with good in-loop deblocking filters we can lower the weight of the blockiness component, while in frames with high levels of film grain (as determined by the pre-analysis stage) we can reduce the weight of the texture distortion component.

Tile Pooling

In the final step of the frame quality score calculation, the tile scores are perceptually pooled to yield a single frame score value. The perceptual pooling uses weights which are dependent on importance (derived from the pre-analysis stages, such as the presence of face and/or skin in the tile), and on the complexity of blocks in the tile compared to average complexity of the frame. The weights are also dependent on tile score values – we give more weight to low scoring tiles, in the same way, human viewers are drawn to quality drops even if they occur in isolated areas.

Score Configurator

The score configurator block is used to configure the calculations for different use cases. For example, in implementations where latency or performance are tightly bounded, the configurator can apply a fast score calculation which skips some of the stages of pre-analysis and uses a somewhat reduced complexity score. To still guarantee a perceptually identical result, the score calculated in this fast mode can be scaled or compensated to account for the slightly lower perceptual accuracy, and this scaling may in some cases slightly reduce savings.

To learn more about CABR, continue reading “A Deep Dive into CABR, Beamr’s Content-Adaptive Rate Control.”

Authors: Dror Gill & Tamar Shoham

Beamr 5 cabr Content-Adaptive

A Deep Dive into CABR, Beamr’s Content-Adaptive Rate Control

Posted on September 11, 2019by Tamar Shoham

Going Inside Beamr’s Frame-Level Content-Adaptive Rate Control for Video Coding

When it comes to video, the tradeoff between quality and bitrate is an ongoing dance. Content producers want to maximize quality for viewers, while storage and delivery costs drive the need to reduce bitrate as much as possible. Content-adaptive encoding addresses this challenge, by striving to reach the “optimal” bitrate for each unique piece of content, be it a full clip or a single scene. Our CABR technology takes it a step further by adapting the encoding at the frame level. CABR is a closed-loop content-adaptive rate control mechanism enabling video encoders to lower the bitrate of their encode, while simultaneously preserving the perceptual quality of the higher bitrate encode. As a low-complexity solution, CABR also works for live or real-time encoding.

All Eyes are on Video

According to Grand View Research, the global video streaming market is expected to grow at a CAGR of 19.6% from 2019 to 2025. This shift, fueled by the increasing popularity of direct-to-consumer streaming services such as Netflix, Amazon and Hulu, the growth of video on social media networks and user-generated video platforms such as Facebook and YouTube, and other applications like online education & video surveillance, has all eyes on video workflows. Therefore, efficient video encoding, in terms of encoding and delivery costs, and meeting the viewer’s rising quality expectations, are at the forefront of video service provider’s minds. Beamr’s CABR solution can reduce bitrates without compromising quality while keeping a low computational overhead to enhance video services.

Comparing Content-Adaptive Encoding Solutions

Instead of using fixed encoding parameters, content-adaptive encoding configures the video encoder according to the content of the video clip to reach the optimal tradeoff between bitrate and quality. Various content-adaptive encoding techniques have been used in the past to provide a better user experience with reduced delivery costs. Some of them have been entirely manual, where encoding parameters are hand-tuned for each content category and sometimes, like in the case of high-volume Blu-ray titles, at the scene level. Manual content-adaptive techniques are restricted in the sense that they can’t be scaled, and they don’t provide granularity lower than the scene level.

Other techniques, such as those used by YouTube and Netflix, use “brute force” encoding of each title by applying a wide range of encoding parameters, and then by employing rate-distortion models or machine learning techniques, try to select the best parameters for each title or scene. This approach requires a lot of CPU resources since many full encodes are performed on each title, at different resolutions and bitrates. Such techniques are suitable for diverse content libraries that are limited in size, such as premium content including TV series and movies. These methods do not apply well to vast repositories of videos such as user-generated content, and are not applicable to live encoding.

Beamr’s CABR solution is different from the techniques described above in that it works in a closed-loop and adapts the encoding per frame. The video encoder first encodes a frame using a configuration based on its regular rate control mechanism, resulting in an initial encode. Then, Beamr’s CABR rate control instructs the encoder to encode the same frame again with various values of encoding parameters, creating candidate encodes. Using a patented perceptual quality measure, each candidate encode is compared with the initial encode, and then the best candidate is selected and placed in the output stream. The best candidate is the one that has the lowest bitrate but still has the same perceptual quality as the initial encode.

Taking Advantage of Beamr’s CABR Rate Control

In order for Beamr’s CABR technology to encode video to the minimal bitrate and still retain the perceptual quality of a higher bitrate encode, it compresses each video frame to the maximum extent that provides the same visual quality when the video is viewed in motion. Figure 1 shows a block diagram of an encoding solution which incorporates CABR technology.

Figure 1 – A block diagram of the CABR encoding solution

An integrated CABR encoding solution consists of a video encoder and the CABR rate control engine. The CABR engine is comprised of the CABR control module responsible for managing the optimization process and a module which evaluates video quality.

As seen in Figure 2, the CABR encoding process consists of multiple steps. Some of these steps are performed once for each encoding session, some are performed once for each frame, and some are performed for each iteration of candidate frame encoding. When starting a content-adaptive encoding session, the CABR engine and the encoder are initialized. At this stage, we set system-level parameters such as the maximum number of iterations per frame. Then, for each frame, the encoder rate control module selects the frame types by applying its internal logic.

Figure 2. A block diagram of a video encoder incorporating Content Adaptive Bit-Rate encoding.

The encoder provides the CABR engine with each original input frame for pre-analysis within the quality measure calculator. The encoder performs an initial encode of the frame, using its own logic for bit allocation, motion estimation, mode selections, Quantization Parameters (QPs), etc. After encoding the frame, the encoder provides the CABR engine with the reconstructed frame corresponding to this initially encoded frame, along with some side information – such as the frame size in bits and the QP selected for each MacroBlock or Coding Tree Unit (CTU).

In each iteration, the CABR control module first decides if the frame should be re-encoded at all. This is done, for example, according to the frame type, the bit consumption of the frame, the quality of previous frames or iterations, and according to the maximum number of iterations set for the frame. In some cases, the CABR control module may decide not to re-encode a frame at all – in that case, the initial encoded frame becomes the output frame, and the encoder continues to the next frame. When the CABR control module decides to re-encode, the CABR engine provides the encoder with modified encoding parameters, for example, a proposed average QP for the frame, or the difference from the QP used for the initial encode. Note that the QP or delta QP values are an average value, and QP modulation for each encoding block can still be performed by the encoder. In more sophisticated implementations a QP map of value per encoding block may be provided, as well as additional encoder configuration parameters.

The encoder performs a re-encode of the frame with the modified parameters. Note that this re-encode is not a full encode, since it can utilize many encoding decisions from the initial encode. In fact, the encoder may perform only re-quantization of the frame, reusing all previous motion vectors and mode decisions. Then, the encoder provides the CABR engine with the reconstructed re-encoded frame, which becomes one of the candidate frames. The quality measure module then calculates the quality of the candidate re-encoded frame relative to the initially encoded frame, and this quality score, along with the bit consumption reported by the encoder is provided to the CABR control module, which again determines if the frame should be re-encoded. When that is the case, the CABR control module sets the encoding parameters for the next iteration, and the above process is repeated. If the control module decides that the search for the optimal frame parameters is complete, it indicates which frame, among all previously encoded versions of this frame, should be used in the output video stream. Note that the encoder rate control module receives its feedback from the initial encode of the current frame, and in this way the initial encode of the next frames (which determines the target quality of the bitstream) is not affected.

The CABR engine can operate in either a serial iterative approach or a parallel approach. In the serial approach, the results from previous iterations can be used to select the QP value for the next iteration. In the parallel approach, all candidate QP values are provided simultaneously and encodes are done in parallel – which reduces latency.

Integrating the CABR Engine with Software & Hardware Encoders

Beamr has integrated the CABR engine into its AVC software encoder, Beamr 4, and into its HEVC software encoder, Beamr 5. However, the CABR engine can be integrated with any software or hardware video encoder, supporting any block-based video standard such as MPEG-2, AVC, HEVC, EVC, VVC, VP9, and AV1.

To integrate the CABR engine with a video encoder, the encoder should support several requirements. First and foremost, the encoder should be able to re-encode an input frame (that has already been encoded) with several different encoding parameters (such as QP values), and save the “state” of each of these encodes, including the initial encode. The reason for saving the state is that when the CABR control module selects one of the candidate frame encodes (or the initial encode) as the one to use in the output stream, the encoder’s state should correspond to the state it was right after encoding that candidate frame. Encoders that support multi-threaded operation and hardware encoders typically have this capability, since each frame encode is performed by a stateless unit.

Second, the encoder should support an interface to provide the reconstructed frame and the per block QP and bit consumption information for the encoded frame. To improve compute performance, we also recommend that the encoder supports a partial re-encode mode, where information related to motion estimation, partitioning and mode decisions found in the initial encode can be re-used for re-encoding without being computed again, and only the quantization and entropy coding stages are repeated for each candidate encode. This results in a minimal encoding efficiency drop for the optimized encoding result, with significant speed-up compared to full re-encode. As described above, we recommend that the encoder will use the initial encoded data (QPs, compressed size, etc.) for its Rate Control state update. However, the selected frame and accompanying data must be used for reference frames and other reference data, such as temporal MV predictors, as it is the only data available in the bitstream for decoding.

When integrating with hardware encoders that support parallel encoding with no increase in latency, we recommend using the parallel search approach where multiple QP values per frame are evaluated simultaneously. If the hardware encoder can perform parallel partial encodes (for example, re-quantization and entropy coding only), while all parallel encodes use the analysis stage of the initial encode, such as motion estimation and mode decisions, better CPU performance will be achieved.

Sample Results

Below, we provide two sample results of the CABR engine, when integrated with Beamr 5, Beamr’s HEVC software encoder, each illustrating different aspects of CABR.

For the first example, we encoded various 4K 24 FPS source clips to a target bitrate of 10 Mbps. Sample frames from each of the clips can be seen in Figure 3. The clips vary in their content complexity: “Crowd Run” has very high complexity since it has great detail and very significant motion of the runners. “StEM” has medium complexity, with some video compression challenges such as different lighting conditions and reasonably high film grain. Finally, a promotional clip of JPEGmini by Beamr has low complexity due to relatively low motion and simple scenes.

Figure 3. Sample frames from the test clips. top: crowd-run, bottom left: StEM bottom right: JPEGmini.

We encoded 500 frames from each clip to a target bitrate of 10 Mbps, using the VBR mode of the Beamr 5 HEVC encoder, which performs regular encoding, and using the CABR mode, which creates a lower bit-rate, perceptually identical stream. For the high complexity clip “Crowd Run,” where providing excellent quality at such an aggressive bitrate is very challenging, CABR reduced the bitrate by only 3%. For the intermediate complexity clip “StEM,” bitrate savings were higher and reached 17%. For the lowest complexity clip “JPEGmini,” CABR reduced the bitrate by a staggering 45%, while still obtaining excellent quality which matches the quality of the 10 Mbps VBR encode. This extensive range of bitrate reduction percentage demonstrates the fully automatic content-adaptive nature of CABR-enhanced encoder, which reaches a different final bitrate, according to the content complexity.

The second example uses a 500 frame 1080p 24 FPS clip from the well-known “Tears Of Steel” movie by the Blender open movie project. The same clip was encoded using the VBR and CABR modes of the Beamr 5 HEVC software encoder, with three target bitrates: 1.5, 3 and 5 Mbps. Savings, in this case, were 13% for the lowest bitrate resulting in a 1.4 Mbps encode, 44% for the intermediate bitrate resulting in an encode of 1.8 Mbps, and 62% for the highest bitrate, resulting in a 2 Mbps encode. Figures 4 and 5 show sample frames from the encoded clips with VBR encoding on the left vs. CABR encoding on the right. The top two images are from encodes to a bitrate of 5 Mbps, while the bottom two were taken from the 1.5 Mbps encodes. As can be seen here, both 5 Mbps target encodes preserve the details, such as the texture of the bottom lip or the two hairs on the forehead above the right eye, while in the lower bitrate encodes these details are somewhat blurred. This is the reason that when starting from different target bitrates, CABR does not converge to the same bitrate. We also see, however, that the more generous the initial encoding, generally the more savings can be obtained. This example shows that CABR adapts not only to the content complexity, but also to the quality of the target encode, and preserves perceptual quality in motion while offering significant savings.

Figure 4. A sample from the “Tears of Steel” 1080p 24 FPS encode to 5 Mbps (top) and 1.5 Mbps (bottom), encoded in VBR mode (left) and CABR mode (right)

Figure 5. Closer view of the face in Figure 4, showing detail of lips and forehead from the encode to 5 Mbps (top) and 1.5 Mbps (bottom), encoded in VBR mode (left) and CABR mode (right).

To learn how our CABR solution leverages our patented quality measure, continue to “The patented visual quality measure that makes all the difference.”

Authors: Dror Gill & Tamar Shoham

4K TV Beamr 5 Content-Adaptive Encoder Applications H.265 HDR HEVC IBC Mobile Video OTT Reduce Bitrate

How to deal with the tension on the mobile network – part 2 (VIDEO Interview)

Posted on October 31, 2017by Dror Gill

In late July, I reported on the “news” that Verizon was throttling video traffic for some users. As usual, the facts around this seemingly punitive act were not fully understood, which triggered this blog post.

At IBC last month (September 2017), I was interviewed by RapidTV where much of the conversation was around the Apple news of their support for HEVC across the device ecosystem running iOS 11 and High Sierra. As I was reviewing this interview, it seemed natural to publish it as a follow up to the original post.

There is no doubt that mobile operators are under pressure as a result of the network crushing video traffic they are being forced to deliver. But the good news is that for those operators who adopt HEVC, they are going to enjoy significant bitrate efficiencies, possibly as high as 50%. And for many services, though they will chose to take some savings, this means they’ll be able to upgrade their resolutions to full 1080p while simultaneously improving the video quality they are delivering.

I hope you find this video insightful. Our team has a very simple evaluation offer to discuss with all qualified video services and video distributors. Just send an email to sales@beamr.com and we’ll get in touch with the details.

Content-Adaptive H.265 HEVC WWDC2017

We Celebrate with Cake!

Posted on June 14, 2017by Dror Gill

At Beamr, when we celebrate, we do it with cake!

Today’s very special, and oh so yummy cake celebration, was a recognition of the amazing milestone that we reached on May 31, 2017 as the result of Beamr acquiring Vanguard Video on April 1st, 2016. Our vision for buying Vanguard as a firmly entrenched leader in HEVC video encoding was to combine Beamr’s world class content-adaptive optimization technology with the world’s best HEVC encoder. The results as we demonstrated at NAB 2017, are nothing short of breathtaking.

Can you imagine second screen HD at 1.5Mbps and 4K UHD with HDR at just 10Mbps? With Beamr 5x, and now that WWDC2017 saw Apple enabling HEVC across their devices, the time is now to move to HEVC so your users can enjoy enhanced UX and improved video quality.

Beamr 5x is available for private beta testing, contact us for more information.

Keep an eye out for all our news, because we’ve only just begun. The technology that we have introduced to the video encoding industry has set a new standard for performance and savings, and what the future holds is nothing short of earth shattering.

And yes, precisely 23 seconds after this picture was taken, this cake was unrecognizable!

Content-Adaptive H.264 H.265 HEVC Reduce Bitrate

How the Magic of Beamr Beats SSIM and PSNR

Posted on March 7, 2017by Dror Gill

Every video encoding professional faces the dilemma of how best to detect artifacts and measure video quality. If you have the luxury of dealing with high bitrate files then this becomes less of an “issue” since for many videos throwing enough bits at the problem means an acceptably high video quality is nearly guaranteed. However, for those living in the real world where 3 Mbps is the average bitrate that they must target, then compressing at scale requires metrics (algorithms) to help measure and analyze the visual artifacts in a file after encoding. This process is becoming even more sophisticated as some tools enable a quality measure to feed back into the encoding decision matrix, but more commonly quality measures are used as a part of the QC step. For this post we are going to focus on the application of quality measures used as part of the encoding process.

There are two common quality measures, PSNR and SSIM that we will discuss, but as you will see there is a third one and that is the Beamr quality measure that the bulk of this article will focus on.

PSNR, the Original Objective Quality Measure

PSNR, peak signal-to-noise ratio represents the ratio between the highest power of an original signal and the power level of the distortion. PSNR is one of the original engineering metrics that is used to measure the quality of image and video codecs. When comparing or measuring the quantitative quality of two files such as an original and a compressed version, PSNR attempts to approximate the difference between the compressed and the original. A significant shortcoming is that PSNR may indicate that the reconstruction is of suitably high quality when in some cases it is not. For this reason a user must be careful to not hold the results in high regard.

What is SSIM?

SSIM or the structured similarity index is a technique to predict the perceived quality of digital images and videos. The initial version was developed at the University of Texas at Austin while the full SSIM routine was developed jointly at New York University’s Laboratory for Computational Vision. SSIM is a perceptual model based algorithm that takes into account image degradation as a perceived shift in structural information, while including crucial perceptual detail, such as luminance and contrast masking. The difference compared with other techniques like PSNR is that this approach attempts to estimate absolute errors.

The basis of SSIM is the assumption that pixels have strong inter-dependencies and these dependencies contain needed information about the structure of the object in the scene, GOP or adjacent frames. Put simply, structured similarity is used for computing the similarity of two images. SSIM is a full reference metric where the computation and measurement of image quality is based on an uncompressed image as a reference. SSIM was developed as a step up over traditional methods such as PSNR (peak signal-to-noise ratio) which has proven to be uncorrelated with human vision. Yet, unfortunately SSIM itself is not perfect and can be easily fooled as shown by the following graphic which illustrates that though the original and compressed are closely correlated visually, PSNR and SSIM scored them as being not similar. Meanwhile, Beamr and MOS (mean opinion score), show them as being closely correlated.

Beamr Quality Measure

The Beamr quality measure is based on a proprietary, low complexity, reliable, perceptually aligned quality measure. The existence of this measure enables controlling a video encoder, to obtain an output clip with (near) maximal compression of the video input, while still maintaining the input video resolution, format and visual quality (PQ). This is performed by controlling the compression level of each frame, or GOP, in the video sequence, in such a way that is as deeply compressed as it can be, while still resulting in a perceptually identical output.

The Beamr quality measure is also a full-reference measure, i.e. it indicates a quality of a recompressed image or video frame when compared to a reference or original image or video frame, which is in accordance with the challenges our technology aims to tackle such as reducing bitrates to the maximum extent possible without imposing any quality degradation from the original. (as perceived by the human visual system). The Beamr quality measure calculation consists of two parts: A pre-process of the input video frames in order to obtain various score configuration parameters, and an actual score calculation done per candidate recompressed frame. Following is a system diagram of how the Beamr quality measure would interact with an encoder.

Application of the Beamr Quality Measure in an Encoder

The Beamr quality measure when integrated with an encoder enables the bitrate of video files to be reduced by up to an additional 50% over the current state of the art standard compliant block based encoders, without compromising image quality or changing the artistic intent. If you view a source video and a Beamr-optimized video side by side, they will look exactly the same to the human eye.

A question we get asked frequently is “How do you perform the “magic” of removing bits with no visual impact?”

Well, believe it or not there is no magic here, just solid technology that has been actively in development since 2009, and is now covered by 26 granted patents and over 30 additional patent applications.

When we first approached the task of reducing video bitrates based on the needs of the content and not a rudimentary bitrate control mechanism, we asked ourselves a simple starting question, “Given that the video file has already been compressed, how many additional bits can the encoder remove before the typical viewer would notice?”

There is a simple manual method of answering this question, just take a typical viewer, show them the source video and the processed video side by side, and then start turning down the bitrate knob on the processed video, by gradually increasing the compression. And at some point, the user will say “Stop! Now I can see the videos are no longer the same!”

At that point, turn the compression knob slightly backwards, and there you have it – a video clip that has an acceptably lower bitrate than the source, and just at the point before the average user can notice the visual differences.

Of course I recognize what you are likely thinking, “Yes, this solution clearly works, but it doesn’t scale!” and you are correct. Unfortunately many academic solutions suffer from this problem. They make for good hand built demos in carefully controlled environments with hand picked content, but put them out in the “wild” and they fall down almost immediately. And I won’t even go into the issues of varying perception among viewers of different ages, or across multiple viewing conditions.

Another problem with such a solution is that different parts of the videos, such as different scenes and frames, require different bitrates. So the question is, how do you continually adjust the bitrate throughout the video clip, all the time confirming with your test viewer that the quality is still acceptable? Clearly this is not feasible.

Automation to the Rescue

Today, it seems the entire world is being infected with artificial intelligence which in many cases is not much more than automation that is smart and able to adapt to its environment. So we too looked for a way to automate this image analysis process. That is take a source video, and discover a way to reduce the “non-visible” bits in a fully automatic manner, with no human intervention involved. A suitable solution would enable the bitrate to vary continuously throughout the video clip based on the needs of the content at that moment.

What is CABR?

You’ve heard of VBR or variable bitrate, Beamr has coined the term CABR or content-adaptive bitrate to summarize the process just described where the encoder is adjusted at the frame level based on quality requirements, rather than relying only on a bit budget to make decisions of where bits are applied and the number needed. But we understood that in order to accomplish the vision of CABR, we would need to be able to simulate perception of a human viewer.

We needed an algorithm that would answer the question, “Given two videos, can a human viewer tell them apart?” This algorithm is called a Perceptual Quality Measure and it is the very essence of what sets Beamr so far apart from every other encoding solution in the market today.

A quality measure is a mathematical formula, which tries to quantify the differences between two video frames. To implement our video optimization technology, we could have used one of the well-known quality measures, such as PSNR (Peak Signal to Noise Ratio) or SSIM (Structural SIMilarity). But as already discussed, the problem with these existing quality measures is that they are simply not reliable enough as they do not correlate highly enough with human vision.

There are other sophisticated quality measures which correlate highly enough with human viewer opinions to be useful, but since they require extensive CPU power they cannot be utilized in an encoding optimization process, which requires computing the quality measures several times for each input frame.

Advantages of the Beamr Quality Measure

With the constraints of objective quality measures we had no choice but to develop our own quality measure, and we developed it with a very focused goal: To identify and quantify the specific artifacts created by block-based compression methods.

All of the current image and video compression standards, including JPEG, MPEG-1, MPEG-2, H.264 (AVC) and H.265 (HEVC) are built upon block based principles.

They divide an image into blocks, attempt to predict the block from previously encoded pixels, and then transform the block into the frequency domain, and quantize it.

All of these steps create specific artifacts, which the Beamr quality measure is trained to detect and measure. So instead of looking for general deformations, such as out of focus images, missing pixels etc. which is what general quality measures do, in contrast, we look for artifacts that were created by the video encoder.

This means that our quality measure is tightly focused and extremely efficient, and as a result, the CPU requirements of our quality measure are much lower than quality measures that try to model the Human Visual System (HVS).

Beamr Quality Measure and the Human Visual System

After years of developing our quality measure, we put it to the test, under the strict requirements of ITU BT-500, which is an international standard for testing image quality. We were happy to find that the correlation of our quality measure with subjective (human) results was extremely high.

When the testing was complete, we felt certain this revolutionary quality measure was ready for the task of accurately comparing two images for similarity, from a human point of view.

But compression artifacts are only part of the secret. When a human looks at an image or video, the eye and the brain are drawn to particular places in the scene, for example, places where there is movement, and in fact we are especially “tuned” to capture details in faces.

Since our attention is focused on these areas, artifacts are more disturbing than the same artifacts in other areas of the image, such as background regions or out-of-focus areas. For this reason the Beamr quality measure takes this into account, and it ensures that when we measure quality proper attention is given to the areas that require it.

Furthermore, the Beamr quality measure takes into account temporal artifacts, introduced by the encoder, because it is not sufficient to ensure that each frame is not degraded, it is also necessary to preserve the quality and feel of the video’s temporal flow.

The Magic of Beamr

With the acquisition last year of Vanguard Video many industry observers have gone public with the idea that the combination of our highly innovative quality measure tightly integrated with the world’s best encoder, could lead to a real shake up of the ecosystem.

We encourage you to see for yourself what is possible when the world’s most advanced perceptual quality measure becomes the rate-control mechanism for the industry’s best quality software encoder. Check out Beamr Optimizer.

Content-Adaptive Uncategorized

Content-Adaptive Optimization is Bringing Next Level Performance to OTT and Broadcast Encoding Workflows

Posted on April 28, 2016by Dror Gill

As the digital landscape continues to grow, it’s no surprise that the demand for a high-quality and reliable streaming video experience on mobile devices is increasing. In fact, Cisco reported that by 2019, video will represent 80% of all global consumer Internet traffic. This means, once every second, nearly 1 million minutes of video content will cross the network. And more than 50% of this video traffic will traverse content delivery networks (CDNs).

Given these trends, it’s more important than ever for video content distributors to pursue more efficient methods of encoding their video so they can adapt to the rapidly changing market, and this is where content-adaptive optimization can provide a huge benefit.

Recently popularized by Netflix and Google, content-adaptive encoding is the idea that not all videos are created equal in terms of their encoding requirements. I recently wrote a blog post on the subject, you can read it here:

https://www.linkedin.com/pulse/its-time-move-beyond-apple-tn2224-mark-donnigan

The concept is easy to explain but difficult to execute.

Not every scene is created equal

Content-adaptive media optimization complements the encoding process by driving the encoder to the lowest bitrate possible based on the needs of the content, and not a fixed, target bitrate (as seen in a traditional encoding process).

This means that a content-adaptive solution is able to optimize more efficiently by analyzing already-encoded video on a frame-by-frame and scene-by-scene level, detecting areas of the video that can be further compressed without losing perceptual quality (i.e. slow motion scenes, smooth surfaces).

Provided these calculations are performed at the frame level with an optimizer that contains a closed loop perceptual quality measure, the output can be guaranteed to be the highest quality at the lowest bitrate possible.

Content-adaptive secret sauce

Reducing bitrate while maintaining perceptual quality may sound simple. Truth is, it takes years of intensive research that extends from the innermost workings of encoding science all the way to the study of block based encoding artifacts. This is work that Beamr has undertook since 2009 and it is the reason why our method of reducing bitrate has been recognized by the industry as the highest quality and safest. The Beamr perceptual quality measure is so highly correlated to human vision that viewers cannot tell the difference between a source video file and an optimized one, provided Beamr Video was used in the process. Video samples can be seen on our homepage.

The magic of Beamr Video is that we apply the optimization process in a closed loop, making it possible to determine the subjective quality level of both input and output video streams. And as a result the video encoding process is controlled by setting the compression parameters per video frame. This method guarantees the optimal bitrate for any type of content. For example, high motion content and highly detailed textures will receive more bits, whereas low motion content with smoother textures receive less bits.

Flexibility is key to the content-adaptive media optimization technology, and this is what enables finding the best bitrate-quality balance. From a business perspective, the result is smaller files with the same (or better) quality, requiring less storage and enhancing delivery of high quality video over congested networks.

How encoding and optimization work together

Since the content-adaptive optimization process is applied to files that have already been encoded, by combining the industry leading H.264 and HEVC encoder with the best optimization solution (Beamr Video), the market will benefit by receiving the highest quality video at the lowest possible bitrate. Which as a result, will allow content providers to improve the end-user experience with high quality video, while meeting the growing network constraints due to increased mobile consumption and general Internet congestion.

To dive deeper into the subject, we invite you to download The Case for Content-Adaptive Optimization whitepaper.