Connecting Virtual Reality with the History of Encoding Technology

Two fun and surprising brain factoids are revealed that connect virtual reality with the history of encoding technology.

Bloomberg featured a Charlie Rose interview with Jeremy Bailenson, the founding director of Stanford University’s Virtual Human Interaction Lab. Not surprisingly, the lab houses the sharpest minds and insightful datasets in its discipline of focus: Virtual Reality.

It’s a 20-minute video that touches on some fascinating elements of VR – very few of which are about commercial television or sports entertainment experiences.

In fact, it is as much an interview about the brain, human interaction, and the physical body, as it is about media and entertainment.

As Jeremy says: “The medium [of VR] puts you inside of the media. It feels like you are actually doing something.”

Then, he states our first stunning fact about the brain, which illustrates why VR will be so impactful on modern civilization:

We can’t tell the difference!

Professor Bailenson: “The brain is going to treat that as if it is a real experience. Humans have been around for a very long time [evolving in the real world.] The brain hasn’t yet evolved to really understand the difference between a compelling virtual reality experience and a real one.”

The full video is here.

So there you have it. Our brains are nothing short of miraculous, but they’ve evolved some peculiar wiring to say the least. To put it bluntly, while humans are exceptionally clever in many ways, we’re not so much in others.

Which is the perfect segue into my second surprising factoid about the brain, and it’s taken 25 years for commercial video markets to exploit this fact!

To be fair, that’s not an exact statement, but here’s the timeline for reference.

According to Wikipedia, Cinepak was one of the very first commercial implementations of video compression technology. It made it possible to watch video utilizing CD-ROM. (Just typing the words taps into nostalgia.) Cinepak was released in 1991 and became part of Apple’s QuickTime toolset a year later.

It was 16 years later, in 2007, that the Video Quality Experts Group decided to create and benchmark a new metric that – while not perfect – served as a milestone amongst the video coding community. For the first time, there was a recognition that maximum compression required us to take human vision biology into account when designing algorithms to shrink video files. Their perceptual metric was known as Perceptual Evaluation of Video Quality, and despite its impracticality for implementation, it became a part of the International Telecommunications Union standards.

Then in 2009, Beamr was formed to solve the very real need to reduce file sizes while retaining quality. This need became evident after an encounter with a consumer technology products company who indicated the massive cost of storage for digital media was an inhibitor for them to offer services that could extend the capacity of their devices. So we set out to solve the technical challenge of reducing redundant bits without compromising quality, and to do this in a fully automatic manner. The result? 50 patents have now been granted or are pending. And we have commercial implementations of our solution that have been working in some of the largest new media and video distribution platforms for more than three years.

But beyond this, there is another subjective datapoint taken from Beamr’s experience over the last few quarters where many of the conversations and evaluations that we are entering into about next-generation encoding are not limited to advanced codecs but rather subjective quality metrics – and leveraging our knowledge of the human vision system to remove bits in a compressed video file with no human noticeable difference.

As VR, 360, UHD, HDR and other exciting new consumer entertainment technologies are beginning to take hold in the market, never before has there been a greater need to advance the state of the art in the area of maximizing quality at a given bitrate. Beamr was the first company to step up to address and solve this challenge, and with our demonstrable quality, it’s not a stretch to suggest that we have the lead.

More information on Beamr’s software encoding and optimization solutions can be found at beamr.com

2016 Paves the Way for a Next-Gen Video Encoding Technology Explosion in 2017

2016 has been a significant year for video compression as 4K, HDR, VR and 360 video picked up steam, paving the road for an EXPLOSION of HEVC adoption in 2017. With HEVC’s ability to reduce bitrate and file sizes up to 50% over H.264, it is no surprise that HEVC has transitioned to be the essential enabler of high-quality and reliable streaming video powering all the new and exciting entertainment experiences being launched.

Couple this with the latest announcement from HEVC Advance removing royalty uncertainties that plagued the market in 2016 and we have a perfect marriage of technology and capability with HEVC.

In this post we’ll discuss 2016 from the lenses of Beamr’s own product and company news, combined with notable trends that will shape 2017 in the advanced video encoding space.  

>> The Market Speaks: Setting the Groundwork for an Explosion of HEVC

The State of 4K

With 4K content creation growing and the average selling price of UHD 4K TVs dropping (and being adopted faster than HDTVs), 4K is here and the critical mass of demand will follow closely. We recently did a little investigative research on the state of 4K and four of the most significant trends pushing its adoption by consumers:

  • The upgrade in picture quality is significant and will drive an increase in value to the consumer – and, most importantly, additional revenue opportunities for services as consumers are preconditioned to pay more for a premium experience. It only takes a few minutes viewing time to see that 4K offers premium video quality and enhances the entertainment experience.
  • Competitive forces are operating at scale – Service Providers and OTT distributors will drive the adoption of 4K. MSO are upping their game and in 2017 you will see several deliver highly formidable services to take on pure play OTT distributors. Who’s going to win, who’s going to lose? We think it’s going to be a win-win as services are able to increase ARPUs and reduce churn, while consumers will be able to actually experience the full quality and resolution that their new TV can deliver.
  • Commercially available 4K UHD services will be scaling rapidly –  SNL Kagan forecasts the number of global UHD Linear channels at 237 globally by 2020, which is great news for consumers. The UltraHD Forum recently published a list of UHD services that are “live” today numbering 18 VOD and 37 Live services with 8 in the US and 47 outside the US. Clearly, content will not be the weak link in UHD 4K market acceptance for much longer.
  • Geographic deployments — 4K is more widely deployed in Asia Pacific and Western Europe than in the U.S. today. But we see this as a massive opportunity since many people are traveling abroad and thus will be exposed to the incredible quality. They will then return home to question their service provider, why they had to travel outside the country to see 4K. Which means as soon as the planned services in the U.S. are launched, they will likely attract customer more quickly than we’ve seen in the past.

HDR adds WOW factor to 4K

High Dynamic Range (HDR) improves video quality by going beyond more pixels to increase the amount of data delivered by each pixel. HDR video is capable of capturing a larger range of brightness and luminosity to produce an image closer to what can be seen in real life. Show anyone HDR content encoded in 4K resolution, and it’s no surprise that content providers and TV manufacturers are quickly jumping on board to deliver content with HDR. Yes, it’s “that good.” There is no disputing that HDR delivers the “wow” factor that the market and consumers are looking for. But what’s even more promising is the industry’s overwhelmingly positive reaction to it. Read more here.

Beamr has been working with Dolby to enable Dolby Vision HDR support for several years now, even jointly presenting a white paper at SMPTE in 2014. The V.265 codec is optimized for Dolby Vision and HDR10 and takes into account all requirements for both standards including full support for VUI signaling, SEI messaging, SMPTE ST 2084:2014 and ITU-R BT.2020. For more information visit http://beamr.com/vanguard-by-beamr-content-adaptive-hevc-codec-sdk

Beamr is honored to have customers who are best in class and span OTT delivery, Broadcast, Service Providers and other entertainment video applications. From what we see and hear, studios are uber excited about HDR, cable companies are prepping for HDR delivery, Satellite distributors are building the capability to distribute HDR, and of course OTT services like Netflix, FandangoNow (formerly M-GO), VUDU, and Amazon are already distributing content using either Dolby Vision or HDR10 (or both). If your current video encoding workflow cannot fully support or adequately encode content with HDR, it’s time to update. Our V.265 video encoder SDK is a perfect place to start.

VR & 360 Video at Streamable Bitrates

360-degree video made a lot of noise in 2016.  YouTube, Facebook and Twitter added support for 360-degree videos, including live streaming in 360 degrees, to their platforms. 360-degree video content and computer-generated VR content is being delivered to web browsers, mobile devices, and a range of Virtual Reality headsets.  The Oculus Rift, HTC Vive, Gear VR and Daydream View have all shipped this year, creating a new market for immersive content experiences.

But, there is an inherent problem with delivering VR and 360 video on today’s platforms.  In order to enable HD video viewing in your “viewport” (the part of the 360-degree space that you actually look at), the resolution of the full 360 video delivered to you should be 4K or more.  On the other hand, the devices on the market today which are used to view this content, including desktops, mobile devices and VR headsets only support H.264 video decoding. So delivering the high-resolution video content requires very high bitrates – twice as much as using the more modern HEVC standard.

The current solution to this issue is lowered video quality in order to fit the H.264 video stream into a reasonable bandwidth. This creates an experience for users which is not the best possible, a factor that can discourage them from consuming this newly-available VR and 360 video content.  But there’s one thing we know for sure – next generation compression including HEVC and content adaptive encoding – and perceptual optimization – will be a critical part of the final solution. Read more about VR and 360 here.

Patent Pool HEVC Advance Announces “Royalty Free” HEVC software

As 4K, HDR, VR and 360 video gathers steam, Beamr has seen the adoption rate moving faster than expected, but with the unanswered questions around royalties, and concerns of who would shoulder the cost burden, distributors have been tentative. The latest move by HEVC Advance to offer a royalty free option is meant to encourage and accelerate the adoption (implementation) of HEVC, by removing royalty uncertainties.

Internet streaming distributors and software application providers can be at ease knowing they can offer applications with HEVC software decoders without incurring onerous royalties or licensing fees. This is important as streaming app content consumption continues to increase, with more and more companies investing in its future.

By initiating a software-only royalty solution, HEVC Advance expects this move to push the rest of the market i.e. device manufacturers and browser providers to implement HEVC capability in their hardware and offer their customers the best and most efficient video experience possible.

 

>> 2017 Predictions

Mobile Video Services will Drive the Need for Content-adaptive Optimization

Given the trend toward better quality and higher resolution (4K), it’s more important than ever for video content distributors to pursue more efficient methods of encoding their video so they can adapt to the rapidly changing market, and this is where content-adaptive optimization provides a massive benefit.

The boundaries between OTT services and traditional MSO (cable and satellite) are being blurred now that all major MSOs include TVE (TV Everywhere streaming services with both VOD and Linear channels) in their subscription packages (some even break these services out separately as is the case with SlingTV). And in October, AT&T CEO Randall Stephenson vowed that DirecTV Now would disrupt the pay-TV business with revolutionary pricing for an  Internet-streaming service at a mere $35 per month for a package with more than 100 channels.

And get this – AT&T wireless is adopting the practice of “zero rating” for their customers, that is, they will not count the OTT service streaming video usage toward the subscriber’s monthly data plan. This represents a great value for customers, but there is no doubt that it puts pricing pressure on the operational side of all zero rated services.

2017 is the year that consumers will finally be able to enjoy linear as well as VOD content anywhere they wish even outside the home.

Beamr’s Contribution to MSOs, Service Providers, and OTT Distributors is More Critical Than Ever

When reaching to consumers across multiple platforms, with different constraints and delivery cost models, Beamr’s content adaptive optimizer perfects the encoding process to the most efficient quality and bitrate combination.

Whether you pay by the bit delivered to a traditional CDN provider, or operate your own infrastructure, the benefits of delivering less traffic are realized with improved UX such as faster stream start times and reduced re-buffering events, in addition to the cost savings. One popular streaming service reported to us that after implementing our content-adaptive optimization solution their rebuffering events as measured on the player were reduced by up to 50%, while their stream start times improved 20%.

Recently popularized by Netflix and Google, content-adaptive encoding is the idea that not all videos are created equal in terms of their encoding requirements. Content-adaptive optimization complements the encoding process by driving the encoder to the lowest bitrate possible based on the needs of the content, and not a fixed target bitrate (as seen in traditional encoding processes and products).

A content-adaptive solution can optimize more efficiently by analyzing already-encoded video on a frame-by-frame and scene-by-scene level, detecting areas of the video that can be further compressed without losing perceptual quality (e.g. slow motion scenes, smooth surfaces).

Provided the perceptual quality calculation is performed at the frame level with an optimizer that contains a closed loop perceptual quality measure, the output can be guaranteed to be the highest quality at the lowest bitrate possible. Click the following link to learn how Beamr’s patented content adaptive optimization technology achieves exactly this result.

Encoding and Optimization Working Together to Build the Future

Since the content-adaptive optimization process is applied to files that have already been encoded, by combining an industry leading H.264 and HEVC encoder with the best optimization solution (Beamr Video), the market will be sure to benefit by receiving the highest quality video at the lowest possible bitrate and file size. As a result, this will allow content providers to improve the end-user experience with high quality video, while meeting the growing network constraints due to increased mobile consumption and general Internet congestion.

Beamr made a bold step towards delivering on this stated market requirement by disrupting the video encoding space when in April 2016 we acquired Vanguard Video – a premier video encoding and technology company. This move will benefit the industry starting in 2017 when we introduce a new class of video encoder that we call a Content Adaptive Encoder.

As content adaptive encoding techniques are being adopted by major streaming services and video platforms like YouTube and Netflix, the market is gearing up for more advanced rate control and optimization methods, something that fits our perceptual quality measure technology perfectly. This fact when combined with Beamr having the best in class HEVC software encoder in the industry, will yield exciting benefits for the market. Read the Beamr Encoder Superguide that details the most popular methods for performing content adaptive encoding and how you can integrate them into your video workflow.

One Year from Now…

In one year from now when you read our post summarizing 2017 and heralding 2018, what you will likely hear is that 2017 was the year that advanced codecs like HEVC combined with efficient perceptually based quality measures, such as Beamr’s, provide an additional 20% or greater bitrate reduction.

The ripple effect of this technology leap will be that services struggling to compete today on quality or bitrate, may fall so far behind that they lose their ability to grow the market. We know of many multi-service operator platforms who are gearing up to increase the quality of their video beyond the current best of class for OTT services. That is correct, they’ve watched the consumer response to new entrants in the market offering superior video quality, and they are not sitting still. In fact, many are planning to leapfrog the competition with their aggressive adoption of content adaptive perceptual quality driven solutions.  

If any one service assumes they have the leadership position based on bitrate or quality, 2017 may prove to be a reshuffling of the deck.

For Beamr, the industry can expect to see an expansion of our software encoder line with the integration of our perceptual quality measure which has been developed over the last 7 years, and is covered by more than 50 patents granted and pending. We are proud of the fact that this solution has been shipping for more than 3 years in our stand-alone video and photo optimizer solutions.

It’s going to be an exciting year for Beamr and the industry and we welcome you to join us. If you are intrigued and would like to learn more about our products or are interested in evaluating any of our solutions, check us out at beamr.com.

Before you evaluate x265, read this!

With video consumption rising and consumer preferences shifting to 4K UHD this is contributing to an even faster adoption rate than what we saw with the move to HD TV. Consumer demand for a seamless (buffer-free) video experience is a new expectation, and with the latest announcement from HEVC Advance removing royalty uncertainties in the market it’s time to start thinking about building and deploying an HEVC workflow, starting with a robust HEVC encoder.

As you may know, Beamr’s V.265 was the first commercially deployed HEVC codec SDK and it is in use today by the world’s largest OTT streaming service. Even still, we receive questions regarding V.265 in comparison to x265 and in this post we’d like to address a few of them.

In future posts, we will discuss the differences in two distinct categories, performance (speed) and quality, but in this post we’ll focus on feature-related differences between V.265 and x265.

Beginning with our instruction set, specifically support for X86/x64 SMP Architecture, V.265 is able to improve encoding performance by leveraging a resource efficient architecture that is used by most multiprocessors today. Enabling this type of support allows each processor to execute different programs while working on discrete data sets to afford the capability of sharing common resources (memory, I/O device interrupt system and so on) that are connected using a system bus or a crossbar. The result is a notable increase in overall encoding speed with V.265 over x265. For any application where speed is important, V.265 will generally pull ahead as the winner.

Another area V.265 shines compared to x265 is with its advanced preprocessing algorithm support that provides resizing and de-interlacing. As many of you know, working with interlaced video can lead to poor video quality so to try and minimize the various visual defects V.265 uses a variety of techniques like line doubling where our smart algorithms are able to detect and fill in an empty row by averaging the line above and the line below. The advantages of having a resizing feature is recognizable, largely saving time and resources, and out of the box V.265 allows you to easily convert video from one resolution to another (i.e. 4K to HD). One note, we are aware that x265 supports these features via FFMPEG. However in the case that a user is not able to use FFMPEG, the fact that V.265 supports them directly is a benefit.

V.265 boasts an unmatched pre-analysis library with fading detection and complexity analysis capabilities not supported in x265. An application for the V.265 library is video segmentation that is problematic with many encoders because of the different ways two consecutive shots may be linked. In V.265, the fading detection method detects the type of gradual transition, fade type etc. which is needed to detect hard to recognize soft cuts. V.265’s complexity analysis is able to discriminate temporal and spatial complexity in video sequences with patented multi-step motion estimation methods that are more advanced than standard “textbook” motion estimation algorithms. The information gained from doing a video complexity analysis is used during the encoding process to improve encoding quality especially during transitions between scenes.

One of the most significant features V.265 offers compared to x265 is multistreaming (ABR) support. V.265 can produce multiple GOP-aligned video output streams that are extremely important when encoding for adaptive streaming. It is critical that all bitrates have IDRs aligned to enable seamless stream switching, which V.265 provides.

Additionally, with V.265 users can produce multiple GOP-aligned HEVC streams from a single input. This is extremely important for use cases when a user has one chance to synchronize video of different resolutions and bitrates.  Multistreaming helps to provide encoded data to HLS or DASH packagers in an optimal way and it provides performance savings – especially when the service must output multiple streams of the same resolution, but at varying bitrates.


Another significant feature V.265 has over x265 is its content adaptive speed settings that makes codec configuration more convenient such as real-time compared to VOD workflows. Currently we offer presets ranging from ultra fast for extremely low latency live broadcast streams to the highest quality VOD.

To combat packet losses and produce the most robust stream possible, V.265 supports slicing by slice compressed size which produces encoded slices of limited sized (typically the size of a network packet) for use in an error prone network. This is an important feature for anyone distributing content on networks with highly variable QoS.

Continuing on to parallel processing features, V.265 offers support for tiles that divides the frame into a grid of rectangular regions that can be independently decoded and encoded. Enabling this feature increases encoding performance.

V.265 is regarded as one of the most robust codecs in the market because of its ability to suit both demanding real-time and offline file based workflows. To deliver the industry leading quality that makes V.265 so powerful, it offers motion estimation features like patented high performance search algorithms and motion vectors over a picture boundary to provide additional quality improvements over x265.

For encoding by frame-type, V.265 offers Bi- and uni-directional non-reference P-frames which is useful where low-delay encoding is needed to improve temporal scalability

As for encoding tools, V.265 offers a unique set of tools over x265:

  1. Joint bi-directional Motion Vector Search which is an internal motion estimation encoding technique that provides a better bi-direction motion vector search.
  2. Sub-LCU QP modulation that allows the user to change QP from block to block inside LCU as a way to control in-frame bits/quality more precisely.
  3. Support for up to 4 temporal layers of multiple resolutions in the same bitstream to help with changing network conditions.
  4. Region of Interest (ROI) control which allows for encoding of a specific ROI with a particular encoding parameter (qp) to add flexibility and improve encoding quality.

Another major advantage over x265 is the proprietary rate control implementation offered with V.265. This ensures target bitrates are always maintained.

The more supplemental enhancement information (SEI) messages a codec supports the more video usability information (VUI) metadata that may be delivered to the decoder in an encoded bitstream. For this reason, Beamr found it necessary to include in V.265 support for Recovery point, Field indication, Decoded Picture Hash, User data unregistered, and User data as specified by ITU-T T.35.

V.265’s ability to change encoding parameters on the fly is another extremely important feature that sets it apart from x265. With the ability to change encoder resolution, bitrate, and other key elements of the encoding profile, video distributors can achieve a significant advantage by creating recipes appropriate to each piece of content without needing to interrupt their workflows or processing cycles to reset and restart an encoder.

We trust this feature comparison was useful. In the event that you require more information or would like to evaluate the V.265, feel free to reach out to us at http://beamr.com/info-request and someone will get in touch to discuss your application and interest.

Patent Pool HEVC Advance Responds: Announces “Royalty Free” HEVC Software

HEVC Advance Releases New Software Policy

November 22nd 2016 may be shown by history as the day that wholesale adoption of HEVC as the preferred next generation codec began. For companies like Beamr who are innovating on next-generation video encoding technologies such as HEVC, the news HEVC Advance announced on to drop royalties (license fees) on certain applications of their patents is huge.

In their press release, HEVC Advance, the patent pool for key HEVC technologies stated that they will not seek a license fee or royalties on software applications that utilize the HEVC compression standard for encoding and decoding. This carve out only applies to software which is able to be run on commodity servers, but we think the restriction fits beautifully with where the industry is headed.

Did you catch that? NO HEVC ROYALTIES FOR SOFTWARE ENCODERS AND DECODERS!

Specifically, the policy will protect  “application layer software downloaded to mobile devices or personal computers after the initial sales of the device, where the HEVC encoding or decoding is fully executed in software on a general purpose CPU” from royalty and licensing fees.  

Requirements of Eligible Software

For those trying to wrap their heads around eligibility, the new policy outlines three requirements which the software products performing HEVC decoding or encoding must meet:

  1. Application layer software, or codec libraries used by application layer software, enabling software-only encoding or decoding of HEVC.
  2. Software downloaded after the initial sale of a related product (mobile device or desktop personal computer). In the case of software which otherwise would fit the exclusion but is being shipped with a product, then the manufacturer of the product would need to pay a royalty.
  3. Software must not be specifically excluded.

Examples of exempted software applications where an HEVC decode royalty will likely not be due includes web browsers, personal video conferencing software and video players provided by various internet streaming distributors or software application providers.

For more information check out  https://www.hevcadvance.com/

As stated previously, driven by the rise of virtual private and public cloud encoding workflows, provided an HEVC encoder meets the eligibility requirements, for many companies it appears that there will not be an added cost to utilize HEVC in place of H.264.

A Much Needed Push for HEVC Adoption

As 4k, HDR, VR and 360 video are gathering steam, Beamr has seen the adoption rate moving faster than expected, but with the unanswered questions around royalties, and concerns of the cost burden, even the largest distributors have been tentative. This move by HEVC Advance is meant to encourage and accelerate the adoption (implementation) of HEVC, by removing uncertainties in the market.

Internet streaming distributors and software application providers can be at ease knowing they can offer applications with HEVC software decoders without incurring onerous royalties or licensing fees. This is important as streaming app content consumption continues to increase, with more and more companies investing in its future.

By initiating a software-only royalty solution, HEVC Advance expects this move to push the rest of the market i.e. device manufacturers and browser providers to implement HEVC capability in their hardware and offer their customers the best and most efficient video experience possible.

What this Means for a Video Distributor

Beamr is the leader in H.265/HEVC encoding. With 60 engineers around the world working at the codec level to produce the highest performing HEVC codec SDK in the market, Beamr V.265 delivers exceptional quality with much better scalability than any other software codec.

Industry benchmarks are showing that H.265/HEVC provides on average a 30% bitrate efficiency for the same quality and resolution over H.264. Which given the bandwidth pressure all networks are under to upgrade quality while minimizing the bits used, there is only one video encoding technology available at scale to meet the needs of the market, and that is HEVC.

The classic chicken and egg problem no longer exists with HEVC.

The challenge every new technology faces as it is introduced into the market is the classic problem of needing to attract implementers and users. In the case of a video encoding technology, without an appropriately scaled video playback ecosystem, no matter the benefits, it cannot be deployed without a sufficiently large number of players in the market.

But the good news is that over the last few years, and as consumers have propelled the TV upgrade cycle forward, many have opted to purchase UHD 4k TVs.

Most of the 2015-2016 models of major brand TVs have built-in HEVC decoders and this trend will continue in 2017 and beyond. Netflix, Amazon, VUDU, and FandangoNow (M-GO) are shipping their players on most models of UHD TVs that are capable of decoding and playing back H.265/HEVC content from these services. These distributors were all able to utilize the native HEVC decoder in the TV, easing the complexity of launching a 4k app.

For those who wonder if there is a sufficiently large ecosystem of HEVC playback in the market, just look at the 90 million TVs that are in homes today globally (approximately 40 million are in the US). And consider that in 2017 the number of 4k HEVC capable TV’s will nearly double to 167 million according to Cisco, as illustrated below.

cisco-vni-global-ip-traffic-forecast-2015-2020

The industry has spoken regarding the superior quality and performance of Beamr’s own HEVC encoder, and we will be providing benchmarks and documentation in future blog posts. Meanwhile our team of architects and implementation specialists who work with the largest service providers, SVOD consumer streaming services, and broadcasters in the world are ready to discuss your migration plans from H.264 to HEVC.

Just fill out our short Info Request form and the appropriate person will get in touch.

Immersive VR and 360 video at streamable bitrates: Are you crazy?

There have been many high-profile experiments with VR and 360 video in the past year. Immersive video is compelling, but large and unwieldy to deliver. This area will require huge advancements in video processing – including shortcuts and tricks that border on ‘magical’.

Most of us have experienced breathtaking demonstrations that provide a window into the powerful capacity of VR and 360 video – and into the future of premium immersive video experiences.

However, if you search the web for an understanding of how much bandwidth is required to create these video environments, you’re likely to get lost in a tangled thicket of theories and calculations.

Can the industry support the bitrates these formats require?

One such post on Forbes in February 2016 says No.

It provides a detailed mathematical account of why fully immersive VR will require each eye to receive 720 million pixels at 36 bits per pixel and 60 frames per second – or a total of 3.1 trillion bits per second.1

We’ve taken a poll at Beamr, and no one in the office has access to those kinds of download speeds. And some of these folks pay the equivalent of a part-time salary to their ISP!

Thankfully the Forbes article goes on to explain that it’s not quite that bad.

Existing video compression standards will be able to improve this number by 300, according to the author, and HEVC will compress that by 600 down to what might be 5.2 Gbps.

The truth is, the calculations put forth in the Forbes piece are very ambitious indeed. As the author states:

“The ultimate display would need a region of 720 million pixels for full coverage because even though your foveal vision has a more narrow field of view, your eyes can saccade across that full space within an instant. Now add head and body rotation for 360 horizontal and 180 vertical degrees for a total of more than 2.5 billion (giga) pixels.”

A more realistic view of the way VR will rollout was presented by Charles Cheevers of network equipment vendor ARRIS at INTX in May of this year.2

Great VR experiences including a full 360 degree stereoscopic video environment at 4K resolutions could easily require a streaming bandwidth of 500 Mbps or more.

That’s still way too high, so what’s a VR producer to do?

Magical illusion, of course. 

In fact, just like your average Vegas magician, the current state of the art in VR delivery relies on tricks and shortcuts that leverage the imperfect way we humans see.

For example, Foveated Rendering can be used to aggressively compress the areas of a VR video where your eyes are not focused.

This technique alone, and variations on this theme – can take the bandwidth required by companies like NextVR dramatically lower, with some reports that an 8 Mbps stream can provide a compelling immersive experience. The fact is, there are endless ways to configure the end-to-end workflow for VR and much will depend on the hardware and software and networking environments in which it is deployed.

Compression innovations utilizing perceptual frame by frame rate control methodologies, and some involving the mapping of spherical images to cubes and pyramids, in an attempt to transpose images into 5 or 6 viewing planes, and ensure the highest resolution is always on the plane where the eyes are most intensely focused, are being tried.3

At the end of the day, it’s going to be hard to pin down your nearest VR dealer on the amount of bandwidth that’s required for a compelling VR experience. But there’s one thing we know for sure – next generation compression including HEVC and content adaptive encoding – and perceptual optimization – will be a critical part of the final solution.

References:

(1) Found on August 10, 2016 at the following URL: http://www.forbes.com/sites/valleyvoices/2016/02/09/why-the-internet-pipes-will-burst-if-virtual-reality-takes-off/#ca7563d64e8c

(2) Start at 56 minutes. https://www.intxshow.com/session/1041/  — Information and a chart is also available online here: http://www.onlinereporter.com/2016/06/17/arris-gives-us-hint-bandwidth-requirements-vr/ 

(3) Facebook’s developer site gives a fascinating look at these approaches, which they call dynamic streaming techniques. Found on August 10, 2016 at the following URL:  https://code.facebook.com/posts/1126354007399553/next-generation-video-encoding-techniques-for-360-video-and-vr/

Translating Opinions into Fact When it Comes to Video Quality

This post was originally featured at https://www.linkedin.com/pulse/translating-opinions-fact-when-comes-video-quality-mark-donnigan 

In this post, we attempt to de-mystify the topic of perceptual video quality, which is the foundation of Beamr’s content adaptive encoding and content adaptive optimization solutions. 

National Geographic has a hit TV franchise on its hands. It’s called Brain Games starring Jason Silva, a talent described as “a Timothy Leary of the viral video age” by the Atlantic. Brain Games is accessible, fun and accurate. It’s a dive into brain science that relies on well-produced demonstrations of illusions and puzzles to showcase the power — and limitation — of the human brain. It’s compelling TV that illuminates how we perceive the world.(Intrigued? Watch the first minute of this clip featuring Charlie Rose, Silva, and excerpts from the show: https://youtu.be/8pkQM_BQVSo )

At Beamr, we’re passionate about the topic of perceptual quality. In fact, we are so passionate, that we built an entire company based on it. Our technology leverages science’s knowledge about the human vision system to significantly reduce video delivery costs, reduce buffering & speed-up video starts without any change in the quality perceived by viewers. We’re also inspired by the show’s ability to turn complex things into compelling and accessible, without distorting the truth. No easy feat. But let’s see if we can pull it off with a discussion about video quality measurement which is also a dense topic.

Basics of Perceptual Video Quality

Our brains are amazing, especially in the way we process rich visual information. If a picture’s worth 1,000 words. What’s 60 frames per second in 4k HDR worth?

The answer varies based on what part of the ecosystem or business you come from, but we can all agree that it’s really impactful. And data intensive, too. But our eyeballs aren’t perfect and our brains aren’t either – as Brain Games points out. As such, it’s odd that established metrics for video compression quality in the TV business have been built on the idea that human vision is mechanically perfect.

See, video engineers have historically relied heavily on two key measures to evaluate the quality of a video encode: Peak Signal to Noise Ratio, or PSNR, and Structured Similarity, or SSIM. Both metrics are ‘objective’ metrics. That is, we use tools to directly measure the physics of the video signal and construct mathematical algorithms from that data to create metrics. But is it possible to really quantify a beautiful landscape with a number? Let’s see about that.

PSNR and SSIM look at different physics properties of a video, but the underlying mechanics for both metrics are similar. You compress a source video where the properties of the “original” and derivative are then analyzed using specific inputs, and metrics calculated for both. The more similar the two metrics are, the more we can say that the properties of each video are similar, and the closer we can define our manipulation of the video, i.e. our encode, as having a high or acceptable quality.

Objective Quality vs. Subjective Quality


However, it turns out that these objectively calculated metrics do not correlate well to the human visual experience. In other words, in many cases, humans cannot perceive variations that objective metrics can highlight while at the same time, objective metrics can miss artifacts a human easily perceives.

The concept that human visual processing might be less than perfect is intuitive. It’s also widely understood in the encoding community. This fact opens a path to saving money, reducing buffering and speeding-up time-to-first-frame. After all, why would you knowingly send bits that can’t be seen?

But given the complexity of the human brain, can we reliably measure opinions about picture quality to know what bits can be removed and which cannot? This is the holy grail for anyone working in the area of video encoding.

Measuring Perceptual Quality

Actually, a rigorous, scientific and peer-reviewed discipline has developed over the years to accurately measure human opinions about the picture quality on a TV. The math and science behind these methods are memorialized in an important ITU standard on the topic originally published in 2008 and updated in 2012. ITU BT.500 (International Telecommunications Union is the largest standards committee in global telecom.) I’ll provide a quick rundown.

First, a set of clips is selected for testing. A good test has a variety of clips with diverse characteristics: talking heads, sports, news, animation, UGC – the goal is to get a wide range of videos in front of human subjects.

Then, a subject pool of sufficient size is created and screened for 20/20 vision. They are placed in a light-controlled environment with a screen or two, depending on the set-up and testing method.

Instructions for one method is below, as a tangible example.

In this experiment, you will see short video sequences on the screen that is in front of you. Each sequence will be presented twice in rapid succession: within each pair, only the second sequence is processed. At the end of each paired presentation, you should evaluate the impairment of the second sequence with respect to the first one.

You will express your judgment by using the following scale:

5 Imperceptible

4 Perceptible but not annoying

3 Slightly annoying

2 Annoying

1 Very annoying

Observe carefully the entire pair of video sequences before making your judgment.

As you can imagine, testing like this is an expensive proposition indeed. It requires specialized facilities, trained researchers, vast amounts of time, and a budget to recruit subjects.

Thankfully, the rewards were worth the effort for teams like Beamr that have been doing this for years.

It turns out, if you run these types of subjective tests, you’ll find that there are numerous ways to remove 20 – 50% of the bits from a video signal without losing the ‘eyeball’ video quality – even when the objective metrics like PSNR and SSIM produce failing grades.

But most of the methods that have been tried are still stuck in academic institutions or research labs. This is because the complexities of upgrading or integrating the solution into the playback and distribution chain make them unusable. Have you ever had to update 20 million set-top boxes? Well if you have, you know exactly what I’m talking about.

We know the broadcast and large scale OTT industry, which is why when we developed our approach to measuring perceptual quality and applied it to reducing bitrates, we were insistent on staying 100% inside the standard of AVC H.264 and HEVC H.265.

By pioneering the use of perceptual video quality metrics, Beamr is enabling media and entertainment companies of all stripes to reduce the bits they send by up to 50%. This reduces re-buffering events by up to 50%, improves video start time by 20% or more, and reduces storage and delivery costs.

Fortunately, you now understand the basics of perceptual video quality. You also see why most of the video engineering community believes content adaptive sits at the heart of next-generation encoding technologies.

Unfortunately, when we stated above that there were “all kinds of ways” to reduce bits up to 50% without sacrificing ‘eyeball video quality’, we skipped over some very important details. Such as, how we can utilize subjective testing techniques on an entire catalog of videos at scale, and cost efficiently.

Next time: Part 2 and the Opinionated Robot

Looking for better tools to assess subjective video quality?

You definitely want to check out Beamr’s VCT which is the best software player available on the market to judge HEVC, AVC, and YUV sequences in modes that are highly useful for a video engineer or compressionist.

VCT is available for Mac and PC. And best of all, we offer a FREE evaluation to qualified users.

Learn more about VCT: http://beamr.com/h264-hevc-video-comparison-player/

 

Will Virtual Reality Determine the Future of Streaming?

As video services take a more aggressive approach to virtual reality (VR), the question of how to scale and deliver this bandwidth intensive content must be addressed to bring it to a mainstream audience.

While we’ve been talking about VR for a long time you can say that it was reinvigorated when Oculus grabbed the attention of Facebook who injected 2 billion in investment based on Mark Zuckerberg’s vision that VR is a future technology that people will actively embrace. Industry forecasters tend to agree, suggesting VR will be front and center in the digital economy within the next decade. According to research by Canalys, vendors will ship 6.3 million VR headsets globally in 2016 and CCS Insights suggest that as many as 96 million headsets will get snapped up by consumers by 2020.

One of VR’s key advantages is the fact that you have the freedom to look anywhere in 360 degrees using a fully panoramic video in a highly intimate setting. Panoramic video files and resolution dimensions are large, often 4K (4096 pixels wide, 2048 pixels tall, depending on the standard) or bigger.

While VR is considered to be the next big revolution in the consumption of media content, we also see it popping up in professional fields such as education, health, law enforcement, defense telecom and media. It can provide a far more immersive live experience than TV, by adding presence, the feeling that “you are really there.”

Development of VR projects have already started to take off and high-quality VR devices are surprisingly affordable. Earlier this summer, Google announced that 360-degree live streaming support was coming to YouTube.

Of course, all these new angles and sharpness of imagery creates new and challenging sets of engineering hurdles which we’ll discuss below.

Resolution and, Quality?

Frame rate, resolution, and bandwidth are affected by the sheer volume of pixels that VR transmits. Developers and distributors of VR content will need to maximize frame rates and resolution throughout the entire workflow. They must keep up with the wide range of viewers’ devices as sporting events in particular, demand precise detail and high frame rates, such as what we see with instant replay, slow motion, and 360-degree cameras.

In a recent Vicon industry survey, 28 percent of respondents stated that high-quality content was important to ensuring a good VR experience. Let’s think about simple file size comparisons – we already know that ultra HD file sizes take up considerably more storage space than SD and the greater the file size, the greater a chance it will impede the delivery. VR file sizes are no small potatoes.  When you’re talking about VR video you’re talking about four to six times the foundational resolution that you are transmitting. And, if you thought that Ultra HD was cumbersome, think about how you’re going to deal with resolutions beyond 4K for an immersive VR HD experience.

In order to catch up with the file sizes we need to continue to develop video codecs that can quickly interpret the frame-by-frame data. HEVC is a great starting point but frankly given hardware device limitations many content distributors are forced to continue using H.264 codecs. For this reason we must harness advanced tools in image processing and compression. An example of one approach would be content adaptive perceptual optimization.

I want my VR now! Reaching End Users

Because video content comes in a variety of file formats including combinations of stereoscopic 3D, 360 degree panoramas and spherical views – they all come with obvious challenges such as added strain on processors, memory, and network bandwidth. Modern codecs today use a variety of algorithms to quickly and efficiently detect these similarities, but they are usually tailored to 2D content. However, a content delivery mechanism must be able to send this to every user and should be smart to optimize the processing and transmitting of video.

Minimizing latency, how long can you roll the boulder up the hill?

We’ve seen significant improvements in the graphic processing capabilities of desktops and laptops. However, to take advantage of the immersive environment that VR offers, it’s important that high-end graphics are delivered to the viewer as quickly and smoothly as possible. The VR hardware also needs to display large images properly and with the highest fidelity and lowest latency. There really is very limited room for things like color correction or for adjusting panning from different directions for instance. If you have to stitch or rework artifacts, you will likely lose ground. You need to be smart about it. Typical decoders for tablets or smart TVs are more likely to cause latency and they only support lower framerates. This means how you build the infrastructure will be the key to offering image quality and life-like resolution that consumers expect to see.

Bandwidth, where art thou?

According to Netflix, for an Ultra HD streaming experience, your Internet connection must have a speed of 25 Mbps or higher. However, according to Akamai, the average Internet speed in the US is only approximately 11 Mbps. Effectively, this prohibits live streaming on any typical mobile VR device which to achieve the quality and resolution needed may need 25 Mbps minimum.

Most certainly the improvements in graphic processing and hardware will continue to drive forward the realism of the immersive VR content, as the ability to render an image quickly becomes easier and cheaper. Just recently, Netflix jumped on the bandwagon and became the first of many streaming media apps to launch on Oculus’ virtual reality app store. As soon as all the VR display devices are able to integrate with these higher resolution screens, we will see another step change in the quality and realism of virtual environments. But will the available bandwidth be sufficient, is a very real question. 

To understand the applications for VR, you really have to see it to believe it

A heart-warming campaign from Expedia recently offered children at a research hospital in Memphis Tennessee the opportunity to be taken on a journey of their dreams through immersive, real-time virtual travel – all without getting on a plane:  https://www.youtube.com/watch?time_continue=179&v=2wQQh5tbSPw

The National Multiple Sclerosis Society also launched a VR campaign that inventively used the tech to give two people with MS the opportunity to experience their lifelong passions. These are the type of immersive experiences we hope will unlock a better future for mankind. We applaud the massive projects and time spent on developing meaningful VR content and programming such as this.

Frost & Sullivan estimates that $1.5 billion is the forecasted revenue from Pay TV operators delivering VR content by 2020. The adoption of VR in my estimation is only limited by the quality of the user experience, as consumer expectation will no doubt be high.

For VR to really take off, the industry needs to address some of these challenges making VR more accessible and most importantly with unique and meaningful content. But it’s hard to talk about VR without experiencing it. I suggest you try it – you will like it.

Applications for On-the-Fly Modification of Encoder Parameters

As video encoding workflows modernize to include content adaptive techniques, the ability to change encoder parameters “on-the-fly” will be required. With the ability to change encoder resolution, bitrate, and other key elements of the encoding profile, video distributors can achieve a significant advantage by creating recipes appropriate to each piece of content.

For VOD or file-based encoding workflows, the advantages of on-the-fly reconfigurability are to enable content specific encoding recipes without resetting the encoder and disrupting the workflow. At the same time, on-the-fly functionality is a necessary feature for supporting real-time encoding on a network with variable capacity.  This way the application can take appropriate steps to react to changing bandwidth, network congestion or other operational requirements.

Vanguard by Beamr V.264 AVC Encoder SDK and V.265 HEVC Encoder SDK have supported on-the-fly modification of the encoder settings for several years. Let’s take a look at a few of the more common applications where having the feature can be helpful.

On-the-fly control of Bitrate

Adjusting bitrate while the encoder is in operation is an obvious application. All Vanguard by Beamr codec SDKs allow for the maximum bitrate to be changed via a simple “C-style” API.  This will enable bitrate adjustments to be made based on the available bandwidth, dynamic channel lineups, or other network conditions.

On-the-fly control of Encoder Speed

Encoder speed control is an especially useful parameter which directly translates into video encoding quality and encoding processing time. Calling this function triggers a different set of encoding algorithms, and internal codec presets. This scenario applies with unicast transmissions where a service may need to adjust the encoder speed for ever-changing network conditions and client device capabilities.

On-the-fly control of Video Resolution

A useful parameter to access on the fly is video resolution. One use case is in telecommunications where the end user may shift his viewing point from a mobile device operating on a slow and congested cellular network, to a broadband WiFi network, or hard wired desktop computer. With control of video resolution, the encoder output can be changed during its operation to accommodate the network speed or to match the display resolution, all without interrupting the video program stream.

On-the-fly control of HEVC SAO and De-blocking Filter

HEVC presents additional opportunities to enhance “on the fly” control of the encoder and the Vanguard by Beamr V.265 encoder leads the market with the capability to turn on or off SAO and De-blocking filters to adjust quality and performance in real-time.

On-the-fly control of HEVC multithreading

V.265 is recognized for having superior multithreading capability.  The V.265 codec SDK provides access to add or remove encoding execution threads dynamically. This is an important feature for environments with a variable number of tasks running concurrently such as encoding functionality that is operating alongside a content adaptive optimization process, or the ABR packaging step.

Beamr’s implementation of on-the-fly controls in our V.264 Codec SDK and V.265 Codec SDK demonstrate the robust design and scalable performance of the Vanguard by Beamr encoder software.

For more information on Vanguard by Beamr Codec SDK’s, please visit the V.264 and V.265 pages.  Or visit http://beamr.com for more on the company and our technology.