Immersive VR and 360 video at streamable bitrates: Are you crazy?

There have been many high-profile experiments with VR and 360 video in the past year. Immersive video is compelling, but large and unwieldy to deliver. This area will require huge advancements in video processing – including shortcuts and tricks that border on ‘magical’.

Most of us have experienced breathtaking demonstrations that provide a window into the powerful capacity of VR and 360 video – and into the future of premium immersive video experiences.

However, if you search the web for an understanding of how much bandwidth is required to create these video environments, you’re likely to get lost in a tangled thicket of theories and calculations.

Can the industry support the bitrates these formats require?

One such post on Forbes in February 2016 says No.

It provides a detailed mathematical account of why fully immersive VR will require each eye to receive 720 million pixels at 36 bits per pixel and 60 frames per second – or a total of 3.1 trillion bits per second.1

We’ve taken a poll at Beamr, and no one in the office has access to those kinds of download speeds. And some of these folks pay the equivalent of a part-time salary to their ISP!

Thankfully the Forbes article goes on to explain that it’s not quite that bad.

Existing video compression standards will be able to improve this number by 300, according to the author, and HEVC will compress that by 600 down to what might be 5.2 Gbps.

The truth is, the calculations put forth in the Forbes piece are very ambitious indeed. As the author states:

“The ultimate display would need a region of 720 million pixels for full coverage because even though your foveal vision has a more narrow field of view, your eyes can saccade across that full space within an instant. Now add head and body rotation for 360 horizontal and 180 vertical degrees for a total of more than 2.5 billion (giga) pixels.”

A more realistic view of the way VR will rollout was presented by Charles Cheevers of network equipment vendor ARRIS at INTX in May of this year.2

Great VR experiences including a full 360 degree stereoscopic video environment at 4K resolutions could easily require a streaming bandwidth of 500 Mbps or more.

That’s still way too high, so what’s a VR producer to do?

Magical illusion, of course. 

In fact, just like your average Vegas magician, the current state of the art in VR delivery relies on tricks and shortcuts that leverage the imperfect way we humans see.

For example, Foveated Rendering can be used to aggressively compress the areas of a VR video where your eyes are not focused.

This technique alone, and variations on this theme – can take the bandwidth required by companies like NextVR dramatically lower, with some reports that an 8 Mbps stream can provide a compelling immersive experience. The fact is, there are endless ways to configure the end-to-end workflow for VR and much will depend on the hardware and software and networking environments in which it is deployed.

Compression innovations utilizing perceptual frame by frame rate control methodologies, and some involving the mapping of spherical images to cubes and pyramids, in an attempt to transpose images into 5 or 6 viewing planes, and ensure the highest resolution is always on the plane where the eyes are most intensely focused, are being tried.3

At the end of the day, it’s going to be hard to pin down your nearest VR dealer on the amount of bandwidth that’s required for a compelling VR experience. But there’s one thing we know for sure – next generation compression including HEVC and content adaptive encoding – and perceptual optimization – will be a critical part of the final solution.

References:

(1) Found on August 10, 2016 at the following URL: http://www.forbes.com/sites/valleyvoices/2016/02/09/why-the-internet-pipes-will-burst-if-virtual-reality-takes-off/#ca7563d64e8c

(2) Start at 56 minutes. https://www.intxshow.com/session/1041/  — Information and a chart is also available online here: http://www.onlinereporter.com/2016/06/17/arris-gives-us-hint-bandwidth-requirements-vr/ 

(3) Facebook’s developer site gives a fascinating look at these approaches, which they call dynamic streaming techniques. Found on August 10, 2016 at the following URL:  https://code.facebook.com/posts/1126354007399553/next-generation-video-encoding-techniques-for-360-video-and-vr/

4 Facts about 4K

We recently did a little investigative research on the state of 4k and here are four highlights of what we found.

To start, as an industry, we’ve been anticipating 4K for a few years now, but it was just this past April that DIRECTV launched the first-ever Live 4K broadcast from the Masters Golf Tournament. Read more here:

http://ktla.com/2016/03/30/get-ready-for-4k-programming-with-directv/

In May Comcast EVP Matt Strauss spoke with Multichannel News about the company’s plans to begin distributing a 4K HDR capable Xi6 set-top box, but not until 2017.

http://www.multichannel.com/news/content/building-video-momentum/405085

And Comcast did broadcast the Olympics in 4K, but only to the Xfinity App built in to a select set of Smart TVs. Also, as with DIRECTV and DISH Network, the 4K signals were broadcast after a 24-hour delay which I understand was caused mostly by content prep requirements. 

Meanwhile for VOD, Netflix and Amazon are in the game producing and delivering 4K content. While VUDU and FandangoNow also have a limited set of licensed content available for streaming delivery.

Watch Dave Ronca discuss Netflix 4K workflow and technology architecture at Streaming Media East.

As for linear 4K UHD options, in the U.S. today there are just a few TV channels available with the only major operator offering a 24×7 4K UHD linear TV channel being DIRECTV. (There is also a small operator in Chattanooga Tennessee with five 4K UHD channels)

Given the seeming “lack of content” and esoteric discussions about 4K not being easy to “actually see” because most screen sizes are too small due to the extended viewing distance in most homes, you’d be excused for thinking that 4K is still a ways out.

But… our research took us to Best Buy, where the store is filled wall to wall with 4K UHD capable TVs.

Our conclusion?

Forget everything you’ve read: The upgrade in picture quality is real and it’s awesome.

And that brings us to the first key fact about 4K UHD:

  1. The upgrade in picture quality is significant – and it will drive an increase in value to the consumer – and drive additional revenues in return.

SNL Kagan data released in July 2016 the following data. Nearly 2 out of 3 service providers and content producers they surveyed reported they believe consumers are willing to pay more for 4k UHD content. (4K Global Industry Forecast, SNL Kagan, July 2016)

However, it’s important to note that this stunning picture quality isn’t simply resolution. In fact, as we’ll point out in an upcoming white paper, High Dynamic Range is probably as important a feature in today’s 4K UHD TVs as resolution.

HDR enables three key things. Most essentially, HDR improves camera hardware by capturing the high contrast ratios – lighter lights and darker darks – that exist in the real world. As such, HDR images provide more ‘realism’ – and to stunning effect. Also, HDR provides greater luminance (brighter) and thirdly, it offers a wider color gamut (redder reds and greener greens.)

If that consumer benefit can translate into revenue impact, and we believe it will, this will drive accelerated service provider adoption, particularly given our 2nd fact finding about 4k:

  1. Competitive forces operating at scale – amongst Service Providers and OTT providers will drive the adoption of 4K.

Once 4K rollouts start, many in the business feel it will move lightning fast compared to the HD rollout. Why? Consolidation has created more scale in the TV market.

Plus you need to add competitive pressure to the mix with digital leaders like Netflix setting a high video quality bar for not only OTT competitors but MVPDs.

Meantime, major video service providers have been aggressive in efforts to dominate and extend their footprint into consumer homes. Fear and competition will drive decision making and actions at MVPDs as much as consumer delight.

All of the growth pressure described in #2 manifests itself in the growing forecasts for UHD linear TV channel launches.

  1. SNL Kagan forecasts the number of global UHD Linear channels at 95 by the end of 2016 – and 237 globally by 2020.

Of course, this is a chicken-and-egg problem. Few consumers want to purchase 4K TVs if there isn’t enough content to be displayed on them.

But as Tim Bajarin of Creative Strategies points out, until 35-40% of homes have a 4K TV, the cable and broadcast networks won’t justify sizable numbers of 4K channel launches. [USA TODAY Jan 2 2016, “More 4K TV programming finally here in 2016”]

Which leads us to our 4th key fact about 4k UHD TV.

  1. Don’t forget about Geography. 4K is already far more widely deployed in Asia Pacific and Western Europe than in the U.S.

It’s clear that 4K UHD is in the earliest stages of a commercial rollout. Yet it is surprising to see how far behind the U.S. is in 4K UHD channel launches, at least according to the SNL Kagan report previously referenced.

In that report, the North American region had just 12% of linear 4K UHD channels globally, compared with 42% in Asia Pacific, and 30% in Western Europe.

But as you think about the state of 4K and your company’s investment level whether that be in acquiring content rights, licensing HEVC encoders, or upgrading your network and streaming technologies to accommodate the increased bandwidth demands, don’t make the mistake of misreading the speed of adoption. Start acquiring content and building your 4K workflows now, because when the competitive pressure arrives to have a full UHD 4K offer (and it will come) you do not want to be scrambling.

Can we profitably surf the Video Zettabyte Tsunami?

Two key ingredients are in place. But we need to get started now.

In a previous post, we warned about the Zettabyte video tsunami – and the accompanying flood of challenges and opportunities for video publishers of all stripes, old and new. 

Real-life tsunamis are devastating. But California’s all about big wave surfing, so we’ve been asking this question: Can we surf this tsunami?

The ability to do so is going to hinge on economics. So a better phrasing is perhaps: Can we profitably surf this video tsunami?

Two surprising facts came to light recently that point to an optimistic answer, and so we felt it was essential to highlight them.

1. The first fact is about the Upfronts – and it provides evidence that 4K UHD content can drive growth in top-line sales for media companies.

The results from the Upfronts – the annual marketplace where networks sell ad inventory to premium brand marketers – provided TV industry watchers a major upside surprise. This year, the networks sold a greater share of ad inventory at their upfront events, and at higher prices too. As Brian Steinberg put it in his July 27, 2016 Variety1 article:

“The nation’s five big English-language broadcast networks secured between $8.41 billion and $9.25 billion in advance ad commitments for primetime as part of the annual “upfront” market, according to Variety estimates. It’s the first time in three years they’ve managed to break the $9 billion mark. The upfront finish is a clear signal that Madison Avenue is putting more faith in TV even as digital-video options abound.”

Our conclusion? Beautiful, immersive content environments with a more limited number of high-quality ads can fuel new growth in TV. And 4K UHD, including the stunning impact of HDR, is where some of this additional value will surely come from.

Conventional wisdom is that today’s consumers are increasingly embracing ad-free SVOD OTT content from premium catalogs like Netflix, even when they have to pay for it. Since they are also taking the lead on 4K UHD content programming, that’s a great sign that higher value 4K UHD content will drive strong economics. But the data from the Upfronts also seems to suggest that premium ad-based TV content can be successful as well, especially when the Networks create immersive, clutter-free environments with beautiful pictures. 

Indeed, if the Olympics are any measure, Madison Avenue has received the message and turned up their game on the creative. I saw more than a few head-turning :30-second spots. Have you seen the Chobani ads in pristine HD? They’re as powerful as it gets.2

Check out this link to see the ads.

2. The second fact is about the operational side of the equation.

Can we deliver great content at a reasonable cost to a large enough number of homes?  On that front, we have more good news. 

The Internet in the United States is getting much faster. This, along with advanced methods of compression including HEVC, Content Adaptive Encoding and Perceptual Quality Metrics, will result in a ‘virtual upgrade’ of existing delivery network infrastructure. In particular, Ookla’s Speedtest.net published data on August 3, 2016 contained several stunning nuggets of information. But before we reveal the data, we need to provide a bit of context.

It’s important to note that 4K UHD content requires bandwidth of 15 Mbps or greater. Let’s be clear, this assumes Content Adaptive Encoding, Perceptual Quality Metrics, and HEVC compression are all used in combination. However, in Akamai’s State of the Internet report released in Q1 of this year, only 35% of the US population could access broadband speeds of 15 Mbps.

(Note: We have seen suggestions that 4K UHD content requires up to 25 Mbps. Compression technologies improve over time and those data points may well be old news. Beamr is on the cutting edge of compression and we firmly believe that 10 – 15 Mbps is the bandwidth needed – today – to achieve stunning 4K UHD audio visual quality.)

And that’s what makes Ookla’s data so important. Ookla found that in the first 6 months of 2016, fixed broadband customers saw a 42% year-over-year increase in average download speeds to a whopping 54.97 Mbps. Even more importantly, while 10% of Americans lack basic access to FCC target speeds of 25 Mbps, only 4% of urban Americans lack access to those speeds. This speed boost seems to be a direct result of industry consolidation, network upgrades, and growth in fiber optic deployments.

After seeing this news, we also decided to take a closer look at that Akamai data. And guess what we found? A steep slope upward from prior quarters (see chart below).

To put it back into surfing terms: Surf’s Up!
time-based-trends-in-internet-connection-speeds-and-adoption-rates

References:

(1) “How TV Tuned in More Upfront Ad Dollars: Soap, Toothpaste and Pushy Tactics” Brian Steinberg, July 27, 2016: http://variety.com/2016/tv/news/2016-tv-upftont-networks-advertising-increases-1201824887/ 

(2)  Chobani ad examples from their YouTube profile: https://www.youtube.com/watch?v=DD5CUPtFqxE&list=PLqmZKErBXL-Nk4IxQmpgpL2z27cFzHoHu

Translating Opinions into Fact When it Comes to Video Quality

This post was originally featured at https://www.linkedin.com/pulse/translating-opinions-fact-when-comes-video-quality-mark-donnigan 

In this post, we attempt to de-mystify the topic of perceptual video quality, which is the foundation of Beamr’s content adaptive encoding and content adaptive optimization solutions. 

National Geographic has a hit TV franchise on its hands. It’s called Brain Games starring Jason Silva, a talent described as “a Timothy Leary of the viral video age” by the Atlantic. Brain Games is accessible, fun and accurate. It’s a dive into brain science that relies on well-produced demonstrations of illusions and puzzles to showcase the power — and limitation — of the human brain. It’s compelling TV that illuminates how we perceive the world.(Intrigued? Watch the first minute of this clip featuring Charlie Rose, Silva, and excerpts from the show: https://youtu.be/8pkQM_BQVSo )

At Beamr, we’re passionate about the topic of perceptual quality. In fact, we are so passionate, that we built an entire company based on it. Our technology leverages science’s knowledge about the human vision system to significantly reduce video delivery costs, reduce buffering & speed-up video starts without any change in the quality perceived by viewers. We’re also inspired by the show’s ability to turn complex things into compelling and accessible, without distorting the truth. No easy feat. But let’s see if we can pull it off with a discussion about video quality measurement which is also a dense topic.

Basics of Perceptual Video Quality

Our brains are amazing, especially in the way we process rich visual information. If a picture’s worth 1,000 words. What’s 60 frames per second in 4k HDR worth?

The answer varies based on what part of the ecosystem or business you come from, but we can all agree that it’s really impactful. And data intensive, too. But our eyeballs aren’t perfect and our brains aren’t either – as Brain Games points out. As such, it’s odd that established metrics for video compression quality in the TV business have been built on the idea that human vision is mechanically perfect.

See, video engineers have historically relied heavily on two key measures to evaluate the quality of a video encode: Peak Signal to Noise Ratio, or PSNR, and Structured Similarity, or SSIM. Both metrics are ‘objective’ metrics. That is, we use tools to directly measure the physics of the video signal and construct mathematical algorithms from that data to create metrics. But is it possible to really quantify a beautiful landscape with a number? Let’s see about that.

PSNR and SSIM look at different physics properties of a video, but the underlying mechanics for both metrics are similar. You compress a source video where the properties of the “original” and derivative are then analyzed using specific inputs, and metrics calculated for both. The more similar the two metrics are, the more we can say that the properties of each video are similar, and the closer we can define our manipulation of the video, i.e. our encode, as having a high or acceptable quality.

Objective Quality vs. Subjective Quality


However, it turns out that these objectively calculated metrics do not correlate well to the human visual experience. In other words, in many cases, humans cannot perceive variations that objective metrics can highlight while at the same time, objective metrics can miss artifacts a human easily perceives.

The concept that human visual processing might be less than perfect is intuitive. It’s also widely understood in the encoding community. This fact opens a path to saving money, reducing buffering and speeding-up time-to-first-frame. After all, why would you knowingly send bits that can’t be seen?

But given the complexity of the human brain, can we reliably measure opinions about picture quality to know what bits can be removed and which cannot? This is the holy grail for anyone working in the area of video encoding.

Measuring Perceptual Quality

Actually, a rigorous, scientific and peer-reviewed discipline has developed over the years to accurately measure human opinions about the picture quality on a TV. The math and science behind these methods are memorialized in an important ITU standard on the topic originally published in 2008 and updated in 2012. ITU BT.500 (International Telecommunications Union is the largest standards committee in global telecom.) I’ll provide a quick rundown.

First, a set of clips is selected for testing. A good test has a variety of clips with diverse characteristics: talking heads, sports, news, animation, UGC – the goal is to get a wide range of videos in front of human subjects.

Then, a subject pool of sufficient size is created and screened for 20/20 vision. They are placed in a light-controlled environment with a screen or two, depending on the set-up and testing method.

Instructions for one method is below, as a tangible example.

In this experiment, you will see short video sequences on the screen that is in front of you. Each sequence will be presented twice in rapid succession: within each pair, only the second sequence is processed. At the end of each paired presentation, you should evaluate the impairment of the second sequence with respect to the first one.

You will express your judgment by using the following scale:

5 Imperceptible

4 Perceptible but not annoying

3 Slightly annoying

2 Annoying

1 Very annoying

Observe carefully the entire pair of video sequences before making your judgment.

As you can imagine, testing like this is an expensive proposition indeed. It requires specialized facilities, trained researchers, vast amounts of time, and a budget to recruit subjects.

Thankfully, the rewards were worth the effort for teams like Beamr that have been doing this for years.

It turns out, if you run these types of subjective tests, you’ll find that there are numerous ways to remove 20 – 50% of the bits from a video signal without losing the ‘eyeball’ video quality – even when the objective metrics like PSNR and SSIM produce failing grades.

But most of the methods that have been tried are still stuck in academic institutions or research labs. This is because the complexities of upgrading or integrating the solution into the playback and distribution chain make them unusable. Have you ever had to update 20 million set-top boxes? Well if you have, you know exactly what I’m talking about.

We know the broadcast and large scale OTT industry, which is why when we developed our approach to measuring perceptual quality and applied it to reducing bitrates, we were insistent on staying 100% inside the standard of AVC H.264 and HEVC H.265.

By pioneering the use of perceptual video quality metrics, Beamr is enabling media and entertainment companies of all stripes to reduce the bits they send by up to 50%. This reduces re-buffering events by up to 50%, improves video start time by 20% or more, and reduces storage and delivery costs.

Fortunately, you now understand the basics of perceptual video quality. You also see why most of the video engineering community believes content adaptive sits at the heart of next-generation encoding technologies.

Unfortunately, when we stated above that there were “all kinds of ways” to reduce bits up to 50% without sacrificing ‘eyeball video quality’, we skipped over some very important details. Such as, how we can utilize subjective testing techniques on an entire catalog of videos at scale, and cost efficiently.

Next time: Part 2 and the Opinionated Robot

Looking for better tools to assess subjective video quality?

You definitely want to check out Beamr’s VCT which is the best software player available on the market to judge HEVC, AVC, and YUV sequences in modes that are highly useful for a video engineer or compressionist.

VCT is available for Mac and PC. And best of all, we offer a FREE evaluation to qualified users.

Learn more about VCT: http://beamr.com/h264-hevc-video-comparison-player/

 

VCT, the Secret to Confident Subjective Video Quality Testing

We can all agree that analyzing video quality is one of the biggest challenges when evaluating codecs. Companies use a combination of objective and subjective tests to validate encoder efficiency. In this post, I’ll explore why it is difficult to measure video quality with quantitative metrics alone because they fail to meet the subjective quality perception ability of the human eye.

Furthermore, we’ll look at why it’s important to equip yourself with the best resources when doing subjective testing, and how Beamr’s VCT visual comparison tool can help you with video quality testing.

But first, if you haven’t done so already, be sure to download your free trial of VCT here.

OBJECTIVE TESTING

The most common objective measurement used today is pixel-based Peak Signal to Noise Ratio (PSNR). PSNR is a popular test to use because it is easy to calculate and nearly everyone working in video is familiar with interpreting its values. But it does have limitations. Typically a higher PSNR value correlates to higher quality, while a lower PSNR value correlates to lower quality. However, since this test measures pixel-based mean-squared error over an entire frame; measuring the quality of a frame (or collection of frames) using a single number does not always parallel true subjective quality.

PSNR gives equal weight to every pixel in the frame and each frame in a sequence, ignoring many factors that can affect human perception. For example, below are 2 encoded images of the same frame.1 Image (a) and Image (b) have the same PSNR, which should theoretically correlate to two encoded images of the same quality. However, it is easy to see the difference in this example of perceived quality as viewers would rate Image (a) as exceptionally higher quality than Image (b).

Example: 

PSNR value example of why it shouldn't be the absolute measurement for assessing video quality

Due to the inconsistencies of error-based methods, like PSNR to adequately mimic human eye perception, other methods for analyzing video quality have been developed, including the Structural Similarity Index Metric (SSIM) which measures structural distortion. Unlike PSNR, SSIM addresses image degradation as measures of the perceived change in three major aspects of images: luminance, contrast, and correction. SSIM has gained popularity, but as with PSNR, it has its limitations. Studies have suggested that SSIM’s performance is equal to PSNR’s performance and some have cited evidence of a systematic relationship between SSIM and Mean Squared Error (MSE).2

While SSIM and other quantitative measures including multi-scale structural similarity (MS-SSIM) and the Sarnoff Picture Quality Rating (PQR) have made significant gains, none can truly deliver the same assurance as subjective evaluation, using the human eye. It is also important to note that the two most widely used objective quality metrics mentioned above, PSNR and SSIM, were designed to evaluate static image quality. This means that both algorithms provide no meaningful information regarding motion artifacts, whereby limiting the effectiveness of the metric with regards to video.

SUBJECTIVE TESTING

While objective methods attempt to model human perception, there are no substitutes for subjective “golden-eye” tests. But we are all familiar with the drawbacks of subjectivity analysis, including variance of individual quality perception and the difficulties of executing proper subjective tests in 100% controlled viewing environments so that a large number of testers can participate. Evaluating video using subjective visual tests can reveal key differences that may not get caught by objective measures alone. Which is why it is important to use a combination of both objective and subjective testing methodologies.

One of the logistic difficulties of performing subjective quality comparisons is coordinating simultaneous playback of two streams. Recognizing some of the drawbacks of current subjective evaluation methods, in particular single-stream playback or awkward dual-stream review workarounds, Beamr spent years in research and development to build a tool that offers simultaneous playback of two videos with various comparison modes, to significantly improve the golden-eye test execution necessary to properly evaluate encoder efficiency.

Powered by our professional HEVC and H.264 codec SDK decoders, the Beamr video comparison tool VCT allows encoding engineers and compressionists to play back two frame-synchronized independent HEVC, H.264, or YUV sequences simultaneously. And compare the quality of these streams in four modes:

  1. Split screen
  2. Side-by-side
  3. Overlay
  4. and the newest mode Butterfly

MPEG2-TS and MP4 files containing either HEVC or H.264 elementary streams are also supported. Additionally, VCT displays valuable clip information such as bit-rate, screen resolution, frame rate, number of frames, and other important video information.

Developed in 2012, VCT was the industry’s first internal software player offered as a tool to help Beamr customers conduct subjective testing while evaluating our encoder’s efficiency. Today, VCT has been tested by many content and equipment companies from around the world in multiple markets including broadcast, mobile, and internet streaming, making it the defacto standard for subjective golden-eye video quality testing and evaluation.

VCT BENEFITS AND TIPS

Your FREE trial of VCT will come with an extensive user guide that contains everything you need to get started. But we know you are eager to begin your testing, so following are a few quick tips we trust you will find useful. Take advantage of this “golden” opportunity and get started today!

Note: use Command (⌘) instead of Ctrl for the OS X version of VCT.

  1.      Split Screen Comparison Mode:
    • Benefits:
      • Great for viewing two clips when only one screen is available.
      • Moving slider bar allows you to clearly see quality difference between two streams in your desired region of interest. For example, you can move the slider bar back and forth across a face to see quality differences between two discrete files.
    • Pro Tips:
      • Use the keyboard shortcut Ctrl + \ to re-center the slider bar after it is moved.
      • Shortcut key Ctrl + Tab allows you to change which video appears on the left or right of the slider bar.

VCT split screen comparison mode for subjective video quality assessment

 

  1.       Side-by-side Comparison Mode:
    • Benefits:
      • Great for tradeshows. Solves the lack of synchronization of side by side comparison tests when using two independent players.
      • Single control for both streams.
    • Pro Tip:
      • Shortcut key Ctrl + Tab allows you to change which video appears on which screen without moving the windows.

VCT side-by-side comparison mode for subjective video quality assessment

 

  1.       Overlay Comparison Mode:
    • Benefits:
      • Great for viewing the full frame of one stream on a single window.
    • Tips:
      • Shortcut key Ctrl + Tab allows you to cycle between the two videos. If you do this fast it is a great way to easily see quality differences between the two streams that you might not have noticed.

Overlay Mode

 

  1.      Butterfly Comparison Mode:
    • Benefits:
      • Very useful for determining the accuracy of the encoding process. The butterfly mode displays mirrored images of two sequences to help you assess whether an artifact occurs in the source when comparing an encoded sequence to the original.
    • Tips:
      • Use shortcut key Ctrl + \ to reset the frame to the leftmost view in and use shortcut Ctrl + Alt + \ to switch to the rightmost view in butterfly mode.
      • Use shortcut key Ctrl + [ and Ctrl + ] to move image in butterfly mode left/right.

VCT butterfly comparison mode for subjective video quality assessment

  1.      Other Useful Tips:
    • Ctrl + m allows you to toggle through the 4 comparison modes.
    • Shift + Left Click opens the magnifier tool that allows you to zoom into hard to see areas of the video.
    • Easily scale frames of different resolutions to the same resolution by clicking “scale to same look” on the main menu
    • NEW automatic download feature on the splash screen notifies you of the latest version updates to ensure you’re always up to date.
    • For more great features be sure to check out the VCT userguide beamr.com/vct/userguide.com.

 

Reference:

(1)   P. M. Arun Kumar and S. Chandramathi. Video Quality Assessment Methods: A Bird’s-Eye View

(2)   Richard Dosselmann and Xue Dong Yang. A Formal Assessment of the Structural Similarity Index

Will Virtual Reality Determine the Future of Streaming?

As video services take a more aggressive approach to virtual reality (VR), the question of how to scale and deliver this bandwidth intensive content must be addressed to bring it to a mainstream audience.

While we’ve been talking about VR for a long time you can say that it was reinvigorated when Oculus grabbed the attention of Facebook who injected 2 billion in investment based on Mark Zuckerberg’s vision that VR is a future technology that people will actively embrace. Industry forecasters tend to agree, suggesting VR will be front and center in the digital economy within the next decade. According to research by Canalys, vendors will ship 6.3 million VR headsets globally in 2016 and CCS Insights suggest that as many as 96 million headsets will get snapped up by consumers by 2020.

One of VR’s key advantages is the fact that you have the freedom to look anywhere in 360 degrees using a fully panoramic video in a highly intimate setting. Panoramic video files and resolution dimensions are large, often 4K (4096 pixels wide, 2048 pixels tall, depending on the standard) or bigger.

While VR is considered to be the next big revolution in the consumption of media content, we also see it popping up in professional fields such as education, health, law enforcement, defense telecom and media. It can provide a far more immersive live experience than TV, by adding presence, the feeling that “you are really there.”

Development of VR projects have already started to take off and high-quality VR devices are surprisingly affordable. Earlier this summer, Google announced that 360-degree live streaming support was coming to YouTube.

Of course, all these new angles and sharpness of imagery creates new and challenging sets of engineering hurdles which we’ll discuss below.

Resolution and, Quality?

Frame rate, resolution, and bandwidth are affected by the sheer volume of pixels that VR transmits. Developers and distributors of VR content will need to maximize frame rates and resolution throughout the entire workflow. They must keep up with the wide range of viewers’ devices as sporting events in particular, demand precise detail and high frame rates, such as what we see with instant replay, slow motion, and 360-degree cameras.

In a recent Vicon industry survey, 28 percent of respondents stated that high-quality content was important to ensuring a good VR experience. Let’s think about simple file size comparisons – we already know that ultra HD file sizes take up considerably more storage space than SD and the greater the file size, the greater a chance it will impede the delivery. VR file sizes are no small potatoes.  When you’re talking about VR video you’re talking about four to six times the foundational resolution that you are transmitting. And, if you thought that Ultra HD was cumbersome, think about how you’re going to deal with resolutions beyond 4K for an immersive VR HD experience.

In order to catch up with the file sizes we need to continue to develop video codecs that can quickly interpret the frame-by-frame data. HEVC is a great starting point but frankly given hardware device limitations many content distributors are forced to continue using H.264 codecs. For this reason we must harness advanced tools in image processing and compression. An example of one approach would be content adaptive perceptual optimization.

I want my VR now! Reaching End Users

Because video content comes in a variety of file formats including combinations of stereoscopic 3D, 360 degree panoramas and spherical views – they all come with obvious challenges such as added strain on processors, memory, and network bandwidth. Modern codecs today use a variety of algorithms to quickly and efficiently detect these similarities, but they are usually tailored to 2D content. However, a content delivery mechanism must be able to send this to every user and should be smart to optimize the processing and transmitting of video.

Minimizing latency, how long can you roll the boulder up the hill?

We’ve seen significant improvements in the graphic processing capabilities of desktops and laptops. However, to take advantage of the immersive environment that VR offers, it’s important that high-end graphics are delivered to the viewer as quickly and smoothly as possible. The VR hardware also needs to display large images properly and with the highest fidelity and lowest latency. There really is very limited room for things like color correction or for adjusting panning from different directions for instance. If you have to stitch or rework artifacts, you will likely lose ground. You need to be smart about it. Typical decoders for tablets or smart TVs are more likely to cause latency and they only support lower framerates. This means how you build the infrastructure will be the key to offering image quality and life-like resolution that consumers expect to see.

Bandwidth, where art thou?

According to Netflix, for an Ultra HD streaming experience, your Internet connection must have a speed of 25 Mbps or higher. However, according to Akamai, the average Internet speed in the US is only approximately 11 Mbps. Effectively, this prohibits live streaming on any typical mobile VR device which to achieve the quality and resolution needed may need 25 Mbps minimum.

Most certainly the improvements in graphic processing and hardware will continue to drive forward the realism of the immersive VR content, as the ability to render an image quickly becomes easier and cheaper. Just recently, Netflix jumped on the bandwagon and became the first of many streaming media apps to launch on Oculus’ virtual reality app store. As soon as all the VR display devices are able to integrate with these higher resolution screens, we will see another step change in the quality and realism of virtual environments. But will the available bandwidth be sufficient, is a very real question. 

To understand the applications for VR, you really have to see it to believe it

A heart-warming campaign from Expedia recently offered children at a research hospital in Memphis Tennessee the opportunity to be taken on a journey of their dreams through immersive, real-time virtual travel – all without getting on a plane:  https://www.youtube.com/watch?time_continue=179&v=2wQQh5tbSPw

The National Multiple Sclerosis Society also launched a VR campaign that inventively used the tech to give two people with MS the opportunity to experience their lifelong passions. These are the type of immersive experiences we hope will unlock a better future for mankind. We applaud the massive projects and time spent on developing meaningful VR content and programming such as this.

Frost & Sullivan estimates that $1.5 billion is the forecasted revenue from Pay TV operators delivering VR content by 2020. The adoption of VR in my estimation is only limited by the quality of the user experience, as consumer expectation will no doubt be high.

For VR to really take off, the industry needs to address some of these challenges making VR more accessible and most importantly with unique and meaningful content. But it’s hard to talk about VR without experiencing it. I suggest you try it – you will like it.

Applications for On-the-Fly Modification of Encoder Parameters

As video encoding workflows modernize to include content adaptive techniques, the ability to change encoder parameters “on-the-fly” will be required. With the ability to change encoder resolution, bitrate, and other key elements of the encoding profile, video distributors can achieve a significant advantage by creating recipes appropriate to each piece of content.

For VOD or file-based encoding workflows, the advantages of on-the-fly reconfigurability are to enable content specific encoding recipes without resetting the encoder and disrupting the workflow. At the same time, on-the-fly functionality is a necessary feature for supporting real-time encoding on a network with variable capacity.  This way the application can take appropriate steps to react to changing bandwidth, network congestion or other operational requirements.

Vanguard by Beamr V.264 AVC Encoder SDK and V.265 HEVC Encoder SDK have supported on-the-fly modification of the encoder settings for several years. Let’s take a look at a few of the more common applications where having the feature can be helpful.

On-the-fly control of Bitrate

Adjusting bitrate while the encoder is in operation is an obvious application. All Vanguard by Beamr codec SDKs allow for the maximum bitrate to be changed via a simple “C-style” API.  This will enable bitrate adjustments to be made based on the available bandwidth, dynamic channel lineups, or other network conditions.

On-the-fly control of Encoder Speed

Encoder speed control is an especially useful parameter which directly translates into video encoding quality and encoding processing time. Calling this function triggers a different set of encoding algorithms, and internal codec presets. This scenario applies with unicast transmissions where a service may need to adjust the encoder speed for ever-changing network conditions and client device capabilities.

On-the-fly control of Video Resolution

A useful parameter to access on the fly is video resolution. One use case is in telecommunications where the end user may shift his viewing point from a mobile device operating on a slow and congested cellular network, to a broadband WiFi network, or hard wired desktop computer. With control of video resolution, the encoder output can be changed during its operation to accommodate the network speed or to match the display resolution, all without interrupting the video program stream.

On-the-fly control of HEVC SAO and De-blocking Filter

HEVC presents additional opportunities to enhance “on the fly” control of the encoder and the Vanguard by Beamr V.265 encoder leads the market with the capability to turn on or off SAO and De-blocking filters to adjust quality and performance in real-time.

On-the-fly control of HEVC multithreading

V.265 is recognized for having superior multithreading capability.  The V.265 codec SDK provides access to add or remove encoding execution threads dynamically. This is an important feature for environments with a variable number of tasks running concurrently such as encoding functionality that is operating alongside a content adaptive optimization process, or the ABR packaging step.

Beamr’s implementation of on-the-fly controls in our V.264 Codec SDK and V.265 Codec SDK demonstrate the robust design and scalable performance of the Vanguard by Beamr encoder software.

For more information on Vanguard by Beamr Codec SDK’s, please visit the V.264 and V.265 pages.  Or visit http://beamr.com for more on the company and our technology.

How HDR, Network Function Virtualization, and IP Video are Shaping Cable

Beamr just returned from the Internet & Television Expo, or INTX, previously known as the Cable Show, where we identified three technology trends that are advancing rapidly and for some, are even here now. They are HDR, Network Function Virtualization, and IP Video.

HDR (High Dynamic Range) is probably the most exciting innovation in display technology in recent years.

There is a raging debate about resolution, “are more pixels really better?” But there is no debating the visual impact of HDR. Which is why it’s great to see TVs in the market that can display HDR reaching lower and lower price points, with better and better performance. However being able to display HDR is not enough. Without content there is no impact.

For this reason, Comcast EVP and CTO Tony Werner’s announcement at INTX that on July 4th, Comcast will be shipping their Xi5 STB to meet NBC Universal’s schedule of transmitting select Olympic events in HDR, is a huge deal. Though there will be limited broadcast content available in HDR, once Comcast has a sufficiently high number of HDR set top boxes in the field, and as consumers buy more HDR enabled TVs, the HDR bit will flip from zero to one and we’ll wonder how we ever watched TV without it.

Virtualization is coming and already here for some cable companies.

Though on the surface NFV (Network Function Virtualization) may be thought of as nothing more than the cable industry moving their data centers to the cloud, it’s actually much more than that. NFV offers an alternative to design, deploy and manage networking services by allowing network functions to run in software rather than traditional, “purpose-built” hardware appliances. In turn, this helps alleviate the limitations of designing networks using these “fixed” hardware appliances, giving network architects a lot more flexibility.

There are two places in the network where the efficiencies of virtualization can be leveraged, Access and Video. By digitizing access, the Virtual CCAP removes the physical CCAP and CMTS completely, allowing the control plane of the DOCSIS to be virtualized. Distributing PHY and the MAC is a critical step, but separating their functions is ground zero for virtualization.

Access virtualization is exciting, but what’s of great interest to those involved in video is virtualizing the video workflow from ingest to play out. This includes the encoding, transcoding, ad insertion, and packaging steps and is mainly tailored for IP video, though one cable operator took this approach to the legacy QAM delivery by leveraging converged services for IP and QAM. In doing this, the operator is able to simplify their video ingest workflow.

By utilizing a virtualized approach, operators are able to build more agile and flexible video workflows using “best of bread” components. Meaning they can hand pick the best transcoder, packager, etc. from separate vendors if needed. It also allows operators to select the best codec and video optimizer solutions, processes that are considered to be the most crucial parts of the video ingestion workflow, as the biggest IP (intellectual property) is within the video processing, not packaging, DRM etc. With content adaptive encoding and optimization solutions being introduced in the last few years, if an operator has a virtualized video workflow, they can be free to add innovations as they are introduced to the market. Gone are the days where service providers are forced to buy an entire solution from one vendor using proprietary customized hardware.

Having the IT industry (CPU, networking, storage) make tremendous progress in running video processing, packagers, streamer as software-only solutions on standard COTS hardware, this virtualization concept helps vendors focus on their core expertise, whether it is video processing, workflow, streamer, ad etc.

Virtualization can lower TCO, but it can also introduce operational and management challenges. Today service providers buy “N” transcoders, “N” streamers etc. to accommodate peak usage requirements. With virtualization the main advantage is to share hardware, so that overall less hardware is needed, which can lower TCO as file based transcoders could be run during off peak times (middle of the night) while more streamers are needed during peak times to accommodate a higher volume of unicast stream sessions (concurrency). This will require new methods of pay per usage, as well as sophisticated management and workflow solutions to initiate and kill instances when demand is high or when it drops.

For this reason we are seeing some vendors align with this strategy. Imagine Communications is entering the market with solutions for providing workflow management tools that are agnostic to the video processing blocks. Meanwhile, Cisco and Ericsson provide open workflows capable of interoperating with their transcoders, packagers, etc. while being open to third party integration. This opens the door for vendors like Beamr to provide video processing applications for encoding and perceptual quality optimization.

It is an IP Video world and that is a good thing.

Once the network is virtual, it flattens the distribution architecture so no longer does an operator need to maintain separate topologies for service delivery to the home, outside the home, fixed wire, wireless, etc. The old days of having RF, on net, and off net (OTT) systems, are quickly moving behind us.

IP video is the enabler that frees up new distribution and business models, but most importantly meets the expectation of end-users to access their content anywhere, on any device and at anytime. Of course there is that little thing called content licensing that can hold back the promise of anytime, anywhere, anyplace, especially for sports – but in time, as content owners adapt to the reality that by opening up availability they will spur not hamper consumption, it may not be long before the user is able to enjoy entertainment content on the terms they are willing to pay for.

Could we be entering the golden age of cable? I guess we’ll have to wait and see. One thing is certain. Vendors should ask themselves whether they are able to be the best in every critical path of the workflow. Because what is obvious, is that service providers will be deciding for them, as there is no solution from a single vendor that can be best of breed in todays modern network and video architectures. Vendors who adapt to changes in the market, due to virtualization, will be the leaders of the future.

At Beamr we have a 60 person engineering team focused solely on the video processing block of the virtualized network, specifically HEVC and H.264 encoding and content adaptive optimization solutions. Our team comes into the office every day with the single objective of pushing the boundary for delivering the highest quality video at the lowest bitrates possible. The innovations we are developing translate to improved customer experience and video quality whether that is 4k HDR with Dolby Vision, or reliable 1080p on a tablet.

IP Video is here, and in tandem with virtualized networks and the transition of video from QAM to the DOCSIS network, we are reaching a technology inflection point that is enabling better quality video than previous technological generations were able to deliver. We think it’s an exciting time to be in cable!

HDR adds ‘kapow’ to 4k

High Dynamic Range (HDR) improves video quality by going beyond more pixels to increase the amount of data delivered by each pixel. As a result, HDR video is capable of capturing a larger range of brightness and luminosity to produce an image closer to what can be seen in real life. Show anyone HDR content encoded in 4K resolution, and it’s no surprise that content providers and TV manufacturers are quickly jumping on board to deliver content with HDR. HDR definitely provides the “wow” factor that the market is looking for. But what’s even more promising is the industry’s overwhelmingly positive reaction to it.

Chicken and egg dilemma will be solved

TV giants Samsung, Sony, Panasonic, and LG have all launched HDR-capable 4K TVs  in the premium price range. However, Vizio get the credit for being the first to break through with low cost UHD HDR TV’s with their P-Series. Available now and starting at just $999, this removes the price objection for all but the most budget conscious consumers. Check out the price chart below referenced in a recent CNET article.

VIZIO P SERIES 2016 TVS

Model Size Price Dimming zones Refresh rate Panel type
P50-C1 50 inches $999 126 60Hz VA
P55-C1 55 inches $1,299 126 120Hz IPS
P65-C1 65 inches $1,999 128 120Hz VA
P75-C1 75 inches $3,799 128 120Hz VA

The availability of affordable TV’s is an extremely promising factor that is pushing the market to believe that HDR is here to stay. The fact that HDR sets are starting at such a low price this early in the market development of the technology is a good indicator that the category is going to grow quickly, allowing consumers to experience the enhancement of high dynamic range sooner than is normally possible when new advanced technologies are first introduced. In fact, some are predicting that these prices will fall to “as little as $600 to $700” for a 50-55inch UHD TV with HDR capability, which if true, brings HDR and UHD even closer to the price of current 1080p models.

Now all we need is content  

In January 2016, Netflix announced the availability of streaming of Marco Polo series in Dolby Vision and HDR10. At CES 2016, Netflix also showed clips from the Daredevil series in Dolby Vision. Far from being demos only, Daredevil season 2 was released on March 18th and Marco Polo season 2 will be released on July 1st.  Thus, it’s safe to say that Netflix sees HDR as “the next generation of TV”.

HDR standards are emerging

Publishing guidelines to ensure compatibility and consistent user experience across the device ecosystem for HDR content and displays, is the next natural and significant step to insure industry adoption of HDR, and on April 18th the UHD Forum announced the, UHD Forum Guidelines. The ITU-R Study Group 6 works on recommendations for HDR and the publication is expected in July 2016.

Surveying the current market, there are several HDR technologies that exist and cover the spectrum of both dual and single layer HDR, with the main ones being Dolby Vision dual and single layer, and a Technicolor-Philips single layer solution, known as HDR10.

What is the difference between dual and single layer HDR workflows? The dual layer approach provides backward compatibility with legacy SDR systems (set-top-boxes, TVs), but requires two decoders for endpoint devices. Single layer is not backwards compatible with SDR systems, but it makes TV sets and set-top-boxes more economical and less complex.

Since there are multiple standards, it presents certain challenges for an industry-wide rollout. Dolby Vision is getting a lot of attention due to its well-recognized name and the Vizio and LG endorsement. At the same time Ultra HD Premium (HDR10) is required by Blu-ray Disc Association. All these competing standards make choosing the appropriate one more challenging. But never fear, there is an encoder in the market today, which is capable of generating Dolby Vision single and dual layer streams, and HDR10 compatible streams or files.

Meet V.265 Beamr’s HDR-optimized encoder

Beamr has been working with Dolby to enable Dolby Vision HDR support for several years now, even jointly presenting a white paper at SMPTE. The V.265 codec is optimized for Dolby Vision and HDR10 and takes into account all requirements for both standards including full support for VUI signaling, SEI messaging, SMPTE ST 2084:2014 and ITU-R BT.2020.  

Pesky stuff the industry is addressing

There are many commonalities between HDR technologies, but there are common challenges too. For example SDR to HDR conversion, and conversion between HDR formats can happen in various parts of the distribution chain, causing headaches on the metadata management side. Additionally, peak brightness management across the production chain, and metadata propagation are known challenges too. Metadata propagation from content mastering to distribution is one more area that requires standardization. SMPTE will have a role in solving these and the new IMF format may be a good candidate. Beamr welcomes all these challenges and recognizes that HDR is here to stay. Our engineering teams are well equipped to address them.

If you crave a deeper understanding of HDR I encourage you to read our white paper titled, “An Introduction to High Dynamic Range (HDR) and Its Support within the H.265/HEVC Standard Extensions.” It not only gives a great introduction to HDR, but also explains how the set of extensions approved by MPEG and VCEG in July 2014 provides the tools to support HDR functionality within the HEVC standard.

The future is HDR so you better wear shades

I can remember racing to my friend Craig’s house during our lunch break at High School, just so we could watch this new music TV channel, MTV.  Being musicians and huge fans of music, we were mesmerized.  

Though the television technology we watched MTV on was no match for the displays of today, the intersection of content which I highly valued, and an engaged entertainment experience (24/7 music video with cool VJ’s) – shaped – dare I say revolutionized, not just me and my friends, but an entire generation.  

I can’t say that high dynamic range (HDR) will have the same effect on millions of people, it’s highly unlikely.  But I do think you better wear shades because the future is bright for content distributors, television manufacturers and anyone who cares about creating new and exciting entertainment.

Three possibilities for improving video quality

The first option for improving the perceived quality of video (as observed by the human eye) is to increase resolution.  

It stands to reason that if we encode a video file with 3,840 pixels across rather than 1,920, there should be a noticeable improvement in quality.  After all, you have more dots represented, and thus a greater chance of visual details not being lost.  

However, it turns out that in the real world, for a variety of reasons, higher resolution cannot always be noticed.  Example: if you have a 60″ TV, to be able to see the increased resolution of a 4k UHD panel, you must sit less than 8′ away from the television and in many homes that is simply not practical or even desired.  TV resolution viewing chart

The second way you can improve video quality is to transmit more frames.  This method utilizes the same resolution (typically HD) but doubles or quadruples the number of frames shown every second.  For high-speed action, namely sports, increasing the frame rate of the capture and display can offer a demonstrable improvement in quality.  Still, this step is not sufficient to push the TV viewing experience to a “wow”.

HDR opportunity

In walks high dynamic range (HDR), the third option for improving video quality.  

Now let me just say, wide color gamut (WCG) is also a way to bring more realism to video, but we’ll cover WCG in a separate blog post.

The term dynamic range is used to signify the difference (or range) between dark and light sections.  High dynamic range is the designation of a television capable of displaying brightness that is closer the real life.  Thus, the reason a reference to wearing shades seems apropos.  Vanguard Video whitepaper on HDR support within the HEVC standard.

With the acquisition of Vanguard Video, Beamr now has the designation of being the first commercially available HEVC encoder in the market with HDR support for HDR-10 and Dolby Vision.  

Chicken and the egg

As with any new technology, there is always an initial awkward period where the buzz is growing while the reality is quite different.  For HDR, it’s not enough to have a standard, and it’s not sufficient to have an encoder.  If the content that is HDR compatible isn’t being produced, and without a high number of displays capable of supporting the format, technology adoption cannot go mainstream.

After several years of wrangling in the various standard bodies and industry consortiums, two HDR standards have emerged: HDR-10 and Dolby Vision, and Beamr supports both. A third standard, which merges the HDR standard proposals or Technicolor and Phlips, seems to be lagging behind, with initial demonstrations expected at NAB 2016, and commercial deployments in chip form expected by the end of the year.

Though each group can explain how their approach is best, and trade-offs exist between the front-runners, HDR-10 is a baseline that offers the minimum support required for UHD Blu-ray.  As such, many streaming content distributors are encoding with HDR-10 to satisfy playback ubiquity.  Whether a display supports Dolby Vision or Technicolor’s proposed standard, the chances that the consumer can experience an enhanced picture with HDR, are high, provided they buy a TV anytime after today.

HDR content and televisions, available now

The first mover with fully compatible HDR televisions, and most importantly content that can play on them, is Vizio with their new Reference Series.  http://www.vizio.com/r-series

Vizio Reference Series supports native UHD + HDR streaming from Netflix and VUDU with Amazon Instant Video support coming.

VUDU’s catalog can be browsed here: http://www.vudu.com/movies/#featured/12434

Netflix is starting with Marco Polo season one but reported to Engadget that more titles are coming soon including Marvel’s Daredevil.  One can assume next season’s House of Cards will allow us an even more realistic view into the Underwood’s political life.

Beamr HEVC encoders support Dolby Vision & HDR-10 today!

Be sure to visit us at NAB, April 18th to the 21st to check out the full line of Beamr H.264 and HEVC encoding and optimization solutions including our HEVC products that support Dolby Vision and HDR-10. Beamr encoding solutions including HEVC + HDR can be seen in the Las Vegas Convention Center South Hall Upper booth #SU11710.

You can see Beamr optimization products at the Beamr Video stand located in the Las Vegas Convention Center South Hall Upper booth #SU11902CM.