Tech —

Decoding the HTML 5 video codec debate

The HTML 5 video element has the potential to liberate streaming Internet …

The increasingly competitive browser market has at last created an environment in which emerging Web standards can flourish. One of the harbingers of the open Web renaissance is HTML 5, the next major version of the W3C's ubiquitous HTML standard. Although HTML 5 is still in the draft stage, several of its features have already been widely adopted by browsers like Safari, Chrome, and Firefox. Among the most compelling is the "video" element, which has the potential to free Web video from its plugin prison and make video content a native first-class citizen on the Web—if codec disagreements don't stand in the way.

In an article last month, we explored the challenges and opportunities associated with the HTML 5 video element. One of the most significant of these challenges is the lack of consensus around a standard media codec, a contentious issue that has rapidly escalated into a major controversy. The debate has now stalled without a clear resolution in sight.

The HTML 5 working group is split between supporters of Ogg Theora and H.264. Their inability to find a compromise that is acceptable to all stakeholders has compelled HTML 5 spec editor Ian Hickson to "admit defeat" and give up on the effort to define specific codecs and media formats in the standard itself. This is problematic because the lack of uniform codec availability will make it impossible for content creators to publish their videos in a single format that will be viewable through the HTML 5 video element in all browsers.

In an e-mail posted to the WHATWG mailing list, Hickson outlined the positions of each major browser vendor and explained how the present impasse will influence the HTML 5 standard. Apple and Google favor H.264 while Mozilla and Opera favor Ogg Theora. Google intends to ship its browser with support for both codecs, which means that Apple is the only vendor that will not be supporting Ogg.

"After an inordinate amount of discussions, both in public and privately, on the situation regarding codecs for <video> and <audio> in HTML5, I have reluctantly come to the conclusion that there is no suitable codec that all vendors are willing to implement and ship," Hickson wrote. "I have therefore removed the two subsections in the HTML5 spec in which codecs would have been required, and have instead left the matter undefined."

Ogg Theora is an open format that is thought to be unencumbered by patents. The primary reference implementation is distributed under an open source license and it is being developed by the non-profit Xiph.org with funding from Mozilla. Ogg is strongly preferred by the open source software community because it can be freely redistributed without requiring licensing fees.

H.264 is a high-performance codec that is maintained by the ISO Moving Picture Experts Group (MPEG) as part of the MPEG-4 family. It is emerging as the dominant codec for both streaming video and optical media, as it is said to deliver the visual quality of MPEG-2 (used on DVDs) at roughly half the bitrate. The MPEG LA consortium manages licensing of the underlying patents that cover H.264 compression algorithms and other software methods needed to implement the codec. In order to use the format, adopters have to pay licensing fees to MPEG LA.

Patent problems

Patent encumbrance is one of the driving forces behind the HTML 5 video codec controversy. The patent licensing requirements mean that H.264 codecs can't be freely redistributed, making the format a non-starter for Mozilla and most other open source browser vendors. Opera also objects, saying that the licensing fees are too high. Mozilla and Opera strongly advocate Ogg Theora as an alternative because its freedom from known patents could ensure that there are no licensing barriers that prevent ubiquitous adoption.

Apple objects to Ogg Theora, claiming that the lack of known patents on Theora doesn't rule out the threat of submarine patents that could eventually be used against adopters. Apple is also concerned about the lack of widespread support for hardware-based Theora decoding, a factor that diminishes the format's viability on mobile devices. Google shares Apple's skepticism about the potential of Theora in the marketplace. The search giant claims that Theora's lack of quality relative to H.264 will make it an impractical choice for large-scale streaming video services such as YouTube.

Obtaining a license for H.264 from MPEG LA doesn't guarantee complete immunity from patent infringement liability, though. Although it is generally assumed that MPEG LA controls all of the relevant intellectual property pertaining to H.264 implementations, there is still the possibility that a third-party which is not a member of the consortium has a broad patent covering related compression technology that it can independently enforce against MPEG LA licensees.

Although Theora is not known to infringe any patents, critics fear that enhancing it to make it competitive with the most modern and efficient codecs will greatly increase its exposure to infringement risks. Some critics even contend that it's not possible to advance Theora without inevitably hitting a patent wall.

Another licensing issue that is often overlooked is the ambiguity of MPEG LA's future patent royalty collection plans. MPEG LA has established broadcast fees that licensees will be required to pay for distributing free (or ad-supported) streaming video content on the Internet. These fees will not be instated until the end of 2010, when the second H.264 licensing period goes into effect. The language used in the current license treats Internet streaming just like over-the-air television, implying that the licensees will have to pay broadcast fees per-region. That could prove to be extremely costly for Internet video providers who make their content available around the world.

MPEG LA has provided no guidance, clarification, or insight into what the broadcast licensing fees will look like. When asked directly about the issue, MPEG LA representatives say that they haven't even decided yet themselves. The worry is that H.264 licensing for content distributors could potentially become too costly to sustain widespread use for streaming Internet video.

(For more details about MPEG LA licensing eccentricity, you can refer Jan Ozer's concise overview at the Streaming Learning Center. Another good one to refer to is The H.264 Licensing Labyrinth by Tim Siglin.)

The compression efficiency debate

The viability of Theora for large-scale streaming video web sites is Google's primary concern. Google is committed to shipping both Ogg and H.264 support in its browser, but intends to use the latter to power its popular YouTube video site. The direction that YouTube goes for streaming video will be enormously influential and could, by itself, play a very significant role in determining the outcome of the codec issue.

The extent to which Theora lags behind H.264 is often overstated and the codec is, in actuality, in better shape than is generally thought by many of its critics. Google open source programs manager Chris DiBona is skeptical, however, and articulated the search giant's concerns about Theora's compression efficiency during the debate on the WHATWG mailing list.

"If [YouTube] were to switch to Theora and maintain even a semblance of the current YouTube quality it would take up most available bandwidth across the internet," DiBona said. "The most recent public number was just over 1 billion video streams a day, and I've seen what we've had to do to make that happen, and it is a staggering amount of bandwidth."

DiBona's quality claim was broadly disputed by Theora supporters on the mailing list. Mozilla's Mike Shaver encouraged DiBona to examine the most recent Theora developments, suggesting that the latest improvements have helped to significantly close the gap in compression efficiency.

"I don't think the bandwidth delta is very much with recent (and format-compatible) improvements to the Theora encoders," he wrote. "[Codec improvements] are a big part of what we've been funding, and the results have been great already. I'd like to demonstrate them to you, because I suspect that you'd be a better-armed advocate within Google for unencumbered video if you could see what it's really capable of now."

Xiph's Gregory Maxwell responded to DiBona's mailing list post by publishing a comparison that aims to demonstrate Theora's efficacy relative to H.264 in the context of YouTube-quality streaming video.

"Using a simple test case I show that Theora is competitive and even superior to some of the files that Google is distributing today on YouTube," he wrote. "Theora isn't the most efficient video codec available right now. But it is by no means bad, and it is substantially better than many other widely used options. By conventional criteria Theora is competitive. It also has the substantial advantage of being unencumbered, reasonable in computational complexity, and entirely open source. People are often confused by the correct observation that Theora doesn't provide the state of the art in bitrate vs quality, and take that to mean that Theora does poorly when in reality it does quite well."

Ogg has several high-profile supporters, including popular video streaming site DailyMotion and the Wikimedia Foundation, the organization behind the popular Wikipedia Internet encyclopedia. DailyMotion recently began the process of converting its video library to Ogg which it plans to deliver through the HTML 5 video element. DailyMotion has acknowledged that its Ogg streams have some technical deficiencies compared to its current Flash-based video streaming solution, but is confident that it's the best approach in the long run.

The Wikimedia Foundation, which is a strong supporter of open technology and unencumbered accessibility to information, was already committed to Ogg even before the HTML 5 video element gained multiple browser implementations. The organization is collaborating with Mozilla in the effort to boost Theora quality. In an e-mail on the WHATWG list, Wikimedia Foundation volunteer media contact David Gerard said that the organization is also interested in helping Mozilla to raise general awareness of the advantages that unencumbered video would bring to the Internet.

"I'd also point out that Wikimedia has vast publicity abilities in this direction," he wrote. "And we're watching the progress of Theora and Dirac on a day-by-day basis, for obvious reasons. So if you need large charitable organisations to help you with making this the obvious publicity choice for a happy Internet with cute fluffy kitties, I can tell you we'll be right there!"

A Fluffy Kitty summary of the codec debate

The following images demonstrate what the future of the web might look like, depending on potential outcomes of the codec debate.


The Web, with an unencumbered video codec

The Web, with many competing patent-encumbered codecs

I, for one, welcome our fluffy kitty overlords.

The undesirable middle-ground

A solution that seems logical on the surface is to simply expose each platform's underlying media playback engine through the HTML 5 video element—DirectShow on Windows, GStreamer on Linux, and QTKit on Mac OS X. This would make it possible for the browser to play any video formats that are supported natively on the user's computer.

From a purely technical perspective, this is not an impossible problem to solve as there are already existing libraries that do this and provide a cohesive abstraction layer on top. One prominent option is Nokia's Phonon library. It could also possibly be done by using the Quicktime and DirectShow plugins for GStreamer.

Mozilla strongly opposes this approach because it would heighten the risk of fragmentation. Allowing content providers to use any codec that is available on the user's computer might undermine the advantages of the HTML 5 media element because there would be no consistency guarantee and content would not be able to work everywhere. That is, however, arguably the situation that already exists as a result of the impasse in the codec debate.

Conclusion

Hickson has clearly grown tired of the debate and has no interest in allowing the divisive issue to continue distracting HTML 5 stakeholders from their efforts to push forward the standard. He takes the view that documenting a codec in the standard will achieve nothing unless the browser vendors are willing to conform with what the standard says. Microsoft, which has no plans to implement the HTML 5 video element at all, is also still an impediment to bringing open Internet video to the masses.

It's unfortunate that this debate is threatening to derail the adoption of standards-based Internet video solutions. The dominant video solution today is Flash, a proprietary technology that is controlled by a single vendor and doesn't perform well on Linux or Mac OS X. There is a clear need for an open alternative, but the codec controversy could make it difficult. My inner pessimist suspects that Microsoft will finally get around to implementing HTML 5 video at the same time that the H.264 patents expire, in roughly 2025.

Channel Ars Technica