[System|Toolbox] Tools
for the Art
of System
Administration
HOME STAFF FAQ ADVERTISE LEGAL
binaryfreedom.com

Sections
   News
   Reviews
   Commentary

Community Events:
 
Submit an event

Vorbis Recombinant

Sunday February 25, 2001 06:30pm PST
The guys at Xiphophorus have been very, very busy. Releasing beta4 of the Vorbis CODEC, moving to a BSD license and forming a non-profit corporation to champion free multimedia development and distribution efforts are just the beginning.
Tomorrow, the Xiph.org Foundation releases beta4 of the Ogg Vorbis CODEC. Binary Freedom got the opportunity to interview Jack Moffitt, the Executive Director of the new Xiph.org Foundation, and Christopher Montgomery, who serves as Technical Director of same. From updating the codec to beta4, to changing the license on the entire project to a BSD license to starting a non-profit corporation, the Xiph.org guys are kicking ass and taking names. On to the good stuff!

Binary Freedom: So, how does this new version differ from the last in terms of sound quality?

Christopher Montgomery: It kicks much booty. More booty than has been kicked in recent memory.

Binary Freedom: And by "booty", you mean what, exactly?

Christopher Montgomery: Much improved; more stable high end, much improved preecho, and the encoder is twice as fast. Pure tones are also much cleaner. Essentially at this point, if you find a sample that sounds a little dodgy in Vorbis, chances are it sounds like total crap in mp3. Booty being virtual ass, in this case.

Christopher Montgomery: beta4 was originally supposed to be an avalanche of features, it ended up being an avalanche of optimizations, quality tuning and [embarrassing to admit] bugfixes over beta3. beta3 actually had some subtle encoder bugs that didn't really break anything, they just resulted in what sounded like bad quality. We killed every analysis bug we found.

Jack Moffitt: It was as if emacs came in a spray can!

Christopher Montgomery: We sound better than WMA on Microsoft's own handpicked WMA demo samples. A xylophone? C'mon guys! A xylophone isn't hard! Even if you play it fast!

Binary Freedom: What about streaming?

Jack Moffitt: A pre-alpha version of IceCast 2 is available in the source respository.

Christopher Montgomery: IceCast 2.0 is up and running. Now that vorbis.com and icecast.org are back, you can go get it. Jack, is there a real release date planned?

Jack Moffitt: The Winamp and XMMS plugins have received big updates, to support streaming. No official date, but I will be packaging up the 2.0 alpha soon. We've also got a first cut RealPlayer plugin, and we'll hopefully be submitting our first IETF draft to start Vorbis and Icecast down the standards path.

Binary Freedom: How is the encoding technology changed for beta4?

Christopher Montgomery: It hasn't really; the encoder is simply tighter. For 1.0, we'll have finally added cascading and channel coupling. Cascading is the ability to make multiple passes through the frequency spectrum, iteratively filling in more detail, like a progressive jpeg. Channel coupling is a generalization of "joint stereo" in mp3/AAC.

Binary Freedom: With a glut of personal digital audio devices flooding the market, is there any support for Vorbis?

Christopher Montgomery: Just the beginnings; Jack and I both have test versions of the Iomega HipZip with Vorbis support.

Binary Freedom: How is it?

Jack Moffitt: It works great!

Christopher Montgomery: It works like you'd expect it should. It's not finished, but there's less than a week of real development work in it to get it to release-candidate.

Jack Moffitt: Mostly the subtle features.

Christopher Montgomery: Well, "seek" isn't implemented. Easy to do, but a big usability deal. It only plays songs from the beginning because the seek function needs another five lines of code. It plays all ogg files, any bitrate, stereo/mono, etc. We're waiting on permission from iObjects to distribute an alpha of the user-installable firmware update. We've worked quite a bit with iObjects in Seattle to bring this into the hardware realm.

Binary Freedom: That's very cool. I'm told you have a very interesting licensing story; what's the deal?

Christopher Montgomery: Ah. We're going to the BSD license. We even have RMS's blessing to do so! It's not an official blessing, but he agreed with the reasoning.

Jack Moffitt: It turns out that this is a basic building block of the internet for multimedia. and the GPL/LGPL are too restrictive, if we really want to put this building block in the hands of everyone. Let me get his reponse, I believe it was semi-official...

Jack quoting E-mail from Richard Stallman:

Anyway, if and when you announce a different license for the Vorbis code, feel free to mention that I agree with the decision, as long as you make it clear I support "Free Software" and not "Open Source", and don't imply I agree that there is such a thing as a "Linux operating system".

Binary Freedom: Why did you make the decision to change?

Jack Moffitt: The biggest influence was the game industry. They have to modify Vorbis for proprietary hardware systems, like the Playstation 2. This does not allow them to republish the code, but at the same time, republishing the code, wouldn't gain the Open Source or Free Software community anything, as it's a specific implemenation, and not something for general use. So they are trying to do something perfectly acceptable, but specifically not allowed under the LGPL. In order for game companies to accept Vorbis, they would have to reimplement the entire codebase, which takes time. In order to speed adoption by game companies, and for other companies that face similar licensing restrictions, we have adopted the BSD license. This is a very specific case of the LGPL being too restrictive, but it's one we ran into and needed a solution before. The BSD license, being as short as it is, makes it a lot easier on the lawyers.

Christopher Montgomery: The idea is to make this a real, universally acceptable building block... even for companies.

Jack Moffitt: We're not just making an alternative for an existing product. We're filling a hole that has always existed.

Binary Freedom: So, tell us about the Xiph.org Foundation.

Jack Moffitt: After iCast tanked in November, we began the search for a way to continue funding for the project.

Christopher Montgomery: Xiphophorus has always been a normal S Corporation, and we'd decided but never gotten around to doing things as a 501c3 Non-Profit.

Jack Moffitt: The goal was to find multiple sources of funding, and to find enough funding so that we can support ourselves, and hopefully build a small team of people making interoperable internet multimedia a realtity.

Christopher Montgomery: The Xiph.org foundation is a new non-profit to replace the current Xiphophorus, Inc.

Jack Moffitt: Xiph.org's code benefits all internet users, so it makes sense that multiple people/companies/organizations help foot the bill for its continued development. We also want to be non-threatening to the industry as a whole. As a commercial organization, we are competing for the same dollars as Microsoft and RealNetworks. As a non-profit, we are solely engaged in pushing the limits of this technology and researching new problems, as well as working with the other companies in this industry.

Christopher Montgomery: Which is what we'd always been doing, we're just making it more official.

Jack Moffitt: We also get a surprising number of people who want to send us money. This makes it tax-deductable, and gives some officiality to the process. We hope to find grant money and donations from companies and individuals who believe in what we're doing and wish to see the projects continue and mature. This money will pay for us to dedicate our more-than-fulltime development hours solely to these projects and pay for expenses incurred, which right now are next to nil.

Christopher Montgomery: Well, bandwidth isn't totally negligable.

Jack Moffitt: That's true, and soon we'll be too big to host on any one single DSL line. It makes it much easier to get donated bandwidth if the ISP can write it off as a tax deduction.

Binary Freedom: Certainly.

Christopher Montgomery: And let's not forget the mantra: "Our goal with Ogg is not to get rich. It's to get paid, albeit paid well."

Binary Freedom: Have you guys seen a dramatic adoption of Vorbis in the past few months?

Jack Moffitt: There's more and more Vorbis everywhere. There's been a dramatic increase in related projects since beta3. There's a vorbis comment editor for Win32.

Christopher Montgomery: As far as software support, only Quicktime player and Windows Media Player are left.

Jack Moffitt: There's now official support for Vorbis in Sound Forge (from Sonic Foundry).

Christopher Montgomery: All the Mac players have announced support (except Apple itself), and N2mp3 on Mac will encode Vorbis.

Jack Moffitt: We have the first RealPlayer plugin which I mentioned before, which we're working with RealNetworks on. You can now find Vorbis files on Freenet and one several OpenNap servers. I noticed yesterday that Ximian GNOME has the beta3 libs.

Christopher Montgomery: The next real adoption watershed will have to be content providers, but we'll see that the instant Fraunhofer Gesellschaft follows through on its licensing demands.

Binary Freedom: Have you gotten anyone that's come to you and said, "These guys want money, I want to move to your standard?"

Jack Moffitt: That's essentially one of the reasons the game industry is interested; they don't want to pay for distributing commercial decoders at $.25 a pop. If your game sells a million or so copies, that's 250K. Lots of broadcasters are waiting on more streaming stuff as well.

Christopher Montgomery: When Fraunhofer eventually asks for the broadcast/download royalty, and they've stated that this year they will, that pretty much singlehandedly kills net music, which is already scrapping for survival, unless there's a free alternative- and there is.

Binary Freedom: Have you guys thought about doing a service where you'll re-encode wavs and mp3's to Vorbis for a small donation?

Jack Moffitt: I have built a scalable system for encoding and reencoding while at Green Witch, but the basic limiting factor is network transport. If there was an easy to way transfer your whole mp3 collection and get a Vorbis collection back, we'd certainly like to try that, but the real issue there is that there's no reason to re-encode your mp3s, unless you just love Vorbis. We're not dealing with VCR's or CD players anymore. XMMS will play both, and so will the HipZip.

Christopher Montgomery: Yes, mp3 is moving from "nominally most free" (not considering Vorbis) to "as restrictive as most of the other choices". Not as bad as, say, Liquid Audio, but still not good.

Jack Moffitt: So it's only necessary to start encoding all your new stuff in the better format.

Christopher Montgomery: Unless you're a company.

Jack Moffitt: Yes, unless you have a financial reason to do so, or in the case of streaming, a technical one. But for the average consumer of music listening at home, they shouldn't have to care about format. In a way, the end goal of the Xiph.org projects is to make multimedia technology become an everyday thing, instead of something from which many news articles are made. No one writes articles about HTTP, and rarely about HTML. Hopefully in a few years, internet multimedia will just work.

Binary Freedom: Where are you guys located?

Christopher Montgomery: Jack is in SF, I'm in Silly Valley (hoping to escape again). Other developers are everywhere except Antarctica.

Jack Moffitt: Mike Person is in Boston, and there are active developers in Germany, the Netherlands, Australia, etc.

Binary Freedom: What are you guys doing in your spare time?

Christopher Montgomery: http://www.xiph.org/xiphmont/bike/project.html

Jack Moffitt: I have built a small studio and try to work on music; other than that, video games.

Christopher Montgomery: Yeah, but not with me! You and your bloody cat.

Binary Freedom: Huh?

Christopher Montgomery: I'm allergic to cats. Everyone moved in with Jack, and they all play games there. Jack has a cat. I can't attend.

Binary Freedom: Can you give me a scientific breakdown of why Vorbis sounds better?

Christopher Montgomery: Sure. First off, we're a psychoacoustic compression, like mp3 on the surface. Keep what's audible, thow away what isn't. Psychoacoustic models aren't perfect; you either end up throwing away audible things and getting artifacts, or you get too conservative and end up keeping alot of unnecessary information. For the most part, both actually happen. To make a psychoacoustic compression sound better, you have to do two things. Improve the model so that it gets more accurate results, thus requiring fewer bits for the same quality, and improve the packing of the bits so that you can fit more information (as returned by the model) into less real space. Vorbis does both.

Christopher Montgomery: Our psychoacoustics model is actually using hard data from listening test research. No single/dual slope approximations; we're applying the real honest-to-god curve data, as collected in research experiments, line-by-spectral-line. This helps carve the audio turkey a bit more delicately, so that you get more of the meat off the sound without slicing off any bones in broad strokes Secondly, the way we then pack the processed audio data into packets is more aggressive than mp3, so that we can fit more of the spectral data into each packet. Rather than using a fixed spectral normalization and lots of scalefactors, we encode the shape of the entire spectrum into a small LSP filter and pack that along. It's much more precise than the fixed normalization (because it reflects the quality of the sound in each frame) and eliminates scalefactors and the space they take up. Finally, we don't use a fixed, hardwired set of Huffman codes to do the last stage of coding. We use VQ (which is actually just a generalization of Scalar Huffman coding) and the encoder is allowed to construct any codebook set it wants; this is packed into the sample header. The encoder switches between it's declared codebook set on the fly during encoding, picking the best one for each situation. So, we get more "audio bits" into the same number of physical bits.

Christopher Montgomery: mp3 first subbands the input audio, using a bank of filters, into a group of 32 seperate frequency bands. This process is not 100% reversible. Then they run each subband through an MDCT! This makes the "not 100% reversible" even worse, leading to aliasing noise. Even with an infinite bitrate, mp3 would not be able to perfectly reproduce its input audio. The better technique is to abandon the filter band and just use an MDCT directly, which is 100% reversible. mp3 doesn't do this because earlier MPEG audio layers used a filterbank and, by committee, they settled on a technical compromise to use the filterbank in mp3 as well, even though they knew this compromised the fidelity.

Jack Moffitt: Most of their patents are specifically on the filterbank, which may or may not be a reason they kept it.

Christopher Montgomery: Well, some of the "ringers" are on the filterbank. Fraunhofer/Thomson have a veritable avalanche of mp3 related patents. That's just one of the very relevant ones stopping any truly free mp3 encoder.

Christopher Montgomery: That's a mid-level summary on "Vorbis vs mp3" points.

Binary Freedom: What about improvements to the existing Vorbis model in beta4?

Christopher Montgomery: There were two bugs in beta3 that slipped through beta4. If one channel was totally silent, but the other had even a trace of background hiss, beta3 would write a corrupt packet. This crept in a day before release. My fault. I'm suitably shamed. Also, beta3 overflowed it's codebooks; that is, it would, on strong tones, try to encode a too large a data range with the given codebook, and the codebook would "max out" early. This resulted in pure tones (like brass, or strong vocal) sounding gravelly/noisy. It wasn't a format artifact, it was just a bug. I added looking for that to the release verification stuff from now on.

Christopher Montgomery: Beyond bug fixes, the improvements: Much better preecho control. In my tests, it's now at least as good as LAME, although only time will tell for sure. In beta3 and early prebeta4, it would miss things like bongos or kick-drums. It doesn't anymore. Also, finer grained codebooks for tones. Even after fixing the codebook overflow bug in beta3, strong tones picked up some noise with the widely-spaced beta3 codebooks; the intensity steps were too far apart. For a strong tone in beta4, the codebooks are about 12-15x more precise, and at no bitrate penalty.

Binary Freedom: What else?

Christopher Montgomery: Better noise threshold calculation; less noise related artifacts. That is, at "normal" bitrates like 128kbps, it's still necessary to make some compromises, but the compromise should always be a graceful "fuzzing" and never something that jumps out and sounds like a shipload of cymbals being thrown off the roof. And then, of course, there's speed. Much faster, faster than LAME VBR.

Christopher Montgomery: Oh, oh... one more bug fixed... I typoed a "+.5f" in beta3 as "+5.f". The effect was that it increased clipping.

Jack Moffitt: Yeah, I hated that bug. I found it right after I had encoded most of my music, bitch!

With continued development and the new direction of Xiphophorus, Ogg Vorbis isn't just an mp3 alternative, it's a building block to free multimedia creation and distribution on the net. Thanks to Christopher and Jack for taking time out of the rigorous development schedule to speak with us. As always, you can download all of the new goodies at http://www.vorbis.com/download.html.

Comment? - Or do you think this article blows chunks and you could write a better one in your sleep? Then do it!
View Comment Page

Copyright © 2004, The Binary Freedom Project, LLC.