Speex Audio Codec, Version 1.2

Format Description Categories >> Browse Alphabetical List

Speex Audio Codec, Version 1.2

Table of Contents

Identification and description
Local use
Sustainability factors
Quality and functionality factors
File type signifiers
Notes
Format specifications
Useful references

Format Description Properties

ID: fdd000259
Short name: SPX_1_2
Content categories: sound
Format Category: encoding
Other facets: unitary, binary, sampled
Last significant update: 2008-02-19
Draft status: Full

Identification and description

Relationship to other formats
Full name	Speex Audio Codec, Version 1.2
Description	Speech codec designed for packet networks and voice over IP (VoIP) applications but not for mobile phones. File-based compression is also supported. The flexible codec is based on Code Excited Linear Prediction (CELP) and supports a wide range of speech quality and bit-rates. The VoIP-oriented design means that Speex is robust to lost but not to corrupted packets. Because Speex is targeted at a wide range of devices, its memory footprint is modest and its complexity, which is variable, may also be modest.
Production phase	Generally used for final-state, end-user delivery.
Used by	Ogg_SPX, Ogg Speex Audio Format
Affinity to	CELP, Code Excited Linear Prediction. Not documented at this Web site at this time.

Local use

LC experience or existing holdings	In 2007, consideration was being given to the use of Ogg_SPX for service copies of oral history recordings for access via the Web.
LC preference	LCPM preferred for master copies.

Sustainability factors

Disclosure	Fully documented. Developed by xiph as an open source and patent-free project.
Documentation	The Speex Codec Manual, Version 1.2 Beta 2, May 22, 2007.
Adoption	See Ogg.
Licensing and patents	The specification provides the license in Appendix D. It is inspired by the BSD (Berkeley Software Distribution) family of free, near-public-domain software licenses. Paraphrasing appendix D: redistributions of source code or binary versions are free but must retain the copyright notice and other wording; the name of the Xiph.org Foundation or of contributors may not be used to endorse or promote products without specific prior written permission.
Transparency	Encoding depends upon algorithms and tools to read; requires sophistication to build tools.
Self-documentation	See Ogg.
External dependencies	None.
Technical protection considerations	See Ogg.

Quality and functionality factors

Sound
Normal rendering	Good support.
Fidelity (high audio resolution)	This is compression designed for comprehensible speech, not for a rich representation of a full audio spectrum and dynamic range. Paraphrased from the specification: CELP was selected as the encoding technique; it scales well to both low bit-rates (e.g. DoD CELP @ 4.8 kbps) and high bit-rates (e.g. G.728 @ 16 kbps). Speex is designed for three different sampling rates: 8 kHz, 16 kHz, and 32 kHz, referred to as narrowband (telephone quality), wideband, and ultra-wideband. The encoding process is controlled most of the time by a quality parameter that ranges from 0 to 10. In constant bit-rate (CBR) operation, the quality parameter is an integer, while for variable bit-rate (VBR), the parameter is a float. There is also Average Bit Rate that dynamically adjusts VBR quality in order to meet a specific target bit-rate. The management of bit-rate is important in VoIP, where the maximum must be low enough for the communication channel.
Multiple channels	Provides intensity stereo coding.¹
Support for user-defined sounds, samples, and patches	None.
Functionality beyond normal rendering	Not investigated at this time.

File type signifiers

Tag	Value	Note
Internet Media Type	audio/x-speex	For Speex-in-Ogg, from the main part of the specification.
Internet Media Type	audio/speex	From the February 2008 draft of RTP Payload Format for the Speex Codec; link expires in August 2008, thus not active from this page: http://www.ietf.org/internet-drafts/draft-ietf-avt-rtp-speex-05.txt.

Notes

General
History

Format specifications

The Speex Codec Manual, Version 1.2 Beta 2, May 22, 2007 (http://www.speex.org/docs/manual/speex-manual.pdf).

Useful references

URLs

Xiph wiki (http://wiki.xiph.org/index.php/Main_Page).

¹Intensity stereo as explained in the Wikipedia article Joint (audio engineering) (consulted August 24, 2007): "More specifically, the dominance of inter-aural time differences (ITD) for sound localization by humans is only given for lower frequencies. That leaves inter-aural amplitude differences (IAD) as the dominant location indicator for higher frequencies. The idea of intensity stereo coding is to merge the upper spectrum into just one channel (thus reducing overall differences between channels) and to transmit a little side information about how to pan certain frequency regions to recover the IAD cues."

Last Updated: Tuesday, 19-Feb-2008 15:49:05 EST

Sustainability of Digital Formats Planning for Library of Congress Collections

Introduction \| Sustainability Factors \| Content Categories \| Format Descriptions \| Contact

Sustainability of Digital Formats Planning for Library of Congress Collections