Sustainability of Digital Formats
 Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

DTB (Digital Talking Book)

>> Back
Table of Contents
Identification and description
Local use
Sustainability factors
Quality and functionality factors
File type signifiers
Notes
Format specifications
Useful references
Format Description Properties
• ID: fdd000053
• Short name: DTB
• Content categories: text, sound
• Format category: bitstream encoding, file format, bundle
• Last significant update: 2007-08-01
• Draft status: Full

Identification and description Explanation of format description terms

Full nameDigital Talking Book. ANSI/NISO Z39.86-2002
DescriptionThe NISO Digital Talking Book Standard, ANSI/NISO Z39.86-2002, defines the format and content of the electronic file set that comprises a digital talking book (DTB) and establishes a limited set of requirements for DTB playback devices. It uses established and new specifications to delineate the structure of DTBs whose content can range from XML text only, to text with corresponding spoken audio, to audio with little or no text. DTBs are designed to make print material accessible and navigable for blind or otherwise print-disabled persons.

The standard comprises a set of files, including a mandatory package file, which incorporates a manifest listing all the other component files and a spine, which indicates logical reading order. Component files can be of several types, including: textual content (in XML); audio files; image files; synchronization files (in SMIL); navigation control files (in XML); bookmark/highlight files (in XML).
  Production phase  This bundle of files is likely to be used primarily as a middle-state format, with dissemination to end-users managed through publishers or aggregators who provide in a form appropriate for viewers/players that enforce limitations on access and use consistent with terms imposed by copyright holders.
Relationship to other formats 
  ContainsOEBF, Open Ebook Forum Publication Structure 1.0.1
  Based onXML_DTD, XML Document Type Definition
  May containWAVE_LPCM, WAVE Audio File with LPCM Audio
  May containMP3_ENC, MP3 Audio Encoding
  May containAAC_MP4, Advanced Audio Coding (MPEG-4)
  OtherDTB_Ext, Library of Congress extension to include AMR-WB+ speech codec

Local use Explanation of format description terms

LC experience or existing holdingsThe National Library Service for the Blind and Physically Handicapped (NLS), which is part of LC, acts as maintenance agency for this standard. The unit plans to employ the extended version, DTB_Ext.
LC preferencePreferred format for digital talking books. A DTB or DTB_Ext incorporating the full text of a work is among the preferred XML-based formats for textual works.

Sustainability factors Explanation of format description terms

DisclosureOpen standard
  Documentation ANSI/NISO Z39.86-2002. Specifications for the Digital Talking Book ISSN: 1041-5653
Adoption The National Library Service for the Blind and Physically Handicapped (NLS) is using DTB_Ext, the extended version of the standard for the production of NLS talking books. As of July 2004, none of the NLS talking books include full text and there are no plans to produce books with full text.

Bookshare.Org, which provides full-text books to print-handicapped users, makes the full text (without audio) of books available in several formats, including the NISO DTB standard (aka DAISY 3).

A forty-member panel representing educators, publishers, technology specialists, and advocacy groups, sponsored by Office of Special Education Programs at the U. S. Department of Education has recommended, in a report relased in July 2004, that a specific application (or profile) of the NISO DTB standard be adopted as version 1 of the National Instructional Materials Accessibility Standard (NIMAS).
  Licensing and patent claimsNone
TransparencySee information on the encoding formats employed for the files that comprise a DTB file set: XML, LPCM, MP3, AAC_MP4. XML and LPCM both rate highly for transparency.
Self-documentationThe Package file can include Dublin Core metadata and an extended set of elements intended to record information about rights and the provenance and generation of the talking book from a source text.
External dependenciesNone. The format is designed to support effective use of special hardware and software players for the visually impaired, but does not require them.
Technical protection considerationsNone in relation to the sets of files that comply with the ANSI/NISO Z39.86-2002 specification. Digital Talking Books are likely to be distributed to end users via mechanisms that do impose technical protections. Hence it is probable that LC will need to receive such files direct from publisher or aggregator by a special transmittal process rather than by harvesting as if an end user.

Quality and functionality factorsExplanation of format description terms

Text
Normal rendering for textGood support.
Integrity of structureThe logical structure of a document is an important part of a DTB with textual content. See DAISY Structure Guidelines.
Integrity of layoutThis standard focuses on the textual content, the logical structure, and the synchronization of text with audio of the text being read. Layout is not of primary significance in rendering for the visually impaired.
Integrity of rendering of equations, etc.Not supported
Beyond normal rendering for textSupports embedding of audio, images, and synchronization of text with audio of the text being read.
Sound
Normal rendering for soundGood support.
Fidelity (support for high audio resolution)Not intended for audio quality beyond CD quality. DTB players must support sample rates of 44.1, 22.05, and 11.025 kHz at a depth of 16 bits per sample. Compressed audio must be encoded such that the output sampling rate is restricted to one of the above three rates and uses a constant bit rate.
Support for multiple sound channelsDTB Players are not required to support multiple channels, but must recognize stereo and render at least as monaural.
Support for downloadable or user-defined sounds, samples, and patchesNot investigated at this time.
Beyond normal rendering for soundSynchronization with text transcription.

File type signifiers Explanation of format description terms

Tag typeValueNote
Filename ExtensionopfFor the required Package file. Documented in NISO standard.
Filename ExtensionxmlFor textual content files. Documented in NISO standard.
Filename ExtensionncxFor navigation control files. Documented in NISO standard.
Filename ExtensionaacFor AAC_MP4 audio files. Documented in NISO standard.
Filename Extensionmp3For MP3 audio files. Documented in NISO standard.
Filename ExtensionwavFor WAVE_LPCM audio files. Documented in NISO standard.

Notes Explanation of format description terms

GeneralThe standard supports any of the following classes of digital talking book:
• Audio with Title element only: DTB without structure. This is the simplest class of DTB and is used for books where structure will not be applied. The XML textual content file may not be present, or if it is, contains only the title of the book, and other required notation. The book must be read linearly. Direct access to points within the DTB is not possible.
• Audio with NCX only (Navigation Control): DTB with structure. The XML textual content file, if present, contains only the structure of the book and may contain links to features such as narrated footnotes, etc. This is the most common form of DTB and is ideal for stand-alone players.
• Audio with NCX and partial text: DTB with structure and some additional text. The XML textual content file contains only the structure of the book and the text of components where keyword searching and direct access to the text would be beneficial, e.g., index, glossary, etc.
• Audio and full text: DTB with structure and complete text and audio. This form of a DTB is the most complex but provides the greatest level of access. The XML textual content file contains the structure and the full text of the book. The audio and the text are synchronized.
• Full text and some audio: DTB with structure, complete text and limited audio. The XML textual content file contains the structure and the text of the book. The audio files contain recordings of parts of the text. This type of DTB could be used for a dictionary where only pronunciations were provided in audio form.
• Text and no audio: E-text with structure. The XML textual content file contains the structure and text of the book. There are no audio files.
HistoryThis format is also known as DAISY 3, being the third in a sequence of talking book formats. From the point of view of long-term sustainability, the earlier DAISY formats are less likely to be appropriate for LC collections. The previous version is Daisy 2.02, which is based on XHTML rather than XML.

Format specifications Explanation of format description terms

URLs

Print

Useful references

URLs

Print


Last updated 08/20/2007