Gapless playback is the uninterrupted playback of
consecutive audio tracks without intervening silence or clicks at
the point of the track change. Gapless playback is common with
compact discs,
gramophone records, or tapes, but is not
always available with other formats that employ compressed digital
audio. This may be a source of annoyance to listeners of music
where tracks
segue into each other, such as
some
classical music (
opera in particular),
progressive rock,
electronic music, and live recordings with
audience noise between tracks.
Causes of gaps
There are two main reasons why gaps occur during playback:
compression scheme artifacts, and delayed output.
Compression scheme artifacts
Most lossy audio compression schemes add a small amount of silence
to the beginning of a track. One reason that this happens is
because many such schemes involve a time/frequency domain transform
(such as an
MDCT)
which can introduce gaps called encoder delay. These gaps can be
enlarged at decode time when a reverse-MDCT is performed, because
the reverse transform will also introduce gaps (decoder delay) of
its own. Another factor is that transforms act on the data in units
of fixed-size blocks. In order for the audio signal to be encoded
in its entirety, small amounts of silence are appended to the input
before the transform. If the amount of padded silence is not
accounted for, the padding will be decoded together with the audio
data, also introducing gaps between tracks. Because of the
introduction of such gaps, the playtime of the audio data is often
slightly increased.
This issue is technical but also standards-related. The popular
MP3 standard, for example, defines no way to
record the amount of delay or padding for later removal. Also, the
encoder delay may vary from encoder to encoder, making automatic
removal difficult. Even if two tracks are decompressed and merged
into a single track, a gap will usually remain between them. More
recent compressed audio formats (such as
Ogg
Vorbis) have been designed to address this problem, and can
therefore produce gapless audio if played back correctly.
Delayed output
Even when the audio file itself does not contain undesirable gaps,
software, firmware and hardware design often add gaps during
playback. In some cases, software closes and re-opens the output
stream when switching tracks, causing the hardware to create a very
short "click". This problem is solved in more sophisticated designs
of gapless playback.
A different design problem relates to software/firmware/hardware
which are not ready to seamlessly move to the next track by the
time the current track is complete. In this scenario, the listener
is left waiting in silence as the player locates the next file,
reads it, decodes the first blocks if necessary and then starts
loading the buffer for playback. The gap can be as much as half a
second, or even more — very noticeable in "continuous" music such
as certain classical or dance genres.
Many older audio players on
personal
computers do not implement the required buffering to play
gapless audio. Some of these rely on third-party gapless audio
plug-ins to buffer output. Some newer players and newer versions of
old players now support gapless playback directly.
Precise gapless playback
When gaps are caused by silence introduced during the compression
process, it is possible to store
metadata
in the file that explicitly declares the amount of delay or padding
that was introduced. The audio playback software must be able to
recognize the metadata and trim the decoded audio accordingly or
else the information is just ignored. Typically uncompressed audio
formats don't require this because the start and end of the
original audio data is clearly defined.
That alone may not address the issue of introduced gaps. Ensuring
the audio hardware itself is not stopped and started between tracks
such that a click is added may also be necessary and it may help to
process the next track while the current one is running so that the
data is available as a continuous stream.
With such measures there will be no guesswork being performed by
the software: the playback timing would be identical to the
source.
Alternative solutions
Digital signal processor
(DSP) plugins can be used to detect silence between tracks and trim
the audio as necessary on playback. This is not an optimal solution
because it does not always produce results identical to the source.
Sometimes an artist may intentionally leave silence at track
boundaries for dramatic effect; removing this silence also removes
that effect.
It can also be difficult to properly implement silence removal. If
the silence threshold is too low and the track contains decoder
artifacts, the software may not recognize some silences.
Conversely, if the threshold is too high, the software may remove
entire sections of quiet music at the beginning or end of a
track.
DSP plugins can also be used to
cross-fade between tracks. This
eliminates gaps that some listeners find distracting, but also
greatly alters the audio data and is not always desirable. In
particular, when tracks are meant to be played together and perform
the transition at high volume, cross-fading results in a large
volume drop.
Both of these alternate solutions are typically used to address
compression methods that do not support the metadata for gapless
playback. Like the optimal solution, they still require buffering
and not closing the output stream; however, they require more
computations, making them less efficient. In portable digital audio
players, this can mean a reduced playing time on batteries.
Due to the drawbacks of the alternative solutions above, some
listeners dislike their negative effects more than the gap they
attempt to remove. Another problem is that the solutions above do
nothing to prevent the output stream from being closed and reopened
at track boundaries; some measures can be taken to simulate a
gapless output stream, but they are not always successful and
side-effects may occur.
Another alternative is to ignore track boundaries, encoding a
single collection of tracks as a single compressed file, relying on
cue sheets (or something
similar) for navigation. While this method results in gapless
playback within the collection of tracks with consecutive playback,
it can be unwieldy due to the possibly large size of the resulting
compressed file. Furthermore, unless the playback software or
hardware can recognize the cue sheets, navigating between tracks
may be difficult.
Last of all, with some implementations, it is possible to add
gapless metadata to existing files. If the encoder is known, it is
possible to guess the encoder delay. Assuming the compression was
performed on CD audio to create the files, the original playback
length will be an integer multiple of 588 samples. Thus the total
playback time can be guessed also. Adding such information to audio
files will work with implementations which recognize
metadata.
Format support
Since
lossless data
compression excludes the possibility of the introduction of
padding, all lossless audio file formats are inherently
gapless.
These
lossy audio file
formats have provisions for gapless encoding:
Some other formats do not officially support gapless encoding, but
some implementations of encoders or decoders may handle gapless
metadata.
- LAME-encoded MP3 can be gapless with
players that support the LAME Mp3 info tag.
- AAC in MP4 encoded with Nero Digital from Nero AG can be gapless with foobar2000, latest XMMS2,
and iTunes 7.1.1.5 onwards.
- AAC in MP4 encoded with iTunes (current and previous versions)
is gapless in iTunes 7.0 onwards, 2nd generation iPod nanos, all
video-capable iPods with the latest firmware, and recent versions
of foobar2000.
- iTunes-encoded MP3 is gapless when played back in iTunes 7.0
onwards, 2nd generation iPod nanos, and all video-capable iPods
with the latest firmware.
- ATRAC on both MiniDisc and NW WalkMans is gapless through the use
of time codes
Player support
Optimal solutions:
Alternative or partial solutions:
- Amarok for Linux
- Windows Media Player: Has
supported gapless ripping and playback of WMA since Windows Media
9. Available on all current Windows machines.
- XMMS2 - has native support for gapless MP3
/ Ogg Vorbis and FLAC
Notes
See also
References
- Despite this, there are encoders which store the amount of
delay and padding introduced in metadata to allow gapless playback. This can only
be used if the playback software is able to interpret the metadata
information.
- Features a table of encoder delay values.
- Vorbis and Speex feature gapless support through the
Ogg layer. The reference
implementation of Speex did not initially ship with gapless
metadata support.
External links