Sample Format - Bit Depth
See also Sample Rates for help on choosing the appropriate sample rate to work with.
Contents
- Dynamic Range
- Audacity Defaults
- Effects on file size and CPU use
- Which bit depth to use
- Bit depth equivalent noise
Dynamic Range
The difference between digital sample values and the analog waveform is the "quantization error". Quantization error is experienced as noise. Less bits means less amplitude precision, which means more digital noise. More bits means greater precision, which means less digital noise. |
Bit depth is the number of binary digits ("bits") used to carry the data in each sample of audio. In PCM digital audio, the analog waveform is represented as a series of sample values where each sample is a measurement of the waveform amplitude (vertical position) at that point in time.
To represent the amplitude precisely requires that the range of valid sample values (maximum positive value to minimum negative value) is subdivided into a large number of discrete values. The larger the number of discrete values (more subdivisions) between maximum and minimum, the more precisely the digital form will represent the original analog waveform. More "bits per sample" means more subdivisions, thus greater accuracy representing the amplitude at each sample point.
The bit depth chosen for recording limits the dynamic range of the recording. (Other factors in the audio chain may also limit this, so more bits often will not always produce a better recording.)
Audacity Defaults
The Audacity default quality settings are Sample Format 32-bit float (and Sample Rate 44100 Hz). It is strongly recommended that you use these settings unless you have good reasons to deviate from these. 32-bit float is chosen to give an extremely low noise floor and to provide good headroom to avoid sound distortion even when performing heavy editing and manipulation of the audio.
Audacity uses "float" format for 32-bit recording instead of fixed integer format as normalized floating point values are quicker and easier to process on computers than fixed integer values and allow greater dynamic range to be retained even after editing. This is because intermediate signals during audio processing can have very variable values. If they all get truncated to a fixed integer format, you can't boost them back up to full scale without losing resolution (i.e. without the data becoming less representative of the original than it was before). With floating point, rounding errors during intermediate processing are negligible.
The (theoretically audible) advantage of this is that 32-bit floating point format retains the original noise floor, and does not add noise. For example, with fixed integer data, applying a compressor effect to lower the peaks by 9 dB and separately amplifying back up would cost 9 dB (or more than 2 bits) of signal to noise ratio (SNR). If done with floating point data, the SNR of the peaks remains as good as before (except that the quiet passages are 9 dB louder and so 9 dB noisier due to the noise they had in the first place).
In many cases you will be exporting to a 16-bit format (for example if you are burning to a standard audio CD, that format is by definition 16-bit 44100 Hz). The advantage of using 32-bit float to work with holds even if you have to export to a 16-bit format. Using Dither on the Quality pane of Audacity Preferences will improve the sound quality of the exported file so there are only minimal (probably non-audible) effects of downsampling from 32-bit to 16-bit.
Effects on file size
Bit depth affects file size. All other things being equal, a 32-bit file is twice the size of a 16-bit file, and an 8-bit file half the size of a 16-bit one.
Which bit depth to use
The less bits that are available per sample, the less precisely the digital audio can match the analog waveform, so the more digital noise will be present in the recording. To avoid unnecessarily reducing the sound quality, the selected bit depth should have significantly higher dynamic range than the material being recorded.
32-Bit
If you want or need the highest standards (for example, operate a recording studio), expect to do a large amount of manipulation of the data before export, and have audio source equipment with an extremely low noise floor, 32-bit recording (which is the default setting in Audacity) will give the best possible quality and avoid the bit depth having any effect on the sound even after heavy manipulation of the audio.
Finding audio sources capable of providing signals with better dynamic range than 24-bit resolution is a demanding task. A 32-bit data stream records 65,000 times the dynamic range of 16-bit CD audio. In real world applications, a lot of those bits will be normally recording nothing but very low level background noise.
24-Bit
24-bit recording may be used for signals that will be manipulated but still need to maintain the full 16-bit quality of CD audio. 24-bit is good for mastering.
If you're merely listening to thousands of pounds of expertly chosen high end audio kit, and not doing large amounts of editing, there may be no real reason to exceed 24-bit depth.
16-Bit
16-bit matches audio CDs, and is thus suited where the better dynamic range and S/N ratio of CD quality audio is required. 16-bit is a good general purpose high quality setting. 16-bit recording is suitable for vinyl records.
8-Bit
8-bit resolution produces low quality audio, comparable to "Telephone quality". Audacity does not itself support 8-bit recording. 16-bit is the nearest option. It is possible to export files in an 8-bit format, though Audacity defaults to exporting as 16-bit.
If medium quality sources are to be manipulated before saving the recording, it may be preferable to record in 16-bit to avoid any possible quality loss during application of effects.
Bit Depth Equivalent Noise
The digital noise level for 16-bit or greater is extremely low. Audio CDs use 16-bit audio data, and at normal listening levels the digital noise level is too quiet to be audible.
32-bit (float): Recommended for audio processing. Allows very large amounts of amplification or other effects to be applied with no noticeable degradation of sound quality.
16-bit: Allows a dynamic range greater than 90 dB, which is roughly the difference between the sound level of a jackhammer at 1 meter distance, and normal breathing at 1 meter.
12-bit: Similar dynamic range to a top of the range cassette player when using high quality cassette tape and Dolby C-type noise reduction.
8-bit: Used in early PCs and video games. Similar dynamic range to analog telephone (landline), though note that telephone sound quality is also affected by a limited frequency range.
5-bit: Similar dynamic range to a 78 RPM phonograph disc after it has been played a few times.