Subjective test performed on a sample of expert listeners have evaluated the quality and fidelity of the compression for several values of the bit rate. The European Broadcasting Union (EBU) defines audio quality for large scale broadcasting as follows:
"The quality of the audio signal reproduced after encoding should be indistinguishable from that obtained from an audio Compact Disk. This is like, in practice, comparing the analogic signal at the output of the decoder with a reference signal from a linear 16-bit system with a double blind test of type A/B/C, in which the origin of the two signals is hidden to ensure objectivity. The encoder/decoder is considered acceptable if the result of the evaluation of the sound sequences on a 5-grade scale of judgment (CCIR scale) shows coincidence in the trust interval of 95 % for the original signal and the decoded one. Such test must be repeated on critical signals and the coincidence must happen for at least 70% of the sequences."
MPEG_Audio encoding respects such constraints with compression factor 4 for layer I, 6 for layer II and 8 for layer III.
From international organizations, quality preservation has been verified for encoding and decoding of critical sequences of audio signals. Critical audio sequences are those for which the particular signal structure is poorly suited to compression, thus requiring high bit rates to maintain CD quality. The tests have been performed with bit rates of 192, 128, 96 and 64 KBit/s per monophonic channel. Type A/B/C tests have been used, i.e. tests in which the listener can compare at will the original, A, with two other sequences B and C, one of which is again the original signal and the other the one to test, but which one is which is not known. The subject must decide whether the encoded is B or C and give a grade on the following 5-level scale (CCIR):
Other tests have been performed at the University of Hannover in November 1991, on joint stereo modality, and at the BBC in 1990 on robustness in the case of errors. Both have confirmed the reliability of the encoding. Still more tests have been performed in early 1991 in Ottawa, Canada, in regard to: stereo image quality, error robustness, cascade encoding and comparison with high quality FM broadcasting. For cascade encoding, bit rates of 192KBit/s and 128KBit/s. have been used. Four-stage cascade encoding at 192KBit/s and two-stage at 128KBit/s appears totally transparent, while five-stage at 128KBit/s is no more transparent. In the comparison between FM signal with a 128KBit/s bit rate, MPEG encoding has been considered the better, even though by not much - however FM signals were generated in ideal conditions which are not representative of normal broadcasting. As general consideration, the listening quality of an encoded sequence will be the closest to the original when:
The conclusions are that MPEG_Audio encoding satisfies in full the EBU constraints for a good audio quality.
In the encoded sequence, data that must be protected from errors or on which errors must at least be identifies are of limited size, but of great importance. The portion that must absolutely be error-free for a correct decoding are: header, bit allocation, SSFEIs, SCALE FACTORs; while samples, which represent the largest parts of a frame, do not have to be protected.
The results of subjective tests for the evaluation of the effect of purposely created corruption in sequences are reported below - votes are expressed on the CCIR 5-value scale and the errors considered are of random type, whose effects can be noticeably reduced by adopting the error detection method provided by MPEG, available with an optional 16-bit CRC code, and by taking some corrective measures. Effect of the errors on an encoded MPEG_Audio frame and subjective evaluation on the CCIR scale:
CORRUPTED STRUCTURE | CORRUPTED BITS | SENSITIVITY |
BIT ALLOCATION | RANDOM | CATASTROPHICAL |
SSFEI | RANDOM | CATASTROPHICAL |
SCALE FACTOR
|
5 (MSB) 4 3 2 1 0 (LSB) |
VERY DISTURBING VERY DISTURBING VERY DISTURBING DISTURBING NOT VERY DISTURBING AUDIBLE |
SAMPLES SUBBANDS
|
8-16(MSB) 5-7 3-4 0-2 (LSB) |
DISTURBING NOT VERY DISTURBING AUDIBLE NOT AUDIBLE |
Utilization of CRC considerably reduces the negative effects. The Hamming distances for detection of such errors is: d=4, which allows detection of up to three errors on single bits or detection of up to 16 consecutive erroneous bits. Corrective measures in case of error detection consist for instance in turning off audio for the time of the errors or repeating the previous frame, it is error free.
The quality of the compressed signal depends strongly on the available bit rate for that channel. As an indication, the following requests on the channel capacity should be followed.
HIGH QUALITY (quality with editing margin). 192KBit/s for monophonic signals, or 320Kbit/s for two channels produce high quality with ample margin to allow future postprocessing operations, such as signal elaboration in studio like addition of spoken comments.
CONVENTIONAL QUALITY (CD) (maximum listening quality). A bit rate of 192 KBit/s for mono or 256 KBit/s for stereo is sufficient. The obtained quality respects the CCIR recommendations for a good quality for professional distribution.
GOOD QUALITY. With 96KBit/s mono or 192KBit/s on two channels it is possible to obtain encoding with high compression and only a very slight signal degradation. It is expected that, in the future, joint_stereo encoding will be possible which exhibits, with the same of bit rate, a considerable amelioration and reaches the quality of 192 kBit/s joint_stereo and of 256 KBit/s stereo, for consumer distribution.
INTERMEDIATE QUALITY. It is obtained with 64KBit/s for mono or 128 KBit/s for stereo. In such cases, the amelioration introduced by joint_stereo encoding starts to be noticeable. Subjective tests are about to be performed by the CCIR Task Group 10/2. This is for distribution of voice-only signals.
A capacity of about 2KBit/s has been reserved for the transmission of PAD. These are about information associated with the audio signal and are unusable if delayed in a queue or if transmitted in a separate service channel. PAD are used optionally to transmit additional information, such as program texts. One other example of data associated with a program is the dynamic range of the broadcasting, which can be used by the receiver to compress the dynamics of the audio signal. Obviously, such data become meaningless if they are delayed in respect to the encoded signal.
Another application of the PAD is the transmission of programs with more than two channels, whose place serves for the allocation of the auxiliary channels.
[ Index | Main Paragraph | Previous Paragraph | Next Paragraph ]