Audio coding technology standards

Audio coding technology standards

Telephone quality audio compression coding technology standard

The frequency of the phone-quality voice signal is specified at 300 Hz to 3.4 kHz, and standard pulse code modulation (PCM) is used. When the sampling frequency is 8 kHz and 8-bit quantization is performed, the resulting data rate is 64 kbit / s, that is, a digital voice channel. In 1972, CCITT (now called ITU-T) formulated the PCM standard G.711 with a rate of 64 kbit / s, using non-linear quantization μ-law or A-law, and its quality is equivalent to 12-bit linear quantization.

In 1984, CCITT published the adaptive differential pulse code modulation (ADPCM) standard G.721, with a rate of 32 kbit / s. This technique is to quantize the differential signal of the signal and its predicted value, and then adaptively change the quantization parameter according to the characteristics of the adjacent differential signal, thereby improving the compression ratio and maintaining a certain signal quality. Therefore, ADPCM can efficiently encode signals with medium telephone quality requirements, and can be used in AM broadcast and interactive laser turntable audio signal compression.

In order to adapt to the requirements of low-rate voice communication, parameter coding or hybrid coding techniques, such as linear predictive coding (LPC), vector quantization (VQ), and other comprehensive analysis techniques must be used. Among them, the more typical codebook excited linear predictive coding (CELP) is actually a closed-loop LPC system, which determines the best parameters from the input speech signal, and then finds the best excitation codebook vector from the codebook according to some minimum error criterion. CELP has strong anti-interference ability, and can obtain higher quality voice signals at 4 ~ 16kbit / s transmission rate. In 1992, CCITT formulated the standard G.728 for short-delay codebook excited linear predictive coding (LD-CELP) with a rate of 16 kbit / s, and its quality is basically equivalent to the G.721 standard of 32 kbit / s.

In 1988, the European Digital Mobile Task Force formulated the standard GSM (RPE-LTP) standard GSM with a long delay linear prediction rule and a rate of 13 kbit / s. In 1989, the United States adopted the vector and excited linear prediction technology (VSELP) to formulate the digital mobile communication voice standard CTIA, with a rate of 8 kbit / s. In order to adapt to the requirements of confidential communications, the National Security Agency (NSA) developed coding schemes based on LPC with a rate of 2.4 bit / s and CELP with a rate of 4.8 kbit / s in 1982 and 1989, respectively.

Other voice-related standards such as:

G.723: An ITU-T recommended standard, a dual-rate voice encoder for multimedia communication transmission between 5.3 and 6.4 kbps.

H.221: The frame part of the ITU-T H.320 recommended standard is officially called "frame structure of 64 to 1920 kbps channels in audiovisual telephone services." The recommended standard describes synchronization operations that enable the encoder and decoder to be synchronized in time.

H.222: ITU-T recommended standard, which specifies the general encoding of moving pictures and related audio information.

H.223: Part of the ITU-T H.324 standard, a control / multiplexing protocol, commonly called "multiplexing protocol for low-bit-rate multimedia communications."

H.233: A multiplexing recommendation standard that is part of the ITU-T video interoperability recommendation standard protocol family. The recommendation standard specifies how a single picture of audiovisual information is multiplexed in a digital channel.

H.231: Recommended standard for the H.320 protocol family attached to ITU-T, which specifies a multipoint control unit for bridging three or more H.320-compliant codecs in a multipoint conference Together.

H.242: ITU-T H.320 protocol family video interoperability recommended standard part. It specifies the protocol for establishing an audio session and ending the session after the communication is terminated.

H.245: The ITU-T H.323 and H.324 protocol family parts define communication control between multimedia terminals.

H.261: The ITU-T recommended standard enables different video codecs (codec) to explain how a signal is encoded and compressed, and how to decode and decompress the signal. It also defines two graphic formats, CIF and QCIF.

H.263: Video codec (codec) included in the H.324 protocol family.

H.320: An ITU-T standard that contains a large number of individual recommended standards: encoding, framing, signaling, and connection establishment (H.221, H.230, H.321, H.242, and H.261 ). It is applied to point-to-point and multipoint video conference sessions, and contains three audio algorithms: G.711, G.722 and G.728.

H.323: H.323 extends H.320 to packet-switched networks in intranets, extranets, and the Internet: Ethernet, token ring, and other networks that may not guarantee QoS. It also stipulates the visual conference process on ATM including ATM QoS. It supports point-to-point and multi-point operations.

H.324: An ITU-T standard. It provides point-to-point data, video and audio conferencing on analog telephone lines (POTS). The H.324 protocol family includes H.223 (a multiplexing protocol), H.245 (a control protocol), T.120 (a set of audio image protocols), and V.34 (a modem specification).

T.120: ITU-T's "Multimedia Data Transmission Protocol", a data sharing / data conference specification that enables users to share files through any H.32x visual conference.

(2) Audio compression coding standard for AM broadcast quality

The frequency of AM broadcast quality audio signals ranges from 50Hz to 7kHz. CCITT formulated the G.722 standard in 1988. The G.722 standard adopts 16kHz sampling, 14bit quantization, and the signal data rate is 224kbit / s. The subband coding method is used to divide the input audio signal into two parts, a high subband and a low subband, through a filter, and ADPCM coding is performed. Remix to form the output code stream, 224kbit / s can be compressed into 64kbit / s, and finally data insertion (the highest insertion rate is up to 16kbit / s), so the G.722 standard can be used in the N-ISDN narrowband integrated service data network An AM channel broadcast audio signal is transmitted on a B channel.

(3) High-fidelity stereo audio compression coding technology standard

High-fidelity stereo audio signal frequency range is 50Hz ~ 20kHz, using 44.1kHz sampling frequency, 16bit quantization for digital conversion, and its data rate reaches 705kbit / s per channel.

Generally, the dynamic range and frequency response of the speech signal are relatively small. 8kHz sampling is used. Each sample is represented by 8bit. The current speech compression technology can compress the code rate from the original 64kbps to about 4kbps. But the sound in multimedia communication is much more complicated than speech, its dynamic range can reach 100dB, and the frequency response range can reach 20Hz ~ 20KHz. Therefore, the amount of information after the digitization of sound is also very large. For example, digitizing 6-channel surround sound, the sampling frequency of each channel is 48KHz, and each sample value is 18bits. 5.184Mbit / s, even if it is two-channel stereo, the bit rate after digitization is about 1.5Mbps, and the bit rate after digital compression of the TV image signal is about 1.5Mbps ~ 10Mbps, therefore, the sound is relatively uncompressed The rate is too high, in order to use valuable channel resources more effectively, the sound must be digitally compressed and encoded.

Due to the need to determine a common set of video and sound coding schemes, the ISO / IEC standards organization established ISO / IES JTC1 / SC29 / WG11, or MPEG (Moving Picture Experts Group). The team is responsible for comparing and evaluating several low bit rate digital sound coding technologies to produce a set of international standards for moving images, related sound information and their combination, and storage and reproduction using digital storage media (DSM). The DSM targeted by MPEG includes CD-ROM, DAT, magneto-optical disks and computer disks. MPEG-based compression technology will also be used for multiple communication channels, such as: ISDN, local area network and broadcasting. "International standard ISO / IEC for moving images and related sounds for digital storage media below 1.5Mbit / s" (MPEG-1) was completed in November 1992. Among them, ISO ll72-3, as the "MPEG audio" standard, has become the internationally recognized high-fidelity stereo audio compression standard, generally known as "MPEG-1 audio". The first and second levels of MPEG-1 audio coding are to sample the input audio signal at 48kHz, 44.1kHz, and 32kHz, and divide it into 32 subbands by the filter bank. The nature of the signal calculates the human ear shielding threshold of each frequency component, selects the quantization parameters of each subband, and obtains a high compression ratio. The third level of MPEG is to introduce auxiliary subbands after the above-mentioned processing, non-uniform quantization and entropy coding techniques, and further improve the compression ratio. The data rate of MPEG audio compression technology is 32 ~ 448kbit / s per channel, which is suitable for CD-DA disc applications.

MPEG-2 also defines the audio standard, which consists of two parts, namely MPEG-2 audio (Audio, ISO / IEC 13818-3) and MPEG-2 AAC (Advanced Audio Coding, ISO / IEC 13818-3). The MPEG-2 audio coding standard is a subsequent version that is backward compatible with MPEG-1 and supports two to five channels. Mainly consider the high-quality 5 + 1 channel, low bit rate and backward compatibility to ensure that the existing two-channel decoder can decode the corresponding stereo from the 5 + 1 multi-channel signal. MPEG-2 AAC, in addition to backward compatible with MPEG-1 audio, also has non-backward compatible audio standards.

The MPEG-4 Audio standard (ISO / IEC 14496-3) can integrate multi-channel sound from voice to high quality, from natural sound to synthesized sound, and the coding method also includes parametric coding, code excited linear prediction (CELP, code excited linear predicTIve) encoding, time / frequency (T / F, TIme / frequency) encoding, structured audio (SA, structured audio) encoding, text-to-speech (TTS, text-to-speech) synthesized sound, and MIDI Synthesize sound, etc.

The MPEG-7 Audio standard (ISO / IEC 15938-3) provides audio description tools.

The FirstPower Motorcycle Battery is engineered to protect against seepage and corrosion, deliver high cranking power, even when the weather`s dealing its worst. It's the rugged, reliable and dependable battery that customers are looking for. The high-tech. Power-boosting design, FirstPower Motorcycle battery can provide right battery for right job – that's where it all starts.
The industry standard for motorcycles snowmobile and riding mowers, our motorcycle battery offers high cranking power, nice cold cranking performance, minimal internal resistance, maximum power.
With the lead-calcium technology and the AGM used, our Maintenance-free VRLA type motorcycle battery assume really sealed, Never needs refilling, offer a really maintenance-free battery for you.
Non-spillable (no acid leakage).

Motorcycle Battery

Motorcycle Battery,High Performance Motorcycle Battery,Lead Acid Gel Motorcycle Battery,Maintenance Free Motorcycle Battery

Firstpower Tech. Co., Ltd. , https://www.firstpowersales.com

This entry was posted in on