Application of H.264's technical advantages in H.323 system

This article focuses on the H.323 system that is suitable for providing multimedia services on IP networks. H.264 is a new video codec standard proposed by JVT to achieve higher compression ratio of video, better image quality and good network adaptability. Facts have proved that H.264 encoding saves more code stream, its inherent anti-packet loss, anti-error ability and good network adaptability make it very suitable for IP transmission, H.264 is expected to become the preferred video standard in H.323 system .

The H.323 system proposes the following three main requirements for the video codec standard:

(1) Some IP network access methods such as xDSL can provide limited bandwidth. Except for the bandwidth occupied by audio and data, the available bandwidth for transmitting video is less. This requires a high compression rate of video codec, so that it can be Better image quality at bit rate.

(2) Good anti-packet loss performance and anti-error code performance, adapt to various network environments, including wireless networks with severe packet loss and bit error.

(3) The network has good adaptability, which facilitates the transmission of video streams in the network.

2. Three technical advantages of H.264 suitable for H.323 system

H.264 fully considers the various requirements of multimedia communication for video encoding and decoding when it is formulated, and draws on the research results of previous video standards, so it has obvious advantages. The following will explain the three advantages of H.264 based on the requirements of H.323 system for video codec technology.

1. Compression ratio and image quality

Improvements to traditional intra-frame prediction, inter-frame prediction, transform coding, and entropy coding algorithms have further improved the coding efficiency and image quality of H.264 on the basis of previous standards.

(1) Variable block size: The block size can be flexibly selected during inter prediction. H.264 adopts 16 & TImes; 16,16 & TImes; 8,8 & TImes; 16,8 & TImes; 8 in macroblock (MB) division; when it is divided into 8 × 8 mode, it can further adopt 8 × 4, 4 The sub-macroblock division modes of × 8 and 4 × 4 are further divided, which can make the division of moving objects more accurate, reduce prediction errors, and improve coding efficiency. Intra prediction generally adopts two luminance prediction modes: Intra_4 × 4 and Intra_16 × 16. Intra_4 × 4 is suitable for areas with rich details in the image, while Intra_16 × 16 mode is more suitable for rough image areas.

(2) High-precision motion estimation: The accuracy of the motion compensation prediction of the luminance signal in H.264 is 1/4 pixel. If the motion vector points to the entire pixel position of the reference image, the predicted value is the value of the reference image pixel at that position; otherwise, the linear interpolation of the 6th order FIR filter is used to obtain the predicted value of the 1/2 pixel position by taking the integer and 1 / The value of 1/4 pixel position is obtained by averaging the pixel value at 2 pixel positions. Obviously, using high-precision motion estimation will further reduce the inter-frame prediction error.

(3) Multi-reference frame motion estimation: each M × N luma block undergoes motion compensation prediction to obtain a motion vector and a reference image index, and each sub-macroblock division in the sub-macroblock will have different motion vectors. The process of selecting a reference image is performed at the sub-macroblock level, so the division of multiple sub-macroblocks in a sub-macroblock uses the same reference image during prediction, and the reference image selected between multiple sub-macroblocks of the same slice It can be different, this is the multi-reference frame motion estimation.

(4) The selection of the reference image is more flexible: the reference image can even be an image using a bidirectional predictive coding method, which allows the image that more closely matches the current image to be selected as the reference image for prediction, thereby reducing the prediction error.

(5) Weighted prediction: Allow the encoder to weight the motion compensation prediction value with a certain coefficient, so that the image quality can be improved in a certain scene.

(6) Blocking filter in motion compensation loop: In order to eliminate the blockiness introduced in the prediction and transformation process, H.264 also uses a blockiness filter, but the difference is that H.264 eliminates blockiness The filter is located inside the motion estimation loop, so the image after eliminating the blockiness can be used to predict the motion of other images, thereby further improving the prediction accuracy.

2. Anti-packet loss and anti-error codes

The use of key technologies such as parameter sets, slices, FMO, and redundant slices can greatly improve the system's anti-packet loss and anti-error performance.

(1) Parameter set: The parameter set and its flexible transmission method will greatly reduce the possibility of errors due to the loss of key header information. To ensure that the parameter set reaches the decoder reliably, the same parameter set can be sent multiple times or multiple parameter sets can be transmitted by retransmission.

(2) Use of slices: The image can be divided into one or several slices. Divide the image into multiple slices. When a slice cannot be decoded normally, the spatial visual impact will be greatly reduced, and the slice also provides a resynchronization point.

(3) PAFF and MBAFF: When encoding an interlaced image, due to the large scanning interval between the two fields, the spatial correlation of the adjacent two lines in the frame relative to the progressive for the moving image It will be reduced during scanning, and encoding the two fields separately will save the code stream. For frames, there are three optional encoding methods, encoding two occasions as one frame or encoding two fields separately or combining two occasions as one frame, but the difference is that vertically adjacent frames Two macroblocks are merged into a macroblock pair to encode. The first two are called PAFF encoding, and the field method is effective when encoding the moving area. The non-moving area has a greater correlation between the two adjacent lines, so the frame method will be more effective. When there are both moving areas and non-moving areas in the image, at the MB level, it is more effective to adopt the field method for the moving areas and the frame method for the non-moving areas. This method is called MBAFF.

(4) FMO: FMO can further improve the error recovery capability of the film. Through the use of slice groups, FMO changes the way images are divided into slices and macroblocks. The mapping of macroblocks to slice groups defines which slice group a macroblock belongs to. Using FMO technology, H.264 defines seven macroblock scanning modes.

(1) Intra prediction: H.264 draws on the experience of previous video encoding and decoding standards in intra prediction. It is worth noting that in H.264, IDR images can invalidate the reference image cache, and subsequent images are decoded. No longer refer to the image before the IDR image, so the IDR image has a good resynchronization effect. In some channels with severe packet loss and bit errors, the IDR image can be transmitted irregularly to further improve the performance of H.264 against bit errors and packet loss.

(2) Redundant image: In order to improve the robustness of the H.264 decoder when data loss occurs, a method of transmitting redundant images can be adopted. When the basic image is lost, the original image can be reconstructed from the redundant image.

(3) Data partitioning: Since information such as motion vectors and macroblock types are of higher importance than other information, the concept of data partitioning was introduced in H.264, and syntactic elements whose semantics are related to each other in the slice are placed in the same In a division. In H.264, there are three different types of data division, and the three types of data division are transmitted separately. If the information of the second or third type of division is lost, the error recovery tool can still use the information in the first type to divide the lost information. Recover properly.

(4) Multi-reference frame motion estimation: Multi-reference frame motion estimation can not only improve the coding efficiency of the encoder, but also improve the error recovery capability. In the H.323 system, by using RTCP, when the encoder learns that the reference image is lost, it can select the image that the decoder has correctly received as the reference image.

(5) In order to prevent the spatial spread of errors, the decoder can specify that the macroblocks in the P slice or B slice should not use adjacent non-intra-coded macroblocks as a reference when doing intra prediction.

3. Network adaptability

In order to adapt to various network environments and application occasions, H.264 defines a video coding layer (VCL) and a network extraction layer (NAL). The VCL function is to perform video coding and decoding, including motion compensation prediction, transform coding and entropy coding; NAL is used to package and package VCL video data in an appropriate format.

(1) NAL Units: Video data is encapsulated in an integer byte NALU, and its first byte marks the type of data in the unit. H.264 defines two packaging formats. Networks based on packet switching (such as H.323 systems) can use the RTP encapsulation format to encapsulate NALUs. Other systems may require the NALU to be transmitted as a sequential bit stream. For this reason, H.264 defines a bit stream format transmission mechanism. The start_code_prefix is ​​used to encapsulate the NALU to determine the NAL boundary.

(2) Parameter set: In the past video encoding and decoding standards, the header information such as GOBGOP images is very important, and the loss of packets containing these information often results in the images related to these information being unable to be decoded. For this reason, H.264 puts the information that rarely changes and works on a large number of VCL NALUs into the parameter set. There are two types of parameter sets, namely sequence parameter sets and image parameter sets. To adapt to a variety of network environments, the parameter set can be transmitted in-band or out-of-band.

3. Implement H.264 in H.323 system

Since H.264 is a new video codec standard, there are some problems in applying H.264 in the H.323 system, such as how to define the entity's H.264 capabilities during the H.245 capability negotiation process, so H.264 must be The .323 standard is supplemented and modified as necessary. To this end, ITU-T has developed the H.241 standard. This article only introduces modifications related to H.323.

First, it is necessary to specify how to define H.264 capabilities during the H.245 capability negotiation process. The H.264 capability set is a list of one or more H.264 capabilities. Each H.264 capability includes two mandatory parameters, Profile and Level, and several optional parameters such as CustomMaxMBPS and CustomMaxFS. In H.264, Profile is used to define coding tools and algorithms for generating bitstreams, and Level is required for some key parameters. The H.264 capability is included in the GenericCapability structure, where the type of CapabilityIdentifier is standard and the value is 0.0.8.241.0.0.1, which is used to identify the H.264 capability. MaxBitRate is used to define the maximum bit rate. The Collapsing field contains H.264 capability parameters. The first entry in the Collapsing field is Profile, the ParameterIdentifier type is standard, the value is 41, which is used to identify the Profile, and the ParameterValue type is booleanArray, and its value identifies Profile, which can be 64, 32, or 16, which in turn represents Baseline, Main And Extended Three Profiles; the second entry of the Collapsing field is Level, the ParameterIdentifier type is standard, the value is 42, is used to identify the Level, the ParameterValue type is unsignedMin, and its value identifies the 15 optional levels defined in H.264 AnnexA value. Several other parameters appear as options.

Secondly, because the organizational structure of images in H.264 is different from traditional standards, some original H.245 signaling is not applicable to H.264, such as videoFastUpdateGOB in MiscellaneousCommand, etc., so H.241 redefines several information. Order to provide the corresponding function.

Finally, the RTP encapsulation of H.264 refers to RFC 3550, and the payload type (PT) field is not specified.

4. Conclusion

As a new international standard, H.264 has achieved success in coding efficiency, image quality, network adaptability, and error resistance. However, with the rapid development of terminals and networks, the requirements for video encoding and decoding are constantly increasing, so H.264 is still continuing to improve and develop to meet new requirements. The current research on H.264 mainly focuses on how to further reduce the codec delay, algorithm optimization and further improve the image quality. At present, there are more and more video conference systems that use H.264 for encoding and decoding, and most of them have achieved intercommunication on the Baseline Profile. With the continuous improvement of H.264 itself and the continuous popularization of video communication, it is believed that the application of H.264 will become more and more extensive.

48v25Ah Lithium Ion Battery

48V25Ah Lithium Ion Battery,48V25Ah Lifepo4 Battery,48V Lithium Battery For Electric Scooter,48V25Ah Lithium Battery Pack

Jiangsu Zhitai New Energy Technology Co.,Ltd , https://www.zttall.com