H.264 avc standard pdf

2022.01.17 01:51

Some systems e. In the byte stream format, each NAL unit decoding process of some decoders with an ongoing video is prefixed by a specific pattern of three bytes called a start code stream produced by other decoders without penalizing prefix. The boundaries of the NAL unit can then be identified by all decoders with the loss of efficiency resulting from searching the coded data for the unique start code prefix pattern.

This can enable switching a decoder The use of emulation prevention bytes guarantees that start code between representations of the video content that used prefixes are unique identifiers of the start of a new NAL unit. Additional data can also be inserted in the byte stream format that allows expansion of the amount of data to be sent and can III.

NAL aid in achieving more rapid byte alignment recovery, if desired. In other systems e. The full degree of customization of the video content to fit the needs of each particular application is outside the scope of the D. Parameter Sets A parameter set is supposed to contain information that is expected to rarely change and offers the decoding of a large number of VCL NAL units. Structure of an access unit. The sequence and picture parameter-set mechanism decouples the transmission of infrequently changing information from the The primary coded picture consists of a set of VCL NAL units transmission of coded representations of the values of the sam- consisting of slices or slice data partitions that represent the ples in the video pictures.

In VCL NAL units that contain redundant representations of areas this manner, a small amount of data the identifier can be used of the same video picture.

These are referred to as redundant to refer to a larger amount of information the parameter set coded pictures, and are available for use by a decoder in recov- without repeating that information within each VCL NAL unit. Decoders are not required to decode redundant coded of the VCL NAL units that they apply to, and can be repeated to pictures if they are present.

In other applica- decodable and uses only one sequence parameter set , an end tions see Fig. Access Units G. The decoding of each access unit results in one de- A coded video sequence consists of a series of access units coded picture. The format of an access unit is shown in Fig. Each coded video sequence can be de- gether compose a primary coded picture. It may also be prefixed coded independently of any other coded video sequence, given with an access unit delimiter to aid in locating the start of the the necessary parameter set information, which may be con- access unit.

At the beginning of a coded taining data such as picture timing information may also precede video sequence is an instantaneous decoding refresh IDR ac- the primary coded picture. YCbCr Color Space and Sampling The human visual system seems to perceive scene content in terms of brightness and color information separately, and with greater sensitivity to the details of brightness than color.

Video transmission systems can be designed to take advantage of this. This is true of conventional analog TV systems as well as dig- ital ones. The video color space used by H. Com- Fig. Progressive and interlaced frames and fields.

The two chroma components Cb and Cr represent the extent to which the color deviates from gray toward blue and red, respectively. A NAL unit stream may contain one or more coded video Because the human visual system is more sensitive to luma sequences. VCL the horizontal and vertical dimensions. This is called sam- pling with 8 bits of precision per sample.

Proposals since H. The basic source-coding algorithm is a hybrid of inter-picture pre- diction to exploit temporal statistical dependencies and trans- C. Division of the Picture Into Macroblocks form coding of the prediction residual to exploit spatial statis- tical dependencies.

There is no single coding element in the A picture is partitioned into fixed-size macroblocks that each VCL that provides the majority of the significant improvement cover a rectangular picture area of 16 16 samples of the luma in compression efficiency in relation to prior video coding stan- component and 8 8 samples of each of the two chroma compo- dards.

It is rather a plurality of smaller improvements that add nents. This partitioning into macroblocks has been adopted into up to the significant gain. Macroblocks are the basic building blocks of the standard for which the decoding process is specified.

The A. Pictures, Frames, and Fields basic coding algorithm for a macroblock is described after we A coded video sequence in H. A coded picture in [1] can represent either an entire frame or a single field, as was also the case for MPEG-2 D. Slices and Slice Groups video. Generally, a frame of video can be considered to contain two Slices are a sequence of macroblocks which are processed interleaved fields, a top and a bottom field.

A picture maybe split into one or rows of the frame. The bottom field contains the odd-numbered several slices as shown in Fig. A picture is therefore a col- rows starting with the second line of the frame. If the two fields lection of one or more slices in H. Slices are self-con- of a frame were captured at different time instants, the frame is tained in the sense that given the active sequence and picture referred to as an interlaced frame, and otherwise it is referred to parameter sets, their syntax elements can be parsed from the as a progressive frame see Fig.

The coding representation in bitstream and the values of the samples in the area of the picture H. Instead, its coding specifies a rep- are identical at encoder and decoder. Some information from resentation based primarily on geometric concepts rather than other slices maybe needed to apply the deblocking filter across being based on timing. The above three coding types are very similar to those in pre- vious standards with the exception of the use of reference pic- tures as described below.

The following two coding types for slices are new. Subdivision of a picture into slices when not using FMO. For details on the novel concept of SP and SI slices, the reader is referred to [5], while the other slice types are further described below. Encoding and Decoding Process for Macroblocks All luma and chroma samples of a macroblock are either Fig. For transform FMO modifies the way how pictures are partitioned into coding purposes, each color component of the prediction slices and macroblocks by utilizing the concept of slice groups.

Each Each slice group is a set of macroblocks defined by a mac- block is transformed using an integer transform, and the roblock to slice group map, which is specified by the content transform coefficients are quantized and encoded using entropy of the picture parameter set and some information from slice coding methods.

The macroblock to slice group map consists of a slice Fig. Each slice group can be partitioned into one or more each macroblock of each slice is processed as shown. An effi- slices, such that a slice is a sequence of macroblocks within cient parallel processing of macroblocks is possible when there the same slice group that is processed in the order of a raster are various slices in the picture.

To provide high coding efficiency, the H. The left-hand side when coding a frame. The one single coded frame frame mode. For a single frame, but when coding the frame to split the pairs more details on the use of FMO, see [14].

The choice between the first two coded using intra prediction. When a frame is coded as two fields, each field is parti- macroblocks of the P slice can also be coded using inter tioned into macroblocks and is coded in a manner very similar prediction with at most one motion-compensated predic- to a frame, with the following main exceptions: tion signal per prediction block. Basic coding structure for H. Conversion of a frame macroblock pair into a field macroblock pair.

For a macroblock pair that is coded in frame mode, each izontal edges of macroblocks in fields, because the field macroblock contains frame lines. For a macroblock pair that is rows are spatially twice as far apart as frame rows and the coded in field mode, the top macroblock contains top field lines length of the filter thus covers a larger spatial area.

The reasons for this choice are to keep the basic mac- If a frame consists of mixed regions where some regions are roblock processing structure intact, and to permit motion com- moving and others are not, it is typically more efficient to code pensation areas as large as the size of a macroblock. The main idea is to preserve as much spatial consistency as possible. Thus, sometimes PAFF coding can be more efficient than MBAFF coding partic- ularly in the case of rapid global motion, scene change, or intra picture refresh , although the reverse is usually true.

Intra-Frame Prediction Each macroblock can be transmitted in one of several coding types depending on the slice-coding type. In all slice-coding types, the following types of intra coding are supported, which Fig. The 16 samples of the 4 4 block which picture with significant detail.

For each 4 4 block, one of nine block and is more suited for coding very smooth areas of a prediction modes can be utilized. In addition to these two types of luma prediction, a where one value is used to predict the entire 4 4 block , eight separate chroma prediction is conducted. Those modes are suitable to predict encoder to simply bypass the prediction and transform coding directional structures in a picture such as edges at various an- processes and instead directly send the values of the encoded gles.

Mode 2 It provides a way to accurately represent the values of 1 horizontal prediction operates in a manner similar to vertical anomalous picture content without significant data expan- prediction except that the samples to the left of the 4 4 block sion are copied. For mode 2 DC prediction , the adjacent samples 3 It enables placing a hard limit on the number of bits a are averaged as indicated in Fig. The remaining six modes decoder must handle for a macroblock without harm to are diagonal prediction modes which are called diagonal-down- coding efficiency left, diagonal-down-right, vertical-right, horizontal-down, ver- In contrast to some previous video coding standards namely tical-left, and horizontal-up prediction.

As their names indicate, H. The first two diagonal prediction modes are also illus- H. When samples E-H Fig. Note that into inter-coded macroblocks. Segmentations of the macroblock for motion compensation. Top: 2 segmentation of macroblocks, bottom: segmentation of 8 8 partitions. Four prediction modes are supported. Prediction mode 0 vertical prediction , mode 1 hor- izontal prediction , and mode 2 DC prediction are specified Fig.

Filtering for fractional-sample accurate motion compensation. For the speci- one-dimensional 6-tap FIR filter horizontally and vertically. The chroma samples of a macroblock are predicted using Fig. Inter-Frame Prediction The final prediction values for locations and are obtained as 1 Inter-Frame Prediction in P Slices: In addition to the follows and clipped to the range of 0— intra macroblock coding types, various predictive or mo- tion-compensated coding types are specified as P macroblock types.

Each P macroblock type corresponds to a specific partition of the macroblock into the block shapes used for mo- The samples at half sample positions labeled as are obtained tion-compensated prediction. Partitions with luma block sizes by of 16 16, 16 8, 8 16, and 8 8 samples are supported by the syntax.

In case partitions with 8 8 samples are chosen, one additional syntax element for each 8 8 partition is transmitted. This syntax element specifies whether the corresponding 8 8 where intermediate values denoted as , , , , and are partition is further partitioned into partitions of 8 4, 4 8, or obtained in a manner similar to.

The final prediction value 4 4 luma samples and corresponding chroma samples. The two alternative methods of obtaining the The prediction signal for each predictive-coded M N luma value of illustrate that the filtering operation is truly separable block is obtained by displacing an area of the corresponding for the generation of the half-sample positions.

Thus, if the macroblock f, i, k, and q are derived by averaging with upward rounding of is coded using four 8 8 partitions and each 8 8 partition is the two nearest samples at integer and half sample positions as, further split into four 4 4 partitions, a maximum of 16 motion for example, by vectors may be transmitted for a single P macroblock.

The accuracy of motion compensation is in units of one quarter of the distance between luma samples. In case the mo- The samples at quarter sample positions labeled as e, g, p, and tion vector points to an integer-sample position, the prediction r are derived by averaging with upward rounding of the two signal consists of the corresponding samples of the reference nearest samples at half sample positions in the diagonal direc- picture; otherwise the corresponding sample is obtained using tion as, for example, by interpolation to generate noninteger positions.

Since the sampling grid of chroma has lower resolution than the sampling grid of the luma, the displacements used for chroma have one-eighth sample po- sition accuracy. The more accurate motion prediction using full sample, half sample and one-quarter sample prediction represent one of the major improvements of the present method compared to earlier standards for the following two reasons. Multiframe motion compensation.

In addition to the motion vector, sentation. The concept is also 2 The other reason is more flexibility in prediction filtering. In technique scheme of the present invention, the infra-frame prediction PU pattern in the HEVC standard still is all to travel through. In such scheme of the present invention, the present invention has defined following test index:. The algorithm that proposes according to the present invention can keep saving the scramble time greatly under the constant substantially situation of bit rate and PSNR.

The present invention has researched and analysed the H. At first will insert in the decoder of standard H. Because definite CU, PU information are arranged, therefore in the cataloged procedure of HEVC standard, need not carry out the quad-tree partition of CU according to its original coded system, need under each CU degree of depth, not travel through all possible PU pattern yet, and only need calculate PU pattern and intra prediction mode under the corresponding CU degree of depth, thereby saved the computation complexity of CU quad-tree partition and PU pattern traversal in the transcoding.

Experimental result shows, code-transferring method of the present invention under the less situation of bit rate and video quality loss, has reduced the calculation of coding complexity greatly. Table one, table two, table three among the concrete effect embodiment. The present invention is described in further detail below in conjunction with embodiment; it is important to point out; following embodiment is only for the present invention is described further; can not be interpreted as limiting the scope of the invention; affiliated art skilled staff is according to the foregoing invention content; the present invention is made some nonessential improvement and adjustment is specifically implemented, should still belong to protection scope of the present invention.

To the interframe transcoding of HEVC standard, adopt method of the present invention and the method that adopts direct cascade to carry out transcoding relatively from standard H. Method of the present invention and the method to the direct cascade video code conversion of HEVC standard of standard are H. And need under each CU degree of depth, not travel through all PU patterns one by one;.

Therefore, CU of the present invention is cut apart the degree of depth and is got 4. As a whole, the inventive method is compared to the direct cascade code-transferring method of HEVC standard with standard H. Table 1 the inventive method and the H. Table 2 the inventive method and the H. Table 3 the inventive method and H. As claimed in claim 3 from standard H. As claim 1 or 2 or 3 or 4 described from standard H. As claimed in claim 7 from standard H.

One kind be used for enforcement of rights require one of 1 to 9 described from standard H. From H. CNB en. Quick frame inner transcoding method from H. Method for low-complexity video transcoding from H. Sign up using Facebook.

Sign up using Email and Password. Post as a guest Name. Email Required, but never shown. The Overflow Blog. Podcast Making Agile work for data science.

Stack Gives Back Featured on Meta. New post summary designs on greatest hits now, everywhere else eventually. Related Hot Network Questions. Question feed. Stack Overflow works best with JavaScript enabled. Accept all cookies Customize settings.

doylasingjor1983's Ownd

0コメント

1000 / 1000