CN118413674B - Adaptive coding method, apparatus, device and computer program product - Google Patents
Adaptive coding method, apparatus, device and computer program product Download PDFInfo
- Publication number
- CN118413674B CN118413674B CN202410859104.1A CN202410859104A CN118413674B CN 118413674 B CN118413674 B CN 118413674B CN 202410859104 A CN202410859104 A CN 202410859104A CN 118413674 B CN118413674 B CN 118413674B
- Authority
- CN
- China
- Prior art keywords
- scene
- target
- data frame
- degree
- encoded
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 96
- 230000003044 adaptive effect Effects 0.000 title claims abstract description 24
- 238000004590 computer program Methods 0.000 title claims abstract description 16
- 230000003068 static effect Effects 0.000 claims description 51
- 230000000875 corresponding effect Effects 0.000 description 105
- 230000000694 effects Effects 0.000 description 13
- 238000010586 diagram Methods 0.000 description 9
- 238000012545 processing Methods 0.000 description 9
- 238000004364 calculation method Methods 0.000 description 7
- 238000013145 classification model Methods 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000002123 temporal effect Effects 0.000 description 3
- 230000002596 correlated effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 241000023320 Luma <angiosperm> Species 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000003631 expected effect Effects 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 1
- 238000001303 quality assessment method Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The application provides an adaptive coding method, an adaptive coding device and a computer program product. The method comprises the following steps: acquiring a data frame from data to be coded according to a preset time interval; obtaining a target coding scene under the condition that the scene switching condition is met according to the current coding scene and the obtained change degree between adjacent data frames in the data frames; and encoding the data to be encoded according to the QP value corresponding to the target encoding scene. The method can improve the flexibility of the QP value configuration process.
Description
Technical Field
The present application relates to the field of data encoding technology, and in particular, to an adaptive encoding method, apparatus, device, and computer program product.
Background
With the rapid development of online offices, users also put higher demands on improving the experience of online offices. Such as to enhance the experience during video conferencing.
Currently, in the video conference process, data to be encoded of the local terminal device needs to be encoded and then transmitted to the opposite terminal device. In a video conference scenario, there may be a problem of poor network condition, where the network bandwidth is limited, and the bit rate that it can support (the number of data bits transmitted per unit time during data transmission, which may be referred to as a code rate in the video field) is low, for example, the supported code rate is 1024Kbps or less, and this scenario may be referred to as a low code rate scenario. In addition, in order to ensure compatibility of the coding format with devices, that is, to more fully support all possible conference access devices (peer devices), a basic coding format, such as h.264bp, is generally used, and the data size corresponding to the coding format is greater than the data sizes corresponding to other coding formats, which results in more easily exceeding the supported code rate during transmission.
However, in this low code rate scenario, the conventional scheme either fixes the QP (Quantizer Parameter, quantization parameter) value involved in the encoding process to be large in order to avoid that the encoded data exceeds the supported code rate, or fixes the QP value involved in the encoding process to be small in order to secure the image quality. I.e. for whatever purpose, the configuration QP value is fixed, resulting in less flexibility of the configuration process.
Disclosure of Invention
In view of this, the present application provides an adaptive encoding method, apparatus, device and computer program product.
Specifically, the application is realized by the following technical scheme:
according to a first aspect of an embodiment of the present application, there is provided an adaptive coding method, including:
acquiring a data frame from data to be coded according to a preset time interval;
Determining a target coding scene under the condition that scene switching conditions are met according to the current coding scene and the acquired change degree between adjacent data frames in the data frames;
and encoding the data to be encoded according to the QP value corresponding to the target encoding scene.
According to a second aspect of an embodiment of the present application, there is provided a low-bitrate scene adaptive coding device, including:
the acquisition unit is used for acquiring a data frame from the data to be encoded according to a preset time interval;
The determining unit is used for determining a target coding scene under the condition that the scene switching condition is met according to the current coding scene and the acquired change degree between adjacent data frames in the data frames;
and the encoding unit is used for encoding the data to be encoded according to the QP value corresponding to the target encoding scene.
According to a third aspect of embodiments of the present application, there is provided an electronic device comprising a processor and a memory, wherein,
A memory for storing a computer program;
And a processor configured to implement the method provided in the first aspect when executing the program stored in the memory.
According to a fourth aspect of embodiments of the present application, there is provided a computer program product having a computer program stored therein, which when executed by a processor implements the method provided by the first aspect.
According to the adaptive coding method, data frames are acquired from data to be coded according to the preset time interval, a target coding scene is determined under the condition that scene switching conditions are met according to the current coding scene and the variation degree between adjacent data frames in the acquired data frames, the data to be coded is coded according to the QP value corresponding to the target coding scene, the coding scene is automatically determined according to the variation degree between the preset adjacent data frames, and the used QP value is determined according to the coding scene.
Drawings
FIG. 1 is a flow chart of an adaptive coding method according to an exemplary embodiment of the present application;
FIG. 2 is a schematic diagram of a manner of calculating a degree of variation according to an exemplary embodiment of the present application;
FIG. 3a is a schematic diagram illustrating a manner of obtaining a target component value according to an exemplary embodiment of the present application;
FIG. 3b is a schematic diagram illustrating another manner of obtaining target component values in accordance with an exemplary embodiment of the present application;
FIG. 3c is a schematic diagram illustrating yet another way of obtaining a target component value according to an exemplary embodiment of the present application;
FIG. 4 is a schematic diagram of a distribution of data frames according to an exemplary embodiment of the present application;
fig. 5 is a schematic structural diagram of an adaptive coding apparatus according to an exemplary embodiment of the present application;
Fig. 6 is a schematic diagram of a hardware structure of an electronic device according to an exemplary embodiment of the present application.
Detailed Description
The embodiment of the application provides a self-adaptive coding method which is applied to local equipment, wherein the local equipment is used for coding data to be coded acquired in a video conference process and then sending the coded data to opposite-end equipment, so that the opposite-end equipment can display the data to be coded to other conference participants (except the conference participants of the local equipment), and the normal running of the video conference is realized.
In practical applications, the fixed configuration QP value results in poor flexibility of the configuration process. The configuration of smaller QP value can cause the code rate to exceed the setting range, so that the data packet received by the opposite terminal equipment is lost; a larger QP value configuration may cause the peer device to be blurred. Namely, the manner of configuring the QP value in practical application not only results in poor flexibility of the configuration process, but also results in unsatisfactory conditions in the video conference process regardless of the configuration of the QP value, resulting in poor stability and quality of the video conference.
Aiming at the problems, the embodiment of the application provides an encoding scheme based on the correspondence of the target encoding scene and the QP value, wherein the encoding scheme corresponds different QP values to the target encoding scene, namely, after the target encoding scene in the current video conference is determined, the matched QP value can be obtained, so that the flexibility of the QP value configuration process is improved, the problems of super code rate and picture blurring are solved pointedly, and the stability and quality of the video conference are further improved.
In order to enable those skilled in the art to better understand the technical solutions provided by the embodiments of the present application, the following explains some terms related to the embodiments of the present application:
Macroblock (Macroblock): macroblock is a basic concept in video coding technology. Is a strategy to divide a picture into several blocks of different sizes to perform different compression on different locations. In video coding, an encoded image (frame of data) is typically divided into macroblocks, one macroblock consisting of one luminance pixel block and two additional chrominance pixel blocks. Typically, a luminance pixel block is a 16x16 size pixel block, while the size of two chrominance pixel blocks depends on the sampling format of its encoded image, such as an 8x8 size pixel block for a YUV420 sampled image. In each coded image, a plurality of macro blocks are arranged in a sheet form, and a video coding algorithm codes macro blocks by macro block in units of macro blocks and organizes a continuous video code stream.
YUV: YUV is a color coding method, often used in various video processing components. YUV refers to the category of compiled true-color space (color space), and proper nouns such as Y' UV, YUV, YCbCr, YPbPr can be called YUV. The values of YUV include three components, a Y component value, a U component value, and a V component value, respectively, where "Y" represents brightness (luminence or Luma), that is, the gray scale value of a pixel, and "U" and "V" represent chromaticity (Chrominance or Chroma), which serve to describe the image color and saturation, and are used to specify the color of the pixel.
In order to better understand the technical solution provided by the embodiments of the present application and make the above objects, features and advantages of the embodiments of the present application more obvious, the technical solution in the embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
Referring to fig. 1, a flow chart of an adaptive coding method according to an embodiment of the present application is shown in fig. 1, where the adaptive coding method may include the following steps:
Step S100, acquiring a data frame from data to be encoded according to a preset time interval.
For example, the preset time interval may be set according to actual requirements, including but not limited to 200 ms, 300 ms, 500ms, or the like.
The data to be encoded is collected in the process of a video conference, and then transmitted between the conference devices (including the local end device and the opposite end device) after being encoded, wherein the code stream used in the process of the video conference includes but is not limited to single stream, auxiliary stream, double stream and multiple stream.
It should be noted that, for different preset time intervals, when a data frame (hereinafter, simply referred to as frame capturing) is acquired, it may be subjected to different degrees of variation, which may result in a later target encoding scene being either accurate or inaccurate. For example, if the frequency of the frame is too high (the preset time interval is inversely proportional to the frequency, wherein the larger the preset time interval is, the lower the frequency is, and the smaller the preset time interval is, the higher the frequency is), the lower the variation degree is caused; the frequency of frame taking is low, and the change of the intermediate data frame between two adjacent frames after frame taking can be ignored. This makes the inaccurate variation degree used for determining the target coding scene, which may cause misjudgment, so that the QP value used for coding is not adapted to the QP value actually needed to be used, and further the coded data to be coded cannot achieve the expected effect (such as unclear image quality). The frequency for this framing may be determined by debugging or preset based on the desired effect (e.g., any of 2-5 Hz).
Wherein the degree of change is for a pixel in a data frame, and is used to characterize the change in a parameter, such as brightness and/or color, associated with the pixel at the same location between two frames of data. The same position refers to the same position coordinates of the pixel in the data frame, for example, there are a data frame a and a data frame B, and the pixel a with the position coordinates of (1, 1) in the data frame a is the same as the pixel B with the position coordinates of (1, 1) in the data frame B.
In addition, the data frames in the data to be encoded have a determined frame rate, such as any of the 10-60FPS frame rates. For the case of lower frame rate, for example, the frame rate is lower than 10 FPS, considering that the effect of processing and displaying the data to be coded with the frame rate by the opposite terminal equipment is too bad (such as a katon), the meaning of scene determination on the data to be coded is not great, and the embodiment of the application does not consider to correspondingly process the data to be coded under the condition.
Step S110, determining a target coding scene under the condition that the scene switching condition is met according to the current coding scene and the obtained change degree between adjacent data frames in the data frames.
For example, for the acquired data frames, the degree of change between adjacent data frames in the data frames may be determined, and whether the scene switching condition is satisfied may be determined according to the current encoding scene, and further, in the case that the scene switching condition is determined to be satisfied, the target encoding scene may be determined.
For example, in the case where the current encoding scene is a static scene and it is determined that the scene switching condition is satisfied, the target encoding scene is a dynamic scene.
And under the condition that the current coding scene is a dynamic scene and the scene switching condition is met, the target coding scene is a static scene.
Wherein, the adjacent data frames are adjacent data frames in time sequence. It is understood that adjacent in time sequence is the closest in time sequence. If the data frame obtained by taking the frame includes a data frame 1, a data frame 2 and a data frame 3, wherein the time sequence of the data frame 1 is 0ms, the time sequence of the data frame 2 is 200ms, the time sequence of the data frame 3 is 400ms, the data frame 1 is closest to the data frame 2 in time sequence, the data frame 2 is closest to the data frame 1 and the data frame 3, the data frame 3 is closest to the data frame 2, namely, the data frame 1 is adjacent to the data frame 2 in time sequence, the data frame 2 is adjacent to the data frame 1 or the data frame 3 in time sequence, and the data frame 3 is adjacent to the data frame 2 in time sequence. Of course, the concept of temporal adjacency is merely for ease of understanding here, and there is no difference between the temporal adjacency of data frame 3 and data frame 2, and the temporal adjacency of data frame 2 and data frame 3.
It should be emphasized that the target coding scene is proposed in order to solve the above problem according to the embodiments of the present application, and for playing a service scene such as video, PPT, excel, etc. in practical application, the applicant finds that it can be divided into a static scene, a dynamic scene, a hybrid scene, etc. by varying degrees.
Wherein, static scenes and dynamic scenes are a set of relative concepts defined by the corresponding degree of variation of the played media data (mainly data frames, the data amount of the data to be encoded and the playing quality are mainly affected by the data frames). The degree of change in the static scene is generally smaller than that in the dynamic scene, and the degree of change represents the information change condition between different data frames, where the information change condition is the change condition of parameters (such as brightness) related to pixels in the data frames.
The degree of change in the mixed scene is large and small, and the mixed scene can be understood through a static scene and a dynamic scene, namely the mixed scene is equivalent to the situation of frequently switching between the static scene and the dynamic scene. The more typical business scenario is to search a target PPT page from a plurality of PPT pages when PPT is played, and in the searching process, the user is easy to pause reading (corresponding to a static scenario) and slide down or slide up for searching (corresponding to a dynamic scenario).
If in a service scene of playing video, most of time is in a condition of large change degree (such as the occurrence of background switching of characters in the video), only a small part of time is in a condition of small change degree (such as the occurrence of dialogue of characters in the video, and only the corresponding information of faces or bodies changes at the moment); in the case of playing the PPT, for example, when the most time is in a small change degree (for example, a host of the video conference performs a scribing operation on the corresponding text on the PPT page), only a small part of time is in a large change degree (for example, the PPT turns pages); and for example, in a scene of playing Excel, most of the time is in a condition of small change (such as a host of the video conference is explaining the data recorded in the Excel), and only a small part of the time is in a condition of large change (such as a host of the video conference is displaying other data and performing sliding operation on the Excel). That is, most of the time is dynamic and a small part of the time is static when playing the video; most of the time is a static scene and the other part of the time is a dynamic scene when the PPT is played; most of the time is static scene and a small part of the time is dynamic scene when Excel is played. Furthermore, individual traffic scenarios are typically rarely time mixed scenarios. The scene switching is generally between a static scene and a dynamic scene, but does not switch to a mixed scene, and the mixed scene represents frequent changes of the corresponding change degree of the service scene.
It should be noted that, each target encoding scene corresponds to a set of parameters (the parameters are used for encoding the data to be encoded and at least include QP values), the parameters are obtained through verification, and the verification process may include the following steps:
1) Setting a set of initial parameters for each target coding scene;
2) Coding the data to be coded under the corresponding target coding scene through the initial parameters;
3) Acquiring a corresponding coding effect, and adjusting initial parameters by analyzing the coding effect (such as whether the code rate is exceeded or not);
4) And taking the adjusted initial parameters as the initial parameters of the step 2), and executing the step 2) until the coding effect meets the requirement (such as no exceeding code rate).
Therefore, the problem of poor image quality is solved through the QP value corresponding to the static scene, and the problem that the code rate exceeds the setting range is solved through the QP value corresponding to the dynamic scene.
It should be noted that, the QP value corresponding to the static scene is used to ensure that the sharpness of the encoded data to be encoded is greater than the preset sharpness, so as to ensure that the image quality of the opposite terminal device is better when the encoded data to be encoded is processed and displayed. The sharpness is affected by the QP value and the screen resolution of the opposite device, where the screen resolution is an intrinsic attribute of the device, and after the screen resolution is set to the highest resolution supported by the device, the screen resolution cannot be changed to make the picture clearer, so the sharpness is mainly changed by changing the QP value.
Illustratively, since the code rate is affected by the QP value, the sharpness adjustment can be achieved by adjusting the QP value, i.e., the sharpness level is related to the QP value. The correlation may correspond to multiple QP value ranges from small to large for a high-to-low sharpness. If the range of the QP value corresponding to the higher sharpness is (10, 20), the range of the QP value corresponding to the higher sharpness is (20, 30), and the range of the QP value corresponding to the lower sharpness is (30, 40).
Illustratively, sharpness may also be determined by other means, such as by BIQA (Blind Image Quality Assessment ).
The preset definition degree can be set according to requirements, if the definition degree is determined through a code rate, the preset definition degree can be middle definition degree, low definition degree and the like.
It should be noted that, the QP value corresponding to the dynamic scenario is used to ensure that the code rate of the encoded data to be encoded does not exceed the preset code rate. For the preset code rate, the preset code rate can be set through the supported bandwidth of the video conference, and the limitation of the code rate is to ensure that data transmission is completed under the supported bandwidth, namely the code rate cannot exceed the supported bandwidth, otherwise, the condition of data packet loss occurs. Based on this, the preset code rate may be set to be the supported bandwidth or slightly smaller than the supported bandwidth, for example, the difference between the preset code rate and the supported bandwidth is smaller than the preset bandwidth threshold, which is not limited in this embodiment.
It should be noted that, the QP value corresponding to the mixed scene is used to avoid frequent image quality conversion caused by frequent scene switching, so as to improve the stability of the video conference.
And step S120, encoding the data to be encoded according to the QP value corresponding to the target encoding scene.
The QP value may be obtained from a preset mapping table according to a corresponding target coding scene, where the preset mapping table records a correspondence between the target coding scene and the QP value. Of course, the QP value may be obtained by other means, such as from a cache in the local device.
In summary, the embodiment of the present application may set at least two sets of parameters (including at least QP values), for example, one set for a static scene and the other set for a dynamic scene. And under the condition that whether the current scene is in a static scene or a dynamic scene is determined through the data frame, encoding the data to be encoded through a corresponding set of parameters. Therefore, the flexibility of the QP value configuration process is improved, the problem caused by configuring a larger QP value or a smaller QP value is solved, and the stability and quality of the video conference are improved.
In the case of acquiring a data frame from data to be encoded, the target encoding scene may be determined by an artificial intelligence-based classification model based on the acquired data frame, the input of the classification model being the data frame, and the target encoding scene may be determined by the output of the classification model. The output of the classification model is a probability value, and the target coding scene is determined through the probability value, if the target coding scene is a static scene or a dynamic scene, if the probability value is greater than or equal to 0.5, the target coding scene is determined to be a dynamic scene, and if the probability value is less than 0.5, the target coding scene is determined to be a static scene.
For example, for step S110, the degree of change between adjacent data frames in the acquired data frames may be determined by: determining a target pixel location in the data frame; determining the degree of change between adjacent data frames according to the difference between the target pixel positions in the adjacent data frames; and determining a corresponding target coding scene according to the change degree between the adjacent data frames.
Wherein the data frame includes a plurality of pixels, each pixel having a determined position in the data frame, e.g., 1920×1080 pixels in a 1080P data frame, the position coordinates of the upper left pixel being (1, 1), the position coordinates of the lower right pixel being (1920, 1080). The target pixel position is a position of a pixel or a macro block in the data frame, wherein the macro block comprises a plurality of pixels, and the data frame is obtained by dividing the data frame, namely the data frame is divided into a plurality of macro blocks. In determining the degree of variation between adjacent frames of data, it is determined by the corresponding pixel, which is the pixel at the target pixel location. In determining the degree of variation by the respective pixels, in particular the difference between the respective pixels is utilized to determine the degree of variation, i.e. the difference between the pixels in the "same position" above. If the adjacent data frame includes the data frame 1 and the data frame 2, the pixel 11 having the upper left corner position of (1, 1) is included in the data frame 1, and the pixel 12 having the upper left corner position of (1, 1) is included in the data frame 2, then the degree of change can be determined by the difference between the pixel 11 and the pixel 12. It will be appreciated that since a macroblock comprises a plurality of pixels, determining the degree of variation by the macroblock is equivalent to indirectly using pixels to determine the degree of variation.
It should be noted that, when determining the degree of change between adjacent data frames according to the difference between the target pixel positions in the adjacent data frames, there are two cases:
and in the first case, the information change condition is obtained by comparing adjacent data frames, and the information change condition is directly used as the change degree.
And after the second condition is that one information change condition is obtained through mutual comparison of adjacent data frames, the mutual comparison of the adjacent data frames is continuously carried out through the subsequent data frames obtained through frame taking, so that two or more information change conditions are obtained, the change degree is determined through all the information change conditions, and the determination mode can be that the average value of the two or more information change conditions is calculated or the median of the two or more information change conditions is calculated.
For example, for the data frames in the preset statistical duration, the information change condition between each pair of adjacent data frames can be respectively determined, and the change degree between the adjacent data frames is determined according to the information change condition between each pair of adjacent data frames in the preset statistical duration.
For example, the average value or the median of the information change condition between each pair of adjacent data frames in the preset statistical time length is determined as the change degree between the adjacent data frames.
It should be noted that, the preset statistical duration can be set according to the requirement, and in the process of setting the preset statistical duration, the preset statistical duration should not be set too short, so that the determined variation degree is prevented from being too floating, and the real scene cannot be identified; nor should it be set too long, resulting in too slow a judgment of the scene.
For example, the preset statistical duration may be 1.5 to 4 seconds, for example, the preset statistical duration may be 2 seconds.
The information change condition is the difference visualization, and the comparison of adjacent data frames refers to the comparison of the difference between target pixel positions in the adjacent data frames, namely the difference between pixels.
Illustratively, the above-mentioned degree of change includes, but is not limited to, degree of change of image information, number of pixel changes.
For the degree of change of the image information, there is a different processing flow for the above-described case one and case two.
For the first case, the process of determining the degree of change of the image information between the adjacent data frames according to the difference between the target pixel positions in the adjacent data frames is as follows: determining a target component value for a pixel at a target pixel location in each of the adjacent data frames; and determining the degree of change of the image information according to the target component value.
For the second case, the process of determining the degree of change of the image information between the adjacent data frames according to the difference between the target pixel positions in the adjacent data frames is as follows: determining a target component value of a pixel at a target pixel position in each data frame of a current adjacent data frame, and determining a target component value of a pixel at a target pixel position in each data frame of a subsequent adjacent data frame; and determining the change degree of the image information according to the current target component value and the subsequent target component value.
The macroblock size may be n×m, where N and M are positive integers, e.g., N is 16 and M is 16. Generally, the macro blocks are square.
In determining the target component value of a pixel at a target pixel location in any data frame, the target component value of each pixel in the data frame may be determined, the target component value of one pixel in each macroblock may be determined, or the average of the target component values of all or part of the pixels (the number is greater than 1) in the macroblock may be determined.
If the size of the macro block is 4×4, the position coordinate of the pixel at the upper left corner in the macro block is (1, 1), and the position coordinate of the pixel at the lower right corner in the macro block is (4, 4). Accordingly, in order to represent the position of a pixel in a data frame and in a macro block at the same time, the position of the pixel may be represented by a four-dimensional vector, for example, the macro block has a size of 4×4, and for a macro block located in the upper left corner of the data frame and a pixel located in the lower right corner of the macro block, the position coordinates of the pixel may be represented as (1, 4), wherein element values 1 and 1 having element positions of 0 and 1 may be used to represent the position of the macro block in the data frame, and element values 4 and 4 having element positions of 2 and 3 may be used to represent the position of the pixel in the macro block; the position of the pixel may also be represented by means of a two-dimensional vector, such as a macroblock size of 4 x 4, and for a macroblock located in the upper left corner of the data frame and a pixel located in the upper right corner of the macroblock, the position coordinates of the pixel may be represented as ((1, 1), (1, 4)), where an element value (1, 1) with an element position of 0 may be used to represent the position of the macroblock in the data frame, and an element value (1, 4) with an element position of 1 may be used to represent the position of the pixel in the macroblock. The present embodiment is not limited to the manner of representing the position of the pixel.
Illustratively, for the above case, a corresponding process flow is as follows:
When determining the degree of change of the image information according to the target component value, the degree is determined according to the target difference value corresponding to the target pixel position in the adjacent data frame, namely, the target difference value is obtained according to the target component value at the corresponding position of the two data frames in the adjacent data frame. How to obtain the target difference value according to the target component value can be achieved by the following method:
The first method comprises the steps that the target pixel positions are all pixel positions in a data frame, namely, the target component value of each pixel in the data frame is determined, and thus a target difference value is calculated; as shown in fig. 3a, 301 is a data frame, 302 is a pixel, and the target difference is calculated by determining a target component value of each pixel 302.
The second method, the target pixel position is a appointed pixel position in each macro block, namely, the target component value of the pixel at the appointed pixel position is taken from each macro block of the data frame, so as to calculate the target difference value; as shown in fig. 3b, 301 is a data frame, 302 is a pixel, 303 is a macroblock of size 4×4, 304 is a pixel 302 in the macroblock 303, and the target difference is calculated at least by determining a target component value of the pixel 304; a data frame such as 1080P may be divided into 8000 macroblocks, i.e., one frame of the data frame corresponds to 8000 specified pixel locations.
The third method is that the target pixel position is the position of the macro block in the data frame, namely, the target component values of all pixels or partial pixels (the number is more than 1) are taken from each macro block of the data frame, and the average value of the target component values of all pixels or partial pixels in each macro block is determined, so that the target difference value is calculated; as shown in fig. 3c, 301 is a data frame, 302 is a pixel, 303 is a macro block with a size of 4×4, the macro block 303 includes 16 pixels, and the target difference value is calculated by determining a target component value of (1, 16) pixels.
It can be understood that the consumption degree of the equipment computing resources by different implementation methods is different, and a technician can select a method according to the performance of the equipment, wherein the consumption degree of the equipment computing resources by different methods is ranked from low to high as a second method, a third method and a first method, or as a second method, a first method and a third method.
Illustratively, the target component values include, but are not limited to, at least one of YUV values or RGB values. The YUV value comprises three component values, namely a Y component value, a U component value and a V component value; the RGB values include three component values, an R component value, a G component value, and a B component value, respectively. When the degree of variation is obtained by the above three methods for obtaining the target difference value according to the target component value, there are applications for different arrangements of the three methods.
Taking the target component value as YUV value as an example:
for the first or second method:
differences in the Y component value, U component value, and/or V component value at each same target pixel location between two of the adjacent data frames may be separately determined to yield a target difference (referred to herein as a first target difference). If two frames of data are data frame 1 and data frame 2 respectively, the coordinates of the target pixel position are (1, 1), the Y component value at the target pixel position in data frame 1 is 30, the Y component value at the target pixel position in data frame 2 is 50, and the first target difference is 50-30=20. And then, obtaining the sum of absolute values of first target difference values corresponding to the positions of all the target pixels, and determining the change degree between adjacent data frames.
In one possible embodiment, for the method two:
The target component value includes at least two of a Y component value, a U component value, and a V component value, and the target difference value includes a target sub-difference value corresponding to each component value. Since the target component value includes at least two of the Y component value, the U component value, and the V component value, the target difference cannot be directly obtained by simply calculating the difference between the target component values, but the absolute value sum of the target sub-difference values corresponding to the target pixel positions and the component values needs to be calculated for each component value. If two frames of data are data frame 1 and data frame 2 respectively, the coordinates of the target pixel position are (1, 1), (5, 5), (9, 9), etc., the Y component value at the (1, 1) in the data frame 1 is 30, the Y component value at the (5, 5) in the data frame 2 is 30, the Y component value at the (5, 5) in the data frame 1 is 50, the Y component value at the (9, 9) in the data frame 2 is 20, the Y component value at the (9, 9) in the data frame 1 is 20, the Y component value at the (1, 1) in the data frame 2 is 50, the target sub-difference values are 50-30=20, 30-50= -20, 20-20=0, respectively. It can be seen that if the sum of the target sub-differences is calculated directly, it is 0, which obviously cannot reflect the difference between the data frame 1 and the data frame 2, so it is necessary to calculate the sum of the absolute values of the target sub-differences, i.e., the target sub-difference sum is 20+| -20|+0+ … …. And then calculating the sum of target sub-difference values corresponding to the component values in the target component values, so as to determine the change degree between adjacent data frames.
Aiming at the method III:
For any macro block, determining the average value of target component values of all pixels or part of pixels in the macro block, taking the average value as the target component value corresponding to the macro block, and determining a target difference value (referred to as a third target difference value herein) corresponding to the macro block in an adjacent data frame; wherein the third target difference is the difference between the target component values. If the adjacent data frame includes a data frame 1 and a data frame 2, a macro block 1 exists in the data frame 1, and the macro block 1 is a macro block of 4×4, that is, the macro block 1 includes 16 pixels, that is, the target component value is the average value of the target component values of all pixels or part of pixels in the 16 pixels. The target component values comprise at least one of Y component values, U component values and V component values, when the number of the target component values is larger than 1, the average value corresponding to each component value is calculated and calculated, the absolute values are calculated, and the absolute values are summed. And then determining the change degree between adjacent data frames according to the sum of absolute values of third target difference values corresponding to the macro blocks.
In one possible embodiment, for method two+method three:
For any macro block, determining the average value of target component values of all pixels or part of pixels in the macro block, and determining a target difference value (referred to herein as a second target difference value) corresponding to the macro block in an adjacent data frame according to the average value; wherein the second target difference is the difference between the target component values. And determining a larger value between the first target difference value (obtained through the second calculation) corresponding to the macro block and the second target difference value, and obtaining an effective value corresponding to the macro block. And then determining the change degree between adjacent data frames according to the sum of the effective values corresponding to the macro blocks.
For example, in the case where the target component value includes at least two of a Y component value, a U component value, and a V component value, for any macroblock, and any one of the target component values, on the one hand, a target sub-difference value (which may be referred to as a first target sub-difference value) corresponding to the component value for a specified pixel position of the macroblock in the adjacent data frame may be determined; on the other hand, the average value of the component values of all pixels in the macro block can be determined, a target sub-difference value (which can be called a second target sub-difference value) corresponding to the macro block and the component value in the adjacent data frame is determined according to the average value, a larger value between the absolute value of the first target sub-difference value and the absolute value of the second target sub-difference value is determined as an effective target sub-difference value corresponding to the macro block and the component value, and the sum of the absolute values of the effective target sub-difference values corresponding to the component value is determined as an effective target sub-difference value sum corresponding to the component value according to each macro block and the component value.
Further, the degree of change between adjacent data frames may be determined based on the sum of the effective target sub-difference values corresponding to each of the target component values.
Further, in the case where the target component value includes at least two of the Y component value, the U component value, and the V component value, for any one of the macro blocks and any one of the target component values, in determining the first target sub-difference value, a product of the component value of the macro block at the specified pixel position and a preset coefficient (e.g., 1.5) may be taken as a component value participating in subsequent calculation; or, in the case of calculating the first target sub-difference value, the product of the first target sub-difference value and a preset coefficient (such as 1.5) may be used as the first target sub-difference value that participates in the comparison of the effective target sub-difference values.
Similarly, in the process of determining the second target sub-difference value, the product of the component values of all pixels of the macro block and a preset coefficient (such as 1.5) can be used as a component value participating in subsequent calculation; or, in the case of calculating the second target sub-difference value, the product of the second target sub-difference value and a preset coefficient (such as 1.5) may be used as the second target sub-difference value that participates in the comparison of the effective target sub-difference values.
Or the product of the absolute value sum and a preset coefficient (such as 1.5) can be determined as the effective target sub-difference value sum corresponding to the component value under the condition that the absolute value sum of the effective target sub-difference values is obtained according to the original value.
Further, the degree of change between adjacent data frames may be determined based on the sum of the effective target sub-difference values corresponding to each of the target component values. For convenience of description, the following embodiments collectively refer to the sum of absolute values of the first target differences corresponding to the target pixel positions, the sum of target sub-difference sums corresponding to the component values in the target component values, the sum of absolute values of the third target differences corresponding to the macro blocks, and the sum of effective values corresponding to the macro blocks as the difference parameter.
For the target component values are RGB values:
The same target component value is a YUV value, and the difference parameter and the variation degree can be obtained through the first method, the second method or the third method or the second method and the third method. The embodiments may refer to the embodiments in which the target component value is YUV, which will not be described herein.
How the degree of change is determined by the difference parameters:
When the degree of change of the image information is determined, the degree of change is determined by comparing the determined preset numerical range where the difference parameter is located. The preset value range is set by the number of pixels or the number of macro blocks, wherein the setting is performed by the number of pixels when the target difference is calculated by each pixel and the setting is performed by the number of macro blocks when the target difference is calculated by the macro block.
In one possible embodiment, the target difference is calculated by a macroblock, and the preset value range is calculated and determined by the number of macroblocks. As shown in fig. 2, the corresponding value of the different degree of change (degree of change of image information) is the product of the number of macro blocks (number of macro blocks) and a constant value, and the corresponding value of the degree of change 2 is 3*N (3×n). If the corresponding value of the change degree 0 is less than the corresponding value of the difference parameter which is less than or equal to the change degree 1, the current change degree 1 is considered; if the corresponding value of the change degree 1 is less than the corresponding value of the difference parameter which is less than or equal to the change degree 2, the current change degree 2 is considered; if the corresponding value of the change degree 2 is less than the corresponding value of the difference parameter which is less than or equal to the change degree 3, the current change degree 3 is considered; if the corresponding value of the change degree 3 is less than the corresponding value of the difference parameter which is less than or equal to the change degree 4, the current change degree 4 is considered; if the corresponding value of the degree of change 4 < the difference parameter, it is considered to be currently at the degree of change 5.
In the process of determining the first target difference, the data size of the U component value or the V component value is lower than the data size of the Y component value, so that the effect of the Y component value on the code rate and the image quality is greater than the effect of the U component value or the V component value on the code rate and the image quality during encoding, that is, the effect of the Y component value is greater than the effect of the U component value or the V component value on whether the picture of the opposite device is clear and the size of the code rate during encoding of the local device, that is, the effect of the Y component value is greater than the effect of the U component value or the V component value on the degree of image information change.
In addition, when the first target difference is obtained through different component values, a difference exists in the value of the first target difference, so that the corresponding value of each variation degree needs to be correspondingly adjusted, and the adjustment can be performed by multiplying a coefficient. For example, when the first target difference value is obtained by the Y component value, the U component value, and the V component value, the coefficient may be 1.5, 1.6, or the like, and the specific value of the coefficient is not limited in this embodiment.
Accordingly, in the case where the target component value includes at least two of the Y component value, the U component value, and the V component value, for any one of the target component values, the sum of absolute values of target sub-difference values corresponding to the component values for each target pixel position may be calculated, and the product of the sum of absolute values and a preset coefficient (e.g., 1.5) may be determined as the target sub-difference value sum corresponding to the component value; or for any component value in the target component values, the product of the target component value and a preset coefficient (such as 1.5) can be used as a component value participating in subsequent calculation, the sum of absolute values of target sub-difference values corresponding to the component values of each target pixel position is calculated, the sum of absolute values is determined as a target sub-difference value sum corresponding to the component value, and further, the change degree between adjacent data frames can be determined according to the sum of target sub-difference value sums corresponding to each component value in the target component values.
In addition, in practical application, each pixel has a Y component value determined independently, but the U component value and the V component value may be shared by a plurality of pixels, for example, the ratio of YUV three components is 4:1:1, i.e. one U component value and one V component value are shared every 4 pixels.
Aiming at the corresponding processing flow of the second case:
And respectively calculating the difference parameters corresponding to the current adjacent data frame and the subsequent adjacent data frame, wherein the calculation mode is basically the same as the corresponding processing flow aiming at the first condition, and the details are not repeated. The calculation method of the difference parameters corresponding to the current adjacent data frame and the subsequent adjacent data frame should be the same, if the current adjacent data frame calculates to obtain the sum of absolute values of the first target differences corresponding to the target pixel positions, the subsequent adjacent data frame also calculates to obtain the sum of absolute values of the first target differences corresponding to the target pixel positions.
It should be noted that, after the difference parameters corresponding to the current adjacent data frame and the subsequent adjacent data frame are obtained by calculation, the average value is obtained by summing all the difference parameters, and then the change degree is determined by the average value. Accordingly, when determining the degree of change, the difference parameter is replaced with the average value to determine, unlike the corresponding processing flow for the case one.
The difference is that the specific means for determining the degree of change corresponding to the data frame does not calculate the target difference, but counts the number of pixels whose target component value changes in the adjacent data frame, and compares the number with a plurality of different preset number thresholds to determine what degree of change the data frame belongs to.
It should be noted that, in either the first or second case, when determining the target encoding scene according to the change degree, the determining means includes, but is not limited to, scene switching and direct setting.
Exemplary ways of scene cuts include, but are not limited to, the following two:
in the first mode, after the degree of change is obtained, whether scene switching is performed is directly determined. The scene change implemented in this way may be referred to as an instant change.
The process of comparing whether the change degree meets the scene switching condition is as follows:
When the current scene is a static scene, comparing whether the change degree is larger than or equal to a first preset change degree, if so, determining that the scene switching condition is met, wherein the target coding scene is a dynamic scene; or when the current scene is a dynamic scene, comparing whether the change degree is smaller than or equal to a second preset change degree, if so, determining that the scene switching condition is met, and the target coding scene is a static scene.
If the first preset change degree is the same as the second preset change degree, that is, the threshold value triggering the static scene to be switched to the dynamic scene is the same as the threshold value triggering the dynamic scene to be switched to the static scene, frequent switching between the static scene and the dynamic scene is caused due to fluctuation of the change degree between adjacent data frames in a small range, and poor video conference quality is caused. Therefore, the first preset variation degree can be set to be larger than the second preset variation degree, so that poor video conference quality caused by frequent scene switching is avoided.
For example, the first preset change degree and the second preset change degree may be set according to needs, which is not limited in this embodiment.
For example, the degree of change may be any value from 0 to 5, the first preset degree of change being 4 and the second preset degree of change being 2.
Accordingly, when the degree of change satisfies the following conditions, scene switching is not performed:
the first condition is that when the current scene is a static scene, the change degree is smaller than a first preset change degree;
and the second condition is that when the current scene is a dynamic scene, the change degree is larger than a second preset change degree.
And secondly, after the change degree is obtained, continuing to obtain the subsequent change degree, and determining whether scene switching is performed or not according to the obtained change degree. The scene change implemented in mode two may be referred to as a non-instant change.
It is emphasized that, in order to solve the above-mentioned problems, namely: the smaller QP value configuration can cause the code rate to exceed the setting range, so that the data packet received by the opposite terminal equipment is lost; a large QP value configuration may cause a picture blurring for the peer device.
In practical application, the QP value is also configured in a larger range, and adaptive adjustment of the QP value is implemented. If the configurable range of QP values is 1-51, the range of QP values may be configured to 10-40 and the QP values may be adaptively adjusted between 10-40. The specific means is that the code rate is detected by a video encoder, when the video encoder detects that the code rate suddenly increases, the QP value is directly configured to be larger in order to cope with the code rate suddenly change, and then the QP value is gradually reduced to a proper QP value by combining the coding effect corresponding to the configured QP value; when the video encoder detects that the code rate suddenly decreases, the QP value is configured directly smaller to cope with the code rate sudden change, and then the QP value is gradually increased to a suitable QP value in combination with the coding effect corresponding to the configured QP value.
However, in practical application, the QP value is adaptively adjusted, which still results in poor user's look and feel during the video conference. If the user turns pages when PPT is played, the code rate increases dramatically, and to adapt to this situation, the video encoder needs to configure QP values in the above manner, and in this configuration process, the picture changes perceived by the user are blurred and then clear. For another example, when playing video, there is also a problem of blurring and then sharpening, and in addition, there is a problem that a sharpening picture and a blurring picture alternately appear. Because video scene changes can be frequent, the user's look and feel can be poor in cases where the scene changes are blurred where they are large and where they are clear where they are small.
In summary, QP value adaptive adjustment brings new problems in the case of solving the above-mentioned problems.
The applicant researches and discovers that the new problem is caused by dynamically adjusting the QP value in real time, and the second mode provided by the embodiment of the application can well solve the new problem, namely, whether scene switching is performed or not is comprehensively judged by combining at least two change degrees without carrying out dynamic real-time adjustment, so that the new problem is solved.
For the second mode, the process of comparing whether the change degree meets the scene switching condition is as follows: determining that a scene switching condition is met under the condition that the current scene is a static scene and the number of times that the change degree between adjacent data frames is larger than or equal to the third preset change degree meets the first preset number of times requirement; or determining that the scene switching condition is met under the condition that the current scene is a dynamic scene and the number of times that the change degree between the adjacent data frames is smaller than or equal to the fourth preset change degree reaches the second preset number of times requirement. The third preset change degree and the fourth preset change degree can be set according to requirements, and the embodiment is not limited.
The first preset times requirement may include one of the following:
1. And continuously generating I times under the condition that the change degree between the adjacent data frames is greater than or equal to the third preset change degree, wherein I is a positive integer and can be set according to the requirement, if I is 3, determining that the scene switching condition is met when the change degree between the adjacent data frames is greater than or equal to the third preset change degree continuously generating 3 times.
2. J times occur when the degree of change between adjacent data frames is greater than or equal to a third preset degree of change in the number of times judgment section, where J is a positive integer, and may be set as needed, for example, J is 3, and the number of times judgment section is used for determining the position at the start of judgment and the number of degrees of change used for judgment. If the degree of change between adjacent data frames is greater than or equal to the third preset degree of change, determining to start judgment, wherein the number is 5, namely taking the condition that the degree of change between adjacent data frames is greater than or equal to the third preset degree of change as the starting point of the number judgment interval, namely taking the corresponding adjacent data frame as the current adjacent data frame, and judging whether the degree of change between the subsequent 4 adjacent data frames is greater than or equal to the third preset degree of change. And if the condition that the change degree between the adjacent data frames is larger than or equal to the third preset change degree occurs at least J times in the frequency judging section, determining that the scene switching condition is met.
Accordingly, when the degree of change satisfies the following conditions, the scene is not switched:
and the third condition is that the times do not meet the first preset times when the current scene is a static scene.
And the fourth condition is that the times do not meet the second preset times when the current scene is a dynamic scene.
It should be noted that different service scenarios may correspond to different switching policies, where the switching policies are used to control the frequency of switching the scenarios, and switching is frequent if the frequency is high, and occasional switching if the frequency is low.
In one example, in a case where the static scene is a document presentation scene and the dynamic scene is a video presentation scene, the first preset number of times requirement is greater than the second preset number of times requirement.
By way of example, considering actual scenes, such as video conferences, which are typically most of the time in or hoped for in a static scene, e.g., document presentation (e.g., PPT presentation) scenes, the scene determination strategy may also favor static scenes to some extent.
Correspondingly, when the preset times requirement is set, the first preset times requirement is larger than the second preset times requirement, so that under the condition that the times of changing degree between adjacent data frames is larger than or equal to the third preset changing degree in a static scene are more, the switching to a dynamic scene is triggered; and under the condition that the number of times that the change degree between adjacent data frames is smaller than or equal to the fourth preset change degree is relatively smaller in a dynamic scene (such as a video display scene), the switching to the static scene can be triggered.
In one possible embodiment, the switching strategy is determined according to the duty cycle of the static scene, the dynamic scene and the mixed scene in the traffic scene. When the static scene or the dynamic scene is higher, if the static scene is 50%, the scene switching frequency is low, so that the influence of clear and fuzzy images on the appearance is avoided; when the mixed scene is relatively high, if the mixed scene is 60%, the frequency of scene switching is high. The frequency can be adjusted in height by correspondingly setting the first preset change degree and the second preset change degree in the first mode, and the number of times in the second mode, namely the I and J of the frequency judgment interval, the third preset change degree and the fourth preset change degree. If the difference between the first preset change degree and the second preset change degree is large, the frequency is low, and if the difference between the first preset change degree and the second preset change degree is small, the frequency is high; and if the number of times is larger in the I and J settings of the judgment interval, and/or if the difference between the third preset change degree and the fourth preset change degree is large, the frequency is low, and/or if the difference between the third preset change degree and the fourth preset change degree is small, the frequency is high.
Furthermore, for the setting scenario:
That is, when the number of pixel changes is greater than the preset number of changes, the target encoding scene may be set as a dynamic scene, and when the number of pixel changes is less than or equal to the preset number of changes, the target encoding scene may be set as a static scene.
It should be noted that, for the direct setting scenario, it includes immediate setting and non-immediate setting as well. How to perform the instant setting and the non-instant setting specifically can refer to the schemes of the instant switching and the non-instant switching, and this embodiment is not repeated.
In addition, the adaptive coding method provided by the embodiment of the application can further comprise the following steps:
If the current scene is a static scene, if the change degree between the current data frame to be encoded and the previous data frame in the time sequence of the current data frame to be encoded is greater than or equal to a fifth preset change degree and the change degree between adjacent data frames does not meet the scene switching condition, replacing at least one subsequent data frame in the time sequence of the current data frame to be encoded in the data to be encoded with the current data frame to be encoded; wherein the number of data frames to be replaced is related to the size of the degree of variation.
For example, the number of data frames replaced at different levels of variation may not be exactly the same; the number of data frames replaced in the case of a large degree of change is not smaller than the number of data frames replaced in the case of a small degree of change.
In one example, the number of replaced data frames may be positively correlated with the degree of change, i.e., the greater the degree of change, the greater the number of replaced frames.
The extent of the change and the number of data frames replaced are illustrated by way of example below.
For example, assuming that the fifth preset variation degree is 1, in the case that the variation degree between the current data frame to be encoded and the previous data frame in time sequence in the data to be encoded is 2, the current video picture may be considered to have a small variation, for example, the current video picture is in a still picture presentation (such as a host of a video conference performing a scribing operation on a PPT page on a corresponding text) process, in which case the super-code rate probability is smaller, and fewer frames may be replaced in the above manner.
Under the condition that the change degree between the current data frame to be encoded and the previous data frame in the time sequence in the data to be encoded is 3, the current video picture is considered to be changed greatly, and the current video picture is still in a static scene, the change degree is probably caused by a picture switching process (such as PPT page turning), in this case, the display of normal picture content is not affected when the frame in the switching process is replaced by the frame before switching, and if the frame replacement is not performed, the condition of super code rate is easy to occur, so that more frames can be replaced according to the mode.
For example, if the current scene is a dynamic scene or the degree of change between adjacent data frames does not satisfy the scene switching condition or the degree of change between the current data frame to be encoded and the previous data frame in the time sequence of the data to be encoded is smaller than the fifth preset degree of change, the replacement process is not performed.
It should be noted that, since the current scene is a static scene, the static scene corresponds to a smaller QP value, which may cause the code rate to exceed the setting range. When the data frame in the data to be coded is coded, residual coding (Forward Binary Coding, FBC) is adopted, so that the at least one data frame is replaced by the current data frame to be coded, the data quantity when the data to be coded is coded can be reduced, the code rate is reduced, and the probability of occurrence of the super code rate is reduced.
The fifth preset change degree may be set as required, and if the change degree includes a change degree of 0-5, the fifth preset change degree may be any one of 6 change degrees of 0-5.
For example, in judging the magnitude of the variation degree and the fifth preset variation degree, judgment may be made for each frame of data frame. Or judging once every certain frame number (3); or conditionally, the data frame, which may improve coding efficiency.
For example, the frequency of comparing the variation between the current data frame to be encoded and the previous data frame in the time sequence of the data to be encoded with the fifth preset variation may be a fixed frequency, i.e. a fixed interval of several frames (e.g. 2 frames).
Or, the frequency of comparing the variation degree between the current data frame to be encoded and the previous data frame in the time sequence in the data to be encoded with the fifth preset variation degree may be related to the variation degree.
In one example, the frequency of the comparison may be positively correlated with the degree of variation. I.e., the degree of variation is greater, the frequency of comparison may be correspondingly greater, and the number of skipped video frames may be relatively smaller; with a lesser degree of variation, the frequency of comparison may be correspondingly less and the number of skipped video frames may be relatively greater.
For example, in the case where the degree of change is the degree of change 1, it may be considered that the current video picture is changed very little, for example, in the process of still picture presentation (such as PPT page presentation), in which case, the next two frames of data frames may be skipped, and the next frame of the next two frames of data frames may be judged.
In the case that the degree of change is the degree of change 3, it can be considered that a certain degree of change occurs in the current video picture, and the PPT page is displayed as an example, and the current process may be in the PPT page turning process, in this case, a subsequent one-frame data frame may be skipped, and the next frame of the subsequent one-frame data frame may be judged.
The replacement of the current data frame to be encoded with at least one data frame subsequent to the current data frame to be encoded in time sequence in the data to be encoded is performed according to a corresponding process, wherein the corresponding process is to encode the same frame, such as encoding PSkip frames.
The method includes that at least one data frame which is subsequent to the current data frame in time sequence in the data to be encoded is replaced by the current data frame to be encoded, which is equivalent to discarding an original frame of the at least one data frame which is subsequent to the current data frame in time sequence in the data to be encoded, so that the condition of exceeding code rate due to large change of adjacent frames in a static scene is avoided.
For the dynamic scene, since the QP value is relatively set to be relatively large, the situation of picture blocking is easily caused by frame loss in the dynamic scene under the condition that the super code rate is not generated due to large variation of adjacent data frames. Thus, for dynamic scenes, frame replacement operations in the manner described above may not be required.
In summary, taking the fifth preset change degree as the change degree 2 and determining that the fixed frame number is 2 once every fixed frame number (taking fixed frame interval detection as an example), as shown in fig. 4, 401 is a time sequence direction, from left to right, the time sequence is from small to large, 402 is a data frame with sequence number (data frame sequence number) of x+1, the sequence numbers of the data frames along the time sequence direction are sequentially X to x+9, that is, 403 is a data frame with sequence number x+4, and 404 is a data frame of x+7. Wherein, the degree of change of the data frame 402 with respect to the previous data frame in its time sequence is a degree of change 1, the degree of change of the data frame 403 with respect to the previous data frame in its time sequence is a degree of change 3, and the degree of change of the data frame 404 with respect to the previous data frame in its time sequence is a degree of change 5. On this basis, after determining that the degree of change corresponding to the data frame 402 is the degree of change 1, skipping two subsequent frames of data frames, that is, not performing the above-mentioned determination and not performing the above-mentioned corresponding processing; judging the data frame 403, after determining that the corresponding change degree of the data frame 403 is the change degree 3, not judging the two frames of data frames with the sequence numbers of X+5 and X+6, and performing the corresponding processing on the data frame with the sequence number of X+5; the data frame 404 is judged, and after the degree of change corresponding to the data frame 404 is determined to be the degree of change 5, the judgment is not performed on two frames of data frames with serial numbers of x+8 and x+9, but the corresponding processing is performed on both frames of data frames.
Based on the same application concept as the above method, an adaptive coding apparatus is provided in an embodiment of the present application, as shown in fig. 5, which is a schematic structural diagram of the apparatus, where the apparatus may include:
An obtaining unit 501, configured to obtain a data frame from data to be encoded according to a preset time interval;
A determining unit 502, configured to determine a target encoding scene when determining that a scene switching condition is satisfied according to a current encoding scene and an acquired degree of change between adjacent data frames in the data frames;
and the encoding unit 503 is configured to encode the data to be encoded according to the QP value corresponding to the target encoding scene.
In some embodiments, the determining unit 502 is specifically configured to determine a target pixel location in a data frame; and determining the change degree between the adjacent data frames according to the difference between the target pixel positions in the adjacent data frames.
In some embodiments, the data frame is divided into a plurality of macroblocks, and for any macroblock, the target pixel location in the data frame includes a specified pixel location in the macroblock;
the determining unit 502 is specifically configured to determine, for any target pixel position, a first target difference value corresponding to the target pixel position in an adjacent data frame; wherein the first target difference is a difference between target component values; and determining the change degree between adjacent data frames according to the first target difference value corresponding to each target pixel position.
In some embodiments, the target component values include at least two of a Y component value, a U component value, and a V component value; the first target difference value comprises a target sub-difference value corresponding to each component value;
The determining unit 502 is specifically configured to calculate, for any component value included in the target component values, a sum of absolute values of target sub-difference values corresponding to each target pixel position and the component value, to obtain a target sub-difference value sum; and determining the change degree between adjacent data frames according to the sum of target sub-difference values corresponding to the component values in the target component values.
In some embodiments, the determining unit 502 is specifically configured to determine, for any macroblock, a mean value of target component values of all pixels in the macroblock, and determine, according to the mean value, a second target difference value corresponding to the macroblock in an adjacent data frame; wherein the second target difference is a difference between the means; determining a larger value between the absolute value of the first target difference value corresponding to the macro block and the absolute value of the second target difference value to obtain an effective value corresponding to the macro block; and determining the change degree between adjacent data frames according to the sum of the effective values corresponding to the macro blocks.
In some embodiments, a frame of data is divided into a plurality of macroblocks;
The determining unit 502 is specifically configured to determine, for any macroblock, a mean value of target component values of all pixels in the macroblock, take the mean value as a target component value corresponding to the macroblock, and determine a third target difference value corresponding to the macroblock in an adjacent data frame; wherein the third target difference is a difference between target component values; and determining the change degree between adjacent data frames according to the sum of absolute values of third target difference values corresponding to the macro blocks.
In some embodiments, the determining unit 502 is specifically configured to determine, for the data frames within the preset statistical duration, information change conditions between each pair of adjacent data frames respectively; and determining the change degree between the adjacent data frames according to the information change condition between each pair of adjacent data frames in the preset statistical time length.
In some embodiments, the scene cut includes an instant cut;
The determining unit 502 is specifically configured to determine that a scene switching condition is met when the current scene is a static scene and the degree of change between the adjacent data frames is greater than or equal to a first preset degree of change; the target coding scene is a dynamic scene; or determining that a scene switching condition is met under the condition that the current scene is the dynamic scene and the change degree between the adjacent data frames is smaller than or equal to a second preset change degree; the target coding scene is a static scene;
Wherein the first preset variation degree is greater than the second preset variation degree.
In some embodiments, the scene cuts include non-instant cuts;
The determining unit 502 is specifically configured to determine that a scene switching condition is met when the current scene is a static scene and the number of times that the degree of change between adjacent data frames is greater than or equal to a third preset degree of change meets a first preset number of times requirement; or determining that the scene switching condition is met under the condition that the current scene is a dynamic scene and the number of times that the change degree between adjacent data frames is smaller than or equal to the fourth preset change degree reaches the second preset number of times requirement;
Under the condition that the static scene is a document display scene and the dynamic scene is a video display scene, the first preset times requirement is larger than the second preset times requirement.
In some embodiments, the encoding unit 503 is further configured to, in a case where the current scene is a static scene, replace at least one data frame that is subsequent to the current data frame to be encoded in the time sequence in the data to be encoded with the current data frame to be encoded if the degree of change between the current data frame to be encoded and the previous data frame in the time sequence in the data to be encoded is greater than or equal to a fifth preset degree of change, and the degree of change between adjacent data frames does not satisfy the scene switching condition;
Wherein the number of the replaced data frames is related to the magnitude of the degree of variation between the current data frame to be encoded and the previous data frame in the time sequence of the current data frame in the data to be encoded;
the frequency of comparing the change degree between the current data frame to be encoded and the previous data frame in the time sequence of the current data frame to be encoded with the fifth preset change degree is related to the change degree between the current data frame to be encoded and the previous data frame in the time sequence of the current data frame to be encoded and the data to be encoded.
The embodiment of the application also provides electronic equipment, which comprises a processor and a memory, wherein the memory is used for storing a computer program; and a processor for implementing the adaptive coding method described above when executing the program stored on the memory.
Fig. 6 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application. The electronic device may include a processor 601, a memory 602 storing machine-executable instructions. The processor 601 and memory 602 may communicate via a system bus 603. Also, the processor 601 may perform the adaptive encoding method described above by reading and executing machine-executable instructions corresponding to the adaptive encoding logic in the memory 602.
The memory 602 referred to herein may be any electronic, magnetic, optical, or other physical storage device that may contain or store information, such as executable instructions, data, or the like. For example, a machine-readable storage medium may be: RAM (Radom Access Memory, random access memory), volatile memory, non-volatile memory, flash memory, a storage drive (e.g., hard drive), a solid state disk, any type of storage disk (e.g., optical disk, dvd, etc.), or a similar storage medium, or a combination thereof.
In some embodiments, a machine-readable storage medium, such as memory 602 in fig. 6, is also provided, having stored thereon machine-executable instructions that when executed by a processor implement the adaptive encoding method described above. For example, the machine-readable storage medium may be ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
Embodiments of the present application also provide a computer program product storing a computer program and causing a processor to perform the adaptive encoding method described above when the processor executes the computer program.
Claims (12)
1. An adaptive encoding method, comprising:
acquiring a data frame from data to be coded according to a preset time interval;
Determining a target coding scene under the condition that scene switching conditions are met according to the current coding scene and the acquired change degree between adjacent data frames in the data frames;
coding the data to be coded according to the QP value corresponding to the target coding scene;
wherein the method further comprises:
if the current scene is a static scene, if the change degree between the current data frame to be encoded and the previous data frame in the time sequence of the current data frame to be encoded is greater than or equal to a fifth preset change degree and the change degree between adjacent data frames does not meet the scene switching condition, replacing at least one subsequent data frame in the time sequence of the current data frame to be encoded in the data to be encoded with the current data frame to be encoded;
Wherein the number of the replaced data frames is related to the magnitude of the degree of variation between the current data frame to be encoded and the previous data frame in the time sequence of the current data frame in the data to be encoded;
the frequency of comparing the change degree between the current data frame to be encoded and the previous data frame in the time sequence of the current data frame to be encoded with the fifth preset change degree is related to the change degree between the current data frame to be encoded and the previous data frame in the time sequence of the current data frame to be encoded and the data to be encoded.
2. The method of claim 1, wherein the degree of variation between adjacent ones of the acquired data frames is determined by:
determining a target pixel location in the data frame;
and determining the change degree between the adjacent data frames according to the difference between the target pixel positions in the adjacent data frames.
3. The method of claim 2, wherein the data frame is divided into a plurality of macroblocks, and wherein for any macroblock, the target pixel location in the data frame comprises a specified pixel location in the macroblock;
the determining the variation degree between the adjacent data frames according to the difference between the target pixel positions in the adjacent data frames comprises the following steps:
For any target pixel position, determining a first target difference value corresponding to the target pixel position in the adjacent data frame; wherein the first target difference is a difference between target component values;
And determining the change degree between adjacent data frames according to the first target difference value corresponding to each target pixel position.
4. A method according to claim 3, wherein the target component values comprise at least two of Y component values, U component values, and V component values; the first target difference value comprises a target sub-difference value corresponding to each component value;
The determining the change degree between the adjacent data frames according to the first target difference value corresponding to each target pixel position includes:
For any component value included in the target component values, calculating the sum of absolute values of target sub-difference values corresponding to the component values of each target pixel position to obtain a target sub-difference value sum;
and determining the change degree between adjacent data frames according to the sum of target sub-difference values corresponding to the component values in the target component values.
5. A method according to claim 3, wherein determining the degree of variation between adjacent data frames based on the first target difference value corresponding to each target pixel position comprises:
for any macro block, determining the average value of target component values of all pixels in the macro block, and determining a second target difference value corresponding to the macro block in an adjacent data frame according to the average value; wherein the second target difference is a difference between the means;
Determining a larger value between the absolute value of the first target difference value corresponding to the macro block and the absolute value of the second target difference value to obtain an effective value corresponding to the macro block;
and determining the change degree between adjacent data frames according to the sum of the effective values corresponding to the macro blocks.
6. The method of claim 2, wherein the data frame is divided into a plurality of macroblocks;
the determining the variation degree between the adjacent data frames according to the difference between the target pixel positions in the adjacent data frames comprises the following steps:
for any macro block, determining the average value of target component values of all pixels in the macro block, taking the average value as the target component value corresponding to the macro block, and determining a third target difference value corresponding to the macro block in an adjacent data frame; wherein the third target difference is a difference between target component values;
and determining the change degree between adjacent data frames according to the sum of absolute values of third target difference values corresponding to the macro blocks.
7. The method of claim 1, wherein the degree of variation between adjacent ones of the acquired data frames is determined by:
for data frames in a preset statistical time length, respectively determining information change conditions between each pair of adjacent data frames;
And determining the change degree between the adjacent data frames according to the information change condition between each pair of adjacent data frames in the preset statistical time length.
8. The method of claim 1, wherein the scene cut comprises an instant cut;
and determining that the scene switching condition is met according to the current coding scene and the change degree between the adjacent data frames, wherein the method comprises the following steps:
determining that a scene switching condition is met under the condition that the current scene is a static scene and the change degree between the adjacent data frames is greater than or equal to a first preset change degree;
the target coding scene is a dynamic scene;
Or alternatively
Determining that a scene switching condition is met under the condition that the current scene is the dynamic scene and the change degree between the adjacent data frames is smaller than or equal to a second preset change degree;
the target coding scene is a static scene;
Wherein the first preset variation degree is greater than the second preset variation degree.
9. The method of claim 1, wherein the scene cuts include non-instant cuts;
and determining that the scene switching condition is met according to the current coding scene and the change degree between the adjacent data frames, wherein the method comprises the following steps:
determining that a scene switching condition is met under the condition that the current scene is a static scene and the number of times that the change degree between adjacent data frames is larger than or equal to the third preset change degree meets the first preset number of times requirement;
Or alternatively
Determining that the scene switching condition is met under the condition that the current scene is a dynamic scene and the number of times that the change degree between adjacent data frames is smaller than or equal to the fourth preset change degree reaches the second preset number of times requirement;
Under the condition that the static scene is a document display scene and the dynamic scene is a video display scene, the first preset times requirement is larger than the second preset times requirement.
10. An adaptive coding apparatus, comprising:
the acquisition unit is used for acquiring a data frame from the data to be encoded according to a preset time interval;
The determining unit is used for determining a target coding scene under the condition that the scene switching condition is met according to the current coding scene and the acquired change degree between adjacent data frames in the data frames;
the encoding unit is used for encoding the data to be encoded according to the QP value corresponding to the target encoding scene;
The encoding unit is further configured to, when the current scene is a static scene, replace at least one data frame that is subsequent to the current data frame to be encoded in the time sequence of the current data frame to be encoded with the current data frame to be encoded if the degree of change between the current data frame to be encoded and the previous data frame in the time sequence of the current data frame to be encoded is greater than or equal to a fifth preset degree of change, and the degree of change between adjacent data frames does not satisfy the scene switching condition;
Wherein the number of the replaced data frames is related to the magnitude of the degree of variation between the current data frame to be encoded and the previous data frame in the time sequence of the current data frame in the data to be encoded;
the frequency of comparing the change degree between the current data frame to be encoded and the previous data frame in the time sequence of the current data frame to be encoded with the fifth preset change degree is related to the change degree between the current data frame to be encoded and the previous data frame in the time sequence of the current data frame to be encoded and the data to be encoded.
11. An electronic device comprising a processor and a memory, wherein,
A memory for storing a computer program;
A processor for implementing the method of any of claims 1-9 when executing a program stored on a memory.
12. A computer program product, characterized in that the computer program product has stored therein a computer program which, when executed by a processor, implements the method of any of claims 1-9.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410859104.1A CN118413674B (en) | 2024-06-27 | 2024-06-27 | Adaptive coding method, apparatus, device and computer program product |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410859104.1A CN118413674B (en) | 2024-06-27 | 2024-06-27 | Adaptive coding method, apparatus, device and computer program product |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN118413674A CN118413674A (en) | 2024-07-30 |
| CN118413674B true CN118413674B (en) | 2024-08-23 |
Family
ID=91991032
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202410859104.1A Active CN118413674B (en) | 2024-06-27 | 2024-06-27 | Adaptive coding method, apparatus, device and computer program product |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN118413674B (en) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101795415A (en) * | 2010-04-22 | 2010-08-04 | 杭州华三通信技术有限公司 | Method and device for controlling code rate in video coding |
| CN102625106A (en) * | 2012-03-28 | 2012-08-01 | 上海交通大学 | Scene-adaptive screen coding rate control method and system |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8654848B2 (en) * | 2005-10-17 | 2014-02-18 | Qualcomm Incorporated | Method and apparatus for shot detection in video streaming |
| KR101336445B1 (en) * | 2007-06-08 | 2013-12-04 | 삼성전자주식회사 | Method for rate control in video encoding |
| KR20140110221A (en) * | 2013-03-06 | 2014-09-17 | 삼성전자주식회사 | Video encoder, method of detecting scene change and method of controlling video encoder |
| CN110365983B (en) * | 2019-09-02 | 2019-12-13 | 珠海亿智电子科技有限公司 | Macroblock-level code rate control method and device based on human eye vision system |
| CN113766226B (en) * | 2020-06-05 | 2025-10-17 | 深圳市中兴微电子技术有限公司 | Image coding method, device, equipment and storage medium |
| CN117176955A (en) * | 2023-07-27 | 2023-12-05 | 浙江大华技术股份有限公司 | Video encoding method, video decoding method, computer device, and storage medium |
| CN116866604A (en) * | 2023-07-31 | 2023-10-10 | 展讯半导体(成都)有限公司 | Image processing method and device |
-
2024
- 2024-06-27 CN CN202410859104.1A patent/CN118413674B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101795415A (en) * | 2010-04-22 | 2010-08-04 | 杭州华三通信技术有限公司 | Method and device for controlling code rate in video coding |
| CN102625106A (en) * | 2012-03-28 | 2012-08-01 | 上海交通大学 | Scene-adaptive screen coding rate control method and system |
Also Published As
| Publication number | Publication date |
|---|---|
| CN118413674A (en) | 2024-07-30 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN113766226B (en) | Image coding method, device, equipment and storage medium | |
| CN101816182B (en) | Image encoding device, image encoding method, image encoding and decoding system | |
| TWI743919B (en) | Video processing apparatus and processing method of video stream | |
| US6430222B1 (en) | Moving picture coding apparatus | |
| US12125171B2 (en) | Video denoising method and apparatus, and storage medium | |
| JP4153202B2 (en) | Video encoding device | |
| EP0449555B1 (en) | Image encoding apparatus | |
| US6873654B1 (en) | Method and system for predictive control for live streaming video/audio media | |
| US20050123282A1 (en) | Graphical symbols for H.264 bitstream syntax elements | |
| JP5514338B2 (en) | Video processing device, video processing method, television receiver, program, and recording medium | |
| WO2017160606A1 (en) | Opportunistic frame dropping for variable-frame-rate encoding | |
| JP2018101866A (en) | Image processing apparatus, image processing method, and program | |
| EP3718306A1 (en) | Cluster refinement for texture synthesis in video coding | |
| JP2018101867A (en) | Image processing apparatus, image processing method, and program | |
| CN110740316A (en) | Data coding method and device | |
| CN118301361A (en) | Video compression method and device and electronic equipment | |
| CN111770334B (en) | Data encoding method and device, and data decoding method and device | |
| CN118413674B (en) | Adaptive coding method, apparatus, device and computer program product | |
| US8934733B2 (en) | Method, apparatus, and non-transitory computer readable medium for enhancing image contrast | |
| CN110381315A (en) | Bit rate control method and device | |
| EP4013053A1 (en) | Adaptive quality boosting for low latency video coding | |
| US8369423B2 (en) | Method and device for coding | |
| CN117998087A (en) | Video coding parameter adjustment method, device and equipment based on content attribute | |
| CN114040197A (en) | Video detection method, device, equipment and storage medium | |
| CN113810692A (en) | Method for framing changes and movements, image processing apparatus and program product |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |