CN108810537B

CN108810537B - Picture transcoding method and device and image processing equipment

Info

Publication number: CN108810537B
Application number: CN201710283315.5A
Authority: CN
Inventors: 刘海军; 王诗涛; 罗斌姬; 黄晓政
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2017-04-26
Filing date: 2017-04-26
Publication date: 2023-04-07
Anticipated expiration: 2037-04-26
Also published as: WO2018196614A1; TWI669947B; TW201907714A; CN108810537A

Abstract

The embodiment of the invention discloses a method and a device for transcoding pictures and image processing equipment, wherein the method comprises the following steps: after the target picture is input into a transcoding system, decoding the target picture; determining the target picture as a natural picture according to a decoding result, and performing transcoding video coding input by adopting a lossy compression mode as an input source; and determining the target picture as an unnatural picture according to the decoding result, and performing transcoding video coding input by adopting a lossless compression mode as an input source. By adopting the scheme of the embodiment of the invention, the picture compression efficiency can be improved on the premise of retaining the necessary effective information of the target picture.

Description

Picture transcoding method and device and image processing equipment

Technical Field

The present invention relates to the field of image processing, and in particular, to a method and an apparatus for transcoding a picture, and an image processing device.

Background

With the development of the mobile internet, the download traffic of the terminal device is greatly increased, and the traffic consumed by the pictures occupies a large proportion in the download traffic of the user. The transmission of a large number of pictures to the network brings a great bearing pressure. The mainstream picture formats currently transmitted in the network include: JPEG (Joint Photographic Experts Group, the first international image compression standard), PNG (Portable Network Graphic Format), GIF (Graphics Interchange Format), and the like.

Current transcoding schemes for pictures are for the JPEG, PNG, and GIF format pictures mentioned above; the problem of low compression efficiency exists, or the problem of excessive loss of effective information is caused.

The conventional picture transcoding scheme is to determine what source format is used as input for video algorithm compression in a predetermined or user-specified manner. For example: performing video algorithm compression by taking a YUV420 information source format as input; alternatively, video algorithm compression is performed with the YUV444 source format as input. The former may cause a problem of excessive loss of effective information, and the latter may cause a problem of low compression efficiency.

Therefore, the image transcoding scheme has the problems of low compression efficiency or excessive loss of effective information.

Disclosure of Invention

The embodiment of the invention provides a picture transcoding method, a picture transcoding device and image processing equipment, which are used for improving the picture compression efficiency on the premise of keeping necessary effective information of a picture.

In one aspect, an embodiment of the present invention provides a method for transcoding a picture, including:

under the condition that the target picture is a natural picture, a lossy compression mode is adopted as an input source to carry out transcoding video coding input;

and under the condition that the target picture is an unnatural picture, a lossless compression mode is adopted as an input source to carry out transcoding video coding input.

In an optional implementation, the method further includes:

after the target picture is input into a transcoding system, decoding the target picture;

the situation that the target picture is a natural picture comprises the following steps: decoding the target picture and then determining that the target picture is a natural picture;

the case that the target picture is an unnatural picture includes: and after decoding the target picture, determining that the target picture is an unnatural picture.

In an optional implementation manner, the target picture being an unnatural picture includes:

the target picture contains text information, or the target picture is automatically generated by the device.

In an optional implementation manner, the step of including text information in the target picture includes:

the format of the target picture is PNG, and the target picture comprises a text data block;

or the target picture is in a GIF format and comprises a picture file extension data block.

In an optional implementation manner, the target picture contains text information, or the target picture is automatically generated by a device and includes:

and identifying the content of the target picture, and determining that the target picture contains text information according to the result of the content identification, or determining that the target picture is automatically generated by equipment according to the result of the content identification.

In an optional implementation manner, the target picture is a natural picture and includes:

the target picture includes camera information.

In an alternative implementation, the lossy compression method includes: YUV420; the lossless compression method comprises the following steps: YUV444.

In an optional implementation manner, the decoding the target picture includes: decoding the target picture into RGB data.

In another aspect, an embodiment of the present invention provides a method for transcoding a picture, including:

under the condition that the high-frequency component of the target picture is smaller than a first threshold value, a lossy compression mode is adopted as an input source to carry out transcoding video coding input;

and under the condition that the high-frequency component of the target picture is larger than a second threshold value, a lossless compression mode is adopted as an input source to carry out transcoding video coding input.

In an optional implementation manner, the determining that the high-frequency component of the target picture is smaller than the first threshold value includes:

performing high-frequency component detection on the target picture, and determining that the high-frequency component of the target picture is smaller than a first threshold;

the high frequency component of the target picture being greater than a second threshold comprises:

and performing high-frequency component detection on the target picture, and determining that the high-frequency component of the target picture is greater than a second threshold value.

In an optional implementation manner, the step of determining that the high-frequency component of the target picture is greater than the second threshold includes:

In an optional implementation manner, the target picture containing text information includes:

the target picture includes camera information.

In an optional implementation manner, before performing content identification on the target picture, the method further includes:

decoding the target picture into RGB data.

The embodiment of the invention provides a device for transcoding pictures, which comprises:

the transcoding unit is used for transcoding video coding input by adopting a lossy compression mode as an input source under the condition that the target picture is a natural picture; and under the condition that the target picture is an unnatural picture, a lossless compression mode is adopted as an input source to carry out transcoding video coding input.

In an optional implementation, the apparatus further includes:

the decoding unit is used for decoding the target picture after the target picture is input into the transcoding system;

the transcoding unit is specifically configured to determine that the target picture is a natural picture after decoding the target picture; or determining that the target picture is an unnatural picture after decoding the target picture.

or the target picture is in a format of GIF, and the target picture comprises a picture file extension data block.

In an optional implementation manner, the apparatus further includes:

and the content identification unit is used for carrying out content identification on the target picture, determining that the target picture contains text information according to the result of the content identification, or determining that the target picture is automatically generated by equipment according to the result of the content identification.

In an optional implementation manner, the target picture is a natural picture, and the method includes:

the target picture includes camera information.

In an optional implementation manner, the decoding unit is specifically configured to decode the target picture into RGB data.

In four aspects, an embodiment of the present invention provides an apparatus for transcoding a picture, including:

the transcoding unit is used for transcoding video coding input by adopting a lossy compression mode as an input source under the condition that the high-frequency component of the target picture is smaller than a first threshold value; and under the condition that the high-frequency component of the target picture is larger than a second threshold value, a lossless compression mode is adopted as an input source to carry out transcoding video coding input.

In an optional implementation, the apparatus further includes:

a component detection unit, configured to determine that the high-frequency component of the target picture is smaller than a first threshold or that the high-frequency component of the target picture is larger than a second threshold by performing high-frequency component detection on the target picture.

In an optional implementation manner, the step of increasing the high-frequency component of the target picture to be greater than the second threshold value includes:

In an optional implementation, the apparatus further includes:

the target picture includes camera information.

In an optional implementation manner, the transcoding unit is configured to decode the target picture into RGB data before the content identification unit performs content identification on the target picture.

In an alternative implementation, the lossy compression method includes: YUV420; the lossless compression mode comprises the following steps: YUV444.

An embodiment of the present invention provides an image processing apparatus, including: an input-output device, a processor, and a memory;

the processor is used for performing transcoding video coding input by adopting a lossy compression mode as an input source under the condition that the target picture is a natural picture; and under the condition that the target picture is an unnatural picture, a lossless compression mode is adopted as an input source to carry out transcoding video coding input.

In an optional implementation manner, the processor is further configured to decode the target picture after the target picture is input to the transcoding system;

the case that the target picture is an unnatural picture includes: and after the target picture is decoded, determining that the target picture is an unnatural picture.

In an optional implementation manner, the target picture is an unnatural picture and the method includes:

In an optional implementation manner, the processor is further configured to perform content identification on the target picture, determine that the target picture includes text information according to a result of the content identification, or determine that the target picture is automatically generated by a device according to a result of the content identification.

the target picture includes camera information.

An embodiment of the present invention further provides an image processing apparatus, including: an input-output device, a processor, and a memory;

the processor is used for performing transcoding video coding input by adopting a lossy compression mode as an input source under the condition that the high-frequency component of the target picture is smaller than a first threshold value; and under the condition that the high-frequency component of the target picture is larger than a second threshold value, a lossless compression mode is adopted as an input source to carry out transcoding video coding input.

In an optional implementation manner, the processor is further configured to perform high-frequency component detection on the target picture, and determine that the high-frequency component of the target picture is smaller than a first threshold value or that the high-frequency component of the target picture is larger than a second threshold value after performing the high-frequency component detection on the target picture.

In an optional implementation manner, the processor is further configured to perform content recognition on the target picture, determine that the target picture includes text information according to a result of the content recognition, or determine that the target picture is automatically generated by a device according to a result of the content recognition.

the target picture includes camera information.

In an optional implementation manner, the processor is further configured to decode the target picture into RGB data before performing content recognition on the target picture.

Seventh aspect an embodiment of the present invention further provides an image processing apparatus, including: picture decoder, source converter and video encoder still include: a source selector;

the target picture is decoded by the picture decoder and then output to the signal source selector;

the signal source selector selects a lossy compression mode under the condition that the target picture is determined to be a natural picture, and selects a lossless compression mode under the condition that the target picture is determined to be a non-natural picture;

and the source converter converts the decoded target picture according to the compression mode determined by the source selector and outputs the converted target picture to the video encoder.

In an optional implementation manner, the source selector determines that the target picture is an unnatural picture if it is determined that the target picture contains text information or if it is determined that the target picture is automatically generated by a device.

In an optional implementation manner, the determining that the target picture contains text information by the source selector includes: determining that the format of the target picture is PNG, wherein the target picture comprises text data blocks; or determining that the format of the target picture is GIF, wherein the target picture comprises a picture file extension data block.

In an optional implementation, the determining that the target picture is automatically generated by the device by the source selector includes: and identifying the content of the target picture, and determining that the target picture contains text information according to the result of the content identification, or determining that the target picture is automatically generated by equipment according to the result of the content identification.

In an optional implementation manner, the determining, by the source selector, that the target picture is a natural picture includes: and determining that the target picture contains camera information.

In an alternative implementation, the picture decoder is configured to decode the target picture into RGB data.

The embodiment of the invention also provides an image processing device, which comprises: picture decoder, source converter and video encoder still include: a source selector;

the source selector selects a lossy compression mode under the condition that the high-frequency component of the target picture is smaller than a first threshold value, and selects a lossless compression mode under the condition that the high-frequency component of the target picture is larger than a second threshold value;

In an optional implementation manner, the source selector is further configured to perform high frequency component detection on the target picture, determine that the high frequency component of the target picture is smaller than a first threshold, or determine that the high frequency component of the target picture is larger than a second threshold.

In an optional implementation, the source selector is configured to determine that the high frequency component of the target picture is greater than a second threshold, and includes:

determining that the target picture contains text information, or determining that the target picture is automatically generated by a device.

In an optional implementation manner, the determining, by the source selector, that the target picture contains text information includes:

determining that the format of the target picture is PNG, wherein the target picture comprises a text data block; or determining that the format of the target picture is GIF, wherein the target picture comprises a picture file extension data block.

In an optional implementation manner, the source selector is further configured to perform content identification on the target picture; and determining that the target picture contains text information according to the result of the content identification, or determining that the target picture is automatically generated by equipment according to the result of the content identification.

In an optional implementation manner, the source selector, configured to determine that the high frequency component of the target picture is smaller than a first threshold, includes: and determining that the target picture contains camera information.

According to the technical scheme, the embodiment of the invention has the following advantages: dividing the types of the target pictures into natural pictures and non-natural pictures; the non-natural picture contains more high-frequency components, and effective information can be reserved as much as possible by using a compression mode with a lower ratio; natural pictures will have fewer high frequency components, and higher compression efficiency can be achieved using higher compression ratios; therefore, the scheme of the embodiment of the invention can improve the picture compression efficiency on the premise of keeping the necessary effective information of the target picture.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

FIG. 1 is a system architecture diagram of a transcoding system in accordance with an embodiment of the present invention;

FIG. 2A is a diagram illustrating a target picture according to an embodiment of the invention;

FIG. 2B is a diagram illustrating a target picture according to an embodiment of the invention;

FIG. 2C is a diagram illustrating a target picture according to an embodiment of the invention;

FIG. 3 is a schematic flow chart of a method according to an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of an apparatus according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of an apparatus according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of an image processing apparatus according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As shown in fig. 1, which is an architecture diagram of a transcoding system according to an embodiment of the present invention, fig. 1 mainly includes the following parts: the device comprises a picture decoder, a signal source selector, a signal source converter and a video encoder; the above parts can be connected through a bus, wherein the picture decoder is responsible for decoding the data input into the transcoding system, and the decoded data is sent to the source selector; the signal source selector judges whether lossy compression or lossless compression is used according to the decoded data, then the signal source converter converts the decoded data into YUV format data, and then the signal source converter sends the YUV format data to the video encoder. The subsequent process is based on the video encoding process, and is not described herein again.

On one hand, an embodiment of the present invention provides a method for transcoding a picture, which is mainly used for realizing intelligent source selection, and can be mainly realized in a source selector with reference to a system structure shown in fig. 1, where the method includes:

101A: under the condition that the target picture is a natural picture, a lossy compression mode is adopted as an input source to carry out transcoding video coding input;

102A: and under the condition that the target picture is an unnatural picture, a lossless compression mode is adopted as an input source to carry out transcoding video coding input.

It is understood that the above step 101A and step 102A do not have a logical precedence order, and the above two steps are executed alternatively in the execution process, that is, not executed at the same time; because it depends on whether the target image is a natural picture or an unnatural picture. Natural pictures and unnatural pictures are classified based on the way in which the pictures are generated, and natural pictures are naturally generated pictures, for example: photos taken by a camera, natural scene images, and the like; unnatural pictures, typically pictures of a person's person with a generation or device automatically generated, such as: screenshot of an instant messaging interface, game pictures and the like; the screenshot and the game picture of the instant messaging interface may include a part in which people participate and a part automatically generated by equipment; more specifically, taking a game screen as an example: in the instant fighting game, players can communicate by using characters, wherein the character part is a part in which people participate; the other part is the picture content drawn by the device according to the instructions of the computer program. It will be appreciated that, in addition to the screenshot of the instant messaging interface, the screenshot of the whole desktop or other application interface may be used, and any other computer-drawn picture may be used instead of the game screen. In addition, in the present embodiment, the target picture is a picture input to the transcoding system, and the format of the picture may be any picture format; for example: JPEG, PNG, and GIF formats, or other formats.

As an example, fig. 2A, fig. 2B and fig. 2C are possible cases of the three target pictures mentioned in the foregoing example; wherein fig. 2A is a photograph taken by a camera, and fig. 2B and 2C are respectively: screenshot and game picture of the instant messaging interface; fig. 2A is a natural picture, which has less high frequency components, namely: the image transition is more natural than the transition of a soft boundary; fig. 2B and 2C are unnatural pictures, which have many high-frequency components, that is: boundaries where more mutations are present, for example: the lines of the text in fig. 2B have sharp contrast with the background and the transition of the borders changes greatly, and the picture elements in fig. 2C also have obvious borders, which are not as soft as the natural picture.

It should be noted that, in this embodiment, the lossy compression method and the lossless compression method are compression algorithms used for transcoding a picture, and generally, effective information of the picture is reduced by image compression; therefore, in the embodiment of the present invention, "lossless" should be understood as relatively lossless, that is, the lossless compression method may retain more effective information of the target picture than the lossy compression method. For example: YUV444 is a lossless compression method with respect to YUV 420.

In an optional implementation manner, the embodiment of the present invention further provides an operation performed before determining the type of the target picture, and an influence of the operation on the type confirmation of the target picture, which is specifically as follows: the method further comprises the following steps:

the situation that the target picture is a natural picture includes: decoding the target picture and then determining that the target picture is a natural picture;

the case where the target picture is an unnatural picture includes: and after the target picture is decoded, determining that the target picture is an unnatural picture.

In this embodiment, the target picture is decoded first, so as to prepare for the subsequent identification of the picture type; where the decoding can be decoded into RGB format data.

In an optional implementation manner, an implementation scheme of how to specifically determine the unnatural picture is further provided in the embodiments of the present invention, and the implementation scheme includes: the target picture is an unnatural picture and comprises the following steps:

the target picture contains text information, or the target picture is automatically generated by equipment.

Since the non-natural picture has some characteristics, such as text information and automatic generation by the device, it may have other characteristics that cause it to be different from the natural picture; the type of the target picture can be rapidly determined through the characteristics.

In an optional implementation manner, an embodiment of the present invention further provides how to determine that a target picture contains text information, which is specifically as follows: the target picture including text information includes:

In an optional implementation manner, an embodiment of the present invention further provides another implementation scheme for determining an unnatural picture and a natural picture, including: the target picture contains text information, or the target picture is automatically generated by equipment and comprises:

Content identification refers to a technique of identifying specific content contained within a target picture, such as: performing text detection by using a sliding window mode; text detection is carried out through connected domain analysis; there are also implementations that identify specific content based on deep learning methods, such as: full convolution neural networks, pattern recognition, rate distortion optimization, and the like. How to perform content identification to distinguish the type of the target picture is described in detail in the following embodiments, which are not described herein again.

In an optional implementation manner, an embodiment of the present invention further provides a scheme capable of quickly determining that a target picture is a natural picture, which is specifically as follows: the above-mentioned target picture includes for the natural picture:

the target picture comprises camera information.

In the foregoing embodiment, implementation schemes for determining the type of the target picture more quickly may be preferentially used, and in the case that the schemes cannot determine the picture format, the content identification manner is used to determine the type of the target picture.

In an optional implementation manner, in this embodiment, a YUV format is selected as a target format during transcoding based on subsequent video coding to be performed, and a specific transcoding scheme corresponding to a determined compression method is as follows: the lossy compression method includes: YUV420; the lossless compression method includes: YUV444.

In an optional implementation manner, an embodiment of the present invention further provides a specific implementation scheme of decoding before determining a type of a target picture, where the specific implementation scheme is as follows: the decoding the target picture includes: and decoding the target picture into RGB data.

In two aspects, an embodiment of the present invention provides a method for transcoding a picture, where the size of a high-frequency component is used to distinguish picture types, unlike a previous embodiment that uses a natural picture and a non-natural picture to distinguish picture types, the method includes:

101B: under the condition that the high-frequency component of the target picture is smaller than a first threshold value, a lossy compression mode is adopted as an input source to carry out transcoding video coding input;

102B: and under the condition that the high-frequency component of the target picture is larger than a second threshold value, adopting a lossless compression mode as an input source to carry out transcoding video coding input.

Based on the foregoing, the more high frequency components, the more information that would be lost if lossy compression were used, e.g., the text in FIG. 2B would become unclear; in the case of a small high frequency component, a high compression ratio can be used.

In this embodiment, the first threshold and the second threshold are critical points for determining the compression mode, and it should be understood that, in order to avoid collision, the first threshold should be less than or equal to the second threshold. If the first threshold is equal to the second threshold, the high frequency component of the target picture is equal to the first threshold or the second threshold, in this case, the compression mode may be selected according to the technical requirement of picture compression: a lossy compression mode is selected to obtain a higher compression ratio, and a lossless compression mode is selected to keep useful information of more target pictures. If the first threshold is smaller than the second threshold, there is an interval between the first threshold and the second threshold, that is, the interval is greater than or equal to the first threshold and less than or equal to the second threshold, and if the high-frequency component of the target picture falls within the interval, the processing scheme of "the high-frequency component of the target picture is equal to the first threshold or the second threshold" may be referred to, and details are not repeated here.

In an optional implementation manner, the high-frequency component of the target picture may be directly obtained through high-frequency component detection, which is specifically as follows: the step of determining that the high-frequency component of the target picture is smaller than the first threshold value includes:

detecting high-frequency components of the target picture, and determining that the high-frequency components of the target picture are smaller than a first threshold value;

the step of setting the high-frequency component of the target picture to be greater than the second threshold value comprises:

and determining that the high-frequency component of the target picture is greater than a second threshold value through high-frequency component detection on the target picture.

In an alternative implementation manner, the high-frequency component of the target picture may also be obtained without high-frequency component detection, and before the high-frequency component detection is performed, the information contained in the high-frequency component is confirmed to distinguish the type of the high-frequency component, which is as follows: the step of setting the high-frequency component of the target picture to be greater than the second threshold value comprises:

Since the non-natural picture has relatively higher high frequency components, as shown in fig. 2B and fig. 2C, based on the embodiment, the non-natural picture may have some characteristics, such as including text information and being automatically generated by the device, and may have other characteristics that cause a difference between the non-natural picture and the natural picture; by these characteristics, the type of the target picture can be determined quickly, and accordingly, whether the high frequency component is higher than the second threshold value can be determined.

In an optional implementation manner, an embodiment of the present invention further provides how to determine that a target picture contains text information, which is specifically as follows: the target picture containing text information includes:

In an optional implementation manner, an embodiment of the present invention further provides another implementation scheme for determining an unnatural picture and a natural picture, where the implementation scheme is as follows: the target picture contains text information, or the target picture is automatically generated by equipment and comprises:

and identifying the content of the target picture, and determining that the target picture comprises text information according to the result of the content identification, or determining that the target picture is automatically generated by equipment according to the result of the content identification.

Content identification refers to a technique of identifying specific content contained in a target picture, for example: performing text detection by using a sliding window mode; text detection is carried out through connected domain analysis; there are also implementations that identify specific content based on deep learning methods, such as: full convolution neural networks, pattern recognition, rate distortion optimization, and the like. How to perform content identification to distinguish the type of the target picture is described in detail in the following embodiments, which are not described herein again.

In an optional implementation manner, the embodiment of the present invention further provides a scheme capable of quickly determining that a target picture is a natural picture, and since the natural picture generally has a low high-frequency component, it may be determined whether the high-frequency component of the natural picture is smaller than a first threshold by determining the natural picture, specifically as follows: the step of determining that the high-frequency component of the target picture is smaller than the first threshold value includes:

the target picture comprises camera information.

In the foregoing embodiment, implementation schemes for determining the target picture type more quickly may be preferentially used, and in the case that the schemes cannot determine the picture format, the high-frequency component detection manner is used to determine the type of the target picture.

In an optional implementation manner, an embodiment of the present invention further provides a specific implementation scheme of decoding before determining a type of a target picture, where the specific implementation scheme is as follows: before the content identification is performed on the target picture, the method further includes:

and decoding the target picture into RGB data.

Based on the above description, the embodiment of the present invention also provides a specific example embodiment, in which the target picture is decoded into RGB data, and the lossy compression method and the lossless compression method are YUV420 and YUV444, respectively. The embodiment can greatly save the image traffic bandwidth and the storage cost, has wide application scenes, and can be applied to various applications such as social contact, news and tools. For example: in Q, QQ space, QQ browser, tencent sports, tencent news, sunday newspaper, tencent cell phone housekeeper, etc. This is because there may be various pictures in the device to wait for transcoding, and data transmission or storage is performed after transcoding, so there is a demand for saving traffic or saving storage space. The pictures generated by the three application scenarios shown in fig. 2A, fig. 2B and fig. 2C are taken as examples, and there are many drawing descriptions that are not described one by one for the application scenarios.

In this embodiment, the module performing source selection identifies whether the video encoder should use YUV420 or YUV444 for encoding by using the decoding information of the original picture, and the basic method flow, as shown in fig. 3, includes:

301: and inputting the picture into a transcoding system, calling a picture decoder to obtain the header information of the picture, and decoding the original picture into RGB data.

302: the picture type is determined.

In this embodiment, the picture type is determined to be divided into two parts, part a may be executed first, and part B may be executed if part a cannot determine the picture type.

Part A:

302A: and distinguishing the formats of the pictures according to the header information of the pictures.

In this embodiment, joint Photographic Group (JPG), PNG and GIF formats are taken as examples; it is understood that the picture formats are not only three, but also, for example: bitmap (BMP), etc.; therefore, the format of the pictures inputted into the transcoding system will not be limited to only JPG, PNG and GIF, and the above examples should not be construed as limiting the embodiments of the present invention.

302A1: if the picture is the JPG picture, decoding information in a JPG decoder is read, whether information containing camera Exchangeable Image File (exif) exists is judged, if yes, the picture is judged to be a natural photo, namely a natural picture, and YUV420 is used as an input source to be input by a transcoding video encoder.

302A2: if the image is a PNG image, whether a tEXt data block (tEXT or zTxt) is contained in the PNG image information is read, if the image is a PNG image, the image can be determined to be an unnatural image, and YUV444 is directly used as an input source for transcoding video encoder input.

302A3: and if the picture is the GIF picture, reading whether the picture file extension data block is contained in the GIF picture information, and if the picture file extension data block is contained, directly adopting YUV444 as an input source to carry out transcoding video encoder input.

And part B:

302B: determining the picture type through content identification, if the picture contains file information or is a computer generated picture, adopting YUV444 as an input source to carry out input of a transcoding video encoder, and otherwise adopting YUV420 as an input source to carry out input of the transcoding video encoder

The part B may analyze the decoded RGB data by using a picture content recognition module to determine whether RGB includes file information or generates a picture for a computer, in case that the part a above does not directly determine the input source format of the transcoded video encoder. Specific picture content identification includes, but is not limited to, the following:

1. the text detection method in the sliding window mode comprises the following steps: based on a sliding window full-image scanning mode, for each sliding window, a probability response value of a character region is obtained based on texture feature training Adaboost (an iterative algorithm) such as Histogram of Oriented Gradient (HOG), scale-invariant feature transform (SIFT), windowed Fourier transform (Gabor), local Binary Pattern (LBP) and the like, and a Support Vector Machine (SVM) and other two types of classifiers, and then a final character frame is positioned by utilizing connected domain analysis and combining morphology and other operations. If a character area is detected, the transcoding system automatically switches the information source into YUV444 to be used as input for picture transcoding, and if no character is detected, YUV420 is used as input for picture transcoding

2. The text detection method of connected domain analysis comprises the following steps: this is based on underlying rules, such as: region feature extraction (MSER) and a Standard Widget Toolkit (SWT) are firstly divided to obtain connected small Regions, and then a character region and a non-character region are separated in a mode of filtering or judging by a classifier by utilizing information such as color, brightness, edge, shape and the like; if a character area is detected, the transcoding system automatically switches the information source into YUV444 to be used as input for picture transcoding, and if no character is detected, YUV420 is used as input for picture transcoding

3. The method comprises the steps that a Full Convolution neural Network (FCN) based on a deep learning method analyzes picture contents by adopting a pattern recognition method, and recognition of the picture contents is achieved through a machine learning method and special big data training of the Internet; the main identification content comprises a character area and a computer-generated picture, if the two parts are included, the transcoding system automatically switches a source into YUV444 to be used as an input for picture transcoding, and if the two parts are not detected, the transcoding system uses YUV420 as an input for picture transcoding

4. A Rate-Distortion Optimization (RDO) method, which is mainly based on that, for text and images generated by computers and the like, YUV420 coding obviously causes a significant loss of BD-Rate (parameter for representing relationship between code Rate and performance) Structural Similarity (SSIM) or BD-Rate Peak Signal to Noise Ratio (PSNR) compared with YUV444; therefore, the transcoding system can use this characteristic to compare BD-rate SSIM and BD-PSNR for full-picture or partial-area picture of the picture to be transcoded, and then automatically determine which source input is used as the video coding input of the transcoding system.

303: based on the decision data of the picture type, if the picture is an unnatural picture, the method comprises the following steps: if the video contains text information or a computer generated picture, YUV444 is used as an input source for transcoding video encoder input; if the picture is a natural picture, namely: if the video does not contain text information or a computer generated picture is not contained, the YUV420 is used as an input source for transcoding video encoder input; generating YUV420 or YUV444 from RGB data by source format conversion requirement,

304: and invoking a video encoder to perform an encoding operation on YUV420 or YUV444.

The Video encoder may use h.265, AVS (Audio Video coding Standard, which is a source coding Standard), h.264, or AVS2, etc.

305: and carrying out format packaging on the code stream input by the video coding and related header information, and transcoding into a new picture format.

The embodiment of the invention provides an intelligent image transcoding method realized by adopting a video compression method; because the video coding compression efficiency is superior to the image compression standards such as JPEG, PNG, and GIF, the video compression standards (such as h.265, AVS 2) can greatly save the bandwidth of the image. Therefore, the traditional picture format is generated into a new picture format through the transcoding system, so that the operation bandwidth cost of the application can be effectively saved.

Meanwhile, the embodiment of the invention aims at the problem that the traditional video compression algorithm is taken as a lossy compression mode, so that the loss of subjective experience can be caused on some pictures (such as pictures with texts and pictures generated by calculation), and the video compression method of YUV444 is introduced to convert RGB into YUV444 for video compression, so that the loss of subjective effect caused in the process of converting RGB into YUV420 can be reduced. In addition, by introducing an intelligent information source format selection module, the transcoding system can automatically identify which pictures use YUV420, and bandwidth cost can be saved with maximum strength; meanwhile, which pictures should be compressed by using YUV444 can be identified, and subjective experience is guaranteed not to be lost. By utilizing the intelligent transcoding system, the bandwidth can be saved to the maximum extent by the application, and the user can not experience the loss of subjective experience.

The method of the embodiment of the invention adopts a more efficient video coding standard and simultaneously adopts a lossy compression mode for coding, so that the compression efficiency is greatly higher than that of the traditional picture formats such as JPEG, PNG, GIF and the like, and the method has important significance for saving the bandwidth flow and the storage cost of the pictures. At present, the method and the system are tested in a plurality of services such as a hand Q, QQ space, a QQ browser, tencent sports, tencent news, heaven and earth news, a mobile phone manager and the like. According to business statistics, compared with JPEG compression, the method can save more than 40% of flow; for PNG pictures, the traffic can be saved by more than 60%, and for GIF pictures, the traffic saving is more than 80%.

In three aspects, an embodiment of the present invention provides a device for transcoding a picture, as shown in fig. 4, including:

a transcoding unit 401, configured to perform transcoding video coding input by using a lossy compression method as an input source when the target picture is a natural picture; and under the condition that the target picture is an unnatural picture, a lossless compression mode is adopted as an input source to carry out transcoding video coding input.

In this embodiment, the lossy compression method and the lossless compression method are compression algorithms used for transcoding a picture, and generally, effective information of the picture is reduced by image compression; therefore, in the embodiment of the present invention, "lossless" should be understood as relatively lossless, that is, the lossless compression method may retain more effective information of the target picture than the lossy compression method. For example: YUV444 is a lossless compression method with respect to YUV 420.

In an optional implementation manner, the embodiment of the present invention further provides an operation performed before determining the type of the target picture, and an influence of the operation on the type confirmation of the target picture, which is specifically as follows: the above-mentioned device still includes:

a decoding unit 402, configured to decode the target picture after the target picture is input to a transcoding system;

the transcoding unit 401 is specifically configured to determine that the target picture is a natural picture after decoding the target picture; or determining that the target picture is an unnatural picture after decoding the target picture.

In an optional implementation manner, the embodiment of the present invention further provides an implementation scheme for specifically determining an unnatural picture, which includes: the target picture is an unnatural picture, and the method comprises the following steps:

or the target picture is in the format of GIF and comprises the picture file extension data block.

Since the non-natural picture has some characteristics, such as containing text information and being automatically generated by the device, it may have other characteristics that cause it to be different from the natural picture; the type of the target picture can be rapidly determined through the characteristics.

In an optional implementation manner, an embodiment of the present invention further provides a scheme for determining how a target picture contains text information or is automatically generated by a device, which is specifically as follows: the above-mentioned device still includes:

a content recognition unit 403, configured to perform content recognition on the target picture, determine that the target picture includes text information according to a result of the content recognition, or determine that the target picture is automatically generated by a device according to a result of the content recognition.

Content identification refers to a technique of identifying specific content contained in a target picture, for example: performing text detection by using a sliding window mode; text detection is carried out through connected domain analysis; there are also implementations that identify specific content based on deep learning methods, such as: full convolution neural networks, pattern recognition, rate distortion optimization, and the like. Specifically, how to perform content identification to distinguish the types of the target pictures is described in detail in the foregoing embodiments, and details are not repeated here.

the target picture comprises camera information.

In an optional implementation manner, in this embodiment, a YUV format is selected as a target format during transcoding based on subsequent video coding to be performed, and a specific transcoding scheme corresponding to a determined compression method is as follows: the lossy compression method includes: YUV420; the lossless compression method comprises the following steps: YUV444.

In an optional implementation manner, an embodiment of the present invention further provides a specific implementation scheme for decoding before determining the type of the target picture, where the specific implementation scheme is as follows: the decoding unit 402 is specifically configured to decode the target picture into RGB data.

In a fourth aspect, an embodiment of the present invention provides an apparatus for transcoding a picture, as shown in fig. 5, including:

the transcoding unit 501 is configured to, when the high-frequency component of the target picture is smaller than a first threshold, perform transcoding video coding input by using a lossy compression mode as an input source; and under the condition that the high-frequency component of the target picture is larger than a second threshold value, adopting a lossless compression mode as an input source to carry out transcoding video coding input.

In an optional implementation manner, the high-frequency component of the target picture may be directly obtained through high-frequency component detection, which is specifically as follows: the above-mentioned device still includes:

a component detecting unit 502, configured to perform high frequency component detection on the target picture, and determine that the high frequency component of the target picture is smaller than a first threshold, or determine that the high frequency component of the target picture is larger than a second threshold.

In an alternative implementation manner, the high-frequency component of the target picture may also be obtained without high-frequency component detection, and before the high-frequency component detection is performed, the information contained in the high-frequency component is confirmed to distinguish the type of the high-frequency component, which is as follows: the step of determining that the high-frequency component of the target picture is greater than the second threshold includes:

Since the non-natural picture has relatively higher high frequency components, as shown in fig. 2B and fig. 2C, based on the embodiment, the non-natural picture may have some characteristics, such as including text information and being automatically generated by the device, and may have other characteristics that cause a difference between the non-natural picture and the natural picture; by these characteristics, the type of the target picture can be quickly determined, and accordingly, whether the high frequency component thereof is higher than the second threshold value can be determined.

In an optional implementation manner, an embodiment of the present invention further provides another implementation scheme for determining an unnatural picture and a natural picture, including: the above-mentioned device still includes:

a content recognition unit 503, configured to perform content recognition on the target picture, determine that the target picture includes text information according to a result of the content recognition, or determine that the target picture is automatically generated by a device according to a result of the content recognition.

the target picture comprises camera information.

In an optional implementation manner, an embodiment of the present invention further provides a specific implementation scheme for decoding before determining the type of the target picture, where the specific implementation scheme is as follows: the transcoding unit 501 is further configured to decode the target picture into RGB data before the content recognition unit 503 performs content recognition on the target picture.

Five aspects an embodiment of the present invention provides an image processing apparatus, as shown in fig. 6, including: an input-output device 601, a processor 602, and a memory 603; wherein the input output device 601, the processor 602, and the memory 603 may be connected by a bus;

the Memory 603 includes, but is not limited to, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read Only Memory (EPROM), or a portable Read Only Memory (CD-ROM), and the Memory 603 is used for related instructions and data. The input-output device 601 is used to receive and transmit data.

The processor 602 may be one or more Central Processing Units (CPUs), and in the case that the processor 602 is one CPU, the CPU may be a single-core CPU or a multi-core CPU.

The processor 602 in the image processing apparatus is configured to read the program code stored in the memory 603, and perform the following operations: under the condition that the target picture is a natural picture, a lossy compression mode is adopted as an input source to carry out transcoding video coding input; and under the condition that the target picture is an unnatural picture, a lossless compression mode is adopted as an input source to carry out transcoding video coding input.

According to the technical scheme, the embodiment of the invention has the following advantages: dividing the types of the target pictures into natural pictures and non-natural pictures; the non-natural picture contains more high-frequency components, and effective information can be reserved as much as possible by using a compression mode with a lower ratio; natural pictures will have fewer high frequency components, and higher compression efficiency can be achieved using higher compression ratios; therefore, the scheme of the embodiment of the invention can improve the picture compression efficiency on the premise of retaining the necessary effective information of the target picture.

In an optional implementation manner, the embodiment of the present invention further provides an operation performed before determining the type of the target picture, and an influence of the operation on the type confirmation of the target picture, which is specifically as follows: the processor 602 is further configured to decode the target picture after the target picture is input to a transcoding system;

the case where the target picture is an unnatural picture includes: and after decoding the target picture, determining that the target picture is an unnatural picture.

In this embodiment, the target picture is decoded first, so as to prepare for the subsequent identification of the picture type; where the decoding may be decoded into RGB format data.

In an optional implementation manner, an implementation scheme of how to specifically determine the unnatural picture is further provided in the embodiments of the present invention, and the implementation scheme includes: the target picture is an unnatural picture, and the method comprises the following steps:

In an optional implementation manner, an embodiment of the present invention further provides a scheme for determining how a target picture contains text information or is automatically generated by a device, which is specifically as follows: the processor 602 is further configured to perform content recognition on the target picture, determine that the target picture includes text information according to a result of the content recognition, or determine that the target picture is automatically generated by a device according to a result of the content recognition.

the target picture comprises camera information.

In an optional implementation manner, an embodiment of the present invention further provides a specific implementation scheme for decoding before determining the type of the target picture, where the specific implementation scheme is as follows: the decoding the target picture includes: and decoding the target picture into RGB data.

An embodiment of the present invention further provides an image processing apparatus, as shown in fig. 6, including: an input-output device 601, a processor 602, and a memory 603; wherein the input output device 601, the processor 602 and the memory 603 may be connected by a bus.

The processor 602 is configured to, when the high-frequency component of the target picture is smaller than a first threshold, perform transcoding video encoding input by using a lossy compression method as an input source; and under the condition that the high-frequency component of the target picture is larger than a second threshold value, adopting a lossless compression mode as an input source to carry out transcoding video coding input.

In this embodiment, the first threshold and the second threshold are critical points for determining the compression mode, and it should be understood that, in order to avoid collision, the first threshold should be less than or equal to the second threshold. If the first threshold is equal to the second threshold, the high frequency component of the target picture is equal to the first threshold or the second threshold, in this case, the compression mode may be selected according to the technical requirement of picture compression: the lossy compression mode can be used for obtaining a higher compression ratio, and the lossless compression mode can be used for reserving more useful information of the target pictures. If the first threshold is smaller than the second threshold, there is an interval between the first threshold and the second threshold, that is, the interval is greater than or equal to the first threshold and smaller than or equal to the second threshold, and if the high-frequency component of the target picture falls within the interval, reference may be made to the foregoing processing scheme of "the high-frequency component of the target picture is equal to the first threshold or the second threshold", which is not described herein again.

In an optional implementation manner, the high-frequency component of the target picture may be directly obtained through high-frequency component detection, which is specifically as follows: the processor 602 is further configured to perform high frequency component detection on the target picture, and determine that the high frequency component of the target picture is smaller than a first threshold value or that the high frequency component of the target picture is larger than a second threshold value by performing high frequency component detection on the target picture.

In an optional implementation manner, an embodiment of the present invention further provides another implementation scheme for determining an unnatural picture and a natural picture, including: the processor 602 is further configured to perform content recognition on the target picture, determine that the target picture includes text information according to a result of the content recognition, or determine that the target picture is automatically generated by a device according to a result of the content recognition.

In an optional implementation manner, the embodiment of the present invention further provides a scheme capable of quickly determining that a target picture is a natural picture, and since the natural picture generally has a low high-frequency component, it may be determined whether the high-frequency component of the natural picture is smaller than a first threshold by determining the natural picture, specifically as follows: the step of setting the high-frequency component of the target picture to be less than the first threshold value comprises:

the target picture comprises camera information.

In an optional implementation manner, an embodiment of the present invention further provides a specific implementation scheme for decoding before determining the type of the target picture, where the specific implementation scheme is as follows: the processor 602 is further configured to decode the target picture into RGB data before performing content recognition on the target picture.

The image processing apparatus provided in the following seventh aspect and eighth aspect may refer to the method flow in the foregoing embodiment, and details are not repeated.

Seventh aspect an embodiment of the present invention further provides an image processing apparatus, as shown in fig. 7, including: the picture decoder 701, the source converter 702, and the video encoder 703 further include: a source selector 704; the target picture is decoded by the picture decoder 701 and then output to the source selector 704;

the source selector 704 selects a lossy compression method when determining that the target picture is a natural picture, and selects a lossless compression method when determining that the target picture is an unnatural picture;

the source converter 702 converts the decoded target picture according to the compression method determined by the source selector 704 and outputs the converted target picture to the video encoder 703.

In an alternative implementation manner, the source selector 704 determines that the target picture is an unnatural picture if it is determined that the target picture contains text information or if it is determined that the target picture is automatically generated by a device.

In an optional implementation manner, the determining that the target picture contains text information by the source selector 704 includes: determining the format of the target picture to be PNG, wherein the target picture comprises text data blocks; or determining the format of the target picture as GIF, wherein the target picture comprises a picture file extension data block.

In an optional implementation manner, the determining that the target picture is automatically generated by the device by the source selector 704 includes: and identifying the content of the target picture, and determining that the target picture contains text information according to the result of the content identification, or determining that the target picture is automatically generated by equipment according to the result of the content identification.

In an optional implementation manner, the determining, by the source selector 704, that the target picture is a natural picture includes: and determining that the target picture contains camera information.

In an alternative implementation manner, the picture decoder 701 is configured to decode the target picture into RGB data.

The eighth aspect of the present invention also provides an image processing apparatus, as shown in fig. 7, including: the picture decoder 701, the source converter 702, and the video encoder 703 further include: a source selector 704;

the target picture is decoded by the picture decoder 701 and then output to the source selector 704;

the source selector 704 selects a lossy compression method when determining that the high-frequency component of the target picture is smaller than a first threshold, and selects a lossless compression method when determining that the high-frequency component of the target picture is larger than a second threshold;

In an optional implementation manner, the source selector 704 is further configured to perform high frequency component detection on the target picture, determine that the high frequency component of the target picture is smaller than a first threshold, or determine that the high frequency component of the target picture is larger than a second threshold.

In an optional implementation, the determining, by the source selector 704, that the high-frequency component of the target picture is greater than a second threshold includes:

and determining that the target picture contains text information, or determining that the target picture is automatically generated by equipment.

In an optional implementation manner, the determining that the target picture contains text information by the source selector 704 includes:

determining the format of the target picture to be PNG, wherein the target picture comprises a text data block; or determining the format of the target picture as GIF, wherein the target picture comprises a picture file extension data block.

In an optional implementation manner, the source selector 704 is further configured to perform content identification on the target picture; and determining that the target picture contains text information according to the content identification result, or determining that the target picture is automatically generated by equipment according to the content identification result.

In an optional implementation manner, the source selector 704, configured to determine that the high frequency component of the target picture is smaller than the first threshold, includes: and determining that the target picture contains camera information.

In an alternative implementation, the lossy compression method includes: YUV420; the lossless compression method includes: YUV444.

The above image processing device may be a mobile device, for example: mobile phones, personal computers, and the like; it may also be a server device, for example: social platform servers, instant messaging servers, and the like. Therefore, the above image processing devices may be components of these devices, and therefore, the hardware structures of these devices may be based on the hardware structures of these devices, which are not described in detail in the embodiments of the present invention. Wherein the components of figure 7 are present in hardware.

It should be noted that, in the above-mentioned embodiment of the image processing apparatus, the included units are merely divided according to the functional logic, but are not limited to the above-mentioned division as long as the corresponding functions can be realized; in addition, the specific names of the functional units are only for the convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

In addition, it is understood by those skilled in the art that all or part of the steps in the above method embodiments may be implemented by related hardware, and the corresponding program may be stored in a computer readable storage medium, where the above mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the embodiment of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method for transcoding pictures, comprising:

after a target picture is input into a transcoding system, decoding the target picture through a picture decoder in the transcoding system to obtain decoded data; the decoded data are used for indicating a signal source selector in the transcoding system to make a decision, the transcoding system comprises the picture decoder, the signal source selector, a signal source converter and a video encoder, and the signal source converter is used for converting the decoded data into YUV format data and sending the YUV format data to the video encoder for video encoding;

if the decoded data comprise exchangeable image file exif information of a camera, the source selector judges that the target picture is a natural picture;

if the decoded data comprises tEXt information, the source selector judges that the target picture is an unnatural picture, and the tEXt information comprises a tEXt information data block tEXT, a compressed tEXt data block zTxt and/or a graphic tEXt extended data block;

under the condition that the source selector judges that the target picture is a natural picture, a lossy compression mode is adopted as an input source to carry out transcoding video coding input; the natural picture is a naturally generated picture, and the lossy compression mode comprises YUV420;

under the condition that the source selector judges that the target picture is the non-natural picture, a lossless compression mode is adopted as an input source to carry out transcoding video coding input, wherein the lossless compression mode comprises YUV444; the non-natural picture includes at least one of: the device automatically generates a picture or a picture that is characteristic of or distinguishable from the natural picture.

2. The method of claim 1, wherein the target picture being a non-natural picture comprises:

the target picture contains the text information, or the target picture is automatically generated by the device.

3. The method of claim 2, wherein the target picture containing text information comprises:

the format of the target picture is PNG, and the target picture comprises the tEXt information data block tEXT and/or the compressed tEXt data block zTxt;

or the target picture is in a format of GIF, and the target picture comprises the graphic text extension data block.

4. The method of claim 2, wherein the target picture contains text information, or wherein the target picture is automatically generated by a device comprising:

5. The method of claim 1, wherein the target picture is a natural picture comprising:

the target picture comprises exchangeable image file exif information of the camera.

6. The method of claim 1, wherein the decoding the target picture to obtain decoded data comprises: decoding the target picture into RGB data.

7. A method for transcoding pictures, the method being performed by a source selector in a transcoding system, the source selector being configured to receive decoded data obtained by a picture decoder in the transcoding system after decoding a target picture; if the decoded data comprise exchangeable image file exif information of a camera, the signal source selector judges that the target picture is a natural picture; if the decoded data comprises tEXt information, the source selector judges that the target picture is an unnatural picture, and the tEXt information comprises a tEXt information data block tEXT, a compressed tEXt data block zTxt and/or a graphic tEXt extended data block; the transcoding system comprises the picture decoder, the signal source selector, a signal source converter and a video encoder, wherein the signal source converter is used for converting the decoded data into YUV-format data and sending the YUV-format data to the video encoder for video encoding; the target picture is a picture input into the transcoding system, and the method comprises the following steps:

under the condition that whether the decoded data comprise the camera exchangeable image file exif information and the text information or not cannot be determined, carrying out high-frequency component detection on a target picture based on the decoded data to obtain a high-frequency component of the target picture;

under the condition that the high-frequency component of the target picture is smaller than a first threshold value, a lossy compression mode is adopted as an input source to carry out transcoding video coding input, wherein the lossy compression mode comprises YUV420;

and under the condition that the high-frequency component of the target picture is larger than a second threshold value, a lossless compression mode is adopted as an input source to carry out transcoding video coding input, wherein the lossless compression mode comprises YUV444.

8. An apparatus for transcoding a picture, comprising:

the decoding unit is used for decoding the target picture through a picture decoder in the transcoding system after the target picture is input into the transcoding system, so as to obtain decoded data; the decoded data is used for indicating a signal source selector in the transcoding system to make a decision, the transcoding system comprises the picture decoder, the signal source selector, a signal source converter and a video encoder, and the signal source converter is used for converting the decoded data into YUV-format data and sending the YUV-format data to the video encoder to perform video encoding;

the transcoding unit is used for performing transcoding video coding input by adopting a lossy compression mode as an input source under the condition that the source selector judges that the target picture is a natural picture, wherein the lossy compression mode comprises YUV420; the natural picture is a naturally generated picture; if the decoded data comprise exchangeable image file exif information of a camera, the signal source selector judges that the target picture is a natural picture;

the transcoding unit is further configured to, when the source selector determines that the target picture is an unnatural picture, use a lossless compression mode as an input source to perform transcoding video coding input, where the lossless compression mode includes YUV444; the non-natural picture includes at least one of: the device automatically generates a picture which is different from the natural picture in characteristic; if the decoded data includes tEXt information, the source selector determines that the target picture is an unnatural picture, and the tEXt information includes a tEXt information data block txxt, a compressed tEXt data block zTXt, and/or a graphic tEXt extended data block.

9. The apparatus of claim 8, wherein the target picture is an unnatural picture comprising:

10. The apparatus of claim 9, wherein the target picture containing text information comprises:

the target picture is in a format of PNG, and the target picture comprises the tEXt information data block tEXT and/or the compressed tEXt data block zTxt;

11. The apparatus of claim 9, further comprising:

12. The apparatus of claim 8, wherein the target picture is a natural picture comprising:

the target picture comprises the camera exchangeable image file exif information.

13. An apparatus for transcoding a picture, the apparatus being operable in a source selector in a transcoding system, the source selector being configured to receive decoded data obtained by a picture decoder in the transcoding system after decoding a target picture; if the decoded data comprise exchangeable image file exif information of a camera, the signal source selector judges that the target picture is a natural picture; if the decoded data comprises tEXt information, the source selector judges that the target picture is an unnatural picture, and the tEXt information comprises a tEXt information data block tEXT, a compressed tEXt data block zTxt and/or a graphic tEXt extended data block; the transcoding system comprises the picture decoder, the signal source selector, a signal source converter and a video encoder, wherein the signal source converter is used for converting the decoded data into YUV format data and sending the YUV format data to the video encoder for video encoding; the target picture is a picture input into the transcoding system, and the method comprises the following steps:

a component detection unit configured to, in a case where it cannot be determined whether the decoded data includes the camera exchangeable image file exif information and the text information, perform high-frequency component detection on a target picture based on the decoded data, determine that a high-frequency component of the target picture is smaller than a first threshold, or determine that the high-frequency component of the target picture is larger than a second threshold;

the transcoding unit is used for performing transcoding video coding input by adopting a lossy compression mode as an input source under the condition that the high-frequency component of the target picture is smaller than a first threshold, wherein the lossy compression mode comprises YUV420; and under the condition that the high-frequency component of the target picture is larger than a second threshold value, a lossless compression mode is adopted as an input source to carry out transcoding video coding input, wherein the lossless compression mode comprises YUV444.

14. An image processing apparatus comprising a processor configured to invoke stored program code to perform the method of any of claims 1 to 6 or the method of claim 7.

15. A computer-readable storage medium, characterized in that the computer-readable storage medium stores program instructions that, when executed by a processor, cause the processor to perform the method of any of claims 1-6 or the method of claim 7.