CN121120810A - JPEG image coding and decoding method, device and medium - Google Patents
JPEG image coding and decoding method, device and mediumInfo
- Publication number
- CN121120810A CN121120810A CN202511668161.2A CN202511668161A CN121120810A CN 121120810 A CN121120810 A CN 121120810A CN 202511668161 A CN202511668161 A CN 202511668161A CN 121120810 A CN121120810 A CN 121120810A
- Authority
- CN
- China
- Prior art keywords
- data block
- component
- transparency
- mode
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
A method for coding and decoding JPEG image includes obtaining image to be processed, containing brightness component, chromaticity component and transparency component, dividing image into multiple data block groups each containing brightness component data block, chromaticity component data block and transparency component data block, coding brightness component data block, chromaticity component data block and transparency component data block in each data block group to obtain target code stream. Then, the target code stream is analyzed to obtain decoding data corresponding to the brightness component data block, the chromaticity component data block and the transparency component data block, and then an output image is obtained according to the decoding data of each component data block. The application can ensure that JPEG originally supports transparent effect and retains the efficient advantage of encoding by acquiring the image containing brightness, chromaticity and transparency components and dividing the image into corresponding data block groups for encoding and decoding, and can also avoid compatibility problems.
Description
Technical Field
The embodiment of the application relates to the technical field of image processing, but is not limited to, in particular to a JPEG image encoding and decoding method, device and medium.
Background
The JPEG standard is widely used in image processing due to efficient encoding and wide hardware support, but has a core limitation that Alpha channels (transparency information) are not supported. In the scenes needing transparent effects, such as the digital cultural product making software (such as a graphical user interface development tool and a webpage design tool), the game mapping in the cartoon game making engine software, the scene materials of the virtual reality processing software and the like, a developer can only select a PNG format, while the PNG supports lossless compression and complete transparency, the problems of slow decoding and large file size exist, and performance bottlenecks and bandwidth pressure are easy to cause in embedded devices with limited resources, such as home entertainment product software (such as embedded entertainment devices).
In order to make up the defect that the Alpha channel is not supported by JPEG, various improvement attempts appear in the prior art, but the prior art has obvious defects that although the Alpha channel is supported by part of JPEG expansion standards (such as JPEG XR), a conversion mode of non-traditional DCT is adopted, so that the computational complexity is greatly improved, the support degree is limited in general hardware depending on digital culture creative software, the formats such as WebP, HEIC and the like are required to be expanded by depending on video coding standards, the coding logic is complex, the hardware design difficulty and cost are increased, and in addition, the compatibility with a standard JPEG decoder is damaged by some private expansion schemes, so that images cannot be normally displayed on the non-upgraded decoder, and the application range is limited.
Disclosure of Invention
The following is a summary of the subject matter described in detail herein. This summary is not intended to limit the scope of the claims.
The embodiment of the application provides a JPEG image coding and decoding method, a device and a medium, which can enable JPEG native support to be transparent and retain high-efficiency coding, avoid compatibility problems and solve pain points in the prior art.
The embodiment of the application provides a JPEG image coding and decoding method, which comprises the steps of obtaining an image to be processed, wherein the image comprises a brightness component, a chromaticity component and a transparency component, dividing the image into a plurality of data block groups, each data block group comprises a brightness component data block, a chromaticity component data block and a transparency component data block, coding the brightness component data block, the chromaticity component data block and the transparency component data block in each data block group to obtain a target code stream, analyzing the target code stream to obtain decoding data corresponding to the brightness component data block, the chromaticity component data block and the transparency component data block, and obtaining an output image according to the decoding data of each component data block.
In an embodiment of the present application, the encoding of the luminance component data block, the chrominance component data block, and the transparency component data block in each of the data block groups to obtain a target code stream includes encoding the luminance component data block and the chrominance component data block in each of the data block groups according to a JPEG standard format to obtain a corresponding luminance component encoding result and a corresponding chrominance component encoding result, encoding the transparency component data block in each of the data block groups according to a preset encoding mode to obtain a corresponding transparency component encoding result, and splicing the luminance component encoding result, the chrominance component encoding result, and the transparency component encoding result of the same data block group, and obtaining the target code stream according to the splicing result of a plurality of data block groups.
In an embodiment of the application, the encoding of the transparency component data blocks in each data block group according to the preset encoding mode to obtain a corresponding transparency component encoding result includes determining a target encoding mode of the transparency component data blocks from a plurality of preset encoding modes according to the data characteristics of the transparency component data blocks, and obtaining a transparency component encoding result including a control header and a data portion based on the target encoding mode, wherein the control header is used for identifying the target encoding mode and the length of the data portion.
In an embodiment of the present application, the determining the target coding mode of the transparency component data block from a plurality of preset coding modes according to the data characteristics of the transparency component data block includes selecting a first mode from the plurality of coding modes as a target coding mode if all pixel values of the transparency component data block are 0, wherein when the target coding mode is the first mode, the control header in the transparency component coding result is 0x3f, and the length of the data portion is 0.
In an embodiment of the present application, the determining the target encoding mode of the transparency component data block from a plurality of preset encoding modes according to the data characteristics of the transparency component data block includes selecting a second mode from the plurality of encoding modes as the target encoding mode if all pixel values of the transparency component data block are 255, wherein when the target encoding mode is the second mode, the control header in the transparency component encoding result is 0x7f, and the length of the data portion is 0.
In an embodiment of the present application, the determining a target encoding mode of the transparency component data block from a plurality of preset encoding modes according to the data characteristics of the transparency component data block includes performing run-length encoding on the transparency component data block to obtain a run-length block if the transparency component data block includes a non-0 pixel value and a non-255 pixel value, selecting a third mode from the plurality of encoding modes as a target encoding mode if the total length of the run-length block is less than or equal to a preset byte length, selecting the data portion of the transparency component encoding result as the run-length block if the target encoding mode is the third mode, selecting a fourth mode from the plurality of encoding modes as a target encoding mode if the total length of the run-length block is greater than the preset byte length, and selecting the data portion of the transparency component encoding result as original pixel data of the image if the target encoding mode is the fourth mode.
In an embodiment of the present application, the parsing the target code stream to obtain the luminance component data block, the chrominance component data block, and the decoded data corresponding to the transparency component data block includes decoding the luminance component encoded result and the chrominance component encoded result in the target code stream in a JPEG standard format to obtain the luminance component decoded data and the chrominance component decoded data, reading a control header of the transparency component encoded result in the target code stream, decoding the transparency component encoded result according to a first mode if the control header is 0x3f, obtaining transparency component decoded data of all pixel values 0, decoding the transparency component encoded result according to a second mode if the control header is 0x7f, obtaining transparency component decoded data of all pixel values 255, decoding a data portion in the transparency component encoded result according to a third mode if the control header is identified as the third mode, obtaining transparency component decoded data, and decoding a transparency component decoded data of a portion in the transparency component encoded result according to a fourth mode if the control header is identified as the fourth mode.
In an embodiment of the present application, the obtaining an output image according to the decoded data of each component data block includes performing color space conversion on the decoded luminance component data block and the decoded chrominance component data block to obtain RGB component data, and correspondingly combining the RGB component data and the decoded transparency component data block according to pixels to obtain an output image in RGBA format.
In another aspect, an embodiment of the application provides an electronic device comprising at least one processor, at least one memory, and computer program instructions stored in the memory, which when executed by the processor, implement a method as described above.
In another aspect, embodiments of the present application provide a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement a method as described above.
The embodiment of the application provides a JPEG image encoding and decoding method, electronic equipment and a computer readable storage medium. The method comprises the steps of firstly obtaining an image to be processed, wherein the image comprises a brightness component, a chrominance component and a transparency component, then dividing the image into a plurality of data block groups, wherein each data block group comprises a brightness component data block, a chrominance component data block and a transparency component data block, and then encoding the brightness component data block, the chrominance component data block and the transparency component data block in each data block group to obtain a target code stream. Then, the target code stream is analyzed to obtain decoding data corresponding to the brightness component data block, the chromaticity component data block and the transparency component data block, and then an output image is obtained according to the decoding data of each component data block. On one hand, transparency components are incorporated into a JPEG processing flow, transparency information is directly carried through division of data block groups, a JPEG format can originally support transparency effects without depending on alternative formats such as PNG, requirements of layered rendering of roles and scenes in cartoon game making engine software, transparent background processing of digital cultural product making software, scene material superposition of virtual reality processing software and the like can be met, meanwhile, the advantages of high efficiency of JPEG coding are reserved, on the other hand, compact code streams adapting to a JPEG frame can be formed through centralized coding of the data block groups, file size is effectively reduced, storage occupation and transmission bandwidth pressure of embedded equipment are reduced, the whole process is based on expansion of JPEG original data block processing logic, complex algorithms are not introduced, hardware only needs to increase resolving capability of transparency component data blocks on the basis of existing JPEG decoding, reconstruction architecture is not needed, adaptation difficulty of general hardware depending on digital cultural creative software is reduced, the requirements of low-cost scenes are met, in addition, basic images can be displayed normally, requirements of trans-publishing equipment display of digital software can be met, and the application range is enlarged.
Drawings
Fig. 1 is a flowchart of a JPEG image encoding and decoding method provided in an embodiment of the present application;
FIG. 2 is a flowchart illustrating step 130 of FIG. 1 according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating step 220 of FIG. 2 according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a transparency component encoding result structure according to an embodiment of the present application;
FIG. 5 is a flow chart of coding decisions for transparency class data blocks provided by an embodiment of the present application;
FIG. 6 is a flowchart illustrating step 150 of FIG. 1 according to an embodiment of the present application;
Fig. 7 is a decoding flow chart provided by an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
It should be noted that although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different from that in the flowchart. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the structures, proportions, sizes, etc. shown in the drawings are for illustration purposes only and should not be construed as limiting the application to the extent that it can be practiced, since modifications, changes in the proportions, or adjustments of the sizes, which are otherwise, used in the practice of the application, are included in the spirit and scope of the application which is otherwise, without departing from the spirit or scope thereof. Also, the terms such as "upper," "lower," "left," "right," "middle," and "a" and the like recited in the present specification are merely for descriptive purposes and are not intended to limit the scope of the application, but are intended to provide relative positional changes or modifications without materially altering the technical context in which the application may be practiced.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.
In the field of image processing, JPEG (Joint Photographic Experts Group) standard is a mainstream format in various scenes by virtue of the advantages of high coding efficiency and wide hardware support, but the traditional JPEG standard has key limitation that Alpha channel (transparency information) is not supported. In the scene where transparent effect is required to be displayed, such as the digital culture creative design (such as a graphical user interface and a webpage design), the game map in the cartoon game making engine software, the scene material of the virtual reality processing software and the like, a developer has to select an alternative format supporting Alpha channels, such as PNG, etc., however, the PNG format can realize complete transparency, but has the problems of low decoding speed and large file volume, and obvious performance bottleneck and bandwidth transmission pressure are easy to be caused in the embedded device with limited resources carried by software.
In order to overcome the defect that the Alpha channel is not supported by JPEG, various improvement attempts appear in the prior art, but the prior art has obvious defects that although the Alpha channel is supported by part of JPEG expansion standards (such as JPEG XR), a conversion mode of non-traditional DCT is adopted, so that the computational complexity is greatly improved, the support degree is limited in general hardware relied on by digital culture creative software, the formats such as WebP, HEIC and the like are required to be expanded by depending on video coding standards, the coding logic is complex, the hardware design difficulty and cost in the fields such as game cartoon software and digital publishing software are increased, in addition, the compatibility with standard JPEG decoders is damaged by some private expansion schemes, the images cannot be normally displayed on the non-updated decoders, and the application range in the fields such as education and news industry software which need to be widely compatible is limited.
In view of this, embodiments of the present application provide a JPEG image encoding and decoding method, an electronic device, and a computer readable storage medium. The method comprises the steps of firstly obtaining an image to be processed, wherein the image comprises a brightness component, a chrominance component and a transparency component, then dividing the image into a plurality of data block groups, wherein each data block group comprises a brightness component data block, a chrominance component data block and a transparency component data block, and then encoding the brightness component data block, the chrominance component data block and the transparency component data block in each data block group to obtain a target code stream. Then, the target code stream is analyzed to obtain decoding data corresponding to the brightness component data block, the chromaticity component data block and the transparency component data block, and then an output image is obtained according to the decoding data of each component data block. On one hand, transparency components are incorporated into a JPEG processing flow, transparency information is directly carried through division of data block groups, a JPEG format can originally support transparency effects without depending on alternative formats such as PNG, requirements of layered rendering of roles and scenes in cartoon game making engine software, transparent background processing of digital cultural product making software, scene material superposition of virtual reality processing software and the like can be met, meanwhile, the advantages of high efficiency of JPEG coding are reserved, on the other hand, compact code streams adapting to a JPEG frame can be formed through centralized coding of the data block groups, file size is effectively reduced, storage occupation and transmission bandwidth pressure of embedded equipment are reduced, the whole process is based on expansion of JPEG original data block processing logic, complex algorithms are not introduced, hardware only needs to increase resolving capability of transparency component data blocks on the basis of existing JPEG decoding, reconstruction architecture is not needed, adaptation difficulty of general hardware depending on digital cultural creative software is reduced, the requirements of low-cost scenes are met, in addition, basic images can be displayed normally, requirements of trans-publishing equipment display of digital software can be met, and the application range is enlarged.
Embodiments of the present application will be further described below with reference to the accompanying drawings.
Referring to fig. 1, fig. 1 is a flowchart of a JPEG image encoding and decoding method according to an embodiment of the present application. The process may include, but is not limited to, steps 110 through 150.
Step 110, obtaining an image to be processed, wherein the image comprises a brightness component, a chromaticity component and a transparency component;
step 120, dividing the image into a plurality of data block groups, wherein each data block group comprises a brightness component data block, a chrominance component data block and a transparency component data block;
Step 130, coding the brightness component data block, the chromaticity component data block and the transparency component data block in each data block group to obtain a target code stream;
Step 140, analyzing the target code stream to obtain decoding data corresponding to the brightness component data block, the chrominance component data block and the transparency component data block;
And 150, obtaining an output image according to the decoded data of each component data block.
Steps 110 to 150 are described in detail below.
In a possible embodiment, in step 110, the image to be processed may be acquired by an image capturing device (such as a camera, a scanner) or a storage medium (such as a hard disk, a usb disk), where the image includes three core components, and a luminance component (usually denoted as a Y component) is used to reflect the brightness of the image, which is the basis of visual presentation of the image, and the chrominance components include a first chrominance component (usually denoted as a Cb component) and a second chrominance component (usually denoted as a Cr component) used to reflect the color information of the image, cb represents the difference between blue and luminance, cr represents the difference between red and luminance, and a transparency component (also denoted as an Alpha component) is used to control the transparency of a pixel of the image, where the pixel value is usually between 0 and 255, and 255 indicates that the pixel is completely transparent, and by combining these three components, the image processing requirement that the transparency effect is required to be presented later can be satisfied.
In a possible embodiment, in step 120, the image to be processed may be uniformly divided into a plurality of data block groups according to a rule adapted to the JPEG encoding logic, and the data block groups are the smallest encoding units (Minimum Coded Unit, MCUs) in the JPEG standard. To ensure image quality, the present embodiment uses Alpha channels in combination with the YUV444 format, and each data block group (i.e., MCUs) contains 48 x8 data blocks, 1Y (luminance) component data block, 1 Cb (blue chrominance) component data block, 1 Cr (red chrominance) component data block, and 1 Alpha (transparency) component data block. That is, this embodiment defines an MCU structure containing Y, cb, cr, A data blocks within the standard JPEG YUV444 framework, integrating Alpha information while maintaining high quality.
In a possible embodiment, in step 130, for each component data block in each data block group, a differential encoding strategy may be adopted to process the luminance component data block and the chrominance component data block respectively, wherein the luminance component data block and the chrominance component data block use JPEG standard encoding logic to ensure compatibility with the existing JPEG decoder, the transparency component data block uses a preset encoding mode according to its data characteristics to achieve efficient compression, especially extreme compression of all 0/255 blocks, and finally, the encoding results of the three components are integrated to form a target code stream capable of completely carrying all information of an image, the code stream not only contains luminance and chrominance information of conventional JPEG, but also has transparency information newly added, and the overall structure conforms to the code stream frame of JPEG.
In a possible embodiment, in order for the decoder to quickly identify whether the image contains Alpha components, as distinguished from the conventional YUV format, the file may be identified as YUVA format by setting the number of components (comp_num) to 4 in the header (Start of Frame, SOF) of the JPEG file, as distinguished from YUV formats containing only 3 components (e.g., YUV444, YUV 420)
In a possible embodiment, in step 140, since comp_num=4 is set in the JPEG frame header to identify the YUVA format and the encoding result of the Alpha component is added, the decoder needs to increase the parsing and processing capability of the 4 th component (Alpha) based on the standard JPEG decoder. The decoder reads comp_num=4 from the frame header, and recognizes this as YUVA format, and starts the corresponding Alpha component decoding flow. The specific operation process comprises the steps of firstly reading basic identification information (such as component number, data block size, length of each component coding result and the like) in a target code stream, determining positions and formats of four component coding results of brightness (Y), chromaticity (Cb, cr) and transparency (Alpha) in the code stream, then adopting corresponding decoding logic for different components, wherein the brightness component and the chromaticity component are decoded according to a JPEG standard flow, the transparency component determines a coding mode by reading a control head in the coding result, and then analyzes a data part according to the corresponding mode, and finally obtaining decoding data of each of the four component data blocks, wherein the decoding data are consistent with the original component data block information before coding, and accurate basic data are provided for subsequent image synthesis.
In a possible embodiment, in step 150, the decoded luminance component data block and the decoded chrominance component data block may be restored in data format to ensure that they meet the input requirement of color space conversion, the data is converted into a format recognizable by the display device through the color space conversion, and finally the converted color data and the transparency component data block are fused in one-to-one correspondence according to pixels, so as to ensure that the color information of each pixel is exactly matched with the transparency information, and finally an output image capable of presenting a transparency effect is generated.
In a possible embodiment, as shown in fig. 2, the execution of step 130 may include, but is not limited to, steps 210 through 230.
Step 210, respectively encoding the brightness component data block and the chrominance component data block in each data block group according to a JPEG standard format to obtain a corresponding brightness component encoding result and chrominance component encoding result;
Step 220, coding the transparency component data blocks in each data block group according to a preset coding mode to obtain a corresponding transparency component coding result;
And 230, splicing the brightness component coding result, the chroma component coding result and the transparency component coding result of the same data block group, and obtaining a target code stream according to the splicing result of the plurality of data block groups.
In a possible embodiment, in step 210, the encoding process of the JPEG standard format includes Discrete Cosine Transform (DCT), quantization, entropy encoding (e.g. huffman encoding), and the like, where the discrete cosine transform converts 8×8 pixel data in the spatial domain into 64 coefficient data (1 dc coefficient+63 ac coefficients) in the frequency domain, so as to facilitate the subsequent compression by quantizing the high frequency coefficients, and the quantization reduces the data redundancy by performing integer division operation on the high frequency coefficients according to a preset quantization table (the quantization degree can be adjusted according to the image quality requirement, the more sparse the quantization table, the higher the compression rate but the greater the image distortion), and the entropy encoding performs distortion-free compression on the quantized coefficients (e.g. huffman encoding allocates a short code to the coefficients occurring at high frequencies by constructing a frequency table), so as to further reduce the data amount.
In a possible embodiment, in step 220, the preset encoding modes are four encoding modes designed according to the characteristic that the transparency component data block has a majority of pixel values of 0 or 255 and only a small portion of pixel values of non-0 or non-255, and the core purpose is to maximize compression efficiency and avoid redundancy caused by the conventional unified encoding mode under the premise of ensuring accuracy of transparency information through a differentiated encoding strategy.
In a possible embodiment, in step 230, the chroma component encoding result includes a Cb component encoding result and a Cr component encoding result, and a fixed sequence of "luma component encoding result→chroma component encoding result (Cb component encoding result→cr component encoding result) →transparency component encoding result" may be adopted during splicing, and component identification information (such as the encoding result length, the number of data blocks, the total number of rows/total columns of MCUs, etc. of each component) is added to the code stream at the same time, so that the decoder can quickly locate the start and end positions of each component encoding result through the identification information, so as to avoid information confusion during decoding, and ensure that the overall structure of the target code stream conforms to the code stream specification of JPEG, and promote compatibility with various decoding devices.
In a possible embodiment, as shown in fig. 3, the execution of step 220 may include, but is not limited to, step 310 and step 320.
Step 310, determining a target coding mode of the transparency component data block from a plurality of preset coding modes according to the data characteristics of the transparency component data block;
Step 320, obtaining a transparency component encoding result comprising a control header and a data portion based on the target encoding mode, the control header identifying the target encoding mode and the length of the data portion.
In a possible embodiment, in step 310, the data features refer to the distribution of pixel values in the transparency component data block, and specifically include whether the pixel values are all 0, all 255, including non-0 and non-255 pixel values, and the continuous distribution (such as the number of pixels with the same continuous value) when the pixel values are not 0 and not 255 pixel values, where these features are core bases for selecting the coding mode, so as to ensure that the selected mode matches with the actual data situation, and achieve the optimal compression effect.
In a possible embodiment, the transparency component encoding result is composed of a control header of 1 byte and a data portion of at most 64 bytes into one data block in step 320. As shown in FIG. 4, bits 7-6 (high 2 bits) of the control header are used to identify the target coding mode (type), 00 is used to compress all 0 data blocks (first mode), 01 is used to compress all 255 data blocks (second mode), 10 is used to compress data blocks except 0 and 255 (third mode, 1 byte of data value is needed after the control header is used to identify the specific pixel value of non-0 and non-255, and the compression is realized in cooperation with the run information), 11 is used to store non-compressed data after the control header (fourth mode, namely original 8×8 pixel data), and bits 5-0 (low 6 bits) are used to identify the length of the data portion, and the range of values is 0-63, which indicates that the actual length of the data portion is 1-64 bytes. And combining the control head 1 byte, wherein the total length of the single block coding result is 1-64 bytes. The data portion is compressed data obtained by processing the transparency component data block in the corresponding coding mode, and the content of the data portion varies according to the coding mode.
In a possible embodiment, if all pixel values of the transparency component data block are 0, a first mode is selected from multiple encoding modes as the target encoding mode, where the first mode is that if the entire 8×8 block data is all 0, the control header is 0×3f (binary 00111111, high 2 bits 00 identifies the first mode, low 6 bits 111111 is a fixed value, and corresponds to a data portion length of 0 bytes), and the data portion length is 0. Only 1 byte is used to represent the information of an 8x8 block. That is, when the target encoding mode is the first mode, the control header in the transparency component encoding result is 0x3f and the length of the data portion is 0. The mode is suitable for a scene (all pixel values are 0) with a completely transparent transparency component data block, transparency data are not required to be additionally stored at the moment, the information of the data block can be completely represented only through a control head of 1 byte, data redundancy can be reduced to the greatest extent, and compression efficiency is highest.
In a possible embodiment, if all pixel values of the transparency component data block are 255, a second mode is selected from the multiple encoding modes as the target encoding mode, where the second mode is that if the entire 8×8 block data is 255, the control header is 0×7f (binary 01111111, the upper 2 bits 01 identify the second mode, the lower 6 bits 111111 are fixed values, and the corresponding data portion is 0 bytes long), and the length of the data portion is 0. Only 1 byte is used to represent the information of an 8x8 block. That is, when the target encoding mode is the second mode, the control header in the transparency component encoding result is 0x7f and the length of the data portion is 0. The mode is suitable for a scene (all pixel values are 255) with completely opaque transparency component data blocks, and similar to the first mode, the data block information can be represented only through a control head of 1 byte without storing additional transparency data, and extremely high compression efficiency can be realized in the scene.
In a possible embodiment, if the transparency component data block includes a non-0 pixel value and a non-255 pixel value, run-length encoding is performed on the transparency component data block to obtain a Run-length block (i.e., a Run-length block, "Run" is a number of consecutive identical pixel values, "Level" is the pixel value), if the total length of the Run-length block is less than or equal to a preset byte length, a third mode is selected from multiple encoding modes as a target encoding mode, and if 8×8 blocks of data have different values, the third mode is composed of multiple Run-Level blocks, a control header is used to determine the length of a data portion, and the data portion determines a value (i.e., each Run-Level block contains a1 byte Run number+1 byte pixel value). That is, when the target encoding mode is the third mode, the data portion of the transparency component encoding result is a run-length block, and if the total length of the run-length block is greater than the preset byte length, a fourth mode is selected from the plurality of encoding modes as the target encoding mode, wherein the fourth mode is that if the length (data portion) of the run-length block is greater than 64 bytes, encoding is performed in a non-compression manner and the total length of the encoding result is 65 bytes (1 byte control header+64 bytes of original pixel data) in terms of the compression rate. That is, when the target encoding mode is the fourth mode, the data portion of the transparency component encoding result is the original pixel data of the image. The run-length encoding is a method of compressing data by recording the number (run) of the continuous same pixel values and the pixel value (level), for example, for the pixel value sequence "0,0,0,128,128,255", the pixel value sequence can be expressed as "3-0,2-128,1-255" after run-length encoding, so that the storage amount of continuous repeated data can be effectively reduced, the length of a run block, namely a data unit obtained after run-length encoding, is the total byte number of the data after run-length encoding, the preset byte length is usually set to 64 bytes, and the value is determined according to the value range (0-63, corresponding to the data part length of 1-64 bytes) of the lower 6 bits of the control head. When the length of the run block is less than or equal to 64 bytes, the run coding (third mode) is more cost-effective than the original data storage, and when the length of the run block is more than 64 bytes, the compression effect of the run coding is not obvious, at the moment, the original pixel data (fourth mode) of 64 bytes is more efficient to be directly stored, and meanwhile, the complex run decoding process can be avoided.
Referring to fig. 5, fig. 5 is a flowchart of coding decision for transparency classified data blocks according to an embodiment of the present application. The method comprises the specific implementation processes of analyzing data characteristics in an input 8x8 Alpha data block, adopting full 0 block coding of a first mode if all pixel values are 0, judging whether all pixel values are 255 if not all pixel values are 0, adopting full 255 block coding of a second mode if not (namely, including non-0 non-255 pixel values), using Run-level coding for the data block if not, calculating the total length of a Run-level block after coding, adopting Run-level coding of a third mode if the total length is less than or equal to 64 bytes (without a control head), and adopting non-compression coding of a fourth mode if the total length is greater than 64 bytes (without a control head).
In a feasible embodiment, the execution process of analyzing the target code stream to obtain the brightness component data block, the chrominance component data block and the decoding data corresponding to the transparency component data block comprises the steps of adopting a JPEG standard format to decode the brightness component encoding result and the chrominance component encoding result in the target code stream to obtain the brightness component decoding data and the chrominance component decoding data, reading a control head of the transparency component encoding result in the target code stream, and determining a decoding mode according to the identification of the control head. Specifically:
The decoder reads the number of components COMP_NUM=4 from the frame header (SOF) to identify the components as a YUVA format, then determines the initial positions and lengths of a Y component coding result, a Cb component coding result, a Cr component coding result and an Alpha component coding result in the code stream according to component identification information (such as initial offset of each component in an SOS section and the number of data blocks) in a target code stream, splits the coding results of the four components from the code stream so as to facilitate decoding the four components in each MCU in turn according to an MCU structure (1 Y+1 Cb+1 Cr+1 Alpha). Then adopting JPEG standard format (standard YUV444 format decoding flow) to decode Y, cb and Cr component coding result in target code stream, for example, firstly adopting Huffman entropy decoding to analyze quantized coefficient data from code stream, then making inverse quantization (making coefficient data multiplied by quantization table value restore frequency domain coefficient) and inverse discrete cosine transform (IDCT, converting frequency domain coefficient into space domain pixel data) so as to obtain final Y component decoding data, cb component decoding data and Cr component decoding data, said decoding process is the inverse process of JPEG standard coding, and can restore the compressed frequency domain coefficient data into 8-bit pixel data of space domain so as to ensure that brightness and chromaticity information are identical with before coding, then reading control head of transparency component coding result in target code stream, according to the identification of control head defining decoding mode, specifically reading 1 byte control head, according to the high 2 bits of control head judging decoding mode:
if the control head is 0x3f (high 2 bits 00), decoding the transparency component encoding result according to the first mode to obtain transparency component decoding data with all pixel values of 0, and directly generating a transparency data block with 8 x 8 pixels and all values of 0 without reading a data part at the moment;
If the control head is 0x7f (high 2 bits 01), decoding the transparency component encoding result according to a second mode to obtain transparency component decoding data with all pixel values of 255, and directly generating a transparency data block with 8 x 8 pixels and all values of 255 without reading a data part at the same time;
If the control head is 0x80-0xBF (high 2 bit 10, binary 10000000-10111111), decoding the data part in the transparency component coding result according to a third mode to obtain transparency component decoding data, wherein the process is the reverse process of run coding, namely reading the length of the data part marked by the low 6 bits of the control head, analyzing a run block according to the format of 1 byte run number plus 1 byte pixel value, filling 8x 8 pixel blocks according to the run number and the pixel value, and restoring the original transparency data block;
If the control header is 0xFF (the upper 2 bits 11, the binary 11111111 and the lower 6 bits 111111 are fixed values), decoding the original data of the data part in the transparency component encoding result according to the fourth mode to obtain transparency component decoding data, namely reading the subsequent 64-byte original data, filling 8×8 pixel blocks according to the row priority order (from left to right and from top to bottom), and directly using the transparency component decoding data blocks as transparency component decoding data blocks.
In a possible embodiment, as shown in fig. 6, the execution of step 150 may include, but is not limited to, step 610 and step 620.
Step 610, performing color space conversion on the decoded brightness component data block and the decoded chromaticity component data block to obtain RGB component data;
And 620, combining the RGB component data with the decoded transparency component data block according to the pixel correspondence to obtain an output image in RGBA format.
In a possible embodiment, in step 610, the color space conversion is to convert the YUV color space (composed of luminance Y and two chromaticities Cb and Cr) commonly used in JPEG encoding into the RGB color space (composed of red R, green G and blue B) commonly used in the display device, and the conversion process may be implemented by a preset mathematical formula (such as a conversion formula in the ITU-R bt.601 or ITU-R bt.709 standard), for example, the conversion formula conforming to the ITU-R bt.601 standard is r=y+1.402× (Cr-128), g=y-0.34414 × (Cb-128) -0.71414 × (Cr-128), and b=y+1.772× (Cb-128), by which the luminance and chromaticity information can be integrated into 8 bits of RGB color information (R, G, B) that can be directly used for display, which are all 0-255.
In a possible embodiment, in step 620, the RGB component data and the decoded transparency component data block are combined according to pixel correspondence to obtain an output image in RGBA format, where the RGBA format is an image format in which Alpha components are added on the basis of RGB color space, each pixel includes R, G, B, A channels of information, and the value is 0-255, and the "pixel correspondence combination" refers to matching the R, G, B value of each pixel in the RGB component data with the Alpha value of the corresponding position pixel in the transparency component data block, for example, r=255, g=0, and b=0 (red) of the (1, 1) position pixel in the RGB component, and the RGBA value of the pixel after combination is (255,0,0,128) representing the semitransparent red pixel, and the finally generated RGBA format image can accurately present color information and transparency effect by this combination.
Referring to fig. 7, fig. 7 is a decoding flowchart provided by an embodiment of the present application. The decoding process is a JPEG picture decoding process with Alpha channels, and specifically comprises the steps of firstly reading a component number identifier (COMP_NUM) in a picture frame header after receiving a JPEG picture to be decoded, judging whether the JPEG picture is in a YUVA format containing the Alpha channels, for example, the identifier is 4, and confirming that four components of brightness (Y), chromaticity (Cb, cr) and transparency (Alpha) need to be processed simultaneously. The method comprises the steps of sequentially processing Y, cb and Cr component compressed data in a code stream according to a JPEG standard flow, firstly resolving quantized frequency domain coefficients from the code stream through Huffman entropy decoding, then reducing original frequency domain coefficients through inverse quantization operation (multiplying the coefficients by quantization table values), and finally converting the frequency domain coefficients into pixel data of a space domain through Inverse Discrete Cosine Transform (IDCT), so as to finally obtain decoded data of the Y, cb and Cr components. The method comprises the steps of executing exclusive self-adaptive decoding logic aiming at Alpha component coding results in a code stream, firstly reading a1 byte control head, identifying coding modes (all 0 blocks, all 255 blocks, run coding blocks and non-compressed blocks) through high 2 bits, analyzing data parts according to the corresponding modes, namely directly generating 8 multiplied by 8 data blocks corresponding to pixel values if the modes are all 0/all 255 modes, analyzing run information restoring data if the modes are the run coding modes, and directly reading 64 bytes of original data if the modes are the non-compressed modes, so as to finally obtain the decoding data of the Alpha component. And then fusing the RGB component data and Alpha component data according to pixel one-to-one correspondence, combining R, G, B color values of each pixel with Alpha transparency values of corresponding positions, and finally outputting an image in RGBA format to realize display with transparent effect. The process has the core advantages that the characteristics of high efficiency and strong hardware compatibility of the traditional JPEG coding are reserved, the limitation of no transparent effect of JPEG is solved by adding Alpha channels, meanwhile, the hardware adaptation cost is low, the existing JPEG decoding modules (such as a Huffman decoder and an inverse quantization/IDCT unit) can be reused, and only the Alpha decoding units are needed to be added, so that the process is particularly suitable for scenes needing transparent display, such as a graphical user interface, a game map and the like.
In a possible embodiment, when the digital publishing software processes the interactive illustration, the editing team usually needs to process the image-text material containing transparent background (such as schematic drawing in the technical journal and dynamic role of children drawing), and the JPEG decoding process of the present application can be accurately adapted to such a scene, and the specific application is as follows:
After the software receives the illustration file to be decoded (obtained based on the previous JPEG image encoding process), it first reads the component count identifier (comp_num) in the picture frame header. When the identification is 4, the file is judged to be in YUVA format, and contains four components of luminance (Y), chrominance (Cb, cr) and transparency (Alpha). Aiming at the main patterns (such as lines and roles of schematic diagrams) in the illustration, software calls a traditional JPEG decoding module to process Y, cb and Cr components, namely firstly analyzing quantized frequency domain coefficients from a code stream through Huffman entropy decoding, restoring original coefficients through inverse quantization operation, and converting the original coefficients into spatial domain pixel data through Inverse Discrete Cosine Transform (IDCT). The existing JPEG hardware decoding unit is completely multiplexed, no special module is required to be additionally developed, and the adaptation cost of education industry software (such as an electronic teaching material manufacturing tool) is reduced. For transparent elements (such as semitransparent ground color of a marked frame and transparent mantissa of a role) in the illustration, software starts Alpha component exclusive decoding logic, namely, after reading a 1-byte control header, identifying a coding mode through high 2 bits, namely, marking a frame area as a 'run coding block' (adapting to gradual change transparent effect), enabling the role mantissa as a 'non-compressed block' (reserving fine transparent textures), and then analyzing data according to a corresponding mode to quickly generate an 8X 8 transparent data block. After decoding is completed, the software firstly converts Y, cb and Cr components into RGB color data, and then fuses the RGB data and Alpha components according to pixels, wherein R, G, B values of a labeling frame are combined with gradual change Alpha values, labeling content is highlighted and background images and texts are not shielded, and color values of a character mantles are combined with fine Alpha textures to present semitransparent texture. The final output RGBA formatted artwork may be used for cross-device typesetting.
The embodiment of the application also discloses an electronic device which comprises at least one processor, at least one memory and computer program instructions stored in the memory, wherein the computer program instructions realize the JPEG image encoding and decoding method as before when being executed by the processor.
The embodiment of the application also discloses a computer readable storage medium in which a processor executable program is stored, which when executed by a processor is used to perform the JPEG image encoding and decoding method as before.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (10)
1. A JPEG image encoding and decoding method, comprising:
acquiring an image to be processed, wherein the image comprises a brightness component, a chromaticity component and a transparency component;
Dividing the image into a plurality of data block groups, each of the data block groups comprising a luminance component data block, a chrominance component data block, and a transparency component data block;
encoding the brightness component data block, the chrominance component data block and the transparency component data block in each data block group to obtain a target code stream;
Analyzing the target code stream to obtain decoding data corresponding to the brightness component data block, the chrominance component data block and the transparency component data block;
and obtaining an output image according to the decoded data of each component data block.
2. The JPEG image codec method according to claim 1, wherein said encoding of the luminance component data block, the chrominance component data block, and the transparency component data block in each of said data block groups, results in a target bitstream, comprising:
respectively encoding the brightness component data block and the chrominance component data block in each data block group according to a JPEG standard format to obtain a corresponding brightness component encoding result and a corresponding chrominance component encoding result;
coding the transparency component data blocks in each data block group according to a preset coding mode to obtain a corresponding transparency component coding result;
and splicing the brightness component coding result, the chrominance component coding result and the transparency component coding result of the same data block group, and obtaining a target code stream according to the splicing results of the plurality of data block groups.
3. The JPEG image encoding and decoding method according to claim 2, wherein said encoding the transparency component data blocks in each of said data block groups according to a preset encoding mode, to obtain a corresponding transparency component encoding result, comprises:
Determining a target coding mode of the transparency component data block from a plurality of preset coding modes according to the data characteristics of the transparency component data block;
and obtaining a transparency component coding result comprising a control header and a data part based on the target coding mode, wherein the control header is used for identifying the target coding mode and the length of the data part.
4. A JPEG image codec method according to claim 3, wherein said determining a target coding mode of said transparency component data block from a preset plurality of coding modes based on data characteristics of said transparency component data block comprises:
If all pixel values of the transparency component data block are 0, selecting a first mode from the plurality of coding modes as a target coding mode;
wherein when the target coding mode is the first mode, the control header in the transparency component coding result is 0x3f, and the length of the data portion is 0.
5. A JPEG image codec method according to claim 3, wherein said determining a target coding mode of said transparency component data block from a preset plurality of coding modes based on data characteristics of said transparency component data block comprises:
if all pixel values of the transparency component data block are 255, selecting a second mode from the plurality of coding modes as a target coding mode;
Wherein when the target coding mode is the second mode, the control header in the transparency component coding result is 0x7f, and the length of the data portion is 0.
6. A JPEG image codec method according to claim 3, wherein said determining a target coding mode of said transparency component data block from a preset plurality of coding modes based on data characteristics of said transparency component data block comprises:
if the transparency component data block comprises a non-0 pixel value and a non-255 pixel value, executing run-length coding on the transparency component data block to obtain a run-length block;
if the total length of the run-length block is smaller than or equal to the preset byte length, selecting a third mode from the multiple coding modes as a target coding mode, wherein when the target coding mode is the third mode, the data part of the transparency component coding result is the run-length block;
And if the total length of the run-length block is larger than the preset byte length, selecting a fourth mode from the multiple coding modes as a target coding mode, wherein when the target coding mode is the fourth mode, the data part of the transparency component coding result is the original pixel data of the image.
7. The JPEG image codec method according to claim 2, wherein said parsing said target bitstream to obtain decoded data corresponding to said luminance component data block, said chrominance component data block, and said transparency component data block, comprises:
Decoding a brightness component coding result and a chrominance component coding result in the target code stream by adopting a JPEG standard format to obtain brightness component decoding data and chrominance component decoding data;
A control head for reading the transparency component coding result in the target code stream:
if the control head is 0x3f, decoding the transparency component encoding result according to a first mode to obtain transparency component decoding data with all pixel values of 0;
If the control head is 0x7f, decoding the transparency component encoding result according to a second mode to obtain transparency component decoding data with all pixel values of 255;
if the control head mark is in a third mode, decoding a data part in the transparency component coding result according to the third mode to obtain transparency component decoding data;
and if the control head is marked as a fourth mode, decoding the original data of the data part in the transparency component coding result according to the fourth mode to obtain transparency component decoding data.
8. The JPEG image encoding and decoding method according to claim 1, wherein said obtaining an output image from said decoded data of each component data block comprises:
performing color space conversion on the decoded brightness component data block and the decoded chromaticity component data block to obtain RGB component data;
And correspondingly combining the RGB component data with the decoded transparency component data block according to pixels to obtain an output image in an RGBA format.
9. An electronic device comprising at least one processor, at least one memory, and computer program instructions stored in the memory, which when executed by the processor, implement the method of any of claims 1-8.
10. A computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the method of any of claims 1-8.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202511668161.2A CN121120810A (en) | 2025-11-14 | 2025-11-14 | JPEG image coding and decoding method, device and medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202511668161.2A CN121120810A (en) | 2025-11-14 | 2025-11-14 | JPEG image coding and decoding method, device and medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN121120810A true CN121120810A (en) | 2025-12-12 |
Family
ID=97950433
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202511668161.2A Pending CN121120810A (en) | 2025-11-14 | 2025-11-14 | JPEG image coding and decoding method, device and medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN121120810A (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101742317A (en) * | 2009-12-31 | 2010-06-16 | 北京中科大洋科技发展股份有限公司 | Video compressing and encoding method with alpha transparent channel |
| CN110089114A (en) * | 2016-10-10 | 2019-08-02 | 三星电子株式会社 | For carrying out coding or decoded method and apparatus to luminance block and chrominance block |
| CN110636297A (en) * | 2018-06-21 | 2019-12-31 | 北京字节跳动网络技术有限公司 | Component-dependent subblock partitioning |
| CN112995664A (en) * | 2021-04-20 | 2021-06-18 | 南京美乐威电子科技有限公司 | Image sampling format conversion method, computer-readable storage medium, and encoder |
| WO2023070281A1 (en) * | 2021-10-25 | 2023-05-04 | Beijing Xiaomi Mobile Software Co., Ltd. | Decoding method, encoding method, decoding device, encoding device and display device for displaying an image on a transparent screen |
-
2025
- 2025-11-14 CN CN202511668161.2A patent/CN121120810A/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101742317A (en) * | 2009-12-31 | 2010-06-16 | 北京中科大洋科技发展股份有限公司 | Video compressing and encoding method with alpha transparent channel |
| CN110089114A (en) * | 2016-10-10 | 2019-08-02 | 三星电子株式会社 | For carrying out coding or decoded method and apparatus to luminance block and chrominance block |
| CN110636297A (en) * | 2018-06-21 | 2019-12-31 | 北京字节跳动网络技术有限公司 | Component-dependent subblock partitioning |
| CN112995664A (en) * | 2021-04-20 | 2021-06-18 | 南京美乐威电子科技有限公司 | Image sampling format conversion method, computer-readable storage medium, and encoder |
| WO2023070281A1 (en) * | 2021-10-25 | 2023-05-04 | Beijing Xiaomi Mobile Software Co., Ltd. | Decoding method, encoding method, decoding device, encoding device and display device for displaying an image on a transparent screen |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12200210B2 (en) | Encoding method, decoding method, encoding/decoding system, encoder, and decoder | |
| CN106937132B (en) | A method for image file processing | |
| US20190222623A1 (en) | Picture file processing method, picture file processing device, and storage medium | |
| TWI707309B (en) | Method, system and storage medium for processing image file | |
| RU2710873C2 (en) | Method and device for colour image decoding | |
| CN109076231A (en) | Method and apparatus, corresponding coding/decoding method and decoding device for being encoded to high dynamic range photo | |
| CN107852501A (en) | Method and apparatus for encoding both an HDR image and an SDR image obtained from said HDR image using a color mapping function | |
| CN107071516B (en) | A method for image file processing | |
| KR20180044291A (en) | Coding and decoding methods and corresponding devices | |
| CN104935945A (en) | Image Compression Method Based on Extended Reference Pixel Sample Set | |
| WO2023020560A1 (en) | Video coding and decoding method and apparatus, electronic device and storage medium | |
| US20190238832A1 (en) | Image processing device | |
| US7574056B2 (en) | Method for compression and expansion of display data | |
| CN114788280A (en) | Video coding and decoding method and device | |
| JP3462867B2 (en) | Image compression method and apparatus, image compression program, and image processing apparatus | |
| US10165278B2 (en) | Image compression device, image compression method, image extension device, and image extension method | |
| CN108471536A (en) | Alpha channel transmission methods and device, terminal installation and storage medium | |
| US8600181B2 (en) | Method for compressing images and a format for compressed images | |
| CN101378506A (en) | image compression method | |
| CN121120810A (en) | JPEG image coding and decoding method, device and medium | |
| JP4364729B2 (en) | Image data compression / decompression method and image processing apparatus | |
| CN121173919A (en) | Video processing method, device, terminal equipment and storage medium | |
| KR100834357B1 (en) | Apparatus and method for compressing video data | |
| TWI581616B (en) | Method for encoding, method for decoding, system for encoding and decoding, encoder and decoder | |
| CN105828079B (en) | Image processing method and device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |