ENCODING AND DECODING OF VIDEO IMAGE
FIELD
[0001] The invention relates to a method, an apparatus, a computer program and computer memory means for encoding video image formed of consecutive still images. The invention also relates to a method, an apparatus, a computer program and computer memory means for decoding video image formed of consecutive still images.
BACKGROUND
[0002] Video image is encoded and decoded in order to reduce the amount of data so that the video image can be stored more efficiently in memory means or transferred using a telecommunication connection. An example of a video coding standard is MPEG-4 (Moving Pictures Expert Group), where the idea is to send video image in real time on a wireless channel. This is a very ambitious aim, as if the image to be sent is for example of cif size (288 x 352 pixels) and the transmission frequency is 15 images per second, then 36.5 million bits should be packed into 64 kilobits each second. The packing ratio would in such a case be extremely high, 570:1.
[0003] In order to transfer an image, the image is typically divided into image blocks, the size of which is selected to be suitable with the system. The image block information generally comprises information about the brightness, colour and location of an image block in the image itself. The data in the image blocks is compressed block-by-block using a desired coding method. Compression is based on deleting the less significant data. The compression methods are mainly divided into three different categories: spectral redundancy reduction, spatial redundancy reduction and temporal redundancy reduction. Typically various combinations of these methods are employed for compression.
[0004] In order to reduce spectral redundancy, a YUV colour model is for instance applied. The YUV model takes advantage of the fact that the human eye is more sensitive to the variation in luminance, or brightness, than to the changes in chrominance, or colour. The YUV model comprises one luminance component (Y) and two chrominance components (U, V). The chrominance components can also be referred to as cb and cr components. For example, the size of a luminance block according to the video coding stan- dard H.263 is 16 x 16 pixels, and the size of each chrominance block is 8 x 8
pixels, together covering the same area as the luminance block. In this standard the combination of one luminance block and two chrominance blocks is referred to as a macro block. The macro blocks are generally read from the image line-by-line. Each pixel in both the luminance and chrominance blocks may obtain a value ranging between 0 and 255, meaning that eight bits are required to present one pixel. For example, value 0 of the luminance pixel refers to black and value 255 refers to white.
[0005] What is used to reduce spatial redundancy is for example discrete cosine transform DCT. In discrete cosine transform, the pixel presen- tation in the image block is transformed to a spatial frequency presentation. Furthermore, only the signal frequencies in the image block that are presented therein have high-amplitude coefficients, and the coefficients of the signals that are not shown in the image block are close to zero. The discrete cosine transform is basically a lossless transform, and interference is caused to the signal only in quantization.
[0006] Temporal redundancy tends to be reduced by taking advantage of the fact that consecutive images generally resemble one another, and therefore instead of compressing each individual image, the motion data in the image blocks is generated. The basic principle is the following: a previously encoded reference block that is as good as possible is searched for the image block to be encoded, the motion between the reference block and the image block to be encoded is modelled and the motion vector coefficients are sent to the receiver. The difference between the block to be encoded and the reference block is indicated as a prediction error component, or prediction error frame. A reference picture, or reference frame, previously stored in the memory can be used in motion vector prediction of the image block. Such a coding is referred to as inter-coding, which means utilizing the similarities between the images in the same image string.
[0007] Since the present invention focuses on reducing spatial re- dundancy, a closer look is next taken thereupon. Discrete cosine transform is performed to the macro block using the formula:
F(u,v) = — C(«)C(v)2l ∑ f(χ, y) cos — cos y (1)
N =o y=o 2N IN
where x and y are the coordinates of the original block, u and v are the coordinates of the transform block, N = 8 and
1 in eachu,v = 0,
C(u).C(y) = i- (2) l, otherwise
[0008] Next, table 1 shows an example of how an 8 x 8 pixel block is transformed using the discrete cosine transform. The upper part of the table shows the non-transformed pixels, and the lower part of the table shows the result after the discrete cosine transform has been carried out, where the first element of value 1303, what is known as a dc coefficient, depicts the mean size of the pixels in the block, and the remaining 63 elements, what are known as ac coefficients, illustrate the spread of the pixels in the block.
[0009] As the values of the pixels in Table 1 show, they are widely spread. Consequently the result obtained after the discrete cosine transform also includes plenty of ac coefficients of different sizes.
Table 1
[0010] Table 2 illustrates a block in which the spread between pixels is small. As table 2 shows, the ac coefficients receive the same values, meaning that the block is compressed very efficiently.
115 115 115 115 115 115 115 115
115 115 115 115 115 115 115 115
115 115 115 115 115 115 115 115
115 115 115 115 115 115 115 115
115 115 115 115 115 115 115 115
115 115 115 115 115 115 115 115
115 115 115 115 115 115 115 115
115 115 115 115 115 115 115 115
924 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Table 2
[0011] Next, the discrete cosine transformed block is "quantized", or each element therein is basically divided using a constant. This constant may vary between different macro blocks. In addition, a higher divider is generally used for ac coefficients than for dc coefficients. The "quantization parameter ", from which said dividers are calculated, ranges between 1 and 31. The more zeroes are obtained to the block, the better the block is packed, since zeroes are not sent to the channel. Coding methods that are irrelevant to the present invention can also be performed to the quantized blocks and finally a bit stream can be formed thereof that is sent to a decoder. An inverse quantization and an inverse discrete cosine transform are performed to the quantized blocks within the encoder, thus forming a reference image, from which the
blocks of the following images can be predicted. Hereafter the encoder thus sends the difference data between the following block and the reference blocks. Consequently the packing efficiency improves.
[0012] Quantization is a problem in video coding, as the higher the quantization used, the more information disappears from the image and the final result is unpleasant to watch.
[0013] After decoding the bit stream and performing the decoding methods unessential for the present invention, a decoder basically carries out the same measures as the encoder when generating a reference image, meaning that similar steps are performed to the blocks as in the encoder but inversely.
[0014] Finally the assembled video image is applied onto a display, and the final result depends to a great extent on the quantization parameter used. If an element in the block descends to zero during quantization, it can no longer be restored in inverse quantization.
BRIEF DESCRIPTION
[0015] It is an object of the invention to provide an improved method for encoding video image, an improved apparatus for encoding video image, an improved computer program for encoding video image and improved com- puter memory means for encoding video image. It is a further object of the invention to provide an improved method for decoding video image, an improved apparatus for decoding video image, an improved computer program for decoding video image and improved computer memory means for decoding video image. [0016] As an aspect of the invention a method according to claim 1 is provided for encoding video image. A method for decoding video image according to claim 6 is provided as another aspect of the invention. A computer program according to claim 16 is also provided as an aspect of the invention. As a further aspect of the invention, computer memory means according to claim 17 are provided. As a still further aspect of the invention an apparatus according to claim 18 is provided for encoding video image. An apparatus for decoding video image according to claim 23 is also provided as an aspect of the invention.
[0017] Further preferred embodiments of the invention are disclosed in the dependent claims.
[0018] The invention is based on the idea that before discrete cosine transform, pre-processing is carried out in an encoder for a block. The pre-processing can be removed using inverse pre-processing in a decoder. On account of the pre-processing the packing efficiency increases, as pre- processing reduces the noise/deviation found in the image. As described above in connection with tables 1 and 2, the efficiency of discrete cosine transform increases while the spread of the blocks decreases. Consequently the quantization parameter used can be reduced, and in spite thereof the data transmission capacity required of the channel remains approximately the same. In practice, this means that higher-quality video image can be stored/transferred by means of the pre-processing and inverse pre-processing according to the invention than without the invention at the same data transmission rate.
BRIEF DESCRIPTION OF THE DRAWINGS [0019] The preferred embodiments of the invention are described by way of example below with reference to the accompanying drawings, in which: Figure 1 shows prior art apparatuses for encoding and decoding video image,
Figure 2 shows new types of apparatuses for encoding and decod- ing video image,
Figure 3 shows the pre-processing used for encoding video image, Figure 4 shows the inverse pre-processing used for decoding video image,
Figure 5 shows prior art pixels at the interfaces between the appara- tus parts illustrated in Figure 1 ,
Figure 6 shows new types of pixels at the interfaces between the apparatus parts illustrated in Figure 2,
Figure 7 is a flow chart illustrating a method for encoding video image formed of consecutive still images, Figure 8 is a flow chart illustrating a method for decoding video image formed of consecutive still images, and
Figure 9 illustrates how pre-processing affects the pixels.
DESCRIPTION OF EMBODIMENTS
[0020] With reference to Figure 1 , apparatuses for encoding and decoding video image are described. The face of a person 100 is filmed using
a video camera 102. The camera 102 produces video image of individual consecutive still images, whereof one still image 104 is shown in the Figure. The camera 102 forms a matrix describing the image 104 as pixels, for example as described above, where both luminance and chrominance are provided with specific matrixes. A data flow 106 depicting the image 104 as pixels is next applied to an encoder 108. It is naturally also possible to provide such an apparatus, in which the data flow 106 is applied to the encoder 108, for instance along a data transmission connection or for example from computer memory means. In such a case, the idea is to compress un-compressed video image 106 using the encoder 108 for instance in order to be forwarded or stored.
[0021] Since our interest towards the apparatuses concerned lies in the compression to be carried out in order to reduce spatial redundancy, only the essential parts of the encoder 108 and a decoder 120 are described. The operation of other parts is apparent to those skilled in the art on the basis of standards and textbooks, for instance the works incorporated herein by reference:
[0022] - ISO/IEC JTC 1/SC 29/WG 11 : "Generic coding of audiovisual objects - Part 2: Visual", pages 178, 179, 281.
[0023] - Vasudev Bhaskaran ja Konstantinos Konstantinides: "Im- age and Video Compressing Standards - Algorithms and Architectures, Second Edition", Kluwer Academic Publishers 1997, chapter 6: "The MPEG video standards".
[0024] The encoder 108 comprises discrete cosine transform means 110 for performing discrete cosine transform as described above for the pixels in each still image 104. A data flow 112 formed using discrete cosine transform is applied to quantization means 114 that carry out quantization using a selected quantization ratio. Other types of coding can also be performed to a quantized data flow 116 that are not further described in this context. The compressed video image formed by means of the encoder 108 is transferred over a channel 118 to the decoder 120. How the channel 118 is implemented is not described herein, since the different implementation alternatives are apparent for those skilled in the art. The channel 118 may for instance be a fixed or wireless data transmission connection. The channel 118 can also be interpreted as a transmission path, by means of which the video image is stored in memory means, for example on a laser disc, and by means of which the video image is read from the memory means and processed using the decoder 120.
[0025] The decoder 120 comprises inverse quantization means 122, which are used to decode the quantization performed in the encoder 108. The inverse quantization is unfortunately unable to restore the element of the block, the value of which descends to zero in quantization. [0026] An inverse quantized data flow 124 is next applied to inverse discrete cosine transform means 126, which carry out inverse discrete cosine transform to the pixels in each still image 104. A data flow 128 obtained is then applied through other possible decoding processes onto a display 130, which shows the video image formed of still images 104. [0027] The encoder 108 and decoder 120 can be placed into different apparatuses, such as computers, subscriber terminals of various radio systems like mobile stations, or into other apparatuses where video image is to be processed. The encoder 108 and the decoder 120 can also be connected to the same apparatus, which can in such a case be referred to as a video codec. [0028] Figure 5 describes prior art pixels at the interfaces 106, 112,
116, 124 and 128 between the apparatus parts shown in Figure 1. The test image used is the first 8 x 8 luminance block in the first image of the test sequence "calendar_qcif.yuv" known to those skilled in the art. The interface 106 shows the contents of the data flow after the camera 102. The interface 112 depicts the contents of the data flow after the discrete cosine transform means 110. The interface 116 shows the contents of the data flow after the quantization means 114. The quantization ratio used is 17.
[0029] For the sake of simplicity, other known coding methods are not used, meaning that the data flow of the interface 116 is transferred along the channel 118 to the decoder 120. The interface 124 describes the contents of the data flow after the inverse quantization means 122. As Figure 5 shows, when the original data flow 112 before quantization is compared with the reconstructed data flow 124 after the inverse quantization, the ac component values, which have descended to zero and which are represented at the inter- face 116 as a result of the quantization, can no longer be restored. In practice this means that the original image 106 before decoding and the image reconstructed using the inverse discrete cosine transform means 126 described at the interface 128 no longer correspond with one another. Noise that degrades the quality of the image has appeared on the reconstructed image. [0030] Figure 2 shows new types of apparatuses for encoding and decoding video image. Since the apparatuses in Figure 2 are based on the
prior art apparatuses shown in Figure 1 , only the parts that differ from the apparatuses explained in Figure 1 are described below.
[0031] In the encoder 108 discrete cosine transform means 110 communicate with pre-processing means 200 that modify the value of each pixel in the still image 104 before discrete cosine transform is carried out. The pre-processing means 200 comprise three different sets of means, which are described in Figure 3.
[0032] The first means is used to form a pixel value change between two consecutive pixel values. A person skilled in the art may select an appropriate and efficient way that conforms to the conditions for implementing this mathematical operation. In an embodiment, the pre-processing means 200 form the pixel value change by reducing the previous pixel value from the posterior pixel value.
[0033] The second means is used to form a weighted change using the pixel value change and a predetermined weighting coefficient. A person skilled in the art may select an appropriate and efficient way that conforms to the conditions for implementing this mathematical operation. In an embodiment, the pre-processing means 200 form a weighted change by dividing the pixel value change using the weighting coefficient. [0034] The third means is used to set the modified previous pixel value as the posterior pixel value using the weighted change. A person skilled in the art may select an appropriate and efficient way that conforms to the conditions for implementing this mathematical operation. In an embodiment, the pre-processing means 200 form the modified previous pixel value using the weighted change by adding the weighted change to the previous pixel value.
[0035] In an embodiment, the pre-processing means 200 perform the pre-processing using the following formula or a formula that is mathematically equivalent therewith: yι=yn+((χι-yι-ι)/k) (3) where yι is the pre-processed posterior pixel, VM is the previous pixel, Xj is the un-p re-processed posterior pixel, and k is the weighting coefficient.
[0036] The weighting coefficient k is preferably a number that is larger than one. In a preferred embodiment the weighting coefficient is two, in which case formula 3 obtains the following form
[0037] A mathematically equivalent (providing the same result) way for carrying out pre-processing is to employ a sliding weighted mean for calculation. The formula for the weighted mean is the following
[0038] In order that the posterior pixel value remains between the previous pixel value and the un-pre-processed posterior pixel value,
M
∑α, = M holds true for the weighting coefficient in the weighted mean formula,
;=ι and since M is two, then aι+a2=2.
[0039] In the decoder 120, inverse discrete cosine transform means 126 communicate with inverse processing means 214 that remove the preprocessing of each pixel in the still image 104 after the inverse discrete cosine transform has been carried out. The inverse pre-processing means 214 com- prise three sets of means, which are described in Figure 4.
[0040] The first means is used to form a pixel value change between two consecutive pixel values. A person skilled in the art may select an appropriate and efficient way that conforms to the conditions for implementing this mathematical operation. In an embodiment, the inverse pre-processing means 214 form the pixel value change by reducing the previous pixel value from the posterior pixel value.
[0041] The second means is used to form a weighted change using the pixel value change and in video encoding a predetermined weighting coefficient used to form the weighted change. A person skilled in the art may select an appropriate and efficient way that conforms to the conditions for implementing this mathematical operation. In an embodiment, the inverse pre-processing means 214 form the weighted change by multiplying the pixel value change using a weighting coefficient reduced by one.
[0042] The third means is used to set the modified posterior pixel value as the posterior pixel value using the weighted change. A person skilled in the art may select an appropriate and efficient way that conforms to the conditions for implementing this mathematical operation. In an embodiment, the inverse pre-processing means 214 form the modified posterior pixel value us-
ing the weighted change by adding the weighted change to the posterior pixel value.
[0043] In an embodiment the inverse pre-processing means 214 perform the inverse pre-processing using the following formula or a formula that is mathematically equivalent therewith:
where Xj is the un-pre-processed posterior pixel, yi is the pre-processed posterior pixel, yι-ι is the previous pixel, and k is the weighting coefficient.
[0044] The weighting coefficient k is preferably a number that is lar- ger than one. In a preferred embodiment the weighting coefficient is two, in which case formula 6 obtains the following form χi=yi+((y.-yi-ι)) (7)
[0045] As described above, the pre-processing or inverse preprocessing can be performed to the luminance data and the chrominance data of the still image.
[0046] The pre-processing means 200 or the inverse pre-processing means 214 process still image 104 line-by-line, column-by-column, macro block-by-macro block, block-by-block or in accordance with another predetermined non-random way. What is important is that both the encoder 108 and the decoder 120 are familiar with the processing way employed. In principle the pre-processing can also be carried out twice, for example first line-by-line and then column-by-column.
[0047] The pre-processing means 200, the discrete cosine transform means 110, the quantization means 114, the inverse quantization means 122, the inverse discrete cosine transform means 126 and the inverse preprocessing means 214 can be implemented as a computer program operating in the processor, whereby for instance each required operation is implemented as a specific program module. The computer program thus comprises the routines for implementing the steps of the methods that will be described below. In order to promote the sales of the computer program, it can be stored into the computer memory means, such as a CD-ROM (Compact Disc Read Only Memory). The computer program can be designed so as to operate also in a standard general-purpose personal computer, in a portable computer, in a computer network server or in another prior art computer. [0048] Said means 200, 110, 114, 122, 126 and 214 can be implemented also as an equipment solution, for example as one or more application
specific integrated circuits (ASIC) or as operation logic composed of discrete components. When selecting the way to implement the means, a person skilled in the art observes for example the required processing power and the manufacturing costs. Different hybrid implementations formed of software and equipment are also possible.
[0049] Figure 6 describes new types of pixels at the interfaces 202, 204, 206, 210, 212, 216 between the apparatus parts shown in Figure 2. The test image used is the same luminance block as in Figure 5, i.e. the first 8 x 8 luminance block in the first image of the test sequence "calendar_qcif.yuv". In the apparatus according to Figure 2, the contents of the data flow 106 after the camera 102 is the same as in the apparatus according to Figure 1 , and the data flow 106 is therefore not described again in Figure 6. The interface 202 depicts the contents of the data flow after the pre-processing means 200. The interface 204 shows the contents of the data flow after the discrete cosine transform means 110. The interface 206 depicts the contents of the data flow after the quantization means 114. The quantization ratio used is 11.
[0050] For the sake of simplicity, other known coding methods are not used, meaning that the data flow at the interface 206 is transferred along the channel 208 to the decoder 120. The interface 210 describes the contents of the data flow after the inverse quantization means 122. The interface 212 depicts the contents of the data flow after the inverse discrete cosine transform means 126. The interface 214 shows the data flow after the inverse preprocessing means 214.
[0051] In Figure 5, the sum of absolute values of the differences n each pixel in block 106 in the original image arriving from the camera and in the reconstructed block 128 is calculated to be 677. This figure depicts the noise arriving at the block during encoding and decoding. Correspondingly in Figure 6, the sum of absolute values of the differences in each pixel in block 106 in the original image arriving from the camera and block 120 reconstructed in the decoder is calculated to be 596. When comparing the original block 106 with the reconstructed block 128 and the original block 106 with the reconstructed block 216 produced using pre-processing, it is observed that the form of the block remains better when pre-processing is used. Thus, while using the new encoder 108 and decoder 120 according to Figure 2, the quality of the reconstructed image is better, as a lower quantization ratio was used, while the required data transmission capacity of the channel 118, 208 remains approxi-
mately the same, since the packing efficiency remains nearly the same. In Figure 5 the packing ratio is approximately 1 :25 and in Figure 6 the packing ratio is 1 :23.
[0052] Figure 9 illustrates how pre-processing affects the pixels. The pixels selected are the eight original pixels on the first row at the interface 106 in Figure 5 and the eight pre-processed pixels on the first row at the interface 202 in Figure 6. The pixel's serial number 1 to 8 is shown on the horizontal axis and the vertical axis shows the pixel value between 0 and 120. The unbroken line depicts the original pixels and the dashed line the pre-processed pixels. As Figure 9 shows, the pre-processing smoothens the variation of the pixel values.
[0053] The applicant has performed tests, in which a peak signal noise ratio (PSNR) estimator is used for the visual comparison of the image, the PSNR is calculated using the following formula:
2552 psnr = lθlog M N , (8)
where MxN is the image size, f(x,y) is the pixel in the original image and r(x,y) is the pixel in the image to be compared. In the comparison, the PSNR is calculated from the mean of the luminance and chrominance PSNRs.
[0054] When comparing the packing efficiency, the size of the packed video image is simply used. In pre-processing, two is used as the weighting coefficient as well as rounding to the nearest integer number.
[0055] The abbreviations used in the table are as follows:
- Stream name file name
- Pre enc use of pre-processing in the encoder ("+"= stands for used, "-" = for not used)
- Pre dec use of pre-processing in the decoder
- Size the size of a packed video image in bytes
- QP the quantization parameter employed
- Psnr the PSNR of the decoded video com- pared with the PSNR of the original video
- Size rate packing ratio (Size un-processed /
Size processed)
- Psnr rate PSNR ratio (PSNR un-processed /
PSNR processed).
[0056] Let us first take a closer look at the sequence "calen- dar_qcif.yuv", from which the first 20 images are processed.
Table 3 [0057] When quantization parameter 1 and pre-processing are used only in the encoder, which corresponds nicely with the conventional preprocessing, the packing efficiency improves by 40%, but the quality of the image decreases significantly (58%). If the decoding of the pre-processing is added to the decoder, then the quality of the image deteriorates only by 17%.
Table 4 [0058] When the quantization is increased to fifteen, it is noted that the efficiency of the pre-processing to be carried out in the encoder more than doubles, whereas the pre-processing performed in the decoder decreases in efficiency compared with the data run by means of a lower quantization. The size of the test data run by means of a quantization parameter 10 is nearly unprocessed; the efficiency increases about 15 per cent, while the quality of the image remains almost unchanged.
Table 5
[0059] When extremely large quantization is used, it can be noticed that the efficiency of the pre-processing decreases. Now the same image quality is obtained by means of a packing efficiency that is about 3% better. [0060] In the following, with reference to the flow chart in Figure 7, a method for encoding video image formed of consecutive still images is described. The method starts in block 700, in which the encoder starts to receive video image to be encoded. In block 702, the process starts processing the first un-processed part in the first image. [0061] In block 704, one is set as the value of the counter i.
[0062] In block 706, the un-processed value (x-i) is set as the pre- processed value (y-i) of the first pixel.
[0063] In block 708, the value of the counter i is increased by one.
[0064] Then in block 710, the process starts performing the actual pre-processing, i.e. pre-processing is repeated for each pixel in the still image at a time before carrying out discrete cosine transform, said process comprising:
- forming a pixel value change between two consecutive pixel values, - forming a weighted change using the pixel value change and a predetermined weighting coefficient, and
- setting the modified previous pixel value as the posterior pixel value using the weighted change.
[0065] The pre-processing can be carried out for example using formula 3 described above in block 710.
[0066] In block 712, it is tested whether the part of the image being processed is already finished, or in this example it is tested, whether the value of the counter i exceeds the size of the serial number of the last pixel in the image part when the value is increased by one. For instance, if a cif-sized im- age is processed line-by-line, then tests are carried out to know whether i+1>352. If the last pixel in the image has not yet been processed, then the process proceeds as arrow 714 shows back to block 708, from where the process proceeds by increasing the value of the counter i by one. If the last pixel in the image part is processed in block 710, then the process proceeds in accordance with arrow 716 to block 718, where tests are carried out in order to know whether the entire image has been processed.
[0067] If all the image parts are have not yet been processed, then the process proceeds as arrow 720 shows to block 702, from where the process proceeds by starting to process an un-processed part of the following image. [0068] If all the parts in the image are processed, then the process proceeds from block 718 in accordance with arrow 722 to block 724, where discrete cosine transform 724 is performed to the non-transformed pixels in the still image. The discrete cosine transform can be carried out in block 724 for the whole image at a time, or then in accordance with the dashed block 724 for each pre-processed part of the image at a time before the following part in the image is started to be processed in block 702.
[0069] Then in block 726, tests are performed to find out whether the previous pre-processed image was the last one in said video image sequence. If not, the process proceeds as arrow 728 shows to block 702, where the processing continues by starting to process the first un-processed part in the following image. If the pre-processing of the video sequence is finished, the process proceeds from block 726 as arrow 730 shows to block 732, where the process is ended.
[0070] The method can be modified in accordance with the de- pendent claims. Since the contents thereof are explained above in connection with the encoder 108, the description is not repeated herein.
[0071] Finally, Figure 8 illustrates a method for decoding video image formed of consecutive still images.
[0072] The method starts from block 800 where the decoder starts receiving video image to be decoded.
[0073] In block 802, inverse discrete cosine transform 802 is carried out to the pixels in the still image.
[0074] Then in block 804, the first un-p re-processed part of the image is processed. [0075] In block 806, one is set as the initial value of counter i.
[0076] In block 808, the pre-processed value (yi) is set as the inverse processed value (x of the first pixel.
[0077] In block 810, the value of counter i is increased by one.
[0078] Then in block 812, the actual inverse pre-processing is started, i.e. inverse pre-processing is repeated for each pixel in the still image
at a time after having carried out inverse discrete cosine transform, said preprocessing comprising:
- forming a pixel value change between two consecutive pixel values, - forming a weighted change using the pixel value change and a predetermined weighting coefficient used to form the weighted change in video image encoding, and
- setting the modified posterior pixel value as the posterior pixel value using the weighted change. [0079] The inverse pre-processing can be carried out for example in block 812 using formula 6 described above.
[0080] In block 814, it is tested whether the part of the image being processed is already finished, or it is tested in our example, whether the value of counter i exceeds the size of the serial number of the last pixel in the image when the value increased by one. If the last pixel in the image part has not yet been processed, the process proceeds in accordance with arrow 816 back to block 810, from where the processing is continued by increasing the value of counter i by one. If the last pixel in the image part is processed in block 812, the process proceeds in accordance with arrow 818 to block 820, where tests are performed in order to know whether the entire image has been processed.
[0081] If all the parts in the image have not yet been processed, then the process proceeds from block 820 as arrow 824 indicates to block 804, from where the process continues by starting to process the following un-pre- processed part in the image. As Figure 8 shows, this part includes an optional block 802 illustrated by a dashed line, meaning that the inverse discrete cosine transform can be carried out for all the pixels in the image before starting the inverse pre-processing or for each following part to be inverse pre-processed in the image at a time.
[0082] If all the parts in the image are processed, the process pro- ceeds from block 820 in accordance with arrow 822 to block 826, where tests are carried out to know whether the previous pre-processed image was the last one in the video sequence. If not, the process proceeds as arrow 828 shows to block 802, where the processing continues by carrying out the inverse discrete cosine transform for the pixels in the following image. If the inverse pre- processing of the video image sequence is accomplished, the process pro-
ceeds from block 826 in accordance with arrow 830 to block 832, where the method is ended.
[0083] The method can be modified in accordance with the accompanying dependent claims. Since the contents thereof are explained above in connection with the decoder 120, a further description is not repeated herein. [0084] Figure 7 does not describe the quantization, which is carried out in Figure 7 after block 724, and correspondingly Figure 8 does not describe the inverse quantization, which is carried out before block 802. It should be noted that the sequence including the measures of the methods does not have to be like the one described. If for instance the real time requirement is not high and the apparatus includes an adequate amount of memory, the discrete cosine transform according to block 724 can always be carried out for several images stored in the memory at a time.
[0085] Even though the invention has above been described with reference to the example in the accompanying drawings, it is obvious that the invention is not restricted thereto but can be modified in various ways within the scope of the inventive idea disclosed in the attached claims.