WO1993002526A1 - Procede de compression de sequences d'images numeriques - Google Patents
Procede de compression de sequences d'images numeriques Download PDFInfo
- Publication number
- WO1993002526A1 WO1993002526A1 PCT/CH1992/000148 CH9200148W WO9302526A1 WO 1993002526 A1 WO1993002526 A1 WO 1993002526A1 CH 9200148 W CH9200148 W CH 9200148W WO 9302526 A1 WO9302526 A1 WO 9302526A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sub
- bands
- data
- transformation
- images
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/94—Vector quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/63—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
Definitions
- image sequences are becoming increasingly important in modern imaging applications, such as high definition television (HDTV), teleconferencing, multi-media applications, medical imaging, robotics , satellite imagery, interactive video and entertainment.
- HDTV high definition television
- teleconferencing multi-media applications
- medical imaging robotics
- satellite imagery interactive video and entertainment.
- the aim of the present invention is to provide a method of compressing digital images intended for video-digital transmissions or for digital storage on media such as compact disks or optical disks, so as to obtain average transmission rates. of the order of 1 to 10 Mb / s with higher quality compared to known systems, such as for example H261 CCITT or MPEG mentioned above, and with a relatively simple implementation.
- the invention relates to a method for compressing sequences of digital images comprising a step of decom ⁇ position the images by transformation into sub-bands, as defined in claim 1. It also relates to a device for setting implementation of the method, as defined in claim 12, as well as a filter bank for implementing the method, as defined in claim 13, and a filter bank intended for rapid multiresolution transformation for compression digital images, as defined in claim 15.
- the method of the invention makes it possible in particular to take into account the redundancy not only inside a sub-band, but also the dependence between the sub-bands, which leads to a higher efficiency than that of the methods. known.
- the method of the invention has the advantage of being very simple for its implementation. It uses very little memory while being very efficient. Furthermore, the precision of the motion vectors is only limited by the arithmetic precision of the elementary operations of the space-time constraint.
- the structure of the synthesis filters much less complex than that of the analysis filters makes it possible to simplify the decoding operation, which is vital to lower the cost of the decoder.
- the proposed filters can be implemented effectively in terms of polyphase components thanks to the structures of the QMF (quadrature mirror filter) type contained in the synthesis analysis parts.
- the structure of the filter bank allows a VLSI implementation with a clock frequency half as low as that proposed so far, the filters being obtained by optimization of a localization function both in image space and in the 'frequency space.
- the multi-resolution organization of the data is taken into account by three different coding techniques, each giving rise to specific performances adapted to the properties of the respective data classes.
- the average frequencies are coded by a vector quantization (VQ) with a pyramidal structure.
- VQ vector quantization
- a pyramidal structure eliminates linear and non-linear spatial correlation, as well as linear and non-linear correlation across sub-bands.
- this pyramid structure consists of a low resolution image in one level of the pyramid and detail images in the other levels. The resulting pyramid transformation provides information at different levels of resolution.
- a pseudo-random scanning of the high-frequency sub-bands minimizes the visual distortion due to the overflow of the buffer memory by distributing it over the entire surface of the image. Pseudo-random is understood here in a sense analogous to that of the randomize function of a computer.
- the spatial sub-band of the continuous component is coded by a conventional technique of pulse code modulation.
- This process can be used to encode among others the ISO / CCIR 601 and CCITT / CIF formats.
- For the data to be processed which are in the CCIR 601 format one proceeds beforehand in the coder to a conversion from interlaced to progressive (figure 1), then in the decoder to a conversion from progressive to interlaced (figure 2), in order to restore the original format.
- These conversions are based on an interpolation with motion compensation. They are not necessary for progressive scan formats.
- the use of the Gabor decomposition was chosen on the one hand because the Gabor functions, which are Gaussian functions modulated by complex exponentials, have an optimal location in the joint spatial / spatial frequency domain.
- the majority of the receptive field profiles of the mammalian visual system can be modeled by this type of function.
- the partitioning of the spatial frequency domain into octave bands is motivated by natural image statistics and also by the sensitivity of the human visual system.
- Gabor functions do not form an orthogonal basis. Consequently, there is not a priori a direct method to compute the transformation, as one can do it in an orthogonal case by simple scalar products.
- a method has already been proposed for carrying out the Gabor pyramid transformation. This technique is based on the criterion of adjustment by the method of least squares.
- the solution to the problem of the method of least squares shows that the coefficients of weighting can be extracted by simple multiplication between a matrix and a vector of data. If the set of Gabor functions is chosen independently of the image, the multiplicative matrix is constant. The reconstructed data are obtained by another multiplication between the matrix of Gabor functions and the vector of the weighting coefficients.
- a parallel implementation of the transformation is therefore carried out to carry out the transformation in real time.
- the chrominance is undersampled in the transform domain by eliminating the higher frequency components from the pyramid ( Figure 4). This process does not deteriorate the visual quality of color images.
- the spatial continuous component (low resolution image, see Figure 3) is coded using modulation by coded pulses (PCM).
- PCM coded pulses
- the average levels of the pyramid are coded using a hierarchical vector quantization (VQ) with a tree structure, as represented in FIG. 4.
- VQ vector quantization
- the highest spatial frequencies are selected adaptively and scalarly quantized (SQ / RL).
- the position information and the amplitude of the coefficients are coded separately.
- the adaptive quantization step and a variable length entropy encoder are controlled using a feedback strategy based on the occupation of the buffer memory.
- a differential inter-image technique is used. Using two previous images, the current image is predicted by a motion compensated extrapolation, and only the prediction error is coded and transmitted.
- the motion vectors are estimated hierarchically (pairing of blocks or spatio-temporal constraint). These same vectors are also used when converting from progressive to interlaced.
- an intra-image technique is applied in a fixed interval to completely update all the coefficients. This mechanism is also restarted after each scene change.
- Figure 1 is a block diagram of a coding device operating according to the method of the invention.
- Figure 2 is a block diagram of a decoding device operating according to the method of the invention.
- Figure 3 illustrates the three different data regions according to the three coding strategies.
- FIG. 4 illustrates the implementation of the vectors of vector quantization.
- Figure 5 is a block diagram of motion compensation.
- FIG. 6 shows an example of impulse response of a filter from the analysis filter bank.
- FIG. 7 shows an example of impulse response of a filter from the synthesis filter bank.
- FIG. 8 gives a representation in the frequency domain of the filters of the analysis filter bank.
- FIG. 9 gives a representation in the frequency domain of the filters of the bank of synthesis filters.
- the system can be used to encode the two formats ISO / CCIR 601 (interlaced) and CCITT / CIF (progressive).
- the ISO / CCIR 601 format consists of 288 by 720 interlaced images at a frequency of 50 fields per second for 625-line systems.
- the CCITT / CIF format consists of 288 by 360 progressive images at a frequency of 25 images per second.
- a first block ( Figures 1) performs the conversion from interlaced to progressive. This first conversion is based on a spatial interpolation with motion compensation.
- a final block on decoding ( Figure 2) allows you to find the initial format by converting from progressive to interlaced. This second conversion uses time interpolation with motion compensation.
- the missing lines are obtained by using a space-time interpolation with compensation for movement between the two neighboring lines existing on either side of the missing line. This makes it possible to go from an image frequency of 25 images per second to an image frequency of 50 progressive images per second.
- a time compensated motion interpolation is used.
- the movement between two consecutive images is estimated by a hierarchical technique. These same motion vectors are also used for motion compensation prediction, and they are obviously only calculated once.
- Subband decomposition and transform coding as a subset of subband decomposition are very popular for data compression, thanks to the good quality of the results obtained for a rate of compression given by comparison with other techniques.
- the transformation used in the present system is a Gabor pyramid transformation with multi-resolution.
- multiresolution techniques are very effective for image analysis and coding; as for example SG Mallat, "A Theory for Multirésolution Signal Décomposition: The avelet Representation", pami IEEE, volume 11, number 7, July, pages 674-693, 1989, and Rosenfeld, A., "Multirésolution Image Processing and Analysis” , Springer-Verlag, 1984, Berlin, Germany.
- the choice of the Gabor functions for the basis of the transformation is motivated by the fact that these functions have an optimal localization in the joint spatial / spatial frequency domain.
- the Gabor functions are the only ones to reach the lower limit of Heisenberg uncertainty in the space of signals. This principle states that the product of the extent of a signal in the spatial domain with its extent in the frequency domain is always greater than or equal to a constant. The minimum is reached precisely, when the signal is a Gabor function.
- the majority of receptive field profiles of the mammalian visual system can be modeled by this type of function.
- the power spectrum of natural images decreases exponentially as the spatial frequency increases.
- the synthesis filters are designed to contain only coefficients which are a sum or a difference of two powers of two at most.
- Several methods have been proposed to approximate a given filter, for example by a method based on min-max or least squares criteria by linear or quadratic programming (see the article by YC Lim and SR Parker, "FIR Filter Design over a discrete powers-of-two coefficient space ", IEEE transactions on ASSP Vol. 31 No. 3, 1983, Pages 583-591), and by a method based on simulated annealing (see the article by N. Benvenuto, M. Marchesi and A.
- the coefficients of these filters are programmed in a special chip according to a poly-phase structure.
- Demultiplexing is achieved by simple addressing of the memory.
- the coefficients of the transformed image are coded according to three different methods depending on the spatial frequency to which they belong. These coding classes are shown in Figure 3.
- PCM Pulse code modulation
- the spatial sub-band of the continuous component is coded by a conventional technique of pulse code modulation. This technique is relatively robust in the presence of noise.
- VQ vector quantization
- SQ scalar quantization
- a relevant parameter is the size of the vectors, in that the larger they are, the better the exploitation of the correlation between coefficients.
- the chrominance coefficients are also included in the vectors, along with those of luminance ( Figure 4). Based on experimental results described in the recommendation "Encoding parameters for digital television for studios"
- the highest level of the pyramid is scanned using a Peano-Hilbert scan in sub-blocks of the image in a pseudo-random order. This scan converts two-dimensional image subbands into a one-dimensional number chain. This chain is then quantified using standard scalar quantization (SQ). The result is a string of numbers with only a small number of bits. These numbers are then compared to a threshold and set to zero if they are below the threshold. The vast majority of the coefficients will be smaller than the threshold. This chain is then divided into two chains, one being a sequence of non-zero coefficients and the other being a binary chain where the value represents the position of a non-zero coefficient and a zero represents a zero coefficient.
- SQL standard scalar quantization
- the binary chain is coded using a range coding (RL) based on the Capon model (see in this regard the thesis of M. Kunt, "Comparison of coding techniques for the reduction of redundancy of facsimile images to two levels ", thesis Nr. 183, LTS-DE, EPFL, 1974).
- the non-zero coefficients are coded using a Huffman code.
- a feedback from the buffer is used to define the threshold. If the data rate exceeds the maximum, the data flow is truncated and the threshold is lowered for the next image. Due to the pseudo-random order of the scanned sub-blocks, the visual effect of truncation is minimized.
- Multiplexing is carried out by simple addressing of the memory, then an inverse pyramidal transformation is carried out.
- the method described here uses motion compensation prediction to reduce the time correlation between the images. Studies have shown that this method is very effective in reducing temporal redundancy (see on this subject the articles by A. Puri, HM Hang and DL Schilling, "An Efficient Block-Matching Algorithm for Motion-Compensated Coding", ICASSP, April 1987, pages 25.4.1-4, and by AN Netravali and JD Rob ' bins, "Motion Compensated Television Coding-Part I", journal Bell Systems Technical Journal, volume 58, number 3, 1979, pages 629-668.). The same displacement vectors obtained are also used for the conversion from progressive to interlaced and for slow motion with good rendering of the movement.
- V (m, n) representing the field of motion
- F (m, n) the predicted image
- F (m, n) the interpolated image
- m and n are the indices of the rows and columns of the image.
- the movement estimation is carried out on the basis of the two previous images. Thus, as said above, no additional information is required.
- the motion estimation is performed based on the current image and the previous image. In this case, a better estimate is obtained, but additional information on the motion vectors is to be sent through the channel.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CH2167/91-5 | 1991-07-19 | ||
| CH216791 | 1991-07-19 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO1993002526A1 true WO1993002526A1 (fr) | 1993-02-04 |
Family
ID=4227458
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CH1992/000148 Ceased WO1993002526A1 (fr) | 1991-07-19 | 1992-07-16 | Procede de compression de sequences d'images numeriques |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO1993002526A1 (fr) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO1997039586A1 (fr) * | 1996-04-15 | 1997-10-23 | Faroudja, Yves, C. | Enregistrement et lecture universels de disques video faisant appel a des signaux de mouvement pour une lecture de haute qualite de sources autres que des films |
| EP0800684A4 (fr) * | 1995-10-26 | 1998-03-25 | Motorola Inc | Procede et dispositif pour coder/decoder une difference de trame deplacee |
| WO2001047277A1 (fr) * | 1999-12-20 | 2001-06-28 | Sarnoff Corporation | Codage vidéo échelonnable |
| CN104350746A (zh) * | 2012-05-31 | 2015-02-11 | 汤姆逊许可公司 | 基于局部幅度和相位谱的图像质量测量 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4663660A (en) * | 1986-06-20 | 1987-05-05 | Rca Corporation | Compressed quantized image-data transmission technique suitable for use in teleconferencing |
| JPS62264764A (ja) * | 1986-05-12 | 1987-11-17 | Nippon Telegr & Teleph Corp <Ntt> | 画像情報圧縮方式 |
| EP0253608A2 (fr) * | 1986-07-14 | 1988-01-20 | British Broadcasting Corporation | Système de balayage vidéo |
| EP0396368A2 (fr) * | 1989-05-04 | 1990-11-07 | AT&T Corp. | Système adaptable à perception de codage d'image |
-
1992
- 1992-07-16 WO PCT/CH1992/000148 patent/WO1993002526A1/fr not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPS62264764A (ja) * | 1986-05-12 | 1987-11-17 | Nippon Telegr & Teleph Corp <Ntt> | 画像情報圧縮方式 |
| US4663660A (en) * | 1986-06-20 | 1987-05-05 | Rca Corporation | Compressed quantized image-data transmission technique suitable for use in teleconferencing |
| EP0253608A2 (fr) * | 1986-07-14 | 1988-01-20 | British Broadcasting Corporation | Système de balayage vidéo |
| EP0396368A2 (fr) * | 1989-05-04 | 1990-11-07 | AT&T Corp. | Système adaptable à perception de codage d'image |
Non-Patent Citations (8)
| Title |
|---|
| 1990 IEEE International Symposium on Circuits and Systems, 1-3 mai 1990, New Orleans, LA, US, IEEE (New York, NY, US); B.R. Horng et al.: "The design of multiplierless two-channel linear-phase FIR filter banks with applications to image subband coding", pages 651-653, voir l'article en entier * |
| 1990 IEEE International Symposium on Circuits and Systems, New Orleans, LA, 1-3 mai 1990, vol. 2, IEEE (New York, NY, US); F.-M. Wang et al.: "Time-recursive deinterlacing for IDTV and pyramid coding", pages 1306-1309, voir page 1308, paragraphe 4 - 1309, paragraphe 6 * |
| GLOBECOM '90, IEEE Global Telecommunications Conference & Exhibition, San Diego, CA, 2-5 décembre 1990, vol. 2, IEEE (New York, NY, US); H. Gharavi: "Subband based CCITT compatible coding for HDTV conferencing", pages 978-981, voir l'abrégé; figure 1 * |
| IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 38, no. 6, juin 1990 (New York, US); G. Karlsson et al.: "Theory of two-dimensional multirate filter banks", pages 925-937, voir figure 1; page 930, colonne de droite, lignes 25-49 * |
| International Conference on Acoustics, Speech, and Signal Processing, Albuquerque, 3-6 avril 1990, vol. 4, IEEE, (New York, US); M. Antonini et al.: "Image coding using vector quantization in the wavelet transform domain", pages 2297-2300, voir abrégé, pages 2299-2300, paragraphe II (cité dans la demande) * |
| International Conference on Acoustics, Speech, and Signal Processing, Tokyo, 7-11 avril 1986, vol. 1, IEEE (New York, NY, US); S.E. Elnahas et al.: "Hybrid interframe coding of video signals with backward-acting motion detection", pages 165-167, voir abrégé * |
| Patent Abstracts of Japan, vol. 12, no. 150 (E-606), 10 mai 1988, & JP,A, 62264764 (NIPPON TELEGR. & TELEPH. CORP.) 17 novembre 1987, voir abrégé; figure * |
| Signal Processing V, Eusipco, 90, 1990, Elsevier Science Publishers, B.V. (Amsterdam, NL); T. Ebrahimi et al.: "Sequence coding by Gabor decomposition", pages 769-772, voir l'abrégé (cité dans la demande) * |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP0800684A4 (fr) * | 1995-10-26 | 1998-03-25 | Motorola Inc | Procede et dispositif pour coder/decoder une difference de trame deplacee |
| WO1997039586A1 (fr) * | 1996-04-15 | 1997-10-23 | Faroudja, Yves, C. | Enregistrement et lecture universels de disques video faisant appel a des signaux de mouvement pour une lecture de haute qualite de sources autres que des films |
| US5754248A (en) * | 1996-04-15 | 1998-05-19 | Faroudja; Yves C. | Universal video disc record and playback employing motion signals for high quality playback of non-film sources |
| WO2001047277A1 (fr) * | 1999-12-20 | 2001-06-28 | Sarnoff Corporation | Codage vidéo échelonnable |
| US6907073B2 (en) | 1999-12-20 | 2005-06-14 | Sarnoff Corporation | Tweening-based codec for scaleable encoders and decoders with varying motion computation capability |
| CN104350746A (zh) * | 2012-05-31 | 2015-02-11 | 汤姆逊许可公司 | 基于局部幅度和相位谱的图像质量测量 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| PodilChuk et al. | Three-dimensional subband coding of video | |
| EP0448491B1 (fr) | Procédé de codage et de transmission à au moins deux niveaux de qualité d'images numériques appartenant à une séquence d'images, et dispositifs correspondants | |
| KR100664928B1 (ko) | 비디오 코딩 방법 및 장치 | |
| Westerink et al. | Subband coding of images using vector quantization | |
| EP0857392B1 (fr) | Codeur d'image a ondlettes a arborescence nulle et a blocs se chevauchant | |
| EP0734164B1 (fr) | Procédé et appareil pour le codage du signal vidéo avec un dispositif de classification | |
| IE902321A1 (en) | A method of processing video image data for use in the¹storage or transmission of moving digital images | |
| de Queiroz et al. | Nonexpansive pyramid for image coding using a nonlinear filterbank | |
| FR2589020A1 (fr) | Procede de codage hybride par transformation pour la transmission de signaux d'image | |
| FR2880743A1 (fr) | Dispositif et procedes de codage et de decodage echelonnables de flux de donnees d'images, signal, programme d'ordinateur et module d'adaptation de qualite d'image correspondants | |
| EP0937291B1 (fr) | Procede et dispositif de prediction compensee en mouvement | |
| EP0668004B1 (fr) | Procede et dispositif de reduction de debit pour l'enregistrement d'images sur magnetoscope | |
| FR2670348A1 (fr) | Dispositif de codage d'images appartenant a une sequence d'images, a rearrangement des lignes avant transformation mathematique, systeme de transmission d'images, recepteur et procede de codage correspondants. | |
| US5629737A (en) | Method and apparatus for subband coding video signals | |
| KR100621582B1 (ko) | 스케일러블 비디오 코딩 및 디코딩 방법, 이를 위한 장치 | |
| KR20050075578A (ko) | 폐루프 최적화를 지원하는 스케일러블 비디오 엔코딩 방법및 장치 | |
| US20050163217A1 (en) | Method and apparatus for coding and decoding video bitstream | |
| KR100621584B1 (ko) | 스무딩 필터를 이용하는 비디오 디코딩 방법 또는 비디오디코더 | |
| KR100755689B1 (ko) | 계층적 시간적 필터링 구조를 갖는 비디오 코딩 및 디코딩방법, 이를 위한 장치 | |
| WO1993002526A1 (fr) | Procede de compression de sequences d'images numeriques | |
| Singhal et al. | Source coding of speech and video signals | |
| FR2597282A1 (fr) | Procede de quantification dans un codage par transformation pour la transmission de signaux d'image | |
| FR2654285A1 (fr) | Systeme de compression d'images numeriques appartenant a une sequence d'images, a quantification adaptative en fonction d'une information psychovisuelle. | |
| Scotton et al. | A low complexity video subband coder for ATM | |
| EP0724812B1 (fr) | Procede et dispositif de codage inter-trame avec regulation de debit pour l'enregistrement d'images sur magnetoscope |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AK | Designated states |
Kind code of ref document: A1 Designated state(s): CA JP US |
|
| AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH DE DK ES FR GB GR IT LU MC NL SE |
|
| ENP | Entry into the national phase |
Ref document number: 2091250 Country of ref document: CA |
|
| NENP | Non-entry into the national phase |
Ref country code: CA |
|
| ENP | Entry into the national phase |
Ref country code: CA Ref document number: 2091250 Kind code of ref document: A Format of ref document f/p: F |