HK1114280A

HK1114280A - Quasi-constant-quality rate control with look-ahead

Info

Publication number: HK1114280A
Application number: HK08109546.2A
Authority: HK
Inventors: 涛田; 陈培松
Original assignee: 高通股份有限公司
Priority date: 2005-03-10
Filing date: 2006-03-10
Publication date: 2008-10-24

Abstract

Methods and apparatus efficiently encode multimedia data, such as live video streams. An encoding complexity of a predetermined time interval, such as 1 second, is estimated before the actual encoding that will be used. This permits the actual encoding to be performed with an a priori estimate of complexity, permitting the bits allocated for the predetermined time interval (bit rate) to be efficiently allocated within the predetermined time interval. Moreover, the estimated complexity can be provided to a device, such as a multiplexer, which can then allocate the available bandwidth for a collection of multiplexed video channels according to the encoding complexity anticipated for those video channels, which then permits the quality of a particular channel to remain relatively constant even when the bandwidth for the collection of multiplexed video channels is relatively constant.

Description

Quasi-constant quality rate control with look-ahead

Claiming priority in accordance with 35U.S.C. § 119

This patent application claims prior to 10/3/2005 provisional application No. 60/660,908 entitled "METHOD and apparatus FOR quick-state-quick RATE CONTROL with altitude-altitude" which is assigned to the assignee of the present invention and is expressly incorporated herein by reference.

Technical Field

The present invention relates generally to digital video, and more specifically to video compression

Background

Due to the explosive growth and great success of the internet and wireless communications, as well as the increasing demand for multimedia services, streaming media over the internet and mobile/wireless channels has attracted considerable attention. In a heterogeneous Internet Protocol (IP) network, video is provided by a server and can be streamed by one or more clients. The wired connection includes: dial-up, Integrated Services Digital Network (ISDN), cable, digital subscriber line protocol (collectively xDSL), fiber, Local Area Network (LAN), Wide Area Network (WAN), and other networks. The transmission mode may be unicast or multicast.

Similar to heterogeneous IP networks are mobile/wireless communications. The transport of multimedia content over mobile/wireless channels is extremely challenging because these channels are often severely impaired by multi-path fading, shadowing, inter-symbol interference, and noise disturbances. Some other reasons, such as mobility and competing traffic, may also result in bandwidth variation and loss. The channel noise and the number of users served determine the time-varying nature of the channel environment.

Digital video is typically compressed for efficient storage and/or transmission. Many video compression standards currently exist.

A common problem with video compression is the trade-off between bandwidth (bits/second) and visual quality. Various metrics, such as peak signal to noise ratio (PSNR), can be used to assess visual quality. It should be appreciated that with a constant frame rate, the bits used to encode a video frame will be proportional to the bit rate or bandwidth of the video, and the terms (bits and bandwidth), although technically different, are often used interchangeably in the art and the correct interpretation can be determined from context.

The bandwidth required to achieve relatively good visual quality will vary with the complexity of the video being encoded. For example, relatively static footage (e.g., footage of a news broadcaster) may be encoded with relatively low bandwidth and relatively high visual quality. Conversely, relatively dynamic shots (e.g., a shot of a rackman during a sporting event) may consume a relatively large amount of bandwidth to achieve the same visual quality.

To achieve a relatively constant quality, it is desirable to use a technique called Variable Bit Rate (VBR) to vary the number of bits available for encoding a frame. However, the VBR techniques are generally not applicable to the context within the transmitted video. Digital video content may be transmitted over a wide variety of media, such as over optical networks, wired networks, wireless networks, satellites, and the like. When broadcasting, the communication medium is typically of limited bandwidth. Accordingly, Constant Bit Rate (CBR) techniques are commonly found in transmission or broadcast environments.

A problem with using Constant Bit Rate (CBR) techniques is that the visual quality will vary depending on the complexity of the video being encoded. When shots are relatively static (e.g., a broadcaster's shot), more bits than are needed to achieve a given level of quality will be consumed or "wasted". When the lens is relatively dynamic (e.g., in sports), then the quality may suffer from CBR. When visual quality is impaired, visual artifacts may become apparent and may be observed in the form of, for example, "blockiness".

There is therefore a need in the art to provide encoding techniques that can combine the advantageous properties of a relatively constant bit rate of the transmission medium with a relatively constant visual quality for the viewer to enjoy.

Disclosure of Invention

The systems and methods disclosed herein address the above stated needs by, for example, estimating the encoding complexity of one or more multimedia data in a predetermined time window, and using the estimated encoding complexity to determine the bit rate for encoding.

One aspect is a method of encoding received video data, wherein the method comprises: determining a first encoding complexity of a first portion of the video data; and encoding the first portion of video at least partially according to the first encoding complexity.

One aspect is an apparatus for encoding received video data, wherein the apparatus comprises: means for determining a first encoding complexity of a first portion of the video data; and encoding means for encoding the first portion of video according, at least in part, to the first encoding complexity.

One aspect is an apparatus that encodes received video data, wherein the apparatus comprises: a processor configured to determine a first encoding complexity of a first portion of the video data; and an encoder configured to encode the first portion of video data according to, at least in part, the first encoding complexity.

One aspect is a computer-program product embodied in a tangible medium having instructions for encoding received video data, wherein the computer-program product comprises: means for determining a first encoding complexity of a first portion of the video data; and means for encoding the first portion of video at least partially according to the first encoding complexity.

One aspect is a method for encoding multimedia data, wherein the method comprises: encoding first multimedia data corresponding to the selected window of data; and encoding second multimedia data different from the first multimedia data, the second multimedia data corresponding to the selected window of data when the first multimedia data is re-encoded.

One aspect is an apparatus for encoding multimedia data, wherein the apparatus comprises: encoding means for encoding first multimedia data corresponding to a selected window of data and second multimedia data different from the first multimedia data, the second multimedia data corresponding to the selected window of data; and re-encoding means for re-encoding the first multimedia data when the encoding means encodes the second multimedia data.

One aspect is an apparatus for encoding multimedia data, wherein the apparatus comprises: a first encoder configured to encode first multimedia data corresponding to a selected window of data and to encode second multimedia data different from the first multimedia data, the second multimedia data corresponding to the selected window of data; and a second encoder configured to re-encode the first multimedia data while the first encoder is encoding the second multimedia data.

One aspect is a method of encoding video data for a plurality of video channels that are multiplexed and carried together at transmission, wherein the method comprises: receiving video frames for video encoding the plurality of video channels; determining a relative amount of data needed to encode the plurality of channels at about the same level of quality, wherein data for the encoded portions are multiplexed together and carried; allocating data for the encoded portion according to the determined relative amount of data; encoding a video frame according to the distributed data; and multiplexing the encoded video frames for transmission.

One aspect is a method of encoding a plurality of video channels multiplexed together in a band limited transmission medium, wherein the method comprises: receiving a plurality of video frames from a plurality of channels to be encoded, multiplexed, and then transmitted together, wherein the plurality of video frames includes a plurality of video frames corresponding to at least a predetermined time interval of each channel; allocating bits applicable to the plurality of channels among the plurality of channels such that visual quality of the plurality of channels is at about the same level regardless of differences in coding complexity; and encoding according to the allocated bits.

One aspect is a method of allocating available bandwidth between two or more video channels to be encoded in a compressed manner, the two or more video channels including at least a first video channel and a second video channel, wherein the method includes: estimating an encoding complexity associated with a set of video frames of a predetermined time period of the first video channel, wherein the predetermined time period is less than a duration of a video clip, wherein the estimating is performed before encoding will actually be used; and using information based at least in part on the estimated encoding complexity of the actual encoding of the set of video frames.

One aspect is a method of allocating available bandwidth among a plurality of channels, wherein the method comprises: receiving an encoding complexity metric for a set of video frames for each of the plurality of channels, wherein data for the set of video frames is multiplexed for transmission at a same time period; and allocating bits available for the time period among the multiplexed channels based at least in part on the coding complexity metric for each channel.

One aspect is an apparatus for video encoding, wherein the apparatus comprises: a first processing element configured to estimate an encoding complexity associated with a set of video frames corresponding to a predetermined time period, the predetermined time period being less than a time period associated with a length of a video clip; and a second processing element configured to encode the set of video frames using a bit rate selected at least in part according to the estimated encoding complexity.

One aspect is an apparatus for allocating available bandwidth among a plurality of channels, wherein the apparatus comprises: an allocation circuit configured to receive an encoding complexity measure for a set of video frames for each of the plurality of channels, wherein data for the set of video frames is multiplexed at a same time period; a computation circuit configured to allocate available bits for the time period among the multiplexed channels based at least in part on a coding complexity metric for each channel.

One aspect is a computer-program product embodied in a tangible medium having instructions for encoding multimedia data, wherein the computer-program product comprises: a module having instructions for encoding first multimedia data corresponding to a selected window of data and encoding second multimedia data different from the first multimedia data, the second multimedia data corresponding to the selected window of data; and a module having instructions for re-encoding, the instructions for re-encoding the first multimedia data while the module having instructions for encoding is encoding the second multimedia data.

Drawings

FIG. 1 is a system diagram illustrating encoding data for multiple channels of multimedia data;

FIG. 2 is a flow diagram generally illustrating an encoding process;

FIG. 3 is a flow diagram generally illustrating a process of encoding while selecting a bit rate according to the data complexity of a plurality of multimedia channels;

FIG. 4 is a system diagram illustrating an example of a multi-pass predictive encoder that may interact with a multiplexer;

FIG. 5 is a system diagram illustrating an example of a single pass look ahead encoder that can interact with a multiplexer;

FIG. 6 is a system diagram illustrating an example of a stand-alone multipass predictive encoder;

FIG. 7A is a system diagram illustrating an example of a stand-alone single-pass predictive encoder;

FIG. 7B is a system diagram illustrating an example of an encoding complexity determination device and an encoding device;

FIG. 8 is a system diagram illustrating an example of an apparatus for allocating bandwidth available to a multiplex system according to channel requirements;

FIG. 9 is a 3-D plot of the estimated visual quality function V with the coding complexity C and the allocated bits B;

FIG. 10 illustrates an example of a method for encoding multimedia data; and

fig. 11 illustrates an example of an apparatus for encoding multimedia data.

Detailed Description

In the following description, specific details are given to provide a thorough understanding of the embodiments. However, it will be understood by those skilled in the art that the embodiments may be practiced without these specific details. For example, electrical components may be shown in block diagram form in order not to obscure the embodiments in unnecessary detail. In other instances, such components, other structures and techniques may be shown in detail to further explain the embodiments. It will also be appreciated by those skilled in the art that electrical components shown as separate blocks may be rearranged and/or combined into one component.

It is also noted that some embodiments may be described as a process which is depicted as a flowchart, a flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently and the process can be repeated. Further, the order of the jobs may be rearranged. When its operation is completed, the process is completed. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, its termination corresponds to a return of the function to the calling function or the main function.

The methods and apparatus described below efficiently encode multimedia data, such as a live video channel. The encoding complexity of the selected window is estimated prior to the actual encoding to be used, where the window may be based on time or based on a selected amount of data (e.g., approximately 1 second) or a portion of the multimedia data (e.g., 1 second). This permits the actual encoding to be carried out with a complexity estimate known in advance, permitting the bits allocated for the predetermined time interval (bit rate) to be efficiently allocated within the selected window.

The bitrate may be an input from a device that estimates a bitrate from bandwidth or quality or both. For example, a multiplexer may be implemented to provide a bit rate. The quality may be based on a complexity-based content classification. This Classification is discussed in the co-pending patent application entitled "Content Classification for Multimedia Processing".

Furthermore, the estimated complexity may be provided to a device (e.g., a multiplexer) which may then allocate available bandwidth for a set of multiplexed multimedia channels (e.g., video channels) according to their expected encoding complexity, thereby permitting the quality of a particular channel to remain relatively constant even if the bandwidth of the set of multiplexed channels is relatively constant. This allows the channels within the channel set to have a variable bit rate and relatively constant visual quality, rather than a relatively constant bit rate and variable visual quality.

Fig. 1 is a system diagram illustrating encoding of multiple sources (e.g., channels) of multimedia data 102. The multimedia data 102 is encoded by a corresponding encoder 104, the encoder 104 being in communication with a Multiplexer (MUX)106, and the MUX 106 being in communication with a transmission medium 108. For example, the multimedia data 102 may correspond to various content channels, such as a news channel, a sports channel, a movie channel, and the like. The encoder 104 encodes the multimedia data 102 into an encoding format specified by the system. Although described in the context of encoding video data, the principles and advantages of the disclosed techniques apply broadly to multimedia data, including, for example, visual data and/or audio data. The encoded multimedia data is provided to a multiplexer 106, which combines the various encoded multimedia data by the multiplexer 106 and sends the combined data to a transmission medium 108 for transmission.

The transmission medium 108 may correspond to a wide variety of media such as, but not limited to, digital satellite communications, such as DirecTV ®, digital cable, wired and wireless internet communications, optical networks, mobile telephone networks, and the like. In the case of a wireless communication system, the transmission medium 108 may comprise, for example, a portion of a code division multiple access (CDMA or CDMA2000) communication system, or alternatively, a Frequency Division Multiple Access (FDMA) system, an Orthogonal Frequency Division Multiple Access (OFDMA) system, a Time Division Multiple Access (TDMA) system such as GSM/GPRS (general packet radio service)/EDGE (enhanced data GSM environment) or TETRA (terrestrial trunked radio) mobile phone technology for the service industry, Wideband Code Division Multiple Access (WCDMA), a high data rate (1xEV-DO or 1xEV-DO gold multicast) system, or in general any wireless communication system using a combination of technologies. The transmission medium 108 may comprise, for example, modulation to Radio Frequency (RF). Typically, due to spectral constraints and the like, the transmission medium has a limited bandwidth and the data from multiplexer 106 to the transmission medium maintains a relatively Constant Bit Rate (CBR).

In conventional systems, using a Constant Bit Rate (CBR) at the output of the multiplexer 106 would require that the encoded multimedia data provided as input to the multiplexer 106 also be CBR. As described in the background, the use of CBR in encoding video content (for example) can result in variable visual quality, which is generally undesirable.

In the illustrated system, two or more encoders 104 exchange the expected encoding complexity for the input data. One or more encoders 104 may receive a modified bit rate control from multiplexer 106 in response thereto. This permits an encoder 104 that desires to encode relatively complex multimedia to be able to receive higher bit rates or higher bandwidths (more bits per frame) for those multimedia frames in a quasi-variable bit rate manner. This permits the multimedia data 102 to be encoded with a more constant visual quality, for example, in a visual background. The additional bandwidth used by a particular encoder 104 that encodes relatively complex video comes from the bits that would otherwise be used to encode other multimedia data 102 if the encoder were constructed to operate at a constant bit rate. This maintains the output of multiplexer 106 at a Constant Bit Rate (CBR).

Although individual sources of multimedia data 102 may be relatively "bursty," that is, the cumulative sum of multiple sources of multimedia data may be less bursty as the bandwidth used varies. The bit rate from the channel encoding the less complex multimedia may be reallocated by, for example, multiplexer 106 to the channel encoding the relatively complex multimedia, which may enhance, for example, the overall visual quality of the combined multimedia data. The disclosed techniques may be used to adaptively allocate bandwidth so that little or no user intervention is required in the multimedia data bandwidth setting.

In one example, encoder 104 provides multiplexer 106 with an indication of the complexity of a set of multimedia data (e.g., video frames) to be encoded and multiplexed together. For example, the multiplexer 106 may multiplex encoded data for a predetermined window of data, e.g., a 1 second long window of data, at a time. In the predetermined window of data, the output of multiplexer 106 should provide an output no higher than the bit rate specified for transmission medium 108.

For example, the complexity indication may correspond to an estimate of the number of bits to be used to encode the multimedia data of the window of data at a given level of quality. Multiplexer 106 analyzes the complexity indication and provides the allocated number of bits or bandwidth for each encoder 104, and encoder 104 uses this information to encode the multimedia data in the set. A group of multimedia data is permitted to be individually of variable bit rate and still be able to obtain a constant bit rate in groups.

Figure 2 is a flow diagram generally illustrating an encoding process. It will be appreciated by those skilled in the art that the illustrated process can be modified in various ways. For example, in another embodiment, portions of the illustrated process can be combined, rearranged in an alternate sequence, removed, and so forth.

States 210 and 220 of the process will be described below from the perspective of an encoding device, such as encoder 104 in fig. 1, encoding in conjunction with other encoders via a multiplexer to efficiently use the bandwidth available to the encoders in the group whose outputs are being multiplexed. States 320 and 330 are written from the perspective of a multiplexer or other bandwidth allocation device, if separate from the multiplexer.

The illustrated process estimates the encoding complexity within a predetermined time period or window at state 210. This predetermined time period or window may vary over a relatively wide range. For example, this predetermined time period may be in the range of about 500 milliseconds to about 10000 milliseconds. A relatively good value for the predetermined time period is about 1 second. It should be noted that this predetermined time period (referred to as a "superframe") does not correspond to the time period of a group of pictures (GOP, that is, a group of pictures related to an intra frame), unless the group of pictures happens to match the predetermined time period. Rather, a superframe may be a given number of consecutive pictures. It should be appreciated that in a system where a multiplexer or similar device collects various multimedia data in periodic intervals, such as a predetermined number of frames, for packaging in a multiplexed manner, the predetermined time period should match the periodic time interval used by the multiplexer. For example, the multiplexer may use a predetermined time interval of about 1 second within which to allocate bits or bandwidth to individual channels to comply with bandwidth limitations on the multiplexed data.

One example of estimating the encoding complexity is encoding by an encoding process, such as an encoding similar to the "first pass" encoding process in a two-pass or multi-pass encoding scheme. In the first pass encoding process, a standard default quality level, such as a default Quantization Parameter (QP) level, is typically used. Another technique for estimating the coding complexity is the bandwidth ratio metric, which will be described below.

In a conventional multi-pass encoding scheme, an entire multimedia clip (e.g., a 2-hour movie) is first encoded in a first pass without saving the encoded video, and then encoded again. The metrics or statistics gathered from the first encoding pass (e.g., how complex a particular set of frames is or how many moving pictures are) are then used to adjust the bit rate used in the second or subsequent encoding passes. This improves the quality of the encoded video.

However, while such conventional multipass techniques are suitable for encoding non-real-time data, such as a movie for storage on a DVD, the lag present in encoding the entire clip is generally not suitable for encoding data in a broadcast or other distribution environment that would not accept a relatively large delay. Typically, multimedia data can carry live content, such as a news broadcast or a sporting event. Furthermore, the multimedia data may carry content that is outside the control of the entity distributing the multimedia, so that certain content is considered live even when it is not "live".

For live multimedia data such as news and sports events, large delays (e.g., lasting hours) are generally not acceptable. However, in many cases, a relatively short delay is acceptable and unnoticeable. The disclosed techniques with a predetermined time period or window may introduce relatively short delays, such as delays in the range of seconds. These delays are typically not objectionable to viewers of live content.

When the encoding complexity within a predetermined time period is known, this information can be used to efficiently allocate bits for the actual encoding 220 of the process. For example, the coding complexity information may be used independently of the coding complexity of other channels, and the actual encoding 220 would still have significant advantages. Small variations in bit rate can typically be accommodated by a buffer. The process can be improved even further when combined with bit rate allocations from other sources of multimedia data. Such a variation is illustrated in fig. 3.

Figure 3 is a flow diagram generally illustrating a process of encoding when the bit rate is selected according to the complexity of multiple multimedia data sources. It will be appreciated by those skilled in the art that the illustrated process can be modified in various ways. For example, in another embodiment, portions of the illustrated process may be combined, rearranged in another sequence, removed, and so forth.

States 310 and 340 in the process will be described below from the perspective of an encoding device, such as encoder 104 in fig. 1, that encodes in conjunction with other encoders via a multiplexer to efficiently utilize the bandwidth available to the encoders in the bank whose outputs are being multiplexed. States 320 and 330 are written from the perspective of a multiplexing device (such as multiplexer 106 in fig. 1) or other bandwidth allocation device, if separate from the multiplexer.

The illustrated process estimates the encoding complexity within a predetermined time period or window at state 310. The estimating act 310 may be the same as described for estimating act 210 in the process described above in connection with fig. 2. It should be noted, however, that the predetermined time period used to estimate act 310 in fig. 3 should match the time period or window in which the multiplexer allocates bits or bandwidth.

The process continues at state 320 to retrieve coding complexity from other channels within the same predetermined time period. Within a particular time period, the encoded video from each encoded channel will be multiplexed and should not exceed the total bandwidth allocated to the multiplexed communication channel.

The process continues in state 330 with allocating bits between the various channels. For example, information about coding complexity may relate to the number of bits expected to be consumed for a given visual quality. The number of bits may be summed and the number of bits allocated to the channel may depend on the relative number of bits expected to be consumed, e.g., a scaled portion of the total available bandwidth may be allocated for each channel according to the expected bit consumption for the scaling of the channel. In another example, various pre-selected positioning rates can be assigned to each channel, such as one of a low bandwidth, a medium bandwidth, or a high bandwidth, depending on the relative number of bits expected to be consumed. It should also be noted that each system may have specific system limits on maximum bandwidth and on minimum bandwidth, and that the bit allocation should not exceed these limits. The bits allocated to the group of frames during the predetermined period, referred to as a superframe, are then provided to an encoder for bit rate control at the superframe level.

The encoder then encodes the superframe or frame in a predetermined period according to the allocated bits at state 340. This permits individual channels or groups of channels to have a Variable Bit Rate (VBR) to obtain a near constant quality, and still provide multiple channels for the output of the multiplexer to maintain a Constant Bit Rate (CBR) at the output.

Fig. 4 is a system diagram illustrating an example of a multi-pass predictive encoder that can interact with a multiplexer. The illustrated system uses a two-pass architecture, but it should be understood that more than two passes may be used. The system includes a data memory 402, an optional preprocessor 404, a first pass encoder 406, a second pass encoder 408, a multiplexer 106, and a transmission medium 108. The system may be constructed in a variety of ways. For example, one or more of the optional preprocessor 404, first pass encoder 406, and second pass encoder 408 may be combined, separable, controllable by a common controller, and so on.

The data storage 402 stores incoming multimedia data. The data storage 402 can be implemented by a wide variety of devices, such as solid state memory (DRAM, SRAM), hard disk drives, and the like. The optional preprocessor 404 may compute a bandwidth ratio metric for use in allocating bits within a group of frames during the encoding process. The bandwidth ratio metric will be described later. In the illustrated system, pre-processor 404 operates independently of superframe level bit rate control. The optional preprocessor 404 provides the metrics to a first pass encoder 406 and a second pass encoder 408.

The optional preprocessor 404, first pass encoder 406, and second pass encoder 408 are arranged in a pipelined architecture. In such a pipelined architecture, the delay of each stage should be the same, that is, each stage should process the same amount of data. The amount of data corresponds to a frame in a predetermined time period or window. The frame is called a superframe. While the second pass encoder 408 is encoding a set of frames, the first pass encoder 406 is performing a first pass encoding on the next set of frames to be encoded by the second pass encoder 408, and the optional preprocessor 404 processes the next set of frames to be first pass encoded by the first pass encoder 406. Raw data for the unencoded video may be retrieved from data store 402 and references to the data may be passed in the module as memory pointers and the like.

First pass encoder 406 encodes a frame of multimedia data for a predetermined time period. However, the actual encoded multimedia data frame is not necessarily used and is typically discarded. A complexity measure, such as the number of bits used to encode a frame at a given quality level, is provided to the multiplexer 106. Other reusable data, such as the calculation of Motion Vectors (MVs) and the sum of absolute pixel differences (SAD), may be provided from the first pass encoder 406 to the second pass encoder 408.

The complexity measure is used by the multiplexer 106 to allocate bits available for the combined multiplexed data between the various channels. For example, multiplexer 106 may use the allocate bit state 330 in the process described above in connection with FIG. 3. The allocated bits are then provided to the second pass encoder 408 for encoding those frames.

The second pass encoder 408 then encodes the frame in the next cycle of the pipeline, and the encoded frame is provided to the multiplexer 106 for multiplexing with data from other channels and transmission over the transmission medium in subsequent cycles of the pipeline.

Fig. 5, 6 and 7 illustrate system variations similar to those shown in fig. 4. Other variations will be readily ascertainable by those skilled in the art. For example, one variation of the system illustrated in FIG. 4 is a system without the optional preprocessor 404. The removal preprocessor 404 eliminates one stage in the pipeline and the delay of that stage.

In the variation of fig. 5, the preprocessor 502, upon performing an analysis that is different and less complex than the first pass encoding, provides a complexity measure. This may be applicable to systems where, for example, processing power is limited. For example, the complexity may be determined using a bandwidth ratio metric as will be discussed below.

The variation illustrated in fig. 5 includes a data memory 402, a preprocessor 502, an encoder 504, a multiplexer 106, and a transmission medium 108. The multiplexer 106 may allocate a bit rate for the encoder 504 based on the complexity estimate from the pre-processor 502. The mapping between another metric, such as the bandwidth ratio from the pre-processor 502, and the bit-rate control of the encoder 504 may be linear, non-linear, mapped within a selected range via a look-up table, etc.

The variation illustrated in fig. 6 does not interact with the multiplexer. The variation of fig. 6 includes a data store 402, an optional preprocessor 404, a first pass encoder 602, and a second pass encoder 604. It should be appreciated that the number of multi-pass encoders may become other than two. Instead, the results of the first pass encoder are used by the second pass encoder without implementing bit rate allocation based on complexity from other channels. This may be applicable in systems where a certain degree of bit rate variation can be tolerated due to the buffer but the desired bit rate encoding is close to constant. In the illustrated system, the first pass encoder 602 performs a first pass encoding on a frame within a predetermined time (also referred to as a superframe) and uses the complexity information to set the encoding of the second pass 604. One difference between the variation of fig. 6 and conventional two-pass systems is that conventional two-pass systems perform the entire media clip once at a time, which is impractical for certain types of data, such as live data.

The variation illustrated in fig. 7A also does not interact with the multiplexer. The variant includes a data store 402, a preprocessor 702, and a first pass encoder 704. The preprocessor 702 provides a time-windowed complexity measure to the first pass encoder 704. First pass encoder 704 uses the information to select a bit rate for encoding of corresponding multimedia data.

The variation illustrated in fig. 7B may, but need not, interact with the multiplexer. The variant comprises encoding complexity determining means 712 for providing a complexity indication to the encoding means 714. In one example, the encoding complexity determining means 712 may correspond to the pre-processor 502 and the encoding means 714 may correspond to the first pass encoder 504. In another example, encoding complexity determining means 712 may correspond to first pass encoder 602 and encoding means 714 may correspond to second pass encoder 604. In one aspect, the encoding complexity determining means 712 provides complexity information for time window data subsequently encoded by the encoding means 714. In another aspect, the complexity information may be provided to a multiplexer or another module, which may then provide the adjusted bit rate to the encoding device 714.

Fig. 8 is a system diagram showing an example of a processor 800 for allocating available bandwidth of a multiplexing system according to the needs or coding complexity of various multimedia data sources. For example, such a processor may be incorporated within the multiplexer 106 or may be independent of the multiplexer 106.

The processor 800 illustrated in fig. 8 includes a complexity collection circuit 802 and a dispatch circuit 804. The complexity collection circuit 802 retrieves complexity estimates from, for example, multiple first pass encoders. The complexity estimates are collected. The total bit rate 806 is allocated among the corresponding multimedia data-it may be a data value stored in a register. For example, there may be various bit rates set aside for high, medium, and low complexity, and then the bit rate may be selected based at least in part on the relative complexity estimate and provided to dispatch circuit 804 accordingly. The assignment circuit then communicates with, for example, a second pass encoder, which uses the assigned bit rate for bit rate control at the super-frame level.

Other aspects will now be described. Rate control may be implemented in multiple levels. It should be noted that in addition to rate control for a group of pictures (GOP) both intra and intra macroblocks, rate control at the "superframe" level, that is, rate control for a fixed number of consecutive pictures or for a window of data, may also be used. For example, rate control may be implemented within the other levels using conventional techniques as well as those described herein. It should also be noted that the size of the group of pictures may be larger or smaller than the size of the superframe, depending on the content and size chosen for the superframe.

Bandwidth map generation

The human visual quality V may be a function of both the coding complexity C and the allocated bits B (also referred to as bandwidth).

It should be noted that the coding complexity measure C takes into account spatial and temporal frequencies from a human visual perspective. For distortions to which the human eye is more sensitive, the complexity value will be correspondingly higher. It can generally be assumed that V decreases monotonically with C and increases monotonically with B. An example of such a 3-D relationship is depicted in fig. 9.

To obtain a constant visual quality, the ith object to be encoded (frame or MB) is assigned a bandwidth (B)_i) Said bandwidth (B)_i) The criteria expressed in equations 1 and 2 are satisfied.

B_i＝B(C_iV) (equation 1)

(equation 2)

In equations 1 and/or 2, C_iIs the encoding complexity of the ith object, B is the total available bandwidth, and V is the visual quality achievable by the object. The visual quality of a person is difficult to formulate in a stylistic form. Therefore, the above equation set is not precisely defined. However, if it is assumed that all variables of the 3-D model are continuous, the bandwidth ratio (B) may be set_i/B) is considered to be unchanged in the neighborhood of the (C, V) pair. Bandwidth ratio beta_iDefined in equation 3.

β_i＝B_i/B (equation 3)

The bit allocation problem may then be defined as expressed in equation 4.

When (C)_i，V)∈δ(C₀，V₀) Time (equation 4)

In equation 4 above, δ indicates a "neighborhood".

Coding complexity is affected by human visual acuity-both in terms of space and time. Girod's human visual model is one model that can be used to define spatial complexity. The model takes into account local spatial frequencies and ambient lighting. The resulting measure is called D_csat. At a pre-processing point in the process, it is not known whether a picture will be intra-coded or inter-coded and yields a bandwidth ratio of the two. For intra coded pictures, the bandwidth ratio is expressed as equation 5.

β_INTRA＝β_0INTRAlog₁₀(1+α_INTRAY²D_csal) (equation 5)

In the above equation, Y is the average luminance component of MB, α_INTRAIs the square of the luminance and D thereafter_csatWeighting factors of the terms, β_OINTRAIs a normalization factor to ensure. E.g. alpha_INTRAA value of 4 results in good visual quality. Conversion factor beta_OINTRAThe value of (b) is not critical, as long as β is dependent on the different video objects_INTRAThe bits are allocated by the ratio therebetween.

To understand the relationship, it should be noted that bandwidth is allocated in a logarithmic relationship to coding complexity. The luminance squared term reflects the fact that: coefficients with larger magnitudes are encoded using more bits. To prevent the logarithm from becoming negative, 1 is added to the term in parentheses. Logarithms with other bases may also be used.

The bit allocation of inter coded pictures needs to take into account spatial as well as temporal complexity. This is expressed in equation 6 below.

β_INTER＝β_OINTERlog₁₀(1+α_INTER·SSD·D_csatexp(-γ‖MV_P+MV_N‖²) Equation 6)

In equation 6, MV_PAnd MV_NAre the forward and backward motion vectors of the current MB. It can be noted that Y in the INTRA formula²Replaced by SSD (which represents the sum of squared differences).

To understand | MV_P+MV_N‖²In the role in equation 6, it should be noted that the following features of the human visual system: undergoing smooth predictable motion (small | MV)_P+MV_N‖²) Can attract attention and can be tracked by the eye and generally cannot tolerate distortions larger than static areas. However, subject to rapid or unpredictable motion (large | MV)_P+MV_N‖²) The regions cannot be tracked and significant quantization can be tolerated. Experiments show that alpha is_INTERGood visual quality is obtained when the value of 1 and the value of 0.001 are obtained.

Beta may be calculated for each frame_INTRAAnd beta_INTERAnd then performs frame type determination. If beta is_INTER/β_INTRA≧ T, or if a scene change is detected, encoding the frame into an I-frame; otherwise, it is encoded as a P frame or B frame. The number of consecutive B frames is content adaptive.

GOP level Rate Control (RC) and frame level RC

Different picture types have different coding efficiencies and different error robustness. The encoder and decoder each maintain a buffer. The virtual buffer size determines how much burstiness is tolerable. For example, if the virtual buffer size is set to a size that averages eight seconds of multimedia data, the Rate Control (RC) algorithm will maintain an average bit rate over an eight second time frame. The instantaneous bit rate may be much higher or lower than the average bit rate, but in any eight seconds of data, the average of the bit rates should remain in close proximity to the target bit rate.

The buffer fullness of the encoder buffer and the decoder buffer are dual to each other. An overflow event of the decoder buffer corresponds to an underflow event of the encoder buffer. An underflow event of the decoder buffer then corresponds to an overflow event of the decoder buffer.

I frame QP assignment

At the start of the bitstream, a Quantization Parameter (QP) is calculated for the first I frame from the averaged bits per pixel (bpp) value, as illustrated by equation 7.

(equation 7)

The equation is taken from the Joint Video Team (JVT) specification. In the above equations, W and H are the width and height of the picture, respectively.And simply satisfies the parameters in this equation.

The encoder records the QP of the previously encoded P picture. A smooth P picture QP is obtained by equation 8.

QP′_I＝(1-α)×QP′_I+α×QP_P(equation 8)

In the above equation, α is an exponential weighting factor and QP_PIs the QP of the most recently encoded P frame. The QP for the first I frame in the GOP may be calculated using equation 9.

QP_I＝QP′_I+Δ_QP(equation 9)

P frame QP assignment

C as represented by equation 10, the lower bound of the frame size is initialized at the beginning of a group of pictures (GOP).

(equation 10)

Equation 10 is taken from JVT, where B (n)_i，0) Is the buffer occupancy after encoding the (i-1) th GOP, and B (n)_0，0) 0 to make the buffer empty at the beginning, and u (n)_i，j) Is the available channel bandwidth and F is the predefined frame rate.

The encoding of an I-frame may take many bits. As represented by equation 11, the upper bound of the frame size is initialized at the beginning of a group of pictures (GOP).

U(n_i，0)＝(b_I-B(n_i，0) X ω (equation 11)

Equation 11 above is taken from JVT, where b_IIs the maximum number of bits consumed on the initial I-frame and ω is the overflow protection factor for the buffer.

After each frame is encoded, the lower and upper bounds may be updated as expressed by equations 12 and 13.

(equation 12)

(equation 13)

Equations 12 and 13 are taken from JVT, where b (n)_i，j) Is the number of bits produced by the jth frame in the ith GOP.

The remaining foreseen total bits may be calculated as expressed in equation 14.

(equation 14)

In equation 14, N_{RemainingLookAhead}Is the number of the remaining look-ahead frames, β_{RemainingLookAhend}Is a corresponding bandwidth measure of the remaining look-ahead frames, and beta_{ProjectedBWSmoothingWin}Is a bandwidth measure of the projected bandwidth smoothing window. It can be seen that the task of emptying the buffer is distributed over the bandwidth smoothing window, the size of which can be specified by the user. For example, to maintain an average bit rate W for every eight seconds of video, the bandwidth smoothing window should be set to 8 times the frame rate (frames/second).

The projection size of the P frame is expressed in equation 15.

(equation 15)

In the above equation, β (n)_i，j) Is a bandwidth metric of the jth frame in the ith GOP and it can be obtained by a bandwidth metric (B).

To prevent underflow and overflow of buffer occupancy, pair R (n)_i，j) The clipping functions of equations 16 and 17 are applied.

R(n_i，j)＝max{L(n_i，j)，f(n_i，j) } (equation 16)

R(n_i，j)＝min{U(n_i，j)，f(n_i，j) } (equation 17)

Equations 16 and 17 are taken from JVT. Quadratic modeling described in JVT may be applied to calculate Qstep and QP, where S is the SAD value obtained from motion estimation. Parameters in the quadratic model may be updated frame-by-frame using linear regression.

As can be seen from the above discussion, INTER frame RC is similar to the JVT proposal with the following improved characteristics; large virtual buffers allow for sufficient bit rate variation and quasi-constant quality; allocating frame sizes according to frame complexity mapping; anticipatory frames (superframes) are used to exploit the anti-causal statistics.

B frame QP assignment

This part may be the same as the JVT proposal. Equation 18 may be used if there is only one B frame between an I frame pair or a P frame pair.

Equation 18 above is taken from JVT, where QP₁And QP₂Is the quantization parameter of the I-frame pair or P-frame pair.

Equation 19 may be used if the number of B frames between an I frame pair or a P frame pair is L (L > 1).

(equation 19)

The above equations are taken from JVT, where the quantization parameter and QP for the first B frame₁The difference therebetween is determined by equation 20 a.

α＝min{-3，max{2，2×L+QP₂-QP₁}](equation 20a)

Superframe level Rate Control (RC)

Superframe level rate control is used so that the size of the superframe does not exceed a maximum specified value. In one aspect, if multi-pass encoding is implemented, the super-frame level RC is not implemented in the first pass.

After encoding the entire superframe, the encoder verifies whether the size of the superframe is below a limit. If not, the bandwidth map for each frame in the current superframe may be scaled down as described by equation 20 b. In equation 20b, P is a protection factor between 0 and 1.

(equation 20b)

Basic cell level Rate Control (RC)

A basic unit may be one or more macroblocks.

INTRA basic unit QP adjustment

The relationship expressed in equation 21 can be used for Intra base unit QP adjustment.

(equation 21)

In equation 21, A is a non-negative parameter determined by experiment, andis the average bandwidth ratio of the video object neighborhood. A value of a of 0.08 results in good visual quality. The change in QP is further constrained by prescribed limits to prevent abrupt changes.

INTER base unit QP adjustment

INTER base units QP are assigned according to a quadratic model taken from JVT as shown in equation 22.

(equation 22)

Adaptive base unit size

The size of the base unit determines how often adjustments can be made to the QP. However, excessive QP adjustment increases overhead. Adaptive base unit sizing will group together and assign a single QP to Macroblocks (MBs) with similar QPs.

Fig. 10 illustrates an example of a method for encoding multimedia data. The method comprises the following steps: encoding 1010 first multimedia data corresponding to the selected window of data; and encoding 1020 second multimedia data different from the first multimedia data, the second multimedia data corresponding to the selected window of data when the first multimedia data is re-encoded.

Fig. 11 illustrates an example of an apparatus for encoding multimedia data. The apparatus comprises: encoding means 1110 for encoding first multimedia data corresponding to a selected window of data and second multimedia data different from the first multimedia data, the second multimedia data corresponding to the selected window of data; and means 1120 for re-encoding the first multimedia data when the encoding means 1110 encodes the second multimedia data.

Those of skill in the art would understand that information and signals may be represented using any of a variety of different technologies and techniques. For example, data, instructions, commands, information, signals, bits, symbols, and chips that may be referenced throughout the above description may be represented by voltages, currents, electromagnetic waves, magnetic fields or particles, optical fields or particles, or any combination thereof.

Those of skill would further appreciate that the various illustrative logical blocks, modules, and algorithm steps described in connection with the examples disclosed herein may be implemented as electronic hardware, firmware, computer software, middleware, microcode, or combinations thereof. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed methods.

The various illustrative logical blocks, modules, and circuits described in connection with the embodiments disclosed herein may be implemented or performed with the following means: a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. Some or all of the system may be implemented in a processor not shown. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, a hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may then reside in a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a user terminal.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other examples as well, and additional elements may also be added. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method of encoding received video data, the method comprising:

determining a first encoding complexity of a first portion of the video data; and

encoding the first portion of video data based at least in part on the first encoding complexity.

2. The method of claim 1, further comprising:

receiving a target bit rate determined at least in part from the first encoding complexity, wherein encoding further comprises encoding the first portion to enable transmission of the encoded first portion at a bit rate approximating the target bit rate.

3. The method of claim 1, further comprising:

wherein the first encoding complexity is based on encoding of the first portion;

receiving a target bit rate determined at least in part from the first encoding complexity, wherein encoding further comprises re-encoding the first portion to enable transmission of the re-encoded first portion at a bit rate approximating the target bit rate.

4. The method of claim 1, further comprising:

wherein the first encoding complexity is not based on an encoding process;

a second encoding complexity of a first portion of the video data is retrieved, wherein a target bit rate is determined based at least in part on the first encoding complexity and the second encoding complexity.

5. The method of claim 1, wherein the first portion corresponds to video data for a predetermined period of time, the predetermined period of time being less than the duration of a video clip.

6. The method of claim 5, wherein the predetermined period of time is about 1 second.

7. The method of claim 1, wherein the first portion corresponds to a predetermined number of frames of video data.

8. The method of claim 1, wherein the first encoding complexity is based at least in part on a logarithm of a metric of an object, the metric being based on a combination of local spatial frequency and ambient light.

9. The method of claim 8, wherein the object is a macroblock or a frame.

10. An apparatus for encoding received video data, the apparatus comprising:

means for determining a first encoding complexity of a first portion of the video data; and encoding means for encoding the first portion of video data according, at least in part, to the first encoding complexity.

11. The apparatus of claim 10, further comprising means for receiving a target bit rate determined at least in part according to the first encoding complexity, wherein encoding further comprises encoding the first portion to enable transmission of the encoded first portion at a bit rate approximating the target bit rate.

12. The apparatus of claim 10, wherein the first encoding complexity is based on encoding the first portion, further comprising means for receiving a target bit rate determined at least in part according to the first encoding complexity, wherein the encoding means further comprises means for re-encoding the first portion to enable transmission of the re-encoded first portion at a bit rate that approximates the target bit rate.

13. The apparatus of claim 10, further comprising:

wherein the first encoding complexity is not based on an encoding process;

means for retrieving a second encoding complexity for a first portion of the video data, wherein a target bit rate is determined based at least in part on the first encoding complexity and the second encoding complexity.

14. The apparatus of claim 10, wherein the first portion corresponds to video data for a predetermined period of time, the predetermined period of time being less than a duration of a video clip.

15. The apparatus of claim 14, wherein the predetermined period of time is about 1 second.

16. The apparatus of claim 10, wherein the first portion corresponds to a predetermined number of frames of video data.

17. The apparatus of claim 10, wherein the first encoding complexity is based at least in part on a logarithm of a metric of an object, the metric being based on a combination of local spatial frequency and ambient light.

18. The apparatus of claim 18, wherein the object is a macroblock or a frame.

19. An apparatus for encoding received video data, the apparatus comprising:

a processor configured to determine a first encoding complexity of a first portion of the video data; and

an encoder configured to encode the first portion of video data according to, at least in part, the first encoding complexity.

20. The apparatus of claim 19, wherein the encoder is further configured to receive a target bit rate determined at least in part according to the first encoding complexity, wherein the encoder is further configured to encode the first portion to enable transmission of the encoded first portion at a bit rate approximating the target bit rate.

21. The apparatus of claim 19, wherein the first encoding complexity is based on an encoding of the first portion, wherein the encoder is configured to receive a target bit rate determined at least in part according to the first encoding complexity, wherein the encoder is further configured to re-encode the first portion to enable transmission of the re-encoded first portion at a bit rate that approximates the target bit rate.

22. The apparatus of claim 19, wherein the first encoding complexity is not based on an encoding process, wherein the encoder is further configured to retrieve a second encoding complexity for a first portion of the video data, wherein a target bit rate is determined at least in part from the first encoding complexity and the second encoding complexity.

23. The apparatus of claim 19, wherein the first portion corresponds to video data for a predetermined period of time, the predetermined period of time being less than a duration of a video clip.

24. The apparatus of claim 23, wherein the predetermined period of time is about 1 second.

25. The apparatus of claim 19, wherein the first portion corresponds to a predetermined number of frames of video data.

26. The apparatus of claim 19, wherein the first encoding complexity is based at least in part on a logarithm of a metric of an object, the metric being based on a combination of local spatial frequency and ambient light.

27. The apparatus of claim 26, wherein the object is a macroblock or a frame.

28. The apparatus of claim 26, further comprising a multiplexer, wherein the multiplexer is configured to provide a target bit rate for the encoder based at least in part on an encoding complexity.

29. A computer-program product embodied in a tangible medium and having instructions for encoding received video data, the computer-program product comprising:

means for determining a first encoding complexity of a first portion of the video data; and

means for encoding the first portion of video data based at least in part on the first encoding complexity.

30. The computer program product of claim 29, further comprising:

means for receiving a target bit rate determined at least in part according to the first encoding complexity, wherein the means for encoding further comprises instructions for encoding the first portion to enable transmission of the encoded first portion at a bit rate approximating the target bit rate.

31. The computer program product of claim 29, further comprising:

means for receiving a target bit rate determined at least in part according to the first encoding complexity, wherein the means for encoding further comprises instructions for re-encoding the first portion to enable transmission of the re-encoded first portion at a bit rate approximating the target bit rate.

32. The computer program product of claim 29, further comprising:

wherein the first encoding complexity is not based on an encoding process;

a module having instructions for retrieving a second encoding complexity for a first portion of the video data, wherein a target bit rate is determined at least in part from the first encoding complexity and the second encoding complexity.

33. The computer program product of claim 29 wherein the first portion corresponds to video data for a predetermined period of time, the predetermined period of time being less than the duration of a video clip.

34. The computer program product of claim 34, wherein the predetermined period of time is approximately 1 second.

35. The computer program product of claim 29, wherein the first portion corresponds to a predetermined number of frames of video data.

36. The computer program product of claim 29, wherein the first encoding complexity is based at least in part on a logarithm of a metric for an object, the metric being based on a combination of local spatial frequency and ambient light.

37. The computer program product of claim 36, wherein the object is a macroblock or a frame.

38. A method for encoding multimedia data, comprising:

encoding first multimedia data, the first multimedia data corresponding to the selected window of data; and

encoding second multimedia data different from the first multimedia data, the second multimedia data corresponding to the selected window of data when the first multimedia data is re-encoded.

39. The method of claim 38, wherein the selected window of data corresponds to about 1 second of data.

40. The method of claim 38, wherein the first multimedia data comprises video data and encoding the first multimedia data comprises:

determining an encoding complexity of the video data; and

encoding the video data.

41. The method of claim 38, wherein re-encoding the first multimedia data uses encoding statistics from encoding the first multimedia data.

42. An apparatus for encoding multimedia data, comprising:

encoding means for encoding first multimedia data corresponding to a selected window of data and second multimedia data different from the first multimedia data, the second multimedia data corresponding to the selected window of data; and

re-encoding means for re-encoding the first multimedia data while the encoding means encodes the second multimedia data.

43. The method of claim 42, wherein the selected window of data corresponds to about 1 second of data.

44. The method of claim 42, wherein the first multimedia data comprises video data and the encoding device comprises:

means for determining an encoding complexity of the video data; and

encoding means for encoding the video data.

45. The method of claim 42, wherein the re-encoding device re-encodes the first multimedia data using encoding statistics from the encoding device.

46. An apparatus for encoding multimedia data, comprising:

a first encoder configured to encode first multimedia data corresponding to a selected window of data and to encode second multimedia data different from the first multimedia data, the second multimedia data corresponding to the selected window of data; and

a second encoder configured to re-encode the first multimedia data while the first encoder is encoding the second multimedia data.

47. The apparatus of claim 46, wherein the selected window of data corresponds to about 1 second of data.

48. The apparatus of claim 46, wherein the first multimedia data comprises video data and the first encoder comprises:

a processor configured to determine an encoding complexity of the video data; and

an encoder configured to encode the video data.

49. The apparatus of claim 46, wherein the second encoder is configured to receive encoding statistics from the first encoder to re-encode the first multimedia data.

50. A computer-program product embodied in a tangible medium and having instructions for encoding multimedia data, the computer-program product comprising:

a module having instructions for encoding first multimedia data corresponding to a selected window of data and encoding second multimedia data different from the first multimedia data, the second multimedia data corresponding to the selected window of data; and

means for re-encoding the first multimedia data while the means for encoding is encoding the second multimedia data.

51. The computer program product of claim 50, wherein the selected window of data corresponds to about 1 second of data.

52. The computer-program product of claim 50, wherein the first multimedia data comprises video data and the module having instructions for encoding the first multimedia data comprises:

a module having instructions for determining an encoding complexity of the video data; and

a module having instructions for encoding the video data.

53. The computer-program product of claim 50, wherein the module having instructions for re-encoding the first multimedia data comprises instructions for using encoding statistics received from the module having instructions for encoding the first multimedia data.