HK1141150A

HK1141150A - Dynamic stream interleaving and sub-stream based delivery

Info

Publication number: HK1141150A
Application number: HK10107299.1A
Authority: HK
Inventors: M‧卢比; P‧帕克扎德; M‧沃森; L‧维奇萨诺; J‧J‧克利什
Original assignee: 数字源泉公司
Priority date: 2007-04-16
Filing date: 2008-04-16
Publication date: 2010-10-29

Description

Dynamic stream interleaving and sub-stream based delivery

Cross Reference to Related Applications

This application claims the benefit of U.S. provisional application No.60/912145 entitled "Dynamic Stream interleaving and Sub-Stream Based Delivery", filed on day 16, 4/2007. The contents of this application are incorporated by reference herein in their entirety for various purposes.

The present disclosure is also incorporated by reference into the following commonly assigned applications/patents for various purposes, as if fully set forth herein:

U.S. Pat. No.6,307,487 to Luby (hereinafter referred to as "Luby I");

U.S. Pat. No.7,068,729 to Shokrollahi et al (hereinafter referred to as "Shokrollahi");

U.S. patent application No.11/423,391 entitled "Forward Error-correcting (FEC) Coding and Streaming" (hereinafter "Luby II"), filed on 9.6.2006, by Luby et al; and

U.S. patent application No.11/674,625 to Watson et al, entitled "Streaming and buffering Using Variable FEC Overhead and Protection period", filed 2/13/2007 (hereinafter "Watson").

Technical Field

The present invention relates to improving streaming quality delivery, content move-to-time, scalable distributed delivery of streams, and the use of FEC coding in various aspects to improve streaming solutions. Streaming includes streaming of audio, video and data at constant or variable bit rates for on-demand, playlist content or live broadcasts.

Background

As it is becoming more common to deliver high quality audio and video over packet-based networks, such as the internet, cellular and wireless networks, powerline networks, and many others, streaming media delivery is becoming more and more important. The quality of the delivered streaming media depends on a number of factors including the quality of the original content, the encoding quality of the original content, the ability of the receiving device to decode and display video, the timeliness and quality of the signal received at the receiver, etc. The delivery and timeliness of the signals received at the receiver is particularly important in order to establish a perceptually good streaming experience. Good delivery provides fidelity of the stream received at the receiver compared to the stream sent from the sender, while timeliness indicates how quickly the receiver can begin playing the content after initiating a request for the content.

Recently, it has become common practice to consider the use of Forward Error Correction (FEC) codes for protection of streaming media during transmission. When transmitted over packet networks, examples of which include the internet and wireless networks such as those standardized by organizations such as 3GPP, 3GPP2, and DVB, the source streams are placed into packets when generated or made available, and the packets are thus used to carry the source or content streams to the receivers in the order in which they were generated or made available.

In a typical application of FEC codes for these types of cases, the encoder uses the FEC codes in the creation of repair packets, which are subsequently transmitted in addition to the original source packets containing the source streams. The repair group has the following attributes: when a source packet loss occurs, the received repair packets may be used to recover the data contained in the lost source packet. Repair packets may be used to recover the content of a completely lost source packet, but may also be used to recover from when a partial packet loss occurs, whether a completely received repair packet or even a partially received repair packet. Thus, the fully or partially received repair packets may be used to recover the fully or partially lost source packets.

In other instances, other types of corruption of the transmitted data may occur, for example, the values of the bits may shift, and therefore repair packets may be used to correct such corruption and provide recovery of the source packets as accurately as possible. In other instances, the source streams are not necessarily sent in discrete packets, but may be sent, for example, as a continuous bit stream.

There are many examples of FEC codes that may be used to provide protection of a source stream. Reed-Solomon (Reed-Solomon) codes are well known codes used for error and erasure correction in communication systems. One known effective implementation of reed-solomon codes uses the Cauchy (Cauchy) or van der monde (Vandermonde) matrices described hereinafter for erasure correction on, for example, packet data networks: rizzo, "Effective Erasure Codes for replaceable Computer Communication protocols", Computer Communication view, 27 (2): 24-36(April 1997) (hereinafter "Rizzo") and Blemer et al, "An XOR-Based Erasure-responsive coding scheme", Technical Report TR-95-48, International Computer Science Institute, Berkeley, California (1995) (hereinafter "XOR-Reed-Solomon"), or others.

Other examples of FEC codes include LDPC codes, chain reaction codes such as those described in Luby I, and multi-level chain reaction codes such as in Shokrollahi I.

Examples of FEC decoding processes for variations of Reed-Solomon codes are described in Rizzo and XOR-Reed-Solomon. In those instances, decoding is applied after enough source and repair data packets have been received. The decoding process can be computationally complex and dependent on available CPU resources, which can take a significant amount of time to complete relative to the length of time spanned by the media in the block. The receiver has to take into account the length of time required for decoding when calculating the delay required between the start of reception of the media stream and the playback of the media. This delay due to decoding is perceived by the user as a delay between when they request a particular media stream and the start of playback. It is therefore desirable to minimize this delay.

In many applications, the packets are further subdivided into symbols on which FEC processing is applied. A packet may include one or more symbols (or less than one symbol, but the symbols are not typically divided among the packets). The symbols may have any size, but the size of the symbols is often at most equal to the size of the packet. The source symbols are those symbols that encode the data to be transmitted. A repair symbol is a symbol that is generated from source symbols, either directly or indirectly, in addition to the source symbols (i.e., if all source symbols are available and no repair symbols are available, the data to be transmitted can be fully recovered).

Some FEC codes are block-based, where the encoding operation depends on one or more symbols in a block, and may be independent of symbols not in that block. Using block-based encoding, the FEC encoder can generate repair symbols for one block from source symbols in that block and then move to the next block, and need not involve source symbols other than the source symbols for the current block being encoded. In transmission, a source block including source symbols may be represented as an encoded block including encoded symbols (which may be some source symbols, some repair symbols, or both). In the presence of repair symbols, not all source symbols are required in each coding block.

For some FEC codes, particularly reed-solomon codes, the encoding and decoding time increases and is no longer practical as the number of encoded symbols per source block increases. Thus, in practice there is often a practical upper limit on the total number of coded symbols that can be generated per source block (255 being a likely practical limit for some applications), particularly in the typical case where reed-solomon encoding or decoding processing is performed by conventional hardware, such as implementing MPE-FEC processing using reed-solomon codes included as part of the DVB-H standard for protecting streams against packet loss in dedicated hardware in a cellular telephone limited to a total of 255 reed-solomon coded symbols per source block. This imposes a practical upper limit on the maximum length of the source block being encoded, since it is often necessary to place symbols in separate packet payloads. For example, if the packet payload is limited to 1024 or less bytes and each packet carries one coding symbol, the encoded source block may be at most 255KB (kilobytes), and this is of course an upper limit on the size of the source block itself.

The following other concerns are also problematic: such as being able to decode the source blocks fast enough to keep up with the source stream transmission rate, minimizing the decoding delay introduced by FEC decoding, and using only a small fraction of the available CPU on the receiving device at any point during FEC decoding.

Other concerns include the ability to start playing the stream, such as using a personal computer to decode and present the received audio and video streams and display the video on a computer screen and play the audio through built-in speakers, or as another example, using a set-top box to decode and present the received audio and video streams and display the video on a television display and play the audio through a stereo system. One major problem is to minimize the delay between when the user decides to watch new content delivered as a stream and when the content starts playing, hereinafter referred to as "content move-to-time". An example of content moving to is when a user is watching a first content delivered via a first stream, and then the user decides to watch a second content delivered via a second stream, and initiates an action to start watching the second content. The second flow may be sent from the same or a different group of servers as the first flow. Another example of content moving to is when a user is watching a website and decides to start watching a first content delivered via a first stream by clicking on a link in a browser window. Another example of content moving to is when a user wants to seek and start viewing at a new position (forward or backward) in the same content stream. Minimizing content move-to-time is important for video viewing to allow users to have a high quality fast content surfing experience when searching and sampling a large amount of available content. A high quality fast content surfing experience is often positively correlated with the amount of content consumed by a user.

It is often the case that: the main factor affecting the content move-to time is the underlying FEC structure. Another consideration is to minimize the time slot between the end of playing one content and the beginning of playing another content, which is preferably continuous, with little or no pause. For example, when the one content is a broadcast display and the next content is an advertisement, or vice versa, a long gap between their plays (referred to herein as a "content transition time") is undesirable.

It is clear that it is desirable to minimize the content transition time, while minimizing the streaming speed to the receiver during the period around the transition.

Another consideration is to maximize the quality of the delivered stream when using a best-effort delivery network such as the internet, which may drop packets and cause a large variation in the amount of time it takes for packets to be delivered, while minimizing the use of network resources such as bandwidth.

Another consideration is to provide a robust and scalable stream delivery solution that allows components of the system to fail without adversely affecting the quality of the stream delivered to the receiver.

Interleaving may be used to provide superior protection against channel imperfections such as intermittent packet loss. For example, packet loss is often somewhat bursty, so spreading out the source blocks over a longer period of time may be beneficial. For some FEC codes, local use of large source blocks is practical, but for other FEC codes, such as reed-solomon codes, there is often a practical limit to the size of source blocks that can be used. Therefore, in order to spread the transmission of packets associated with source blocks over a longer time interval, it is beneficial to interleave the transmission of packets that include coded symbols for different source blocks.

Previously, methods have been introduced that address some of the considerations discussed above. For example, some novel FEC source block formation and interleaving methods are described in Luby II. Some interleaving methods are static in the sense that the amount of interleaving is fixed for the entire stream. Thus, there is sometimes a trade-off between the amount of interleaving and the content move-to-time that affects the quality of protection provided by such an approach, i.e. a larger amount of interleaving provides better stream protection, but a longer content move-to-time, and this trade-off is determined in a fixed manner for the entire duration of the stream transmission to the receiver.

There are some methods, such as some described in Watson, that provide short content move-to times and larger amounts of interleaving during most of the streaming process. Some of the methods described in Watson dynamically transition from short initial source blocks to longer and longer source blocks and transmit at a slightly faster rate than the content stream transmission rate during the transition period. Such an approach provides short content move-to times while allowing the quality of protection provided as the stream progresses to be established. For example, one way to apply some of the methods described in Watson is to determine the source block structure and perform FEC encoding as the stream is being transmitted, i.e. the short to long source block structure is determined and FEC encoded as they are transmitted at each point where they are accessed by a separate receiver, so source block structure formation and FEC encoding is performed uniquely for each receiver and the stream transmitted to each receiver is unique. However, it is sometimes desirable to have a source block structure of a content stream that is determined independently of the delivery of the stream, e.g., independently of the receiver, independently of the time at which the content is viewed and where in the content stream the viewing begins, and independently of the order in which the data in the stream is delivered. This is particularly important if the content stream is to be delivered from multiple servers to a single receiver.

Accordingly, it is desirable to have improved processes and devices.

Disclosure of Invention

Embodiments of an encoder, decoder and communication system according to aspects of the present invention provide methods of dynamically interleaving a stream, including methods of dynamically introducing a greater amount of interleaving independent of any source block structure when the stream is transmitted. Some of the benefits of these methods are: they spread the loss or error in the channel over a larger period of time in the original stream than when no interleaving was introduced, they provide superior protection against packet loss or packet corruption when FEC coding is used, they provide superior protection against network jitter, and they allow content move-to time and content transition time to be reduced to a minimum. Some additional benefits of these methods include smoothing the transmitted stream speed, including on transitions from streaming one content to another, and minimal content transition times.

Embodiments of an encoder, decoder and communication system according to aspects of the present invention may also divide the data stream into sub-streams, deliver the sub-streams to the receiver over the network along different paths, and receive at the receiver simultaneously the different sub-streams sent from possibly different servers. When used in conjunction with FEC encoding, the method comprises: the encoded portion of each source block is delivered from possibly different servers. Some of the benefits of these approaches include improved content move-to time, robustness to server failures and path failures, robustness to disk failures, improved robustness to packet loss and/or corruption, improved scalability of the overall streaming solution, and improved content storage and streaming speed balancing between servers.

Embodiments of encoders, decoders, and communication systems according to aspects of the invention may also combine dynamic interleaving with substream delivery. For example, the source block architecture and FEC encoding may be determined using dynamic interleaving, and the encoded stream may be divided into sub-streams, and a combination of the sub-streams may be delivered to the receiver using dynamic interleaving to provide a robust stream delivery system with minimal content move-to-time. The benefit of these combined approaches is the combination of the benefits of dynamic interleaving and sub-stream delivery.

The following detailed description and the accompanying drawings will provide a better understanding of the nature and benefits of the present invention.

Drawings

Fig. 1 is a block diagram of a communication system according to one embodiment of the present invention.

Fig. 2 is a diagram illustrating content moving to time.

Fig. 3A is a diagram illustrating components of content moving to time.

Fig. 3B is a diagram illustrating CPU utilization for FEC during decoding.

FIG. 4 is a diagram illustrating a source block structure of a content stream and a representation of the corresponding content stream speed for each source block.

Fig. 5 is a diagram illustrating a coding block structure corresponding to the content stream of fig. 4.

Fig. 6 is a diagram illustrating a receiver and content shift-to-time corresponding to a basic transmitter method.

Fig. 7 is a diagram illustrating a streaming sub method.

Fig. 8 is a diagram illustrating static interleaving according to a band method of streaming transmission.

Fig. 9 is a diagram illustrating a receiver and content shift-to-time corresponding to the static interleaved transmitter method.

Fig. 10 is a diagram illustrating a dynamic interleaved transmitter method when transmitting a new stream to a receiver.

Fig. 11 is a diagram illustrating content shift-to-time and long term guard periods experienced by a receiver for a dynamic interleaved transmitter method.

Fig. 12 is a diagram illustrating a content transition between two consecutive content segments for a dynamic interleaved transmitter method.

Fig. 13 is a diagram illustrating a content transition between two discontinuous content segments for a dynamic interleaved transmitter method.

Fig. 14 is a diagram illustrating encoded content streams to be distributed from a head-end server to respective distributed servers to be used in a sub-stream-based delivery method.

Fig. 15 is a diagram of a receiver that requests content streams from various distributed servers and receives encoded content streams from some of the servers, illustrated in a sub-stream-based delivery method.

Detailed Description

Embodiments provide novel methods of dynamically interleaving streams, including methods for dynamically introducing a greater amount of interleaving when transmitting a stream independently of any source block structure, where the transmission is over a network or the like. The embodiments also provide novel methods for: the data stream is divided into sub-streams, which are delivered to the receiver over the network along different paths, and different sub-streams sent from possibly different servers are received at the receiver simultaneously. When used in conjunction with FEC encoding, the method comprises: the encoded portion of each source block is delivered from possibly different servers. Embodiments also provide novel methods of combining dynamic interleaving with sub-stream delivery.

In the following, it is assumed that the data-carrying network is packet-based in order to simplify the description herein, and it is recognized that a person skilled in the art can readily see how the processes and methods described herein can be applied to other types of transport networks, such as continuous bit stream networks. In the following, it is assumed that FEC codes provide protection for lost packets or portions of data in lost packets in order to simplify the description herein, and it is recognized that one skilled in the art can readily see how the processes and methods described herein can be applied to other types of data transmission errors, such as bit transforms. In this specification, it is assumed that the data to be encoded (source data) has been divided into equal length "symbols", which may be of any length (as small as a single bit), but may be of different lengths for different parts of the data.

The symbols may be carried on the data network in packets and the full number of symbols is either explicitly carried or implicit in each packet. In some cases, it may be that the source packet is not a multiple of the symbol length, in which case the last symbol in the packet may be truncated. In this case, for FEC coding purposes, it is implicitly assumed that this last symbol is filled with bits of a fixed pattern, e.g. zero-valued bits, so that the receiver can still fill this last truncated symbol into a complete symbol even if these bits are not carried in the packet. In other embodiments, the fixed pattern of bits may be placed in a packet, thereby effectively padding symbols to a length equal to the packet length. The size of a symbol can often be measured in bits, where the symbol has a size of M bits and is selected from a symbol table of 2^ M (the M power of 2) symbols. Non-binary digits are also contemplated, but binary bits are preferred because they are more commonly used.

The FEC codes considered for streaming are typically systematic FEC codes, i.e. the source symbols of a source block are included as part of the encoding of the source block and are thus transmitted. Those skilled in the art will recognize that the methods and processes described herein are equally well suited for FEC codes that are not systematic. The systematic FEC encoder generates a certain number of repair symbols from a source block of source symbols, and the combination of the source symbols and at least some of the repair symbols are coded symbols transmitted over a channel representing the source block. Some FEC codes are useful for efficiently generating the required number of repair symbols, such as "information-added codes" or "fountain codes", and examples of these codes include "chain reaction codes" and "multi-level chain reaction codes". Other FEC codes, such as reed-solomon codes, may actually generate only a limited number of repair symbols for each source block.

There are many other ways to carry symbols and although the following description uses a grouped approach for simplicity, this is not meant to be limiting or comprehensive. In the context of the following description, the term "packet" is not meant to be limited to literally representing content being transmitted as a single unit of data. Rather, it represents a broader intent to include a logical grouping defining symbols and partial symbols that may or may not be transmitted as a single data unit.

There are also forms of data corruption other than symbol loss, such as symbols whose values are changed in transmission or otherwise corrupted, and the methods described below are equally applicable to them. Thus, although the following description often describes symbol loss, the method is equally well applicable to other types of corruption and other types of FEC codes other than FEC cancellation codes, such as FEC error correction codes.

FEC code examples

Fig. 1 is a block diagram of a communication system 100 using chain reaction FEC encoding. In the communication system 100, an input file 101 or an input stream 105 is provided to an input symbol generator 110. The input symbol generator 110 generates a sequence of one or more input symbols (IS (0), IS (1), IS (2),..) from an input file or stream, and each input symbol has a value and a position (represented in fig. 1 as a bracketed integer). The possible values of an input symbol, i.e. its symbol table, are typically a two million symbol table, such that each input symbol encodes M bits of the input file. The value of M is generally determined using communication system 100, but a general system may include symbol size input for input symbol generator 110, such that M may be different for different uses. The output of the input symbol generator 110 is provided to an encoder 115.

The key generator 120 generates a key for each output symbol to be generated by the encoder 115. Each key may be generated according to one of the methods described in Luby I or Shokrollahi I or any similar method that ensures that most of the keys generated for the same input file or data block in the stream are unique, whether they are generated by this or another key generator. For example, the key generator 120 may generate each key using a combination of the output of the counter 125, the unique stream identifier 130, and/or the output of the random number generator 135. The output of the key generator 120 is provided to the encoder 115. In other instances, such as some streaming applications, the key set may be fixed and reused again for each data block in the stream. In an exemplary embodiment, the number of keys that can be generated is controlled by the resolution of the key generator rather than the size or other characteristics of the input file or stream. For example, if the expected input is on the order of often thousands of symbols or less, the key resolution may be 32 bits, allowing up to 40 hundred million unique keys. One consequence of these relative numbers is that an encoder encoded according to these keys is able to generate 40 hundred million unique output symbols for 4000 input symbols. In fact, most communication systems will not lose the 0.999999 part of the symbol, so there is no need to generate 40 billion output symbols anywhere, so the number of possible keys can be considered effectively infinite and will not have to be repeated, and the probability that two independent choices of keys will obtain the same key is infinitely small. However, if this is the case for some reason, the resolution of the key generator can be increased so that the process using the key can function as if there were an endless supply of keys.

From each key I provided by the key generator 120, the encoder 115 generates output symbols having values b (I) from input symbols provided by the input symbol generator.

The value of each output symbol is generated according to some function of its key and one or more input symbols, referred to herein as "associated input symbols" of the output symbol or simply "associated" of the output symbol. Typically, but not always, this is the case, M being the same for both input and output symbols, i.e. they are both encoded for the same number of bits. In some embodiments, the number of input symbols, K, is used by the encoder to select the association. K may be just an estimate if K is not known in advance, such as when the input is a stream and K may vary between each block in the stream. The value K may also be used by the encoder 115 to allocate storage of the input symbols.

The encoder 115 provides the output symbols to the transmission module 140 and the key generator 120 provides the key for each such output symbol to the transmission module 140. The transmitting module 140 transmits the output symbols and depending on the key generation method used, the transmitting module 140 may also transmit some data about the keys of the transmitted output symbols to the receiving module 150 over the channel 145. It is assumed that channel 145 is an erasure channel, but this is not a requirement for proper operation of communication system 100. The modules 140, 145 and 150 may be any suitable hardware components, software components, physical media or any combination thereof, as long as the sending module 140 is adapted to send the output symbols and any required data about their keys to the channel 145 and the receiving module 150 is adapted to receive the symbols and possibly some data about their keys from the channel 145. The value of K may be sent over channel 145 if used to determine the association, or it may be preset by negotiation of encoder 115 and decoder 155.

Channel 145 may be a real-time channel such as a path through the internet or a broadcast link from a television transmitter to a television receiver or a telephone connection from one point to another, or channel 145 may be a storage channel such as a CD-ROM, disk drive, or web site, etc. Channel 145 may even be a combination of real-time and storage channels, such as a channel formed in the following manner: a person sends an input file from a personal computer to an Internet Service Provider (ISP) through a telephone line, the input file being stored on a web server and then sent to a recipient through the internet.

When channel 145 comprises a packet network, communication system 100 may not be able to assume that the relative order of any two or more packets is maintained in transit over channel 145. Thus, the key for the output symbol is determined using one or more of the key generation schemes described above, and not necessarily by the order in which the output symbols depart from the receiving module 150.

The receiving module 150 provides the decoder 155 with output symbols and any data received by the receiving module 150 regarding the keys of these output symbols is provided to a key regenerator 160. The key regenerator 160 regenerates the keys for the received output symbols and provides these keys to the decoder 155. Decoder 155 uses these keys and the corresponding output symbols provided by key regenerator 160 to recover the input symbols (again IS (0), IS (1), IS (2),.). Decoder 155 provides the recovered input symbols to input file reassembler 165, and input file reassembler 165 generates a copy 170 of input file 101 or a copy 175 of input stream 105.

Media streaming applications

When used in media streaming applications, source packets forming a source media stream are sometimes collected in groups called source blocks. For example, a source block may be a set of source packets spanning a fixed length of time, and, for example, a reed-solomon cancellation code may be applied independently to the source blocks to generate a repair packet that is sent to the receiver with the original source packets of the source blocks.

At the transmitter, when source packets arrive, the source stream may be successively divided into source blocks, and repair packets are generated for each source block and transmitted. It is sometimes preferable to minimize the total end-to-end delay added by using FEC codes, especially for live or interactive streaming applications, while it is sometimes preferable if the overall design of the FEC solution is such that it is delayed as little as possible at the sender before being sent, and all source packets and repair packets for a source block are sent with as little total delay as possible. It is also preferred if the speed of the FEC encoded stream is as smooth as possible, i.e. there is as little variability as possible in the FEC encoded stream speed or at least not amplifying any variability already present in the original source stream, as this makes the FEC encoded stream bandwidth usage more predictable and minimizes the impact on the network and other possibly competing streams. It is also preferred if the data sent in the packets of a source block is spread as evenly as possible during the time when the packets of that source block are sent, as this provides the best protection against burst loss. It is also preferred that the source blocks are constructed in such a way as to minimize the content move-to time and the content transition time. It is also preferred that the logic at the receiver is as simple as possible.

At the receiver, if a packet is lost or received with errors (which may be detected, for example, using a CRC check, and the packet discarded), then the repair packet may be used to recover the lost source packet, assuming that enough repair packets have been received.

In some applications, the packets are further subdivided into symbols on which FEC processing is applied. For some FEC codes, particularly reed-solomon codes, the encoding and decoding time becomes impractical as the number of coded symbols per source block increases, and there is often an upper limit on the total number of coded symbols that can be generated per source block. Since the symbols are placed in different packet payloads, this imposes a practical upper limit on the maximum length over the encoding of the source block, and of course also on the size of the source block itself.

For many applications, it is advantageous to provide protection on data exceeding the maximum source block size when protection is to be provided over a long period of time or when the media stream rate is high. In these cases, using source blocks smaller than the maximum source block size and then interleaving source packets from different source blocks provides a solution in which source packets from individual source blocks being transmitted are spread over a larger period of time. For other applications, when short content move-to times are desired and when the source block structure is determined independently of the interleaving method, it is desirable to use shorter source blocks and start sending them in sequence when the receiver accesses the content, and then increase the amount of interleaving as content streaming continues in order to extend the sending of source blocks over a longer time interval to increase the level of protection against burst loss.

Another consideration is the ability to decode the source blocks fast enough to keep up with the source stream transmission speed, minimize the decoding delay introduced by FEC decoding, and use only a small fraction of the available CPU on the receiving device at any point during FEC decoding. It is therefore desirable to use source block interleaving that allows the FEC decoding of each source block to be spread as equally in time as possible, and that minimizes the FEC decoding delay.

Various embodiments described herein provide one or more of these advantages.

Streaming and FEC codes

To provide FEC protection for the source stream, the source stream may be a combination of one or more logical streams, examples of which are a combination of an audio RTP stream and a video RTP stream, a combination of a MIKEY stream and an RTP stream, a combination of two or more video streams, and a combination of a control RTCP traffic and an RTP stream. When a source stream arrives at a sender in a format such as a source bitstream, a source symbol stream, or a source packet stream, the sender may buffer the stream into source blocks and generate a repair stream from the source blocks. The transmitter may schedule and transmit the source and repair streams, for example, in packets to be transmitted over a packet network. The FEC encoded stream is a combined source stream and maintenance stream. The FEC receiver receives the FEC encoded stream which may have been corrupted, e.g. due to loss or bit-shifting. The FEC receiver attempts to reconstruct the original source blocks of the source stream and schedules and makes available the original source stream at the receiver.

For many applications, the source block structure is determined in conjunction with the structure of the base layer stream, e.g., the GOP structure and/or frame structure of an H.264AVC video stream. For some of these applications, the source block structure is determined prior to and/or independent of the stream transmission order of packets, which may depend on when and where the stream is accessed by a receiver to receive the stream. For such applications, it is preferred that the source block structure is determined in the following manner: each source block comprises a contiguous set of source packets from the stream to allow for minimizing content move-to times and content transition times.

For some applications it is preferred that the source block structure formation and FEC encoding of the stream is performed before the stream is transmitted. One reason for this is that the stream can be sent to many receivers, thus completing the source block structure formation and FEC encoding for all receivers at once, which provides some scalability benefits.

For streaming applications, there are several key parameters that are inputs to the design of how FEC codes are used to protect the source stream and several key metrics that are often important for optimization.

One possible key input parameter in the design of the source block structure is the source block duration. The source block duration of a source block may be defined as the duration of the symbols generated from that source block if the source blocks are transmitted in sequence (i.e., not interleaved) and at normal speed (i.e., at substantially normal play-out speed). Alternatively, the source block duration may be defined as the playing time of the video represented by the source block. In some cases, the two definitions are consistent, but they may not. However, for simplicity of description herein, we use the source block duration without specifying which definition is represented, simply assuming that the two definitions are consistent. Those skilled in the art will recognize that the methods and processes described herein are suitable for any one definition of source block duration, even if the two definitions are not identical, and even in some cases, the source block may be sent much faster than its play-out speed. Furthermore, those skilled in the art will recognize that there are other ways to specify the size of the source blocks or the playing time, for example by specifying the number of symbols in the source blocks and the symbol size of the source blocks.

The guard period of a source block is the period of time during which the source block is transmitted, regardless of whether the source block transmission interleaves transmissions of packets from some source blocks with transmissions of packets from other source blocks. Note that the guard period is generally equal to the source block duration if source block interleaving is not used, but the guard period may be longer and sometimes much longer than the source block duration when interleaving is used.

The amount of protection of a source block is the number of FEC repair symbols sent for the source block, which is expressed as a fraction or percentage of the number of source symbols in the source block. For example, if the amount of protection is 20%, and there are 10,000 source symbols in a source block, there are 2,000 repair symbols generated from the source block. The amount of protection is a relative concept, i.e. the amount of protection for the same source block may differ depending on the location of the transmitting source block and the location to which the source block is to be transmitted. For example, a source block may be sent from a first server to another server with a 50% protection amount, whereas the same source block may be sent from a second server to a receiver with a 10% protection amount.

The source block duration and the amount of protection for each source block may vary from source block to source block. For example, when the source blocks preferably do not span between certain source packets in the source stream, such as when the first packet is the last packet of a group of pictures (GOP) in an MPEG2 video stream and the second consecutive packet is the first packet of the next GOP, then the source blocks may be terminated after the first packet and new source blocks started at the second packet. This allows the FEC encoded blocks to be aligned with the video encoded blocks, which may have many advantages, including the advantage that receiver delay or channel move time may be minimized since it is possible to minimize the combination of video buffering and FEC buffering at the receiver. In other applications, it is advantageous to always maintain the same source block duration and/or source block size for each successive source block for a number of reasons. In some of the following description, for simplicity, it is assumed that the source block duration and the amount of protection are the same for each subsequent source block. It will be clear to those skilled in the art, after reading this disclosure, that this is not limiting, as it can be readily determined after reading this disclosure how the processes and methods described herein can be applied when the amount of protection or source block duration or both vary from one source block to the next, and when the source block size varies from one to the next.

To simplify some of the following discussion, it is sometimes assumed that the source symbols of the original stream arrive at a transmitter at a steady speed, the transmitter being configured to perform source block formation and FEC encoding, and that, once the FEC receiver first makes the source symbols available at said receiver, the subsequent source symbols are made available to the FEC receiver at the same steady speed, assuming that no source symbols are lost in the first source block from which the source symbols were received, and that in each subsequent source block, the encoded symbol loss is at most likely to allow successful FEC decoding. This simplified assumption is not inherent in the operation or design of the processes and methods described subsequently and is not meant to limit these processes in any way to this assumption, but is merely introduced as a tool to simplify some explanation of the nature of the processes and methods. For example, for a variable speed stream, the corresponding condition is to have the FEC receiver obtain the source symbols at the same or approximately the same speed they arrive at the transmitter. In some applications, it is preferable to deliver the decoded source symbols to the video player as soon as possible at the receiver in order to minimize content move-to-time, and in such cases the source symbols may be delivered in bursts of source blocks. In some applications, it is desirable to split the source block formation, FEC encoding and transmission steps into two or more different steps. For example, as described below, source block formation and FEC encoding may be performed in one server, then the encoded stream is divided into sub-streams, then the sub-streams are sent to one or more distributed servers and cached locally, then some or all of the sub-streams are sent from some of the one or more distributed servers to the receiver.

Some important key metrics for minimization include transmitter delay, which is the delay introduced by the transmitter. For some applications, such as live video streaming, or interactive applications, such as video conferencing, it is desirable to minimize sender delay. One aspect of the overall design that helps minimize transmitter delay is that the transmitter transmits the encoded symbols of one or more initial source blocks of the stream to the receiver in a sequential order. Other design aspects for minimizing transmitter delay are described below.

Another important metric is content move-to-time. As shown in fig. 2, this is the time between when the receiver joins or requests the stream until when the FEC receiver first makes the source symbols available from the stream. In general, it is desirable to minimize content move-to-time, as this minimizes the amount of time between when the receiver joins a stream and when the stream first begins to become available at the receiver (e.g., for playing a video stream). One important aspect of minimizing content move-to-time is that the transmitter maintains the original transmission order of the encoded symbols of the original source block, but as described below, there are many other important design aspects that have a large impact on content move-to-time.

The content move-to-time typically includes multiple components. Examples of these components of a stream divided into sequential source blocks are shown in fig. 3A and 3B where interleaving is not used. Fig. 3A shows a single source block for each guard period, and the example shows the case where the receiver adds a stream at the beginning of the source block. The two components of the content move-to-time in this example are the guard period and the FEC decoding delay. The receiver protection period for the source block is the following time: during this time, the receiver is buffering received coded symbols from the source block. Note that the transmitter protection period and the receiver protection period are the same if the channel between the transmitter and the receiver does not have any change in the amount of time required for each bit, byte, symbol, or packet to travel from the transmitter to the receiver. Thus, in practice the sender protection period may be different from the receiver protection period for the same source block due to network time variations in delivery.

To simplify the description herein, it is assumed that the sender protection period and the receiver protection period are the same for each source block (and "protection period" is used synonymously for sender protection period and receiver protection period), but this need not always be the case. In other words, there is an assumption that the delivery time is the same for all data networks. Those skilled in the art can make the necessary changes to the methods and apparatus described herein after reading this disclosure to account for differences in sender and receiver protection periods due to network delivery fluctuations.

The guard period component of the content move-to-time is unavoidable because even without any loss of source symbols in the first source block, a delay is still required so that at least the guard period can obtain source symbols to ensure smooth source symbol delivery of all subsequent source symbols when there is a loss of coded symbols in the subsequent source block. During the guard period, some or most or all of the FEC decoding of the source block may occur concurrently with the reception of the coded symbols. At the end of the guard period, there may be additional FEC decoding that occurs before the first source symbol of the source block is available from the FEC receiver, and this period is labeled FEC decoding delay in fig. 3A. In addition, even after the first source symbol is available, there may be additional FEC decoding that occurs before the second and subsequent source symbols of the source block are available. This additional FEC decoding is not shown in fig. 3A for simplicity, and it is assumed in this example that there is sufficient available CPU resources to decode all source symbols after the first source symbol at a sufficiently fast rate.

Another possible component of content move-to-time may be the time between when the receiver requests to add a stream and when the first packet of that stream arrives at the receiver. This amount of time may be variable and depends on one or more round-trip times between a receiver and one or more senders of packets of the flow. This component of content move-to-time is not described in detail herein, but those skilled in the art will recognize that it may sometimes be a significant contributor to content move-to-time that should be considered, and that the methods and processes described herein may be readily modified to account for this possible contributor to content move-to-time.

Fig. 3B shows two possible FEC decoding CPU utilization curves that may correspond to the example shown in fig. 3A. In one of the two curves shown in fig. 3B, the CPU utilization for FEC decoding is the same at each point in time, i.e., the CPU utilization is evenly distributed. This is a desirable CPU utilization curve because it predictably uses the same amount of CPU resources at each point in time and minimizes the maximum CPU resources given that the same amount of total CPU resources are needed to decode the entire source block. In the other of the two curves shown in fig. 3B, the CPU utilization for FEC decoding is different at each point in time, and in particular, the CPU utilization near and only following the end of reception of the coded symbols of the source block is significantly higher than at other points in time. This is not a desirable CPU utilization curve because CPU resource usage spikes at certain points in time, which may be when other processing such as a video player also places demand on the CPU, thus increasing the likelihood of causing, for example, glitches in the playing of the video stream. Therefore, the design of FEC solutions for protecting streams is to provide the following solutions: wherein the FEC decoder uses the CPU as smoothly and uniformly as possible over time. As an example, the design criterion may be that the maximum CPU utilization at any point in time in the FEC decoding process in the worst case mode of code symbol loss is less than a certain threshold, e.g. at most 10% of the CPU is used over each 100 ms interval.

In some streaming applications, when the receiver happens to join a stream in the middle of a source block, the content move-to time may be as small as the source block duration plus decoding delay when there is no loss of source symbols from that first partial source block, as long as the sender initially maintains the original transmission order and delivery speed of the source packets. In other video streaming applications, the sender always sends the stream from the start of a GOP to the receiver, where it is preferable that the start of the source block is aligned with the start of the GOP. Therefore, to minimize content shift-to-delay, it is desirable for the transmitter to maintain the original transmission order of the source symbols of the initial source block.

FEC streaming solutions can also be used to minimize FEC end-to-end delay, which is the worst case overall delay for live streaming applications, introduced by the use of FEC between when the source packets are ready for streaming at the sender before FEC encoding is applied and when the source blocks can be played at the receiver after FEC decoding has been applied. For other types of streaming applications, such as on-demand streaming or playlist content streaming, FEC end-to-end delay is not a major consideration.

In all types of streaming applications, it is important to minimize content move-to times and content transition times. At the same time, it is important to minimize the transmission speed of the stream, i.e. to always limit the transmission speed to a fraction greater than the content stream transmission speed, including during content move-to and content transitions.

FEC streaming solutions may also be used to minimize fluctuations in transmission speed when FEC is used. One benefit of this is that in a packet network, flows with fluctuating sending speeds are more susceptible to packet loss due to congestion or buffer overflow when the peak in the sending speed of the flow coincides with the peak of other traffic at a point in the network with limited capacity. At least, the fluctuation in the speed of the FEC encoded stream should not be worse than the fluctuation in the speed of the original source stream, and it is preferable that the fluctuation in the speed of the FEC encoded stream becomes smaller when more FEC protection is applied to the original source stream. As a special case, if the original stream is transmitted at a constant speed, it is preferable to transmit the FEC encoded stream at as close to the constant speed as possible.

The property that the time when the last coded symbol of each subsequent source block is received extends over time as uniformly as possible is a desired property. The time when the last coded symbol is received for a source block is the time when all the information for decoding the source block is available to the FEC decoder, and this is typically the time when the FEC decoder must work best to complete decoding within a specified decoding delay budget under worst case loss conditions. Thus, evenly spreading the reception of the last coded symbols of the source block allows for smoother use of the CPU for FEC decoding.

The FEC streaming solution should provide as simple logic as possible at the FEC receiver. This is important in many environments because the FEC receiver may be built into a device with limited computing, memory and other resource capabilities. Also, in some cases there may be large symbol losses or corruption in the transmission, and therefore the FEC receiver must recover from catastrophic loss or corruption situations where there is little or no environment to understand where to continue receiving from the stream as conditions improve. Thus, the simpler and stronger the FEC receiver logic, the faster and more reliable the FEC receiver can start recovery and make it possible to obtain the source symbols of the source stream again from the reception of the FEC encoded stream.

Repair packets for source blocks may be sent before, after, or intermixed with source packets for source blocks, and as described herein, there are advantages to different strategies.

Some of the overall desirable features of FEC streaming solutions include:

1. short content move-to time

2. Short content transition time

3. The transmit stream speed should be limited at all times, i.e., to a fraction greater than the content stream speed.

4. The transmit stream speed should be smooth and should be at least as smooth as the content stream speed.

5. When FEC encoding is used, source block formation and FEC encoding may be performed on the stream, and the same encoded stream may be transmitted to many receivers at possibly different times.

6. When FEC coding is used, protection against packet loss using a small source block duration with the minimum amount of protection required should be high, especially when the loss is somewhat bursty in nature.

7. When FEC coding is used, the source block should comprise a contiguous part of the stream.

8. When FEC encoding for live streaming applications, the FEC end-to-end delay should be small.

9. When FEC encoding is used, FEC decoding should smoothly extend CPU utilization.

Basic transmission of FEC encoded streams

In this section, we describe the basic method and process by which the transmitter times the transmission of packets of a stream that can be FEC encoded. Let k be the number of source symbols in the source block, let T be the source block duration of the source block, and let p be the amount of protection expressed as a fraction, so p x k repair symbols will be sent for the source block. The values of k, T and p are determined dynamically as each source block is formed, so that the values of k and T for a source block are known by the source block formation process when most or all of its source symbols have reached the source block formation process, and the value of p can be determined after all of its source symbols have reached the source block formation process or by a separate process. Also, the source block formation process may change the symbol size of different source blocks. Thus, many or all of these parameters for a particular source block may be well known by the source block formation process to the receipt of data for that source block.

The following procedure describes a basic transmitter that does not use interleaving. For simplicity, it is assumed for this basic transmitter that the source block formation process has been applied to the stream, and that the stream has been divided into successive source blocks, each such source block comprising k source symbols and having a source block duration of T seconds, and that for each such source block p x k repair symbols have been generated.

When the receiver requests a stream starting at a particular source block (or proactively transmits the stream using an explicit start request from the receiver), the base transmitter starts to transmit (1+ p) × k coded symbols for the requested source block over a period of T seconds, and then transmits the coded symbols for the next source block after the requested source block, and so on.

The basic sender has the following properties:

1. the guard period is T, which is the same as the source block duration.

2. The symbols transmitted for the source block are spread evenly over a period of T seconds. This means that the level of protection provided for loss when there is a burst interruption of fixed duration is not dependent on when the interruption occurred during transmission of the symbol, which is desirable.

3. The transmitter does not cause fluctuations in the overall transmission speed of the symbols. In particular, if the original transmission speed of the source symbols is constant, the transmission speed of all symbols is still constant, and if the original arrival speed of the source symbols at the transmitter is variable, at least the constant transmission speed of the symbols of each source block decreases the fluctuation. This is a desirable attribute.

4. The content move-to time may be as small as T. This means a minimum buffering of (1+ p) × k symbols (assuming all source blocks comprise k source symbols), which is the smallest possible for a given guard period and is therefore desirable.

One attribute that the basic transmitter has is that the content moves to a time that is at least the time of the protection period, and the protection period is directly related to the quality of protection against burst loss. Therefore, a tradeoff sometimes needs to be made between the protection period and the content move-to time. For example, it is desirable to have content move-to times below one second, while also having a guard period that spans several seconds, in order to provide better protection against temporary network outages or other types of intermittent network problems that result in bursty packet losses that may last on the order of tens or hundreds of milliseconds, in some cases several seconds, while using a reasonably small amount of protection, such as 10%. It is desirable to be able to have a guard period that is much larger than the content move-to time and this is one of many advantages provided by the interleaving method described in the next section.

Stream interleaving

This section describes novel methods and processes for obtaining a data stream and applying different time delays to different portions of the data stream in the following manner: some parts are delayed more than others in the transmission process. One of the more important aspects of these methods and processes is the means for dynamically adjusting the amount of delay incurred in different parts of a data stream when the stream is transmitted.

It is often preferable to align the source blocks with the group of pictures (GOP) structure or other frame structure of the video stream in order to minimize content move-to-time and provide better protection of the stream. In some applications, it is desirable that the interleaving process may occur independently of the source block formation process, which may be performed at different times or may be performed at different locations. In some cases, interleaving may be desirable, for example, to spread burst errors more evenly across the stream, even if the source block formation process is not used, for example because FEC encoding is not used. The methods described herein are applicable even when source block formation and FEC encoding are not used, as will be appreciated by those skilled in the art.

In some cases, there is an advantage in allowing the transmitter to interleave the transmission of symbols from different source blocks, which allows the symbols of each source block to be spread over a guard period that is longer than the source block duration. One reason for this is to provide better protection against time dependent losses (e.g., bursty losses), i.e., a smaller amount of protection is required to provide protection against burst losses of fixed duration as the protection period for the source block increases. Although the source block duration may be t seconds, the desired guard period for the source block may be p seconds, where p > t. Other desirable attributes of transmitters that use interleaving include (1) that the source packets are transmitted in their original order in the source block, and (2) that the time when the last coded symbol of each subsequent source block is received is spread as evenly as possible over time.

When FEC coding is used, a method is introduced of statically interleaving the transmission of the coded symbols of the source blocks, and a method is introduced of dynamically adjusting the amount of interleaving when transmitting the stream, typically with little or no interleaving at the beginning of the transmission of the stream, and therefore with approximately the same guard period as the source block duration, and smoothly introducing more and more interleaving as the transmission of the stream progresses, so that the guard period becomes much longer than the source block duration. This allows for minimizing content move-to-time at the receiver while allowing for more and more protection against burst loss or corruption as transmission proceeds. Another advantage of the method is that more and more network jitter can be protected gradually as the transmission of the stream proceeds.

To simplify the following description, it is assumed that the source block formation and FEC encoding processes are performed before transmission of the stream. This is not a limitation of the method as those skilled in the art will recognize that the process of forming source blocks and performing FEC encoding on these source blocks and transmitting the stream as described below may run concurrently and in some cases this may be beneficial. Also, for some applications, the source block formation, FEC encoding process, and the method for interleaved transmission of streams as described below may be dynamically dependent on each other, i.e., how the source blocks are formed and FEC encoding may in some cases depend on the transmission stream policy.

Streaming tape method

To describe the new interleaving method, it is beneficial to introduce the following stream transmission subband method. Fig. 4 is an explanatory diagram of a content stream for which a source block structure has been determined. For each source block 405(1), 405(2),. and.,. the width 410(1), 410(2),. shows the content play duration for that source block, and the height of each source block 415(1), 415(2) shows the average play speed for each content stream source block, where in this example different source blocks have different play speeds.

Fig. 5 shows a corresponding coding block structure corresponding to fig. 4, i.e. FEC coding has been applied to each source block to generate further repair data 510(1), 510(2) for each source block to form a coded block. 510(1), 510(2), height 515(1), 515(2), indicate the amount of further repair data generated in each encoded block for each source block, i.e. if the encoded blocks are transmitted over the same duration as the corresponding source blocks, the height indicates the average transmission speed of the encoded source blocks. This figure is merely illustrative and not restrictive, as, for example, the amount of repair data generated for each coding block may be greater than the number transmitted for each coding block, and the number transmitted for each coding block may vary between receivers. Moreover, fig. 5 is not meant to suggest a representation of the ordering of source symbols and repair symbols in the encoded source block.

Fig. 6 is an explanatory diagram showing a content move-to-time experience of a receiver corresponding to the basic sender method. Some components of the content movement to time 605 include: the time 610 required for the receiver to receive a first encoded block of the stream sufficient to decode the first source block; the time 620 required for the receiver to decode the first source block from the received portion of the first encoded block; reserved buffer time 630 reserved to absorb network jitter, variations in source block duration, and time to decode a source block from a portion of an encoded block received during reception of a stream.

The tape method of sending the stream is described and one skilled in the art will recognize that there are many equivalent descriptions of similar methods and variations on the description that result in variations of the methods described herein. Fig. 7 shows an example of a stripe method corresponding to the coding block structure shown in fig. 5. In the stripe approach, the transmission of the data stream is indicated by delineating the stream to be transmitted as a stripe 705, where each position 710 along the X-axis of the stripe corresponds to a different point in time in the encoded block structure, and where the height of the stripe is always the same, e.g. nominally height 1, regardless of the speed of the encoded source block at that point in time along the stripe. The sending of the streams represented by the bands may be indicated by moving lines 720(1), 720(2) extending from the top of the band 725 to the bottom of the band 730. In one representation, lines 720(1), 720(2) move over time to represent the order of transmission of data from the coding blocks of the stream. Each point 740(1), 740(2) in the band represents a block of stream data to be transmitted, for example each point may represent a packet of code symbols of an encoded block, or each point may represent a single code symbol of an encoded block. Points that fall in the region corresponding to encoding block 750(1), 750(2),. represent data associated with that encoding block.

The transmission processing according to the band method of transmitting a stream is represented by sweeping lines 720(1), 720(2) on a band with time when the stream is transmitted, and each time the line sweeps a point, data of the stream corresponding to that point is transmitted. Fig. 7 shows lines at two different times in the transmission process, where line 720(1) is its outline at a first time and line 720(2) is its outline at a second time. Thus, the transmission process transmits all data associated with points in the area defined by 720(1), 720(2), 725, and 730 during the time interval between the first time and the second time. The distribution of the points in each coding block, which is the amount of data represented by that point, is preferably evenly distributed in the band regions for that coding block according to their weight, e.g. randomly or pseudo-randomly or deterministically by a process that ensures that the points are evenly distributed according to the weight of each point.

As noted above, the line 720 is straight, but one skilled in the art will recognize that there are many variations, for example the line may be curved or comprise a sequence of continuous line segments and may change its shape as it sweeps during the transmission process. Other variations of the tape dispensing method also exist, including representing the tape such that the tape is not the same height, but varies its height according to the velocity of the flow at the position 710 of the tape.

There are various methods for specifying the movement of a line through the tape during the transmission process, as described in more detail below.

Static interleaving method

The banded approach to transmitting the stream may be used to implement any type of content stream or any depth of static interleaving of the encoded content stream, whether FEC encoding is used or not, and whether a source block structure is used or not. For illustrative purposes, it is assumed that the source block structure has been defined and FEC encoding is used.

Referring to fig. 7, one way to achieve a given number of static interlaces using the banded approach of transmitting the streams is described by way of example. In this example, each coded block is interleaved with other adjacent coded blocks by an amount of time D, i.e., the interleaving depth is D. In this example, the receiver transmits the values of positions X and D when a stream is requested. Then, the transmission processing at the transmitter is described by: by configuring the line 720 such that initially the line 720 intersects the bottom 730 of the band at a position X-D, the line 720 intersects the top 725 of the band at a position X, and then the sending process sweeps the line 720 forward in time at the same speed as the play speed of the stream, i.e., at time t after the sending process is initiated, the line 720 has swept so that it intersects the bottom 730 of the band at a position X-D + t, and the line 720 intersects the top 725 of the band at a position X + t.

In this illustration of the static interleaving method, if the method is used to send a newly requested stream to the receiver, it is beneficial that X is located at a position in the stream to start playing at the receiver, e.g., X is the starting position of the coding block, or X is the starting position of a GOP in the video stream, and the start of the coding block is aligned with the start of the GOP. Also, in these cases, it is beneficial for the transmitter not to send any data to the receiver before position X along the band, since typically the receiver will only receive a portion of the encoded block, and most likely not be sufficient to fully decode the partially received encoded block.

Fig. 8 is an explanatory diagram illustrating the shape of a transmission stream when the transmitter uses the static interleaving method just described. In this case, a static interleaving method is applied to the band shown in fig. 7, which corresponds to the encoded stream shown in fig. 5. In this example, the receiver specifies the value of X as the starting position of the first coding block 750(1) in fig. 7, so in this example, no data preceding position X along the band is to be sent. In this example, the receiver also specifies the value of D, which may be a value such as 10 seconds. The resulting stream of data sent by the transmitter according to this process is shown in fig. 8, where the areas of 850(1), 850(2), and. Note that the transmission speed shown in fig. 8 is a smoothed version of the original content speed shown in fig. 5.

Fig. 9 is an explanatory diagram showing content shift-to-time experienced by a receiver corresponding to the static interleaving method as described above. Some components of the content movement to time 905 include: the time 910 required for the receiver to receive a first encoded block of the stream sufficient to decode the first source block, which is the sum of the source block duration and the interleaving depth D; the time required by the receiver to decode the first source block from the received portion of the first encoded block 920; and a reserved buffer time 930 reserved for expected network jitter delay, variation in source block duration, and time to decode the source block from the portion of the encoded block received during reception of the stream. Note that because the guard period, which in this case is the source block duration plus the interleaving depth D, may be much larger than the source block duration, the content move-to time 905 may be much larger than the source block duration using this approach as described.

Dynamic interleaving method

Dynamic interleaving at any pace at any interleaving depth of any type of content stream or encoded content stream, whether FEC encoding is used or source block structure is used, can be achieved using a stream-transmitted-slice approach. For illustrative purposes, it is assumed that the source block structure has been defined and FEC coding is used.

Referring to fig. 7, one way of using a striped approach to transmitting streams to achieve dynamic interleaving starting with no interleaving and proceeding to a given interleaving depth is described by way of example. In a typical use of this approach, a first coded block is transmitted with little interleaving, and then subsequent coded blocks are interleaved progressively more smoothly over time until an interleaving depth D is achieved with other adjacent coded blocks. However, other uses of this method are also disclosed below, and those skilled in the art will recognize that numerous other variations exist. In this example expressing the parameters of the method, the receiver transmits an initial upper position UI of the line 720, an initial lower position LI of the line 720, a final upper position UF of the line 720, a final lower position LF of the line 720 and a time value T when a stream is requested. For the sake of simplicity, it is assumed hereinafter that UF > ═ UI, LF > ═ LI, UF > ═ LF, UI > ═ LI, T > ═ 0. In general, it is preferable to have UF > ═ UI + T and LF > - ═ LI + T to help ensure that data at the receiver is always available when needed. These values of UF, UI, LF, LI, and T may allow for a smooth establishment of a reserve buffer of content at the receiver when interleaving is dynamically adjusted, as described in the examples below.

The sending method at the sender uses the parameters LI, UI, LF, UF and T to perform the sub-band method as follows. First, the line 720 of fig. 7 is initially configured at a transmission time T-0 such that at the beginning the line 720 intersects the bottom 730 of the band at a location LI and at the beginning the line 720 intersects the top 725 of the band at a location UI, then, during the transmission time T-0 to T, the line 720 sweeps across the band such that at the time T the line 720 intersects the bottom 730 of the band at a location T (LF-LI)/T + and at the time T the line 720 intersects the top 725 of the band at a location T (UF-UI)/T + UI. Then, for all transmission times T > T, the line 720 is swept through the band such that at time T the line 720 intersects the bottom 730 of the band at a position T-T + LF and at time T the line 720 intersects the top 725 of the band at a position T-T + UF, i.e. for T > T the interleaving is static using an interleaving depth D ═ UF-LF.

Dynamic interleaving method for newly requested streams

One example use of the dynamic interleaving method is to send a newly requested stream to the receiver. As an example, as shown in fig. 10, the initial values may all be set to the same value I UI LI, i.e. there is no interleaving at the beginning, and S I is satisfied for the position S where the receiver will start playing the content stream. This ensures that the receiver has been sent the entire band of content from location S and beyond. As shown in fig. 10, it is preferable that S is LI, where S is the position in the content stream where the content stream can be played, e.g., S is the start of an encoding block that is aligned with the start of a GOP. Furthermore, T < ═ LF-S is advantageous. This ensures that if the receiver plays the content at the content speed, the transmission of the content is at least as fast as the playing of the content at the receiver, and a reserved buffer time of R-LF-S-T seconds is smoothly established and continues after the static interleaving is reached at the transmission time T from the start of the transmission to the receiver, where the reserved buffer can absorb network jitter, varying source block durations and decoding times. The amount of interleaving is smoothly established from no interleaving to D ═ UF-LF seconds of interleaving.

As a specific example of the dynamic interleaving method, it is assumed that a receiver accesses contents from the beginning and reaches a reserved buffer of 5 seconds in a stable state, and an interleaving depth of 10 seconds is expected in the stable state, and a transmission speed is about 10% higher than an encoded stream speed during a period when interleaving and reserved buffer are increased. Then, possible settings of the parameters are: LI is the desired starting position, T100 seconds, LF S + T +5 seconds, and UF LF +10 seconds. Thus, if the content stream speed for this example is 1Mbps, and a 10% amount of protection is used, the encoded stream speed will be 1.1 Mbps. Then, for the first 100 seconds of transmission using the dynamic interleaving method employing the parameter setting just discussed, the transmission speed will be about 1.21Mbps because a stream of 100+ (5+15)/2 ═ 110 seconds is transmitted in the first 100 seconds. After 100 seconds of transmission, the reserved buffer will be 5 seconds and the interleaving depth will be 10, and then the transmission speed thereafter will be 1.1 Mbps. For a few seconds before 100 seconds of streaming has occurred, the transmission speed will smoothly transition from 1.21Mbps speed to 1.1Mbps speed. It should also be noted that at the beginning, the transmission speed follows the encoded stream speed, and as the interleaving depth and reserve buffer increase smoothly, the transmission speed becomes smooth and more closely matches the average encoded stream speed.

The content shift-to-time experienced by the receiver corresponding to the dynamic interleaving method described above is described with reference to fig. 11. Some components of the content movement to time 1105 include: time required for the receiver to receive a first encoded block of the stream sufficient to decode the first source block 1110; the time required for the receiver to decode the first source block from the received portion of the first encoded block 1120; and an initial reserved buffer time 1130 reserved for expected network jitter delays, source block duration variations, and time to decode source blocks from portions of encoded blocks received during reception of the stream.

Because the reservation buffer is established over time when dynamic interleaving is used, the initial reservation buffer time 1130 when dynamic interleaving is used may be much shorter than when the reservation buffer size is fixed for the entire duration of the stream transmission. For example, with an elementary stream transmitter, the reservation buffer size may be set to two seconds to buffer for long term network jitter of at most two seconds, whereas for a dynamically interleaved transmission method, the initial reservation buffer time 1130 may be set much shorter, e.g. 200 milliseconds, since there is likely to be little network jitter during the first few seconds of the stream transmission and the reservation buffer has been substantially established before it.

Because the guard period for each source block is gradually established when dynamic interleaving is used, the initial source block duration when dynamic interleaving is used can be much shorter than when the guard period is the source block duration for the entire duration of the stream transmission. For example, with an elementary stream transmitter, the source block duration may be set to 5 seconds and the amount of protection may be set to 20% to protect against 500 milliseconds of short burst packet loss, whereas with a dynamic interleaved transmission method, the source block duration may be set to be shorter, e.g. 500 milliseconds, and the amount of protection may be set to be smaller, e.g. 5%, to provide the same level of protection against such bursts, since it is unlikely that such bursts would occur during the first few seconds of the transmission stream and before which a protection period has been established substantially, e.g. as the original source block duration plus an interleaving depth of 10 seconds, to protect against such bursts.

Thus, in summary, with the dynamic interleaving method, the content move-to time can be short, e.g., within 1 second the content move-to time is a few seconds relative to the method of transmitting using the base stream under the same network conditions, and the dynamic interleaving method can also provide superior long-term protection against network jitter and bursty packet loss.

There are many variations of how the parameters may be specified. For example, an initial start position, an initial interleaving amount, a final interleaving amount, and a period of smooth transition from the initial interleaving to the final interleaving on the band may be alternatively specified. Alternatively, rather than specifying a period of smooth transition from the initial interlace to the final interlace, the speed may be specified relative to the content streaming speed at which the transition is to be made. As another example of a variant, the further parameter may be known by the sender or specified by the receiver, e.g. the receiver may explicitly signal a start position S from which the receiver will start playing the content.

There are many variations of the dynamic interleaving method that can be appreciated by those skilled in the art. For example, the transmitter may decide to filter some encoded data from some or all of the encoded blocks and not transmit some encoded data to individual receivers, e.g., because those receivers do not experience much loss. As another variation, the source block structure may be predetermined, but FEC encoding operates to generate encoded blocks for individual receivers while the transmission process is running, or occasionally to generate a large number of repair symbols for certain encoded blocks when a receiver encounters a need for a greater amount of protection than previous receivers.

As another variation, which is often preferred, the receiver may control the setting of initial parameters of the dynamic interleaving method, and the transmitter or group of transmitters may determine the final target parameters of the dynamic interleaving method. For example, the receiver may specify that the content stream is to start with an interleaving depth of 2 seconds and a reserved buffer of 1 second, and the server may then determine that it is to transmit in the following manner: such that an interleaving depth of 20 seconds and a reserved buffer of 10 seconds is achieved in the first 2 minutes of transmission. One advantage of having the one or more servers control the final parameters of the dynamic interleaving method is that live streaming is easier to support, where parts of the content stream outside the current time are not available, and therefore the servers can direct the dynamic interleaving parameters to the final settings that work under given constraints. Another example of an advantage of having the server control the final parameters is: the server may in some cases adjust parameters of multiple clients served by the same content stream from substantially the same location in the stream, in such a way that eventually many receivers are directed to the same final parameters, which makes the efficiency of the server in sending packets to these receivers, since the same packets are to be sent from the content stream to all of these receivers at the same time.

Dynamic interleaving method for content segment transition

One example use of a dynamic interleaving approach is when a receiver transitions from one content segment to the next in a list of content segments, such as when transitioning from a segment of one episode of a performance to an advertisement and then back to the next segment of the performance, where all transitions occur without any receiver interaction. Different content segments may be sent by different transmitters, e.g., segments of a performance episode may be sent by a content server to a receiver and an inserted advertisement may be sent by an advertisement server to the receiver.

A first example is when the receiver is always watching a first segment of content sent by the first transmitter using the dynamic interleaving method as described above, and the first transmitter has sent long enough that the full interleaving depth D and reserved buffer time R have been established. A dynamic interleaving method may then be used to achieve a smooth transition to the second content segment as described below.

1. The transmission speed of the first segment will decrease linearly from the encoded stream speed to 0 in a period of D seconds, D + R seconds before the end of the transmission of the first content segment, and at that point the first transmitter will stop transmitting the first segment.

2. The receiver requests the second content segment from the second server using the parameters UI 0, LI-D, UF 0, and LF-D, T0, D + R seconds before the first content segment ends playing. Assuming no network delay, the second server will start transmitting the stream of the second content segment, increasing the speed linearly D seconds before transmission, and then transmitting at the speed of the encoded stream.

3. When the first content segment ends playing, the reserved buffer for the second content has been established as R seconds, and the interleaving depth has been established as D seconds. At this point, playback of the second content may begin.

Thus, the transition from the first content segment to the second content segment maintains the receiving speed at the receiver at the encoding streaming speed, i.e. the transmission speed of the second content segment increases linearly as the transmission speed of the first content segment decreases linearly, in such a way that the combining speed over the transition is the same as if there was one content segment being transmitted continuously. Also, the reserved buffer protection and the interleaving protection of the second stream are the same as the first stream in steady state. Fig. 12 is an illustration thereof.

Even if the timing at the start of the transmission of the stream of the second content segment differs a little compared to the end of the transmission of the first content segment, the pure error in the streaming speed is small because the speed of ramp-down and the speed of ramp-up are smoothly linear. For example, if there is an error of 500 milliseconds in the transition timing between two streams and the interleaving depth is 10 seconds, the error in the stream transfer speed is at most 5%. This also means that the parameters of the second content segment can be set somewhat more conservatively than described above, i.e. in the following way: they try to establish more than a bit of reserved buffer and interleaving time rather than just trying to keep these values the same as for the first stream and the resulting increase in the combined streaming speed during the content segment transition will be smaller.

A second example of a content segment transition is when the receiver is viewing a first content segment transmitted by a first transmitter using a dynamic interleaving method as described above, but the first transmitter has not transmitted long enough for the full interleaving depth D and the reserved buffer time R to have been established. Then a smooth transition to the second content segment can be achieved using a dynamic interleaving method as follows, where all the receiver does in this case is to set the parameters and request the second stream in the following way: the transition from the first stream to the second stream is as if the two content segments were connected together and sent by one server. The details of how this type of transition can be implemented using a dynamic interleaving method can be derived by those skilled in the art.

A third example of a content segment transition is when the receiver is viewing a first content segment transmitted by a first transmitter using a dynamic interleaving method as described above, and there is a gap between the period between when the first content segment ends playing at the receiver and when the second content segment is to begin playing. This may be a desired behavior, for example, in the following cases: when there is a first segment of the performance episode ending play, followed by a non-streaming advertisement of a duration of, for example, 30 seconds, followed by an immediate play of a second content segment of the performance episode. In this case, the dynamic interleaving method may be used as follows, assuming for simplicity that the first content segment has been transmitted to the receiver long enough so that the full interleaving depth D and the reserved buffer time R have been established. In this case, as in the case of the above first example, the receiver transmits a request for the second content segment using the parameters UI-0, LI-D, UF-0, and LF-D, T-0 before the end of the playback of the first content segment. This causes the second server to begin transmitting the second content segment at a rate that, when combined with the rate of the first content segment transmitted from the first server, is the rate of a single transmission stream. Then, just at the time when the first content segment completes playing at the receiver, the receiver signals the second server to stop sending the stream of second content segments and the sending speed to the receiver immediately decreases to 0. Then, the 30 second gap occurs. At the end of the gap, the receiver immediately starts playing the second content segment and at the same time sends a request for the second content to the second server using the parameters UI + R, LI-R, UF-D + R, LF-R, T-0. This causes the second server to continue sending the second content from where it left off before the gap. The overall effect is that the second content is played immediately at a predetermined time, while the combined transmission speed to the receiver at all points during the transition is the same as one encoded stream speed during the playing of the two content segments, and when neither of the two content segments is played, the transmission speed is 0. Fig. 13 is an illustration thereof.

Those skilled in the art will recognize that there are many other uses and variations of the dynamic interleaving method described above.

Method for sub-stream based delivery

The sub-stream based delivery is a method for: an FEC encoded stream is obtained and divided into sub-streams such that, for example, approximately the same number of each encoded block is included in each sub-stream. For example, the encoded stream may be divided into 40 sub-streams, each of which is made up of approximately 5% of each source block, so in this example the amount of repair data generated for each source block using FEC encoding is approximately equal to the size of the source block. More generally, when applying FEC encoding to each source block and then applying sub-stream based delivery, then the total encoded data for each source block is divided into the sub-streams such that approximately the same number of encodings for each source block is included in each sub-stream, wherein, if the FEC encoding is systematic, the encoded data for each source block includes the original data for each source block plus the generated repair data, and wherein, if the FEC encoding is not systematic, the encoded data for each source block may include the repair data.

One of the main ideas of sub-stream based delivery is to send sub-streams of a stream along possibly different paths and through possibly different servers in order to achieve multiple desired purposes. As an example, there may be servers, referred to hereinafter as head-end servers (HES), that ingest content streams into a sub-stream based delivery system, where some of the processing performed by the HES is to establish a source block structure of the content stream, FEC encode the stream, divide the encoded stream into sub-streams, and then send the sub-streams to other servers, referred to hereinafter as Distributed Servers (DS), which may be distributed in different data centers or in different network locations. An example of which can be seen in fig. 14. In fig. 14, each of the DSs 1430 receives a different segment of the encoded content stream 1420 from the HES 1410. Some of the processing performed by the DS includes caching sub-streams of a content stream as they pass on their way to a receiver, accepting a request from the receiver for a sub-stream from a particular content stream, and sending the sub-stream to the receiver, e.g., according to a receiver request for a particular sub-stream or according to a receiver subscription. A special case of sub-stream transmission includes the original coded stream without further division.

Multiple receivers in a substream-based delivery system may request and receive substreams of the same content segment starting at the same start position, wherein requests for different substreams may be sent to different DSs, and in this case several different substreams of the same encoded stream with the same start position may be sent from different DSs to the same receiver. An example of which can be seen in fig. 15. In fig. 15, the receiver 1530 requests content streams from the respective DSs 1510, 1520. In this case, one of the DSs 1520 does not respond to the request from the receiver, while the other DS 1510 transmits the sub-streams to the receiver. If the responding DS sends enough data to the receiver to complete a full recovery of the content stream encoded using FEC, it is possible for the receiver to fully recover the content stream encoded using FEC.

As an example, an original 1Mbps content stream may be ingested at the HES, the HES forms a source block structure as the content stream passes through, and adds as much repair data (100% repair) as there is source data in the original stream, divides an encoded stream of 2Mbps into sub-streams of 100Kbps, and transmits the resulting 20 sub-streams to 20 different DSs. A receiver that wants to play a content stream starting from a specific position in the stream may make a request to 12 out of 20 DSs to request content starting at the specified starting position from the substreams that the DSs has. In response, all 12 DSs simultaneously transmit to the receiver the substreams of the coded stream they have from the specified starting point, so that each of the 12 DSs is transmitted to the receiver at a speed of 100Kbps, and thus the total speed is 1.2 Mbps.

The substream-based delivery system just described has several advantages, some or all of which may be found in embodiments of the present invention, including: 1) natural load balancing of content, mixing ordinary and unusual, with natural load balancing benefits for service bandwidth capacity and storage capacity on servers; 2) path failure resilience, i.e., there is still enough data being received by the receiver from the other paths after one path failure so that the content stream can still be fully recovered using FEC decoding; 3) robustness against DS crash, DS disk failure, etc.; 4) data transmission from multiple DSs versus a single server provides a greater opportunity to maintain aggregate transmission speed to the receiver without buffer starvation at the receiver. This is even true, especially if TCP or HTTP is used to send the sub-streams from DS to receiver, but if UDP is used to send the sub-streams from DS to receiver; and, 5) a single point of failure in the overall system is at the intake point of the HES and at the receiver, and not necessarily elsewhere.

Combined dynamic interleaving method and sub-stream based delivery method

The dynamic interleaving method and the substream-based delivery method described herein can be combined to provide further benefits, i.e. all the advantages of both methods can be found in a combined solution. For example, using the dynamic interleaving method, when a content stream is inhaled into the system, the source block structure and FEC encoding of the content stream can be performed by the HES. The sub-streams of the FEC encoded stream may be generated at the HES using a sub-stream method, and these sub-streams may then be sent to a different DS for storage. When the receiver wants to receive the content stream from a specific location in the stream, the receiver can send the appropriate dynamic interleaving parameters to all DSs that send the sub-streams to the receiver, and the DSs will send the sub-streams to the receiver according to these parameters. The receiver may put together packets from the sub-streams of the source block to reconstruct the original content stream to be played. The dynamic interleaving method allows for increased reservation buffering and interleaving depth during streaming, provides superior protection against bursty packet loss and network jitter, while providing rapid channel move-to-time to the receiver. The DSs in this exemplary solution do not have to perform FEC encoding and they can also deliver content streams from different parts of the network to the receivers over distributed paths, thereby increasing server diversity and path diversity for delivery, thereby increasing reliability and robustness to server and network failures.

In addition, the amount of protection per source block in this example may be much higher between the HES and the DS than between the DS and the receiver. For example, 20 sub-streams may be generated and sent from the HES to 20 DSs, whereas only 10 sub-streams may be needed to recover the original content stream (100% protection quantum), the receiver may only request 12 sub-streams, i.e. 20% protection quantum, e.g. from 12 of the 20 DSs, which allows the receiver to recover the original content stream even if one of the DSs fails and at most 10% of the packets are lost in total on all paths from the remaining 11 DSs from which the receiver will receive the sub-streams.

The exemplary solution described above has the additional property that: for appropriate logic contained in a receiver with a list of more than 12 of the 20 DSs, when one of the 12 DSs from which the receiver is receiving a sub-stream fails, the receiver can automatically detect this and request another sub-stream from one of the other DSs from which the receiver is not currently receiving a sub-stream, thereby recovering from receiving 11 sub-streams to receiving 12 sub-streams from 12 different DSs, increasing the reliability of the streams.

The change in the method required to combine the dynamic interleaving method and the sub-stream method is small. For example, the method for spreading data points in a coded block to determine the transmission time of the dynamic interleaving method needs to be enhanced so that each DS can decide how to spread the data of each sub-stream in the coded block it has uniformly over the coded block area within the coded block band (see fig. 7). The decision to uniformly spread data by a DS can be made in a manner independent of decisions made by other DSs so that the aggregate spread of data from all sub-streams within the coded block of all DSs sent to the transmitter is very uniform in the coded block area within the coded stream band (see fig. 7).

As another example of the changes in the methods needed to combine them, it is beneficial to add the information sent for each packet so that when a receiver specifies a particular location in a stream to the DS to which a sub-stream is to be sent, the DS can all interpret the particular location of the sub-streams they are to be sent to the receiver in the following manner: consistent with the interpretation of all other DSs that send sub-streams of the same content to the receiver. Those skilled in the art will recognize that these and possibly some other minor changes allow the combination of interleaved streaming and sub-stream based delivery methods to provide further benefits.

While the invention has been described with respect to exemplary embodiments, those skilled in the art will recognize that numerous modifications are possible, and that such recognition by those skilled in the art will be available from a reading of the present disclosure. For example, the processes described herein may be implemented using hardware components, software components, and/or combinations thereof. Therefore, while the invention has been described with respect to exemplary embodiments, it will be appreciated that the invention is intended to cover all modifications and equivalents within the scope of the appended claims.

Claims

1. In a communication system including at least one transmitter and at least one receiver, a method for transmitting a content stream, comprising:

a connection formed between the receiver and the first transmitter;

receiving, at the receiver, a first content stream transmitted from the first transmitter, wherein the first content stream includes an initial amount of interleaving; and

adjusting the amount of interleaving contained in the first content stream independent of a source block structure of the first content stream during transmission of the first content stream.

2. The method of claim 1, wherein the initial amount of interleaving in the first content stream is configured such that there is no initial interleaving in the first content stream.

3. The method of claim 1, wherein the amount of interleaving in the first content stream is adjusted from the initial amount to a steady state amount.

4. The method of claim 3, wherein the interleaving in the first content stream transitions linearly between the initial amount and the steady-state amount.

5. The method of claim 1, wherein the amount of interleaving contained in the first content stream is adjusted as a function of time.

6. The method of claim 1, wherein the amount of interleaving contained in the first content stream is adjusted according to a difference between a playback speed of the first content stream and a transmission speed of the first content stream.

7. The method of claim 1, wherein the amount of interleaving contained in the first content stream is adjusted according to an amount of data loss experienced at the receiver.

8. The method of claim 1, wherein the source block structure of the first content stream does not change during transmission of the first content stream.

9. The method of claim 1, further comprising:

a reserve buffer of content from the first content stream is established.

10. The method of claim 9, wherein the reserve buffer is established concurrently with any adjustments made to the amount of interleaving in the first content stream.

11. The method of claim 1, further comprising:

a second connection formed between the receiver and a second transmitter;

receiving, at the receiver, a second content stream transmitted from the second transmitter connected to the receiver, wherein the second content stream contains an initial amount of interleaving;

adjusting an amount of interleaving contained in the second content stream delivered to the receiver during transmission of the second content stream independent of a source block structure of the second content stream; and

transitioning between the first content stream and the second content stream in a manner that maintains an aggregate transmission speed of the first content stream and the second content stream substantially constant.

12. The method of claim 11, wherein the transition between the first content stream and the second content stream is performed over time as a function of the amount of interleaving contained in both the first content stream and the second content stream.

13. In a transmitter for transmitting data over a channel, a method for transmitting a content stream, comprising:

a connection formed between the transmitter and the receiver;

transmitting a content stream to the receiver, wherein the content stream contains an initial amount of interleaving; and

adjusting the amount of interleaving contained in the content stream independent of a source block structure of the content stream during transmission of the content stream.

14. In a receiver that receives data over a channel, a method for receiving a content stream, comprising:

a connection formed between the receiver and the first transmitter;

receiving a first content stream transmitted from the first transmitter, wherein the first content stream contains an initial amount of interleaving that is adjustable during transmission of the content stream independent of a source block structure of the content stream.

15. In a communication system including at least one transmitter and at least one receiver, a method for transmitting a content stream, comprising:

forming a connection between the receiver and the plurality of transmitters;

receiving, at the receiver, content streams transmitted from the plurality of transmitters, wherein each transmitter transmits a different sub-stream of the content stream to the receiver, and each sub-stream contains an initial amount of interleaving; and

adjusting the amount of interleaving contained in each content substream independently of a source block structure of the content substream during transmission of the content substream.

16. The method of claim 15, wherein the source block structure of the content sub-stream does not change during transmission of the content sub-stream.

17. The method of claim 15, wherein an amount of interleaving included in each content sub-stream is independent of an amount of interleaving included in other content sub-streams.