HK1134368B

HK1134368B - System and method for providing redundancy management

Info

Publication number: HK1134368B
Application number: HK10102336.7A
Authority: HK
Inventors: P‧奥雅拉; A‧拉卡涅米
Original assignee: Nokia Technologies Oy
Priority date: 2006-09-26
Filing date: 2007-09-26
Publication date: 2014-05-30

Description

System and method for providing redundancy management

Technical Field

The present invention relates generally to speech coding. More particularly, the present invention relates to speech coding, error resilience performance (error resilience) and speech transmission over packet switched networks for voice over IP (VoIP) applications.

Background

This section is intended to provide a background or context to the invention that is recited in the claims. The description herein may include concepts that could be pursued, but are not necessarily ones that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, what is described in this section is not prior art to the description and claims in this application and is not admitted to be prior art by inclusion in this section.

In existing packet switched network transmission protocols, all IP packets in which a receiver detects bit errors are removed. In other words, if an error is detected upon reception, the protocol stack does not pass any distorted packets to the application layer. For this reason, the application layer will face some packet loss when IP packets are transmitted over error prone wireless links or over any medium that may cause errors. In such a scheme, on the contrary, no packet arriving at the application layer contains any remaining bit errors.

Because the distorted packet is not transmitted to the application layer, the error concealment algorithm cannot utilize the partially correct frame. However, such lost frames must also be replaced. This situation becomes even more difficult when more than one consecutive packet is lost. Various approaches have been introduced to cope with such packet loss situations. Some conventional approaches include the use of multiple description coding, where information is distributed over several IP packets; and application layer Forward Error Correction (FEC), where the FEC information is used to reconstruct lost packets.

In addition to the above, another approach to solving the packet loss problem involves the use of redundant transmissions. The advantage of redundant transmission is low computational requirements. Redundant transmission can be achieved by simply appending the current frame and one or more previous frames within the same packet. Decoding the redundant stream is also very straightforward: when a packet is lost, the receiver only needs to wait for the next packet to arrive in order to obtain the corresponding frame for the decoder. However, one problem with the frame redundancy technique is that it results in an increase in the bit rate. With frame redundancy, the bandwidth requirement is essentially doubled whenever another frame is appended to the IP packet in question. Furthermore, with frame redundancy, the total delay is increased by an amount equal to the amount of redundancy, since the receiver needs to buffer the speech frames.

Conventional solutions for codec mode adaptation and mode selection in redundant transmission typically include: or simply copy one or more previous frames to the packet while maintaining the existing bit rate, or, on the other hand, reduce the codec bit rate so that the resulting overall rate does not increase significantly. For example, when a narrowband speech call using an AMR codec is set up and its rate is 12.2kbit/s, 100% redundancy (i.e., a previous speech frame appended to the packet is also transmitted when the current frame is transmitted) is achieved by the 5.9kbit/s mode, which results in a bit rate of 11.8 kbit/s. Furthermore, 200% redundancy (i.e. the addition of two previous speech frames) is achieved by the 4.75kbit/s mode, which results in a bit rate of 14.25 kbit/s. AMR is an audio data compression system that uses link adaptation to select one of eight different bit rates based on link conditions. For example, for AMR-WB, 100% redundancy with the lowest possible rate (8.85kbit/s) would result in 17.7 kbit/s when the nominal rate is 12.65kbit/s in an ARM-WB arrangement. Furthermore, it is always possible that the network and/or the receiving terminal may not have to support a codec mode that is automatically selected for redundant transmission based on bit rate requirements.

One previously proposed system for solving the problems discussed above can be found at www.ietf.org/rfc 3267. txt. In this system, the sender is responsible for selecting an appropriate amount of redundancy based on the received (e.g., in RTCP receiver reports) feedback about the channel being used. However, this system relies on feedback and may therefore cause problems if appropriate information from the decoding device is not received.

Accordingly, it is desirable to develop systems and methods that address the above-mentioned problems.

Disclosure of Invention

Various embodiments of the present invention provide a system and method for more efficiently implementing redundancy management in speech coding applications. According to various embodiments of the present invention, a transmitting device selects a redundant transmission level that is appropriate for the current transmission channel conditions, while at the same time, selecting the most appropriate bit rate from the available codec mode set. In embodiments of the present invention, the level of redundancy transmission used is generally optimal for the selected codec mode and no re-negotiation of the codec is required. The codec mode change period and the limitation that the mode only changes to contiguous mode may limit the adaptation speed of multi-rate codecs (which is discussed in www.ietf.org/rfc 3267. txt). In this case, when stepping to the optimal codec mode configuration, the redundant transmission uses an intermediate mode within the negotiated adaptation limit. The system and method of various embodiments of the present invention can be applied to virtually any multi-rate speech coder, such as 3GPP adaptive multi-rate AMR coders and AMR-WB coders.

These and other advantages and features of the invention, together with the organization and manner of operation thereof, will become apparent from the following detailed description when taken in conjunction with the accompanying drawings, wherein like elements have like numerals throughout the several drawings described below.

Drawings

FIG. 1 illustrates a generic multimedia communication system for use with the present invention;

FIG. 2 is a flow chart showing the implementation of various embodiments of the present invention;

FIG. 3 is a perspective view of an electronic device that may be used in the implementation of the present invention; and

fig. 4 is a schematic representation of the mobile phone circuitry of fig. 3.

Detailed Description

Various embodiments of the present invention provide a system and method for more efficiently implementing redundancy management in speech coding applications. According to various embodiments of the present invention, a transmitting device selects a redundant transmission level that is appropriate for the current transmission channel conditions, while at the same time, selecting the most appropriate bit rate from the available codec mode set. In embodiments of the present invention, the level of redundancy transmission used is generally optimal for the selected codec mode and no re-negotiation of the codec is required. The codec mode change period and the limitation that the mode only changes to contiguous mode may limit the adaptation speed of the multi-rate encoder (which is discussed in www.ietf.org/rfc 3267. txt). In this case, when stepping to the optimal codec mode configuration, the redundant transmission uses an intermediate mode within the negotiated adaptation limit. The system and method of various embodiments of the present invention can be applied to virtually any multi-rate speech coder, such as 3GPP adaptive multi-rate AMR coders and AMR-WB coders.

Fig. 1 shows a generic multimedia communication system for use with the present invention. As shown in fig. 1, a data source 100 provides a source signal in an analog format, an uncompressed digital format, a compressed digital format, or any combination of these formats. The encoder 110 encodes the source signal into an encoded media bitstream. The encoder 110 may be capable of encoding more than one media type, such as audio and video, or more than one encoder 110 may be required to encode source signals of different media types. The encoder 110 may also get synthetically produced input, such as graphics and text, or it may produce an encoded bitstream of synthetic media. In the following, the processing of one coded media bitstream of only one media type is considered in order to simplify the description. It should be noted, however, that a real-time broadcast service typically includes several streams (typically at least one audio stream, a video stream, and a text subtitle stream). It should also be noted that the system may include many encoders, but only one encoder 110 is considered below to simplify the description without loss of generality.

The encoded media bitstream is transferred to the storage device 120. The storage device 120 may comprise any type of mass storage to store the encoded media bitstream. The format of the encoded media bitstream in storage 120 may be a primary self-contained bitstream format, or one or more encoded media bitstreams may be encapsulated in a container file. Some systems operate "live", i.e. omit storage, and transfer the encoded media bitstream from the encoder 110 directly to the sender 130. The encoded media bitstream is then transmitted to the sender 130, also called server, as needed. The format used in the transmission may be a primary self-contained bitstream format, a packet stream format, or one or more coded media bitstreams may be encapsulated in a container file. The encoder 110, the storage device 120, and the transmitter 130 may be in the same physical device, or they may be contained in separate devices. The encoder 110 and sender 130 may operate with live real-time content, in which case the encoded media bitstream is typically not stored permanently, but rather buffered for small periods of time in the content encoder 110 and/or sender 130 to smooth out variations in processing latency, transfer latency, and encoded media bitrate.

The transmitter 130 transmits the encoded media bitstream using a communication protocol stack. The stack may include, but is not limited to, real-time transport protocol (RTP), User Datagram Protocol (UDP), and Internet Protocol (IP). When the communication protocol stack is packet-oriented, the sender 130 encapsulates the coded media bitstream into packets. For example, when RTP is used, the sender 130 encapsulates the coded media bitstream into RTP packets according to an RTP payload format. Typically, each media type has a dedicated RTP payload format. It is further noted that the system may comprise more than one transmitter 130, but for simplicity, the following description will only consider one transmitter 130.

The sender 130 may be connected to the gateway 140 through a communication network, or may not be connected to the gateway 140 through a communication network. The gateway 140 may perform different types of functions such as translating a packet stream according to one communication protocol stack to another communication protocol stack, merging and splitting data streams, and processing data streams according to downlink and/or receiver capabilities, such as controlling the bit rate of the forwarded stream according to prevailing downlink network conditions. Examples of gateways 140 include: a multipoint conference control unit (MCU), a gateway between circuit-switched and packet-switched video telephony, a push-to-talk over cellular (PoC) server, an IP encapsulator of a digital video broadcasting-handheld (DVB-H) system, or a set-top box that forwards broadcast transmissions locally to a home wireless network. When RTP is used, gateway 140 is referred to as an RTP mixer and acts as an endpoint for the RTP connection.

Alternatively, the encoded media bitstream may be transferred from the sender 130 to the receiver 150 by other means, such as when a portable mass storage disk or device is connected to the sender 130, storing the encoded media bitstream on the disk or device, and then connecting the disk or device to the receiver 150.

The system includes one or more receivers 150 that are typically capable of receiving, demodulating, and decapsulating the transmitted signal into an encoded media bitstream. Decapsulation may include removing data that the receiver cannot decode or does not wish to decode. The codec media bitstream is typically further processed by a decoder 160, the output of which is one or more uncompressed media streams. Finally, the renderer 170 may reproduce the uncompressed media streams, for example, through a speaker or a display. The receiver 150, decoder 160 and renderer 170 may be in the same physical device or they may be contained in separate devices.

FIG. 2 is a flow chart showing the implementation of one embodiment of the present invention. At 200 in fig. 2, the application layer FEC algorithm on the sender side determines to utilize redundant transmission. At 210, the application layer FEC algorithm checks the available codec mode set that was set according to the previous Session Description Protocol (SDP) offer-answer negotiation with the receiving device. At 220, the application layer FEC algorithm selects an available codec mode that best matches the current bit rate at a given redundancy level. For example, when the current rate is 12.2kbit/s and the negotiated available codec modes are AMR-NB12.2kbit/s, 7.4kbit/s and 4.75kbit/s, the FEC algorithm selects the 7.4kbit/s mode for 100% redundancy and the 4.75kbit/s for 200% redundancy, respectively. If the available codec modes include only the current mode, as shown at 230, the FEC algorithm does not change modes. In addition, if the codec can only adjust the mode every other frame due to the restriction of the codec mode request signaling, the mode change to the lower mode cannot be performed immediately. This is also true when the codec has a limit in the codec mode adaptation step size, i.e. when the codec can only change to an adjacent codec mode. Thus, if bandwidth limitations apply, redundant transmission is only applied when the codec has reached the desired mode suitable for redundant transmission.

Fig. 3 and 4 show an illustrative electronic device 12 in which the present invention may be implemented. It should be understood, however, that the present invention is not intended to be limited to one particular type of electronic device 12. The electronic device 12 of figures 3 and 4 includes a housing 30, a display 32 in the form of a liquid crystal display, a keypad 34, a microphone 36, an ear-piece 38, a battery 40, an infrared port 42, an antenna 44, a smart card 46 in the form of a UICC according to one embodiment of the invention, a card reader 48, radio interface circuitry 52, codec circuitry 54, a controller 56 and a memory 58. Individual circuits and elements are all of a type well known in the art, for example in the Nokia range of mobile telephones.

The present invention is described in the general context of method steps, which may be implemented in one embodiment by a program product including computer-executable instructions, such as program code, executed by computers in networked environments. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Computer-executable instructions, associated data structures, and program modules represent examples of program code for executing steps of the methods disclosed herein. The particular sequence of such executable instructions or associated data structures represents examples of corresponding acts for implementing the functions described in such steps.

Software and web implementations of the present invention could be accomplished with standard programming techniques with rule based logic and other logic to accomplish the various database querying steps, correlation steps, comparison steps and decision steps. It is also noted that the words "component" and "module," as used herein and in the claims, is intended to encompass: an implementation using one or more lines of software code, and/or a hardware implementation, and/or a device for receiving manual inputs.

The foregoing descriptions of embodiments of the present invention have been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed, and modifications and variations are possible in light of the above teachings or may be acquired from practice of the invention. The embodiments were chosen and described in order to explain the principles of the invention and its practical application to enable one skilled in the art to utilize the invention in various embodiments and with various modifications as are suited to the particular use contemplated.

Claims

1. A method for implementing redundant transmission in a packet switched network, comprising:

determining an available codec mode, the available codec mode being set according to a previous negotiation with the receiving device;

selecting a codec mode from said available codec modes; and

encapsulating the packet for transmission in accordance with a particular frame redundancy level selected in accordance with the selected codec mode;

wherein the selected codec mode is selected from the available codec modes to most closely match a current bit rate when the particular frame redundancy level is used.

2. The method of claim 1, wherein the particular frame redundancy level is only achieved for packet transmission after the selected codec mode is reached if codec adaptation is limited by an adaptation rate.

3. The method of claim 1, wherein the particular frame redundancy level is only achieved for packet transmission after the selected codec mode is reached if codec adaptation is limited by an adaptation step size to an adjacent mode.

4. The method of claim 1, wherein if the available codec modes include only a codec mode currently being used, continuing to use the codec mode currently being used.

5. The method of claim 1, wherein the particular frame redundancy level comprises 100% redundancy.

6. The method of claim 1, wherein the particular frame redundancy level comprises 200% redundancy.

7. The method of claim 1, wherein the available codec modes include a 3GPP adaptive multi-rate AMR codec mode.

8. The method according to claim 1, wherein the available codec modes comprise adaptive multi-rate wideband AMR-WB codec modes.

9. The method of claim 1, wherein the prior negotiation comprises a session description protocol, SDP, offer-answer negotiation.

10. An apparatus for implementing redundant transmission in a packet switched network, comprising:

means for determining an available codec mode, the available codec mode being set according to a previous negotiation with the receiving device;

means for selecting one codec mode from the available codec modes; and

means for encapsulating packets for transmission in accordance with a particular frame redundancy level selected in accordance with the selected codec mode;

11. The apparatus of claim 10, wherein the particular frame redundancy level is implemented for packet transmission only after the selected codec mode is reached if codec adaptation is limited by an adaptation rate.

12. The apparatus of claim 10, wherein the particular frame redundancy level is implemented for packet transmission only after the selected codec mode is reached if codec adaptation is limited by an adaptation step size to an adjacent mode.

13. The apparatus of claim 10, wherein if the available codec modes include only a codec mode currently being used, the codec mode currently being used is continuously used.

14. The apparatus of claim 10, wherein the particular frame redundancy level comprises 100% redundancy.

15. The apparatus of claim 10, wherein the particular frame redundancy level comprises 200% redundancy.

16. The apparatus of claim 10, wherein the available codec modes include a 3GPP adaptive multi-rate AMR codec mode.

17. The apparatus of claim 10, wherein the available codec modes comprise adaptive multi-rate wideband AMR-WB codec modes.

18. The apparatus of claim 10, wherein the prior negotiation comprises a session description protocol, SDP, offer-answer negotiation.