HK1149403B

HK1149403B - Communication method and communication system

Info

Publication number: HK1149403B
Application number: HK11103314.0A
Authority: HK
Inventors: 詹姆士‧D‧贝内特; 吉汉‧卡若古
Original assignee: 美国博通公司
Priority date: 2009-02-03
Filing date: 2011-04-01
Publication date: 2013-08-23

Description

Communication method and communication system

Technical Field

The present invention relates to communication systems, and more particularly, to a method and system for server and client selectable video frame paths.

Background

For many users, multimedia communication technology has become a part of their daily lives. Multimedia technology is applicable to many popular portable or stationary devices, such as mobile phones, digital handheld audio and/or video playback devices, notebook or personal computers, televisions, projection devices, video and still camera displays, video games, set-top boxes, medical and scientific equipment, home or commercial entertainment centers. Multimedia communication and playback devices have become very popular in the marketplace, both at the low cost of communication technology and with the ever-increasing desire of users for higher-end media delivery systems.

Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of skill in the art, through comparison of such systems with some aspects of the present invention as set forth in the remainder of the present application with reference to the drawings.

Disclosure of Invention

A method and/or system for a server and client selectable video frame path, substantially as shown in and/or described in connection with at least one of the figures, as set forth more completely in the claims.

According to an aspect of the present invention, there is provided a communication method including:

receiving, by a client device, a sequence of video frames from a media source, wherein:

receiving, over a second communication path, a video frame of the sequence of video frames that is dependent (dependent on) on the presence of one or more other video frames; and

an independent video frame of the sequence of video frames independent of (independent of) one or more other video frames is received over a first communication path; and

the client device processes the received sequence of video frames and corresponding audio content received from the media source.

Preferably, the method further comprises, in the client system, applying more robust processing to the independent video frames than to the dependent video frames.

Preferably, the method further comprises receiving the independent video frame at a higher security level than the dependent video frame.

Preferably, the method further comprises, in the client system, storing the dependent video frame and the independent video frame in separate queues, respectively.

Preferably, the method further comprises, in the client system, storing the dependent video frames and the independent video frames together in a single queue.

Preferably, the method further comprises generating repeated and/or inserted video frames based on the received independent video frames to compensate for the lost plurality of the dependent video frames.

Preferably, the method further comprises synchronizing the corresponding audio content using the generated repeated video frames and/or inserted video frames.

Preferably, the method further comprises combining (assembling) the repeated video frames, the inserted video frames and/or the received sequence of video frames in display order.

Preferably, the method further comprises decoding the repeated video frames, the inserted video frames and/or the received sequence of video frames.

Preferably, the method further comprises performing audio shifting (toneshifting) on the corresponding audio content.

Preferably, when encoding a video frame of the sequence of video frames, the video source limits the number of video frames of the sequence of video frames for which there is a dependency.

Preferably, the second communication path has a higher data rate than the first communication path.

According to an aspect of the present invention, there is provided a communication system including:

one or more circuits in a client device, the one or more circuits to receive a sequence of video frames from a media source, wherein:

the one or more circuits are operable to process the received sequence of video frames and corresponding audio content received from the media source.

Preferably, the one or more circuits are operable to apply more robust processing to the independent video frames than to the dependent video frames in the client system.

Preferably, the one or more circuits are operable to receive the independent video frame at a higher level of security than the dependent video frame.

Preferably, the one or more circuits are operable to store the dependent video frame and the independent video frame in separate queues, respectively, in the client system.

Preferably, the one or more circuits are operable to store the dependent video frames and the independent video frames together in a single (single) queue in the client system.

Preferably, the one or more circuits are operable to generate repeated and/or interpolated video frames based on received independent video frames to compensate for a plurality of the dependent video frames being lost.

Preferably, the one or more circuits are operable to synchronize the corresponding audio content using the generated repeated video frames and/or inserted video frames.

Preferably, the one or more circuits are operable to combine (assembly) the repeated video frames, the inserted video frames and/or the received sequence of video frames in display order.

Preferably, the one or more circuits are operable to decode the repeated video frames, the inserted video frames and/or the received sequence of video frames.

Preferably, the one or more circuits are configured to audio shift (tone shifting) the corresponding audio content.

Various advantages, aspects and novel features of the invention, as well as details of an illustrated embodiment thereof, will be more fully described in the following description and drawings.

Drawings

FIG. 1A is a schematic diagram of an exemplary server and client system for selective delivery of multimedia data in accordance with a preferred embodiment of the present invention;

FIG. 1B is a diagram illustrating an exemplary reference video frame in a sequence of video frames including I-frames, P-frames, and B-frames in accordance with a preferred embodiment of the present invention;

FIG. 2 is a diagram of an exemplary media source server for selectively processing and transmitting multimedia data (over multiple paths) in accordance with a preferred embodiment of the present invention;

fig. 3 is a flow chart of exemplary steps for implementing selective transmission of multimedia data in accordance with a preferred embodiment of the present invention.

Detailed Description

Certain embodiments of the present invention relate to a method and system for an alternative video frame path for delivering video frames to a media player. In various embodiments of the present invention, a sequence of video frames and corresponding audio are received from a media source and processed by a client system. In a sequence of video frames, a portion of the video frames includes some data that is used to reconstruct the video frames in the client system independently of data in other video frames. The independent portions of the video frames are received at a lower data rate over the first communication path. Another portion of the video frames in the sequence of video frames depends on data in one or more other frames to be reconstructed in the client system. The dependent portion of the video frame is received at a higher data rate over the second communication path. In this regard, the media source server is configured to limit the dependency of each frame when encoding a video frame in a sequence of video frames.

The portion of the video frame that contains independent video frame data is received at a higher level of security in the client system and is subject to more robust processing than the portion of the video frame data that contains dependency. In various embodiments of the present invention, the client system stores the independent video frames and the dependent video frame data in different path queues respectively. In other embodiments of the invention, the client system stores the dependent video frame data and the independent video frame data in a single path queue. The independent video frame data is used to compensate for the missing dependent video frames by generating repeated and/or interpolated frames. The received audio content will be synchronized with the corresponding received video frames, repeated video frames, and/or inserted video frames. In addition, received, repeated and/or inserted video frames will be combined (assembled) and decoded in display order. In various embodiments of the present invention, the audio content is subject to an audio offset. In this manner, the selected video frame may be processed in the client system in accordance with the video frame's ability to promote video frame reconstruction and/or audio frame synchronization.

Fig. 1A is a schematic diagram of an exemplary server and client system for selective delivery of multimedia data in accordance with a preferred embodiment of the present invention. As shown in FIG. 1A, there is shown a multimedia communication system 103, a media source server 107, encoded media 121, biased transcoder 123, unencoded media 125, biased encoder 127, source encoded media 129, time-stamped non-reference frame data 129a, time-stamped reference frame data 129b, time-stamped audio data 120c, a protocol stack path 137, a stack path 137a supporting more robust packet transfers, a stack path 137b supporting less robust packet transfers, a physical interface (PHY)143, a wired and/or wireless communication network 160, a client system 109, a PHY 145, a protocol stack path 147, a more robust path 147a, a less robust path 147b, a dual or integrated path queue 149, queue management 151, splicing (dropping) and frame dropping processing 153, video image recovery (resetting) and audio synchronization 155, video image synchronization 155, and audio synchronization, A decoder 157 and unencoded media 159.

The media source server 107 may comprise suitable logic, circuitry and/or code that may enable storage, retrieval and/or capture of multimedia data and may selectively communicate the multimedia data to the client system 109. In this regard, the multimedia data may be audio and/or video data. The media source server 107 serves to assign priorities to multimedia packet data and can transmit the multimedia packet data through a plurality of packet transmission methods based on the assigned priorities. Media source server 107 is operable to communicate with client systems 109 via one or more wireless and/or wired communication networks. The origin server 107 may be any suitable computing and/or communication device that can process multimedia data, such as a video server, a telephone, a website that provides a live video channel, a video-on-demand multicast or unicast, or a Personal Computer (PC) that can play a DVD or blu-ray disc and send it over the internet to client devices.

The encoded media 121 may include video and/or audio data that is compressed to a specified format and/or encrypted according to standard compression methods (e.g., MPEG-1, MPEG-2, MPEG-4, or h.264). The encoded media may be received from another device or storage medium (e.g., a DVD, hard disk, or blu-ray disc) or captured and encoded by the video source server 107. The encoded media 121 may be stored on the video source server 107.

The offset transcoder 123 may comprise suitable logic, circuitry, and/or code that may be operable to decode the encoded media 121, apply an offset to the decoded video frames, and/or encode the video frames. In this regard, for each target frame in the sequence of video frames, the offset transcoder 123 is operable to identify a number of frames associated with the frame estimation information. In this regard, frame estimation may use both the video image pixel data in a reference frame and information about how some elements in the reference frame change to construct another similar frame. Exemplary frame estimation methods include frame repetition (frame repetition), motion vector estimation, motion vector interpolation, and/or various image transformations such as hue, density, and brightness variations.

In various embodiments of the present invention, biasing of video frames is achieved by reducing frame dependency, e.g., from 0 to 1. In this regard, a portion of a frame in a sequence of video frames may not need to reference any other frame, but may instead serve as a reference for other frames. In addition, the offset transcoder 123 sets the priority of frames in the sequence of video frames in accordance with the number (number) of other frames that it references. After classifying and prioritizing the frames, the offset transcoder encodes the frames based on the method applied to the original encoded media 121, or may encode the frames in other methods as well. Similarly, after classifying, prioritizing and compressing the frames, the compression ratio of the biased frame data will be worse than that of the original encoded media 121 because the degree of use of the frame references is reduced.

Uncoded media 125 includes uncompressed video and/or audio data. Uncoded media 125 may be captured and encoded by video source server 107. For example, the unencoded media 125 may be captured by a video camera on the origin server 107 or read from another device or storage medium. The unencoded media 125 may be stored on the video source server 107.

The bias encoder 127 may comprise suitable logic, circuitry, and/or code that may enable encoding of the unencoded media 125 and applying a bias on video frames. In this regard, for each target frame in a sequence of video frames, the offset encoder 127 is used to identify the number of frames on which the target frame depends for frame estimation. For example, the estimation of the target frame relies on the reference frame as a frame repetition bias for predicting motion or other image transitions. In various embodiments of the present invention, biasing a frame is accomplished by limiting the dependency of the frame to, for example, 0 or 1. A portion of frames in a sequence of video frames may not need to reference any other frames. Other frames may refer to only one other frame, for example. In addition, the offset encoder 127 sets the priority of each frame in the sequence of video frames according to the number of other frames that each frame references. After classifying and prioritizing the frames, the offset encoder encodes the frames based on the specified format. Typically, the compression ratio of the offset-encoded frame data is lower than that of the unbiased encoded data because the dependency of each frame is reduced.

Source coded media 129 is an exemplary frame of data that is output by offset transcoder 123 and/or offset encoder 127. The source encoded media 129 may include time-stamped audio data 129c, time-stamped non-reference frame data 129a, and/or time-stamped reference frame data 129 b. In this regard, the non-reference frame data 129a includes some video frames that are not dependent on other frames when estimating the non-reference frame image. The reference frame data 129b includes frame data that references other frames when estimating the video image. The time stamps of audio data 129c and video frames 129a and 129b may be used to classify and synchronize audio and images on client system 109. In various embodiments of the present invention, the time-stamped non-reference frames 129a and the time-stamped reference frames 129b will be classified according to the number of frames referenced per frame. In this regard, frames that reference fewer other frames, e.g., non-reference frame 129a, will be delivered to client system 109 over a different path than frames that rely on other frames for estimating information.

The protocol stack path 137 may comprise suitable logic, circuitry, and/or code that may enable execution of various communication protocols at various levels of security and/or robustness. In this regard, the protocol stack path 137 uses a set of protocols to format the source encoded media 129 in accordance with the OSI model (e.g., transport layer, network layer, data link layer, and physical layer). In various embodiments of the present invention, a stack path 137a that supports more robust packet transport may be used to format and route high priority frame data from source encoded media 129 to a specified range of ports (dedicated for reliable and/or secure transmission, e.g., via Transmission Control Protocol (TCP)). In this regard, TCP may guarantee proper sequential delivery of at least non-reference frames. Although TCP transfers are not as fast as some other transport layer protocols, only a portion of the frames are transferred in this way. The low priority frames are transmitted over the stack path 137b supporting less robust packet transmission. In this regard, the stack path 137b, which supports less robust packet transmission, may transmit packets containing lower priority video frames over a specified range of ports for faster but less reliable transmission, such as User Datagram Protocol (UDP).

Physical interface (PHY)143 may comprise suitable logic, circuitry, and/or code that may enable sending packets containing source encoded media 129 from protocol stack path 137 to client system 109 via wired and/or wireless communication network 160. Further, PHY 145 may comprise suitable logic, circuitry, and/or code that may enable receiving source encoded media from PHY143 via wired and/or wireless network 160. The wireless and/or wired communication network 160 includes one or more networks adapted to communicate multimedia data. For example, the wired and/or wireless networks may include one or more of WAN, LAN, WLAN, WiFi, WiMax, bluetooth, and ZigBee networks.

The protocol stack path 147 may comprise suitable logic, circuitry, and/or code that may enable receiving packets containing source encoded media 129 from a single PHY 145 via a more robust path 147a and/or a less robust path 147 b. The protocol stack path removes the encapsulation of the lower protocol layers. Further, the protocol stack path 147 may route the source encoded media to a dual or integrated path queue 149.

The queue management 151 may comprise suitable logic, circuitry, and/or code that may be enabled to receive a frame of source encoded media 129 from the dual or integrated path queue 149 and recover images in a video frame and synchronize audio and video frames. The video recovery and audio synchronization module 155 may be used to provide compensation for lost video frames in a variety of ways. In one exemplary embodiment of the invention, the video recovery and audio synchronization module 155 may be used to repeat one or more frames if a previous or subsequent frame is lost. In this way, the audio will remain synchronized with the video images. When the number of lost frame or frames is less than a specified threshold, the lost frames may be replaced by insertion frames, and the audio may remain synchronized with the insertion frames. In other cases, when the number of lost frame or frames is greater than a specified threshold, the frames will be dropped and the audio playback rate will be increased to skip a certain number of frames to catch up with the video frames to achieve synchronization of the audio and video images. In this regard, an increased audio rate will result in a temporary audio offset until the video and audio are synchronized. In addition, the stitching and drop frame process 153 operates to receive recovered and synchronized frames and reassemble them based on timestamps and/or sequence numbers to generate the appropriate sequence. Queue management 151 is used to forward the assembled frames containing source encoded media 129 to decoder 157.

The decoder 157 may comprise suitable logic, circuitry, and/or code that may enable decoding of offset video frames. In various embodiments, the decoded frames may be encoded without bias for storage on client system 109. In this regard, by removing the bias, the compression ratio may be increased, thereby making the storage more compact. In other embodiments of the present invention, the decoder 157 may output the unencoded media 159 for use by the client system 109.

In operation, a sequence of video frames and corresponding audio are received from media source server 107 and processed by client system 109. In a sequence of video frames, a portion of the video frames includes some data that may be used to reconstruct the video frames at client system 109 independently of data in other video frames. The independent portions of the video frames are received at a lower data rate over the first communication path. Another portion of the video frames in the sequence of video frames may need to be dependent on one or more other video frames for reconstruction at client system 109. This portion of the video frame where the dependency exists is received at a higher data rate over the second communication path. In this regard, the media source server 107 may limit the degree of dependency of each frame when encoding a video frame into a sequence of video frames.

The portion of the video frame containing independent frame data is received by the client system 109 at a higher security level and more robust processing is applied than the portion of the video frame containing dependent frame data. In various embodiments of the present invention, client system 109 stores independent frame data and dependent frame data in respective path queues 149. In other embodiments of the present invention, the client system 109 stores independent frame data and dependent frame data in the same path queue 149. The independent frame data is used to compensate for the missing dependent video frames by generating repeated video frames and/or inserting video frames. The received audio content will be synchronized with the corresponding received video frames, repeated video frames, and/or inserted video frames. In addition, the received, repeated and/or inserted video frames will be combined and decoded in display order. In various embodiments, the audio content will require an audio offset.

FIG. 1B is a diagram illustrating an exemplary reference video frame in a sequence of video frames including I-frames, P-frames, and B-frames, according to a preferred embodiment of the present invention. As shown in FIG. 1B, an exemplary sequence of video frames 102 is shown, including I-frames, B-frames, and P-frames. Video frame sequence 102 represents an exemplary video frame sequence encoded using a method like MPEG1 or MPEG 2. However, any other suitable compression standard may be suitable in accordance with embodiments of the present invention, such as mpeg v4 part 10(AVC) or h.264. The origin of the arrow shown in the sequence of video frames 102 indicates a reference frame of image data that is used together with motion vector data or other image transformation information to estimate the frame pointed to by the arrow. In this regard, the arrows start at a referenced frame (reference frame) and point to a reference frame (referencing frame). I-frames, B-frames, and P-frames are presented in display order, with the frames being encoded in an order that precedes the referenced frame, follows the referenced frame, re-references the frame, and then follows. In this regard, a client system receives data for reconstructing a plurality of images prior to decoding a reference frame (which depends on one or more referenced frames).

I frames are intra-coded video frames. For example, image data in an I-frame is generated from encoded pixel data, where the pixels span the size of the (span) image or a portion of the image. When pixel data is decoded, it can be used to reconstruct I-frames, and can also be used as reference data to construct P-frames and/or B-frames. A P frame is a predictive video frame that may be generated by decoding pixel data using references from another image or frame, as well as information describing how the image data should be converted into a P frame. For example, a motion vector, hue or brightness shift may be applied to the reference frame to generate a P frame. In the sequence of video frames 102, P frames will be displayed after the I frame they refer to, in display order, with some P frames referencing the previous P frames. B-frames are bi-directionally predicted frames that reference two previously decoded video frames. In addition, B frames may also be referred to as re-reference frames, where a B frame may refer to a frame that references another frame. For example, a B frame may reference a P frame, which references an I frame. In the unbiased video frame sequence 102, the references create dependencies between frames in the video frame sequence. When information of an I-frame or another frame on which one or more reference frames depend is lost or corrupted during transmission (between media source server 107 and client system 109), multiple frames will be lost and not efficiently reconstructed. This will result in loss of synchronization and/or a noticeable interruption of the video frames.

The exemplary video frame sequence 104 represents the result of an encoding frame method that employs an offset, in which the number of frames used to construct an image is reduced to one or two. When the image data in an I-frame (e.g., the last I-frame in display order in the sequence of video frames 104) is very different from the preceding B-frame, a second option for reconstructing a B-frame may be used, as indicated by the dashed reference arrow. In addition, the second option may also be used when a reference P frame is lost during transmission. In various embodiments of the present invention, there may be few P-frames and B-frames between I-frames, thereby reducing the reference to a level of dependency. Thus, data compressed using an encoding method employing an offset in the sequence of video frames 104 may include a greater amount of data than data compressed using an encoding method identified by the sequence of video frames 102, and at a lower compression ratio. However, the encoding method used when processing the sequence of video frames 104 provides greater reliability in reconstructing the video frames in the client system 109 and provides higher synchronization performance.

In operation, media source server 107 captures multimedia data, e.g., unencoded media data 125, or reads stored multimedia data, e.g., encoded media 121. When using uncoded media 125, source server 107 supports adding a bias on the media to limit the dependency of each video frame. When using the encoded media 121, the media will be decoded by the transcoder 123 and then encoded with an offset to limit the frame dependency. In this regard, the encoded media output by the transcoder 123 may or may not conform to the same standard employed by the encoded media 121. The source encoded media 129 will be output by the transcoder 123 and/or the encoder 127. The source encoded media 129 is time stamped to generate time-stamped non-reference frame data 129a, time-stamped reference frame data 129b, and time-stamped audio data 129 c. The time-stamped data is sent to a protocol stack path 137 where the classified and time-stamped video frames 129a and 129b are packetized and then sent through a stack path 137a that supports more robust packet transfers and/or a stack path 137b that supports less robust packet transfers. The packetized source encoded media 129 will be sent to a single PHY143 through either a dual path or through an integrated path. PHY143 transmits packetized time stamped source encoded media 129 over a physical medium to a single PHY 145 in client system 109. PHY 145 sends the packetized source encoded non-reference frames 129a to the more robust path 147a and the packetized source encoded reference frames 129b to the less robust path 147 b. The protocol stack path 147 is used to transmit the source encoded media 129 to the dual or integrated path queue 149. Video image recovery and audio synchronization 155, and stitching and frame dropping processes 153 in queue management 151 compensate for lost or corrupted video frames, for example, using frame dropping, frame repetition, and/or frame dropping insertion and reassembly frame estimation methods. The time-stamped source encoded frames and the estimated frames are synchronized with the time-stamped audio and sent to the decoder 157. The unencoded media 159 will be used by the client system 109. In various embodiments of the invention, the unencoded media 159 may be encoded with higher compression ratios for subsequent storage.

Fig. 2 is a diagram illustrating an exemplary media source server for selectively processing and transmitting multimedia data over multiple paths according to a preferred embodiment of the present invention. As shown in FIG. 2, there is shown media source server 207, media capture device 217, application layer process 219, optional encoded media 121, optional offset transcoder 123, unencoded media 125, offset encoder 127, time-stamped non-reference frame data 129a, time-stamped reference frame data 129b, time-stamped audio data 129c, transport layer path 231, transport path one 231a, transport path two 231b, Internet Protocol (IP) path 237, robust IP path one 237a, less robust IP path two 237b, link layer path 239, link path one 239a, link path two 239b, and PHY 143.

The media source server 207 is similar to or identical to the media source server 107 depicted in fig. 1A.

The media capture device 217 may be any suitable device capable of capturing multimedia data, such as a video camera and microphone, a cellular telephone, or a notebook computer equipped with a camera and microphone. In various embodiments of the present invention, the storage device is used to store information, including multimedia data distributed by the video source server 107.

The application layer 219 may comprise suitable logic, circuitry, and/or code that may enable encoding video frames using an offset and analyzing and classifying the encoded video frames based on the number of frames on which each frame depends. Optional encoded media 121, optional offset transcoder 123, unencoded media 125, offset encoder 127, time-stamped non-reference frame data 129a, time-stamped reference frame data 129b, time-stamped audio data 129c are similar to or identical to correspondingly numbered components in fig. 1A. In this regard, optional encoded media 121, optional offset transcoder 123 may be employed in various embodiments to distribute multimedia data retrieved from a storage device.

The transport layer path 231 may comprise suitable logic, circuitry, and/or code that may enable transport layer services for encoded media data (output by the application layer 219). In this regard, the transport layer path 231 includes dual paths, which may provide varying levels of reliability. For example, the more robust transport path one 231a may encapsulate the time-stamped non-reference frame data 129a in accordance with the TCP/IP protocol. In this manner, frame data that is more important for the reconstruction of robust frames and the synchronization of audio and video data on the client system 109 will be transmitted in the most reliable manner. Although TCP/IP transport is slower than other transport layer methods, packets transported over TCP/IP will be guaranteed and the packets are transported in order. Further, the less robust but faster transport path two 231b may encapsulate the time-stamped reference data 129b in accordance with the UDP protocol. In this manner, the less robust transmission path two 231b is responsible for transmitting the less important frame data for both the reconstruction of the robust frame and the synchronization on the client system 109. For example, lost reference frames will be compensated for when video frames are reconstructed and audio synchronized on the client system 109. For example, the stitching and drop frame processing 153 and video image recovery and audio synchronization 155 modules in the client system 109 may more easily recover lost or corrupted reference frames 129b (as compared to the frames that are referenced and lost by these frames).

The IP path 237 may comprise suitable logic, circuitry, and/or code that may enable internet protocols for different levels of security. For example, the time-stamped non-reference frame data 129a output by the TCP/IP transport path-1231 a may be highly security encrypted by the robust Internet path-1237 a. In this regard, IPsec may be used to encrypt and/or authenticate each packet transmitted over the robust Internet path-1237 a. In this way, the key video frames 129a will be protected from interception by unauthorized entities. Nonetheless, the less important timestamped reference frame data 129b processed by UDP may be processed by the less robust IP path-2237 b. The less robust IP path-2237 b may use either IPv4 or IPv6 protocols that do not encrypt the frame data. Either IPv4 or IPv6 will be sufficient to accomplish 129b transfer because the reference frame 129b is useless when the non-reference frame 129a is not used to recover the video frame.

Link layer path 239 may comprise suitable logic, circuitry, and/or code that may enable classification of frame packets based on priority when communicated to PHY143 and may enable routing operations in wired and/or wireless communication network 160. In this regard, link path-1239 a processes non-reference frame data packets, which are output by robust IP path-1237 a. The link path-2239 b processes reference frame data packets that are output by the less robust IP path-2237 b. The link layer path 239 may set the transmission priority of packets in the queue based on the sequence of video frames to which the packet belongs and whether the packet contains non-reference or reference frames. For example, a non-reference frame in a sequence of video frames is transmitted first. Reference frames in the same sequence of video frames are then transmitted. After all frames in a video frame sequence have been transmitted, the next video frame sequence is processed.

In operation, media is captured by media capture device 217 in video source server 207 and encoded by encoder 127 in an offset manner to limit reference frame dependencies. The encoded video frames will be analyzed and classified according to the number of frames on which the target frame depends. Video and audio frames are time stamped and video frames that are less dependent will be assigned a higher priority than frames that depend on a greater number of frames. Similarly, high priority frames may also be assigned a high level of external quality of service (QoS) that may ensure packet delivery. High priority frames, such as frames from frame data 129a, will be sent to transmission path-1231 a for transmission over the TCP/IP transport service, then to robust IP path-1237 a for encryption, and then to queues in link path-1239 a. In addition, low priority frames, such as frames from frame data 129b, will be sent to transmission path-2231 b for UDP transport service, then to IP path-2237 b, and then to queues in link path-2239 b. PHY143 transmits all packets containing frames from the first sequence of video frames before transmitting all packets from the second video frame. Furthermore, PHY143 transmits packets containing high priority frame data in a sequence of video frames before transmitting packets containing low priority frame data in the same sequence of video frames.

Fig. 3 is a flow chart of exemplary steps for implementing selective transmission of multimedia data in accordance with a preferred embodiment of the present invention. Step 300 is the start step. In step 302, video frames and/or audio are retrieved or captured in media source server 107. In step 304, if the video frame is not encoded, step 306 is performed. In step 306, in the application layer 219, for a sequence of video frames, frames with offsets are encoded by analyzing the video frames and setting the priority of each frame based on the number of frames referenced by the video frame. The highest priority will be assigned to the independent frame that does not reference other frames. The number of frames that can be referenced per frame is limited or adjustable. In step 308, non-reference frames 129a, reference frames 129b, and audio frames 129c will be time stamped in application layer 219. At step 310, the frame will be transmitted to client system 109 according to its priority, where high priority frames are sent over robust and secure path 137a and low priority frames are sent over a fast but less robust path. In step 312, frames transmitted over multiple paths are received and combined into a sequence of video frames in client system 109. Lost frames may be compensated for by skipping frames, repeating frames, or inserting lost frames and adjusting the corresponding audio time or audio offset. In step 314, the video frame is decoded. Step 316 is an exemplary end step. If the video frame is encoded, step 304, then go to step 318, where the video frame is decoded, step 318.

In one embodiment of the invention, the client system 109 receives and processes a sequence of video frames and corresponding audio from the media source server 107. In a sequence of video frames, a portion of the video frames contain data that can be used to reconstruct the video frames at client system 109 independently of data in other video frames. The independent portions of the video frames are received at a low data rate over the first communication path. Another portion of the video frames from the sequence of video frames is dependent on one or more other video frames during the reconstruction process. The video frames with the dependency are received at a higher data rate over the second communication path. In this regard, the media source server 107 limits the dependency of each frame when encoding a video frame in a sequence of video frames. In addition, media source server 107 timestamps dependent video frames, independent video frames, and/or corresponding audio data. Media source server 107 applies more robust processing to individual video frames in a more robust protocol stack and performs less robust processing for video frames that have dependencies. In addition, media source server 107 encrypts the independent video frames. The media source server 107 may use a more reliable method for transmitting independent video frames than for transmitting dependent video frames. Media source server 107 may deliver video frames with dependencies by faster and/or less reliable methods.

Video frames containing independent frame data will be received by the client system 109 with a higher level of security and processed in a more robust manner than video frames containing frame data for which there is a dependency. However, video frames containing frame data for which there is a dependency will be received at a higher data rate. In various embodiments of the present invention, client system 109 stores independent frame data and dependent frame data in different path queues 149. In other embodiments of the present invention, the client system 109 stores independent frame data and dependent frame data in the same path queue 149. The independent frame data is used to compensate for lost dependent video frames by generating repeated video frames and/or inserting video frames. The received audio content is synchronized with corresponding received video frames, repeated video frames, and/or inserted video frames. In addition, the received, repeated and/or inserted video frames will be combined and decoded in display order. In various embodiments of the present invention, the audio content may be audio offset.

Another embodiment of the present invention provides a machine and/or computer readable storage and/or medium having stored thereon a machine code and/or a computer program having at least one code section executable by a machine and/or a computer for controlling the machine and/or computer to perform a method and system as described herein for a server and client selectable video frame path.

Accordingly, the present invention may be realized in hardware, software, or a combination of hardware and software. The present invention can be realized in a centralized fashion in at least one computer system, or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system or other apparatus adapted for carrying out the methods described herein is suited. A typical combination of hardware and software could be a general purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein. The method is implemented in a computer system using a processor and a memory unit.

The present invention can also be implemented by a computer program product, which comprises all the features enabling the implementation of the methods of the invention and which, when loaded in a computer system, is able to carry out these methods. The computer program in the present document refers to: any expression, in any programming language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following a) conversion to another language, code or notation; b) reproduced in different formats to implement specific functions.

While the invention has been described with reference to several embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the scope of the invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the invention without departing from its scope. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed, but that the invention will include all embodiments falling within the scope of the appended claims.

Claims

1. A method of communication, comprising:

receiving, over a second communication path, a video frame of the sequence of video frames that is dependent on the presence of one or more other video frames; and

receiving, over a first communication path, an independent video frame of the sequence of video frames that is independent of one or more other video frames as compared to the dependent video frame, at a higher level of security;

processing, by the client device, the received sequence of video frames and corresponding audio content received from the media source, the independent video frames being more robustly processed by the client device than the dependent video frames;

the client device generating repeated and/or inserted video frames based on the received independent video frames to compensate for the lost plurality of the dependent video frames; and

the client device synchronizes the corresponding audio content using the generated repeated video frames and/or inserted video frames.

2. The method of claim 1, further comprising storing the dependent video frames and the independent video frames in separate queues in the client system.

3. The method of claim 1, further comprising storing the dependent video frames and the independent video frames together in a single queue in the client system.

4. A communication system, comprising:

one or more circuits in a client for receiving a sequence of video frames from a media source, wherein:

the one or more circuits are operable to process the received sequence of video frames and corresponding audio content received from the media source, the client device applying more robust processing to the independent video frames than to the dependent video frames;

generating repeated and/or inserted video frames based on the received independent video frames to compensate for the lost plurality of the dependent video frames; and

synchronizing the corresponding audio content using the generated repeated video frames and/or the inserted video frames.

5. The system according to claim 4, wherein said one or more circuits are operable to store said dependent video frames and said independent video frames in separate queues, respectively, in said client system.